Color for Object Recognition

Bag-of-words based image representations are found to be very successful for object recognition. Initially, these representations were solely shape based, but over the last several years they have been extended with color information. Much research has been dedicated to optimal color feature design, often from a photometric invariance point of view. However, relatively little research went into the questions where color should be introduced in the bag-of-words pipeline. In this talk I will focus on this aspect, I will indicateseveral places where color can be introduced, analyze the theoretical consequences and compare experimental results.

The two main strategies to combine multiple cues, known as early- and late fusion both suffer from significant drawbacks. Based on their analysis I will propose two novel methods for combining shape and color cues. Firstly, I will discuss a method which is motivated by human color vision, called Color Attention. Here color is used to construct a top-down category-specific attention map. The color attention map is then further deployed to modulate the shape features by taking more features from regions within an image that are likely to contain an object instance. Secondly, I will outline an information theoretical approach to the problem of color and shape combination. This leads to a novel approach for the construction of visual word dictionaries, which we coined Portmanteau vocabularies.  Evaluation of both approaches on several benchmark data sets shows that the proposed methods outperform both early- and late fusion.