A typical image contains 7 different object categories
PASCAl 07 SUN 09
Tree-structured Context Model Context Model Prior Model Measurement Model Co-occurrences Prior Spatial Prior Global Image Features Local Detector Outputs
Prior Model Co-occurrences Prior: Encodes the co-occurrence statistics using a binary tree model Spatial Prior: Captures information regarding the specific relative positions among appearance of objects
L-i’s are modeled as jointly Gaussian and in case of multiple instances of the same category, L-I represent the median location of all instances.
The joint distribution of all binary and Gaussian variables is finally represented as:
Measurement Model Incorporating Global Image Features: Uses gist to measure the presence of an object in an image (scene) Integrating Local Detector Outputs: Taking the candidate windows from a baseline object detector, and learning the likelihood of their correct detection from the training set, the expected location of an object is obtained.
Alternating Inference Given the gist g, candidate window locations W and their scores s, the algorithm infers the presence of objects b, the correct detection c and expected location of objects L, by solving the optimization problem:
Learning the dependency The dependency structure among objects is learnt from a set of fully labeled images using the Chow-Liu algorithm.
It computes the empirical mutual information of all pairs of variables (using sample values in the set of labeled images)
It then finds the maximum weight spanning tree with edge weights equal to the mutual information
A root node is arbitrarily selected once a tree structure is learned.
Learning the dependency
Results Performance on Pascal 07 Object Recognition Performance
Results Performance on SUN 09 Image Annotation Performance