SlideShare a Scribd company logo
1 of 36
Download to read offline
I NTERACTIVE S EARCH FOR
   I MAGE C ATEGORIES BY
      M ENTAL M ATCHING

      Donald Geman
  Johns Hopkins University


   Frontiers in Computer Vision
       M.I.T., August 2011
R EFERENCE




 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,              VOL. 31, NO. 6,   JUNE 2009                                         1087




          A Statistical Framework for
  Image Category Search from a Mental Picture
                              Marin Ferecatu and Donald Geman, Senior Member, IEEE

      Abstract—Starting from a member of an image database designated the “query image,” traditional image retrieval techniques, for
      example, search by visual similarity, allow one to locate additional instances of a target category residing in the database. However, in
      many cases, the query image or, more generally, the target category, resides only in the mind of the user as a set of subjective visual
      patterns, psychological impressions, or “mental pictures.” Consequently, since image databases available today are often unstructured
      and lack reliable semantic annotations, it is often not obvious how to initiate a search session; this is the “page zero problem.” We
      propose a new statistical framework based on relevance feedback to locate an instance of a semantic category in an unstructured
      image database with no semantic annotation. A search session is initiated from a random sample of images. At each retrieval round,
      the user is asked to select one image from among a set of displayed images—the one that is closest in his opinion to the target class.
      The matching is then “mental.” Performance is measured by the number of iterations necessary to display an image which satisfies the
      user, at which point standard techniques can be employed to display other instances. Our core contribution is a Bayesian formulation
      which scales to large databases. The two key components are a response model which accounts for the user’s subjective perception of
      similarity and a display algorithm which seeks to maximize the flow of information. Experiments with real users and two databases of         2 / 38
S CENARIO




            3 / 38
O UTLINE



   Standard Image Retrieval
   Mental Matching
   Experiments
   Statistical Framework (maybe)
   Modeling Human Behavior (maybe)




                                     4 / 38
C ONVENTIONAL Q UERY- BY-E XAMPLE (QBE)



   Start from a query image in a database. Find other images
   which are “close” or “closest”
       in overall color, texture or shape, or
       in a semantic sense, or . . .
   Matching is performed by the system.
   Good results in limited domains, e.g., comparing paintings,
   plants and landscapes.




                                                             5 / 38
E XAMPLE : IKONA S EARCH E NGINE (INRIA)




                                       6 / 38
E XAMPLE ( CONT )




                    7 / 38
“PAGE Z ERO ” P ROBLEM




   QBE requires a starting point - a query image.
   Dilemma: Without a starting point, random sampling a large
   database is too slow in practice.




                                                           8 / 38
E XTERNAL I MAGES



   Mental Picture: The user has a picture “in mind”, e.g., a
   face or painting or house.
   Viewed Image: The user is looking at a picture, e.g., in a
   magazine or on the web.
   Physical Object: The user is holding an object.




                                                                9 / 38
W HO IS THAT P ERSON ?




                         10 / 38
M ENTAL C ATEGORY S EARCH


   Assume this “external query” is represented in our
   database, either by
       a version of the same image (e.g., same person), or
       variations on a theme, i.e., a category of images (e.g.,
       similar houses).
   Objective: Find an efficient way to display this version or
   representatives of this category.
   Applications: Image retrieval (“page zero”); web browsing;
   security; art management; plant science; e-commerces;
   blah blah blah.



                                                                  11 / 38
I NTERACTIVE S EARCH


   The object of the search is a class S (variations on an
   image or theme).
   Single target search is the special |S| = 1.
   Assume the user always recognizes an instance of his
   target.
   At each iteration, some images are displayed, typically two
   to sixteen.
   The user responds by either
       signaling a target if present; or
       choosing the one deemed “closest”.



                                                            12 / 38
I NTERACTIVE S EARCH ( CONT )



   Based on this feedback, the system chooses another set of
   images to display.
   Goal: Minimize the number of iterations until an exemplar of
   the target is displayed.
   Then display other examples (“page zero”) for specialization
   and refinement.




                                                            13 / 38
B ACK TO K ERMIT




                   14 / 38
C OMPLICATIONS



   Mental matching involves human memory, perception and
   opinions.
   People are semantically oriented. However, images are
   indexed by low-level features (“semantic gap”).
   Interest in large databases, order 10,000 to 1,000,000.




                                                         15 / 38
T HE U SER I NTERFACE




                        16 / 38
M EASURES OF P ERFORMANCE

   T : number of iterations until S is displayed.
   P(T < t): The probability distribution over some population
   of users.
   E(T ): The mean of this population.
   For a random search,

                     E(T ) ∼ N/(L(|S| + 1)),
                           =

   where N is the size of the database and L is the number
   displayed per iteration.
   Coherence: The probability that the user selects the i’th
   closest image to S.

                                                               17 / 38
E XPERIMENTAL DATABASES



    Corel: N=60,000 images
    Alinari: N=20,000 images



Ground truth: 10 semantic classes of ≈ 100 hand-chosen
images




                                                         18 / 38
A LINARI DATABASE




                    19 / 38
P ERFORMANCE : A LINARI




             Search time distribution
                                        20 / 38
C ONCLUSIONS


   Rich possibilities for mathematical modeling in building
   efficient man-machine interfaces.
   Mixes geometry, probability, optimization and information
   theory.
   Solving the “vision problem” is probably not around the
   corner.
   Hence extending to databases of order 1,000,000 remains
   a challenge.




                                                          21 / 38
DATABASE AND I MAGE M ETRIC
   I . . . an image
   Ω = {1, 2, ..., N} . . . a database of images
   We do not assume Ω is “structured” (partitioned into
   categories)
   {f (I1 ), f (I2 ), . . . , f (IN )} . . . “features” in R M .
   df : R M × R M → [0, 1] . . . a metric on features.
   S ⊂ Ω . . . the category (semantic class) in the mind of the
   user, a random set.
   For each k = 1, ..., N, define a binary random variable
                        Yk = 1 if k ∈ S
                        Yk = 0 if k ∈ S



                                                               22 / 38
D ISPLAY



   D ⊂ {1, 2, . . . , N} . . . a set of L distinct images.
   Dt . . . the images displayed at time t = 1, 2, . . .
   XD . . . the response of the user to D.

        For D ∩ S = ∅, XD = i means i is “closest” to S,
                  in the opinion of the user




                                                             23 / 38
S EARCH H ISTORY

   History (“evidence”) after t steps:

     Bt = {D1 = d1 , XD1 = i1 , . . . , Dt = dt , XDt = it }
        = {D1 = d1 , XD1 = i1 , XD2 = i2 , . . . , Dt = dt , XDt = it }


   because D1 is chosen at random and Ds+1 will depend only
   on D1 and the previous answers (actually on the posterior).
   Given S and Dt , the answer XDt is independent of the
   search history:

               P(XDt = i|S, Bt ) = P(Xd = i|S, Dt = d)


                                                                      24 / 38
D ISPLAY C RITERION




             Dt+1 = arg max I(XD ; S|Bt )
                          D




                                            25 / 38
S EPARATE B AYESIAN S YSTEMS FOR E ACH
 k ∈Ω
   Prior model:
                   p0 (k) = P(Yk = 1) = P(k ∈ S)


   Answer model: For k ∈ D, i ∈ D,
                  q+ (i|k, D) = P(XD = i|Yk = 1)
                  q− (i|k, D) = P(XD = i|Yk = 0)


   Posterior distribution at step t:
                        pt (k) = P(Yk = 1|Bt )


                                                   26 / 38
A NSWER M ODELS


         Positive Model                          Negative Model
                      φ+ (d(i, k ))                            φ− (d(i, k ))
P(Xd = i|Yk = 1) =                       P(Xd = i|Yk = 0) =
                     j∈D φ+ (d(j, k ))                        j∈D φ− (d(j, k ))




                                                                           27 / 38
PARAMETER E STIMATION (θ1 )


The positive model
Θ1 : “no preference” threshold

Repeat M times:
 1. Fix θ and k ∈ S.
 2. Choose two images i, j such that:
    (a) d(i, k ) ≈ θ
    (b) d(j, k ) is chosen uniformly in [θ, 1]
 3. Display i, j and record the user’s
    choice.



                                                 28 / 38
PARAMETER E STIMATION (θ1 )

Consider two hypotheses:
    H0: “no preference”
    H1: “preference for i (closest)”
Let N θ be the number of times the user chooses i. Under H0,
                                      1
                          N θ ∼ Bin(M, )
                                      2
Let p(θ) = P(Bin(M, 1 ) > N θ ).
                    2


Choose the largest value of θ such that H0 is rejected at
p = 0.05.


                                                               29 / 38
PARAMETER E STIMATION (θ1 )




                              30 / 38
PARAMETER E STIMATION (θ2 )


The positive model
Θ2 : degree of coherence with system metric

Repeat M times:
 1. Fix θ and k ∈ S.
 2. Choose a display D such that:
    (a) One image i in D is very close to some k ∈ S;
    (b) All the other images in D are more than θ1 units away from k.
 3. Display D and record the user’s choice.




                                                                  31 / 38
PARAMETER E STIMATION (θ2 )


                                       1
          P(XD = xi |Yk = 1) ∼
                             =
                                 1 + (n − 1)θ2
                     1 P(XD = xi |Yk = 1)
            θ2 ∼
             +
               =
                   n − 1 P(XD = xi |Yk = 1)



              Corel database (M=600):
                     θ2 = 0.065



                                                 32 / 38
U PDATE M ODEL

The new posterior distribution is

                        pt+1 (k) = P(Yk = 1|Bt+1 )

which reduces to
                      P(XDt+1 = i|Yk = 1, Dt+1 )pt (k)
 P(XDt+1   = i|Yk = 1, Dt+1 )pt (k) + P(XDt+1 = i|Yk = 0, Dt+1 )(1 − pt (k )

which is finally

                            q+ (i|k, Dt+1 )pt (k )
                                                                 .
             q+ (i|k, Dt+1 )pt (k) + q− (i|k, Dt+1 )(1 − pt (k))


                                                                      33 / 38
TAKING S TOCK




   So mental category search reduces to two difficult tasks:
       An optimization problem: Discover approximations to the
       optimal display.
       A modeling problem: Discover answer models which match
       human behavior.




                                                              34 / 38
I DEAL U SER

 Suppose d(i, S) < d(j, S) for each j ∈ D, i ∈ D. Ideal user:

                           P(XD = i|S) = 1

 Since S determines XD :
                .
           Dt+1 = arg max I(XD ; S|Bt )
                            D
                  = arg max(H(XD |Bt ) − H(XD |S, Bt ))
                            D
                  = arg max H(XD |Bt ),
                            D

 which motivates the following choice of display:


                                                                35 / 38
O PTIMAL D ISPLAY: T HE VORONOI C ELLS
 H AVE E QUAL M ASS




                                         36 / 38

More Related Content

What's hot

MLIP - Chapter 2 - Preliminaries to deep learning
MLIP - Chapter 2 - Preliminaries to deep learningMLIP - Chapter 2 - Preliminaries to deep learning
MLIP - Chapter 2 - Preliminaries to deep learningCharles Deledalle
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationJason Anderson
 
MLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningMLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningCharles Deledalle
 
Habilitation à diriger des recherches
Habilitation à diriger des recherchesHabilitation à diriger des recherches
Habilitation à diriger des recherchesPierre Pudlo
 
04 structured support vector machine
04 structured support vector machine04 structured support vector machine
04 structured support vector machinezukun
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networksKyuri Kim
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learningkkkc
 
06 cv mil_learning_and_inference
06 cv mil_learning_and_inference06 cv mil_learning_and_inference
06 cv mil_learning_and_inferencezukun
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Syed Atif Naseem
 
Discrete Models in Computer Vision
Discrete Models in Computer VisionDiscrete Models in Computer Vision
Discrete Models in Computer VisionYap Wooi Hen
 
Johan Suykens: "Models from Data: a Unifying Picture"
Johan Suykens: "Models from Data: a Unifying Picture" Johan Suykens: "Models from Data: a Unifying Picture"
Johan Suykens: "Models from Data: a Unifying Picture" ieee_cis_cyprus
 
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task LearningMasahiro Suzuki
 
Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with pythonSimone Piunno
 
Tutorial SUM 2012: some of the things you wanted to know about uncertainty (b...
Tutorial SUM 2012: some of the things you wanted to know about uncertainty (b...Tutorial SUM 2012: some of the things you wanted to know about uncertainty (b...
Tutorial SUM 2012: some of the things you wanted to know about uncertainty (b...Sebastien Destercke
 
Global illumination techniques for the computation of hight quality images in...
Global illumination techniques for the computation of hight quality images in...Global illumination techniques for the computation of hight quality images in...
Global illumination techniques for the computation of hight quality images in...Frederic Perez
 
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...Shao-Chuan Wang
 
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep YadavMachine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep YadavAgile Testing Alliance
 

What's hot (20)

MLIP - Chapter 2 - Preliminaries to deep learning
MLIP - Chapter 2 - Preliminaries to deep learningMLIP - Chapter 2 - Preliminaries to deep learning
MLIP - Chapter 2 - Preliminaries to deep learning
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image Generation
 
MLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningMLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, Captioning
 
Habilitation à diriger des recherches
Habilitation à diriger des recherchesHabilitation à diriger des recherches
Habilitation à diriger des recherches
 
Lec14 eigenface and fisherface
Lec14 eigenface and fisherfaceLec14 eigenface and fisherface
Lec14 eigenface and fisherface
 
04 structured support vector machine
04 structured support vector machine04 structured support vector machine
04 structured support vector machine
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
ICPR 2012
ICPR 2012ICPR 2012
ICPR 2012
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
06 cv mil_learning_and_inference
06 cv mil_learning_and_inference06 cv mil_learning_and_inference
06 cv mil_learning_and_inference
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)
 
Discrete Models in Computer Vision
Discrete Models in Computer VisionDiscrete Models in Computer Vision
Discrete Models in Computer Vision
 
Johan Suykens: "Models from Data: a Unifying Picture"
Johan Suykens: "Models from Data: a Unifying Picture" Johan Suykens: "Models from Data: a Unifying Picture"
Johan Suykens: "Models from Data: a Unifying Picture"
 
Estimating Space-Time Covariance from Finite Sample Sets
Estimating Space-Time Covariance from Finite Sample SetsEstimating Space-Time Covariance from Finite Sample Sets
Estimating Space-Time Covariance from Finite Sample Sets
 
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
 
Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with python
 
Tutorial SUM 2012: some of the things you wanted to know about uncertainty (b...
Tutorial SUM 2012: some of the things you wanted to know about uncertainty (b...Tutorial SUM 2012: some of the things you wanted to know about uncertainty (b...
Tutorial SUM 2012: some of the things you wanted to know about uncertainty (b...
 
Global illumination techniques for the computation of hight quality images in...
Global illumination techniques for the computation of hight quality images in...Global illumination techniques for the computation of hight quality images in...
Global illumination techniques for the computation of hight quality images in...
 
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...
 
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep YadavMachine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
 

Viewers also liked

18 cv mil_style_and_identity
18 cv mil_style_and_identity18 cv mil_style_and_identity
18 cv mil_style_and_identityzukun
 
17 cv mil_models_for_shape
17 cv mil_models_for_shape17 cv mil_models_for_shape
17 cv mil_models_for_shapezukun
 
16 cv mil_multiple_cameras
16 cv mil_multiple_cameras16 cv mil_multiple_cameras
16 cv mil_multiple_cameraszukun
 
08 cv mil_regression
08 cv mil_regression08 cv mil_regression
08 cv mil_regressionzukun
 
Mit6870 orsu lecture12
Mit6870 orsu lecture12Mit6870 orsu lecture12
Mit6870 orsu lecture12zukun
 
Lecture24
Lecture24Lecture24
Lecture24zukun
 
Lecture27
Lecture27Lecture27
Lecture27zukun
 
Unconstrained Activity Recognition in an Office Environment
Unconstrained Activity Recognition in an Office EnvironmentUnconstrained Activity Recognition in an Office Environment
Unconstrained Activity Recognition in an Office EnvironmentChristopher Ramirez
 
Lecture14
Lecture14Lecture14
Lecture14zukun
 

Viewers also liked (9)

18 cv mil_style_and_identity
18 cv mil_style_and_identity18 cv mil_style_and_identity
18 cv mil_style_and_identity
 
17 cv mil_models_for_shape
17 cv mil_models_for_shape17 cv mil_models_for_shape
17 cv mil_models_for_shape
 
16 cv mil_multiple_cameras
16 cv mil_multiple_cameras16 cv mil_multiple_cameras
16 cv mil_multiple_cameras
 
08 cv mil_regression
08 cv mil_regression08 cv mil_regression
08 cv mil_regression
 
Mit6870 orsu lecture12
Mit6870 orsu lecture12Mit6870 orsu lecture12
Mit6870 orsu lecture12
 
Lecture24
Lecture24Lecture24
Lecture24
 
Lecture27
Lecture27Lecture27
Lecture27
 
Unconstrained Activity Recognition in an Office Environment
Unconstrained Activity Recognition in an Office EnvironmentUnconstrained Activity Recognition in an Office Environment
Unconstrained Activity Recognition in an Office Environment
 
Lecture14
Lecture14Lecture14
Lecture14
 

Similar to Fcv hum mach_geman

An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...NTNU
 
Machine Learning ebook.pdf
Machine Learning ebook.pdfMachine Learning ebook.pdf
Machine Learning ebook.pdfHODIT12
 
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 11_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1MostafaHazemMostafaa
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applicationsFrank Nielsen
 
Camp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningCamp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningKrzysztof Kowalczyk
 
know Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfknow Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfhemangppatel
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data ScienceAlbert Bifet
 
Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learningbutest
 
Introduction to Machine Vision
Introduction to Machine VisionIntroduction to Machine Vision
Introduction to Machine VisionNasir Jumani
 
Massive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filteringMassive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filteringArthur Mensch
 
ABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsUmberto Picchini
 
Aggregation computation over distributed data streams(the final version)
Aggregation computation over distributed data streams(the final version)Aggregation computation over distributed data streams(the final version)
Aggregation computation over distributed data streams(the final version)Yueshen Xu
 
Citython presentation
Citython presentationCitython presentation
Citython presentationAnkit Tewari
 
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)Pierre Schaus
 
ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2zukun
 

Similar to Fcv hum mach_geman (20)

An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
 
Machine Learning ebook.pdf
Machine Learning ebook.pdfMachine Learning ebook.pdf
Machine Learning ebook.pdf
 
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 11_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applications
 
Lausanne 2019 #1
Lausanne 2019 #1Lausanne 2019 #1
Lausanne 2019 #1
 
Camp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningCamp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine Learning
 
know Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfknow Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdf
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learning
 
Introduction to Machine Vision
Introduction to Machine VisionIntroduction to Machine Vision
Introduction to Machine Vision
 
Massive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filteringMassive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filtering
 
ABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space models
 
Aggregation computation over distributed data streams(the final version)
Aggregation computation over distributed data streams(the final version)Aggregation computation over distributed data streams(the final version)
Aggregation computation over distributed data streams(the final version)
 
Dip day1&2
Dip day1&2Dip day1&2
Dip day1&2
 
Citython presentation
Citython presentationCitython presentation
Citython presentation
 
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
 
Lect4
Lect4Lect4
Lect4
 
ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2
 
mini prjt
mini prjtmini prjt
mini prjt
 

More from zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 

More from zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Recently uploaded

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Recently uploaded (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Fcv hum mach_geman

  • 1. I NTERACTIVE S EARCH FOR I MAGE C ATEGORIES BY M ENTAL M ATCHING Donald Geman Johns Hopkins University Frontiers in Computer Vision M.I.T., August 2011
  • 2. R EFERENCE IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 6, JUNE 2009 1087 A Statistical Framework for Image Category Search from a Mental Picture Marin Ferecatu and Donald Geman, Senior Member, IEEE Abstract—Starting from a member of an image database designated the “query image,” traditional image retrieval techniques, for example, search by visual similarity, allow one to locate additional instances of a target category residing in the database. However, in many cases, the query image or, more generally, the target category, resides only in the mind of the user as a set of subjective visual patterns, psychological impressions, or “mental pictures.” Consequently, since image databases available today are often unstructured and lack reliable semantic annotations, it is often not obvious how to initiate a search session; this is the “page zero problem.” We propose a new statistical framework based on relevance feedback to locate an instance of a semantic category in an unstructured image database with no semantic annotation. A search session is initiated from a random sample of images. At each retrieval round, the user is asked to select one image from among a set of displayed images—the one that is closest in his opinion to the target class. The matching is then “mental.” Performance is measured by the number of iterations necessary to display an image which satisfies the user, at which point standard techniques can be employed to display other instances. Our core contribution is a Bayesian formulation which scales to large databases. The two key components are a response model which accounts for the user’s subjective perception of similarity and a display algorithm which seeks to maximize the flow of information. Experiments with real users and two databases of 2 / 38
  • 3. S CENARIO 3 / 38
  • 4. O UTLINE Standard Image Retrieval Mental Matching Experiments Statistical Framework (maybe) Modeling Human Behavior (maybe) 4 / 38
  • 5. C ONVENTIONAL Q UERY- BY-E XAMPLE (QBE) Start from a query image in a database. Find other images which are “close” or “closest” in overall color, texture or shape, or in a semantic sense, or . . . Matching is performed by the system. Good results in limited domains, e.g., comparing paintings, plants and landscapes. 5 / 38
  • 6. E XAMPLE : IKONA S EARCH E NGINE (INRIA) 6 / 38
  • 7. E XAMPLE ( CONT ) 7 / 38
  • 8. “PAGE Z ERO ” P ROBLEM QBE requires a starting point - a query image. Dilemma: Without a starting point, random sampling a large database is too slow in practice. 8 / 38
  • 9. E XTERNAL I MAGES Mental Picture: The user has a picture “in mind”, e.g., a face or painting or house. Viewed Image: The user is looking at a picture, e.g., in a magazine or on the web. Physical Object: The user is holding an object. 9 / 38
  • 10. W HO IS THAT P ERSON ? 10 / 38
  • 11. M ENTAL C ATEGORY S EARCH Assume this “external query” is represented in our database, either by a version of the same image (e.g., same person), or variations on a theme, i.e., a category of images (e.g., similar houses). Objective: Find an efficient way to display this version or representatives of this category. Applications: Image retrieval (“page zero”); web browsing; security; art management; plant science; e-commerces; blah blah blah. 11 / 38
  • 12. I NTERACTIVE S EARCH The object of the search is a class S (variations on an image or theme). Single target search is the special |S| = 1. Assume the user always recognizes an instance of his target. At each iteration, some images are displayed, typically two to sixteen. The user responds by either signaling a target if present; or choosing the one deemed “closest”. 12 / 38
  • 13. I NTERACTIVE S EARCH ( CONT ) Based on this feedback, the system chooses another set of images to display. Goal: Minimize the number of iterations until an exemplar of the target is displayed. Then display other examples (“page zero”) for specialization and refinement. 13 / 38
  • 14. B ACK TO K ERMIT 14 / 38
  • 15. C OMPLICATIONS Mental matching involves human memory, perception and opinions. People are semantically oriented. However, images are indexed by low-level features (“semantic gap”). Interest in large databases, order 10,000 to 1,000,000. 15 / 38
  • 16. T HE U SER I NTERFACE 16 / 38
  • 17. M EASURES OF P ERFORMANCE T : number of iterations until S is displayed. P(T < t): The probability distribution over some population of users. E(T ): The mean of this population. For a random search, E(T ) ∼ N/(L(|S| + 1)), = where N is the size of the database and L is the number displayed per iteration. Coherence: The probability that the user selects the i’th closest image to S. 17 / 38
  • 18. E XPERIMENTAL DATABASES Corel: N=60,000 images Alinari: N=20,000 images Ground truth: 10 semantic classes of ≈ 100 hand-chosen images 18 / 38
  • 19. A LINARI DATABASE 19 / 38
  • 20. P ERFORMANCE : A LINARI Search time distribution 20 / 38
  • 21. C ONCLUSIONS Rich possibilities for mathematical modeling in building efficient man-machine interfaces. Mixes geometry, probability, optimization and information theory. Solving the “vision problem” is probably not around the corner. Hence extending to databases of order 1,000,000 remains a challenge. 21 / 38
  • 22. DATABASE AND I MAGE M ETRIC I . . . an image Ω = {1, 2, ..., N} . . . a database of images We do not assume Ω is “structured” (partitioned into categories) {f (I1 ), f (I2 ), . . . , f (IN )} . . . “features” in R M . df : R M × R M → [0, 1] . . . a metric on features. S ⊂ Ω . . . the category (semantic class) in the mind of the user, a random set. For each k = 1, ..., N, define a binary random variable Yk = 1 if k ∈ S Yk = 0 if k ∈ S 22 / 38
  • 23. D ISPLAY D ⊂ {1, 2, . . . , N} . . . a set of L distinct images. Dt . . . the images displayed at time t = 1, 2, . . . XD . . . the response of the user to D. For D ∩ S = ∅, XD = i means i is “closest” to S, in the opinion of the user 23 / 38
  • 24. S EARCH H ISTORY History (“evidence”) after t steps: Bt = {D1 = d1 , XD1 = i1 , . . . , Dt = dt , XDt = it } = {D1 = d1 , XD1 = i1 , XD2 = i2 , . . . , Dt = dt , XDt = it } because D1 is chosen at random and Ds+1 will depend only on D1 and the previous answers (actually on the posterior). Given S and Dt , the answer XDt is independent of the search history: P(XDt = i|S, Bt ) = P(Xd = i|S, Dt = d) 24 / 38
  • 25. D ISPLAY C RITERION Dt+1 = arg max I(XD ; S|Bt ) D 25 / 38
  • 26. S EPARATE B AYESIAN S YSTEMS FOR E ACH k ∈Ω Prior model: p0 (k) = P(Yk = 1) = P(k ∈ S) Answer model: For k ∈ D, i ∈ D, q+ (i|k, D) = P(XD = i|Yk = 1) q− (i|k, D) = P(XD = i|Yk = 0) Posterior distribution at step t: pt (k) = P(Yk = 1|Bt ) 26 / 38
  • 27. A NSWER M ODELS Positive Model Negative Model φ+ (d(i, k )) φ− (d(i, k )) P(Xd = i|Yk = 1) = P(Xd = i|Yk = 0) = j∈D φ+ (d(j, k )) j∈D φ− (d(j, k )) 27 / 38
  • 28. PARAMETER E STIMATION (θ1 ) The positive model Θ1 : “no preference” threshold Repeat M times: 1. Fix θ and k ∈ S. 2. Choose two images i, j such that: (a) d(i, k ) ≈ θ (b) d(j, k ) is chosen uniformly in [θ, 1] 3. Display i, j and record the user’s choice. 28 / 38
  • 29. PARAMETER E STIMATION (θ1 ) Consider two hypotheses: H0: “no preference” H1: “preference for i (closest)” Let N θ be the number of times the user chooses i. Under H0, 1 N θ ∼ Bin(M, ) 2 Let p(θ) = P(Bin(M, 1 ) > N θ ). 2 Choose the largest value of θ such that H0 is rejected at p = 0.05. 29 / 38
  • 30. PARAMETER E STIMATION (θ1 ) 30 / 38
  • 31. PARAMETER E STIMATION (θ2 ) The positive model Θ2 : degree of coherence with system metric Repeat M times: 1. Fix θ and k ∈ S. 2. Choose a display D such that: (a) One image i in D is very close to some k ∈ S; (b) All the other images in D are more than θ1 units away from k. 3. Display D and record the user’s choice. 31 / 38
  • 32. PARAMETER E STIMATION (θ2 ) 1 P(XD = xi |Yk = 1) ∼ = 1 + (n − 1)θ2 1 P(XD = xi |Yk = 1) θ2 ∼ + = n − 1 P(XD = xi |Yk = 1) Corel database (M=600): θ2 = 0.065 32 / 38
  • 33. U PDATE M ODEL The new posterior distribution is pt+1 (k) = P(Yk = 1|Bt+1 ) which reduces to P(XDt+1 = i|Yk = 1, Dt+1 )pt (k) P(XDt+1 = i|Yk = 1, Dt+1 )pt (k) + P(XDt+1 = i|Yk = 0, Dt+1 )(1 − pt (k ) which is finally q+ (i|k, Dt+1 )pt (k ) . q+ (i|k, Dt+1 )pt (k) + q− (i|k, Dt+1 )(1 − pt (k)) 33 / 38
  • 34. TAKING S TOCK So mental category search reduces to two difficult tasks: An optimization problem: Discover approximations to the optimal display. A modeling problem: Discover answer models which match human behavior. 34 / 38
  • 35. I DEAL U SER Suppose d(i, S) < d(j, S) for each j ∈ D, i ∈ D. Ideal user: P(XD = i|S) = 1 Since S determines XD : . Dt+1 = arg max I(XD ; S|Bt ) D = arg max(H(XD |Bt ) − H(XD |S, Bt )) D = arg max H(XD |Bt ), D which motivates the following choice of display: 35 / 38
  • 36. O PTIMAL D ISPLAY: T HE VORONOI C ELLS H AVE E QUAL M ASS 36 / 38