SlideShare a Scribd company logo
1 of 23
Download to read offline
Politecnico di Torino
Dipartimento di Automatica e Informatica
                                                      http://elite.polito.it




                 FaSet: A Set Theory Model for
                               Faceted Search
                Dario Bonino, Fulvio Corno, Laura Farinetti
Outline
       Faceted Search
       Goal
       The FaSet Set-Theoretical Model
       FaSet Relational Implementation




    2                    WI/IAT 2009, Milano, Italy   FaSet
Faceted Classification
       Originated in Library Science
           Ranganathan, 1962
       Content-based classification scheme
       Multi-dimensional
           Facet = classification dimension
       Multi-valued
           Focus = allowed value in one of the facets




    3                          WI/IAT 2009, Milano, Italy   FaSet
Example

Color     Shape               Taste               Facets
Yellow    Cube                Sweet
Red       Sphere              Bitter
Orange    Cone                Neutral
                                                  Allowed foci for
Green     Cylinder            Acid
                                                  each facet
Blue
White
Black

                     Choice of the foci
                     describing the item




 4                   WI/IAT 2009, Milano, Italy                 FaSet
Faceted Search Systems
       Faceted Classification
           Simple, intuitive, versatile, powerful
       Adopted by more and more web sites
           As a classification system for their
            products/items/documents/resources/…
           As a model for the user interface in search, filtering,
            refinement




    5                            WI/IAT 2009, Milano, Italy           FaSet
Examples




6          WI/IAT 2009, Milano, Italy   FaSet
Examples




7          WI/IAT 2009, Milano, Italy   FaSet
Examples




8          WI/IAT 2009, Milano, Italy   FaSet
Facets in the real world
       Multi-valued                        Color          Shape
        classification                      Yellow         Squared       ▼
                                            Red             Cube
           During classification
                                            Orange          Parallelepiped
           During search
                                            Green          Rounded       ▼
           AND vs OR semantics?                            Sphere
                                            Blue
       Hierarchical (nested)               White           Cylinder
        facets                              Black
           Parents selectable?             Other
                                                           Weight
       Incomplete classification                          0-50 g
       Numerical ranges                                   50-100 g
                                                           100+ g


    9                         WI/IAT 2009, Milano, Italy               FaSet
Facets in the Literature

User Interfaces                            Data and logic model
    Active research field                    Methodologies from
     since ~2000                               Library science
    Usability studies                         (Broughton, Vickery)
        Mainly for search                    Formal models
         interfaces                               Dynamic Taxonomies
    Application case studies                      (Sacco)
    Web vs desktop                               Uniformities, Lattices
                                                   (Priss)
     environment
                                                  Granular computing
    Mainly for multimedia
     data                                     Less applicable results

    10                       WI/IAT 2009, Milano, Italy                     FaSet
Goal of the paper
    Propose a formal model: FaSet
    for representing
        Faceted Classification of resources
        Faceted Search Interfaces for such resource sets
        Searching, Filtering, Ranking operations
    compatible with modern web applications
        Mathematically simple
        Easy mapping to Relational Algebra
        Decouple classification and resources
    versatile and flexible
        Supports all “real-world” variations on Facets

    11                       WI/IAT 2009, Milano, Italy     FaSet
Facets and Foci
    Facets: disjoint sets                                       U

        Fa, Fb, Fc, …                         Fb
    Facet space:
        U = Fa  Fb  Fc  …                               Fa
    Focus L: subset
        La  Fa                                            Fa
        Many foci for each facet
                                                    La<2>
    Focus name: index list                         La<1>
        La<i,j,k,…>                            La<1,1>
                                                La<1,2>



    12                      WI/IAT 2009, Milano, Italy               FaSet
Hierarchy
    Hierarchical nesting of                                                Fa
                                                     La<2>
     foci is represented by
                                                     La<1>
     subset containment
                                                    La<1,1>
        La<narrower> 
                                                    La<1,2>
         La<broader>
    Locus names are                           Incomplete taxonomy
     chosen to represent                           No overlap allowed
     hierarchical containment                      A focus may be larger
        La<i,j,k>  La<i,j>                        than the union of its sub-
        Reminds of Dewey Decimal
                                                    foci
         Classification



    13                         WI/IAT 2009, Milano, Italy                  FaSet
Classification (Facet)
    Resources r are                                                    Fa
                                                      La<2>
     classified w.r.t. the facet
                                                      La<1>
     space
                                                    La<1,1>
        “Projection”: r  Fa
                                                    La<1,2>
    We may only represent
     projections built by                                     r  Fa
     combining foci
        r  Fa = ∪p La<p>
    Just the focus names
     are needed
        {<1,1>,<2>}

    14                          WI/IAT 2009, Milano, Italy             FaSet
Classification (Multidimensional)
    On the multi-
     dimensional space, the                                        rU

     cartesian product is                   r  Fb

     taken
        r  U = rFa  rFb  ...
    Just the focus names                                 r  Fa
     are needed
     




    15                       WI/IAT 2009, Milano, Italy             FaSet
Searching in FaSet
    Resources r                                                 r1
        Classified as r  U                      Fb                  q
                                                            r2
    Query q
        Expressed uniformly as q  U                                     Fa

    Search = Filtering + Ranking
        Filtering: r is relevant to q iff: (r  U) ⋂ (q  U)  
        Ranking: estimate the similarity S(q, r) of r to q




    16                         WI/IAT 2009, Milano, Italy                 FaSet
Filtering
    All resources that match, even partially, with the
     query
        (r  U) ⋂ (q  U)  
    May be easily computed by checking focus names
    Prefix-compatibility: La<p1> ≍ La<p2> iff
        p1 = p2, or
        p1 is a prefix of p2, or
        p2 is a prefix of p1
    At least one couple of foci, per each facet, must be
     prefix-compatible
        ∀Fa : ∃ La<p1> ∈ q, La<p2> ∈ r : La<p1> ≍ La<p2>

    17                         WI/IAT 2009, Milano, Italy   FaSet
Example
                    L<>
          L<1>                                  L<2>
 L<1,1>   L<1,2>   L<1,3>           L<2,1>             L<2,2>


                   <1,3>                        <2>             q


           <1>                                                  r1
                                                       <2,2>    r2
          <1,2>    <1,3>                                        r3
 <1,1>    <1,2>                                                 r4




18                 WI/IAT 2009, Milano, Italy                        FaSet
Ranking
    Compute similarity between resource and query
    Often neglected by Faceted Search Interfaces
    Define a Similarity Measure S(q, r) ∈ [0,1]
        Compute similarity between matching foci (deeper
         matches give higher scores)
        Aggregate focus-based similarity measures in the same
         facet (fuzzy sum)
        Normalize facet-level results
        Aggregate facet-based similarity measures across all
         facets (fuzzy product)



    19                     WI/IAT 2009, Milano, Italy       FaSet
FaSet Relational Implementation
    The FaSet classification requires
        A constant set of Facets
        A constant set of Foci
        An “index” table storing the list of focus names for each
         resource                                             constant




           Resource
           Database




    20                        WI/IAT 2009, Milano, Italy           FaSet
FaSet Relational Implementation
    The FaSet search algorithm uses
        Set operations
        Universal and existential quantification
        Aggregate operations for computing ranking measures
    Directly supported by Relational DBMS primitives




    21                     WI/IAT 2009, Milano, Italy      FaSet
Future work
    Experimentation of FaSet on sample data sets
        Performance evaluation
    Integration with front-end AJAX interfaces
        CMS module
        MIT Exhibit
    Evaluation of the ranking
     algorithm from the
     Information Retrieval
     point of view



    22                     WI/IAT 2009, Milano, Italy   FaSet
Conclusions - FaSet
    Formally defined faceted Representation & Search
     model
    Light formalism
    Supports hierarchies, nesting, multiple classification,
     incomplete specifications, …
    Compatible with modern web development
     technologies

                                                      Thank
                                                       you!
    23                   WI/IAT 2009, Milano, Italy      FaSet

More Related Content

Viewers also liked

Ontology languages and OWL
Ontology languages and OWLOntology languages and OWL
Ontology languages and OWLFulvio Corno
 
Database access and JDBC
Database access and JDBCDatabase access and JDBC
Database access and JDBCFulvio Corno
 
Architetture web - Linguaggi e standard - Web server, application server, dat...
Architetture web - Linguaggi e standard - Web server, application server, dat...Architetture web - Linguaggi e standard - Web server, application server, dat...
Architetture web - Linguaggi e standard - Web server, application server, dat...Fulvio Corno
 
Smart Buildings: dal campo al modello, andata e ritorno
Smart Buildings: dal campo al modello, andata e ritornoSmart Buildings: dal campo al modello, andata e ritorno
Smart Buildings: dal campo al modello, andata e ritornoFulvio Corno
 
SPARQL and the Open Linked Data initiative
SPARQL and the Open Linked Data initiativeSPARQL and the Open Linked Data initiative
SPARQL and the Open Linked Data initiativeFulvio Corno
 
Comunicazione aumentativa alternativa - cenni (corso di Tecnologie per la Dis...
Comunicazione aumentativa alternativa - cenni (corso di Tecnologie per la Dis...Comunicazione aumentativa alternativa - cenni (corso di Tecnologie per la Dis...
Comunicazione aumentativa alternativa - cenni (corso di Tecnologie per la Dis...Fulvio Corno
 
Ontologies: introduction, design, languages and tools
Ontologies: introduction, design, languages and toolsOntologies: introduction, design, languages and tools
Ontologies: introduction, design, languages and toolsFulvio Corno
 
JavaFX fundamentals
JavaFX fundamentalsJavaFX fundamentals
JavaFX fundamentalsFulvio Corno
 
Lists (Java Collections)
Lists (Java Collections)Lists (Java Collections)
Lists (Java Collections)Fulvio Corno
 

Viewers also liked (11)

Jdbc[1]
Jdbc[1]Jdbc[1]
Jdbc[1]
 
Ontology languages and OWL
Ontology languages and OWLOntology languages and OWL
Ontology languages and OWL
 
Database access and JDBC
Database access and JDBCDatabase access and JDBC
Database access and JDBC
 
Architetture web - Linguaggi e standard - Web server, application server, dat...
Architetture web - Linguaggi e standard - Web server, application server, dat...Architetture web - Linguaggi e standard - Web server, application server, dat...
Architetture web - Linguaggi e standard - Web server, application server, dat...
 
Smart Buildings: dal campo al modello, andata e ritorno
Smart Buildings: dal campo al modello, andata e ritornoSmart Buildings: dal campo al modello, andata e ritorno
Smart Buildings: dal campo al modello, andata e ritorno
 
SPARQL and the Open Linked Data initiative
SPARQL and the Open Linked Data initiativeSPARQL and the Open Linked Data initiative
SPARQL and the Open Linked Data initiative
 
Comunicazione aumentativa alternativa - cenni (corso di Tecnologie per la Dis...
Comunicazione aumentativa alternativa - cenni (corso di Tecnologie per la Dis...Comunicazione aumentativa alternativa - cenni (corso di Tecnologie per la Dis...
Comunicazione aumentativa alternativa - cenni (corso di Tecnologie per la Dis...
 
Ontologies: introduction, design, languages and tools
Ontologies: introduction, design, languages and toolsOntologies: introduction, design, languages and tools
Ontologies: introduction, design, languages and tools
 
JavaFX fundamentals
JavaFX fundamentalsJavaFX fundamentals
JavaFX fundamentals
 
Web Architectures
Web ArchitecturesWeb Architectures
Web Architectures
 
Lists (Java Collections)
Lists (Java Collections)Lists (Java Collections)
Lists (Java Collections)
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

FaSet: A Set Theory Model for Faceted Search

  • 1. Politecnico di Torino Dipartimento di Automatica e Informatica http://elite.polito.it FaSet: A Set Theory Model for Faceted Search Dario Bonino, Fulvio Corno, Laura Farinetti
  • 2. Outline  Faceted Search  Goal  The FaSet Set-Theoretical Model  FaSet Relational Implementation 2 WI/IAT 2009, Milano, Italy FaSet
  • 3. Faceted Classification  Originated in Library Science  Ranganathan, 1962  Content-based classification scheme  Multi-dimensional  Facet = classification dimension  Multi-valued  Focus = allowed value in one of the facets 3 WI/IAT 2009, Milano, Italy FaSet
  • 4. Example Color Shape Taste Facets Yellow Cube Sweet Red Sphere Bitter Orange Cone Neutral Allowed foci for Green Cylinder Acid each facet Blue White Black Choice of the foci describing the item 4 WI/IAT 2009, Milano, Italy FaSet
  • 5. Faceted Search Systems  Faceted Classification  Simple, intuitive, versatile, powerful  Adopted by more and more web sites  As a classification system for their products/items/documents/resources/…  As a model for the user interface in search, filtering, refinement 5 WI/IAT 2009, Milano, Italy FaSet
  • 6. Examples 6 WI/IAT 2009, Milano, Italy FaSet
  • 7. Examples 7 WI/IAT 2009, Milano, Italy FaSet
  • 8. Examples 8 WI/IAT 2009, Milano, Italy FaSet
  • 9. Facets in the real world  Multi-valued Color Shape classification Yellow Squared ▼ Red Cube  During classification Orange Parallelepiped  During search Green Rounded ▼  AND vs OR semantics? Sphere Blue  Hierarchical (nested) White Cylinder facets Black  Parents selectable? Other Weight  Incomplete classification 0-50 g  Numerical ranges 50-100 g 100+ g 9 WI/IAT 2009, Milano, Italy FaSet
  • 10. Facets in the Literature User Interfaces Data and logic model  Active research field  Methodologies from since ~2000 Library science  Usability studies (Broughton, Vickery)  Mainly for search  Formal models interfaces  Dynamic Taxonomies  Application case studies (Sacco)  Web vs desktop  Uniformities, Lattices (Priss) environment  Granular computing  Mainly for multimedia data  Less applicable results 10 WI/IAT 2009, Milano, Italy FaSet
  • 11. Goal of the paper  Propose a formal model: FaSet  for representing  Faceted Classification of resources  Faceted Search Interfaces for such resource sets  Searching, Filtering, Ranking operations  compatible with modern web applications  Mathematically simple  Easy mapping to Relational Algebra  Decouple classification and resources  versatile and flexible  Supports all “real-world” variations on Facets 11 WI/IAT 2009, Milano, Italy FaSet
  • 12. Facets and Foci  Facets: disjoint sets U  Fa, Fb, Fc, … Fb  Facet space:  U = Fa  Fb  Fc  … Fa  Focus L: subset  La  Fa Fa  Many foci for each facet La<2>  Focus name: index list La<1>  La<i,j,k,…> La<1,1> La<1,2> 12 WI/IAT 2009, Milano, Italy FaSet
  • 13. Hierarchy  Hierarchical nesting of Fa La<2> foci is represented by La<1> subset containment La<1,1>  La<narrower>  La<1,2> La<broader>  Locus names are  Incomplete taxonomy chosen to represent  No overlap allowed hierarchical containment  A focus may be larger  La<i,j,k>  La<i,j> than the union of its sub-  Reminds of Dewey Decimal foci Classification 13 WI/IAT 2009, Milano, Italy FaSet
  • 14. Classification (Facet)  Resources r are Fa La<2> classified w.r.t. the facet La<1> space La<1,1>  “Projection”: r  Fa La<1,2>  We may only represent projections built by r  Fa combining foci  r  Fa = ∪p La<p>  Just the focus names are needed  {<1,1>,<2>} 14 WI/IAT 2009, Milano, Italy FaSet
  • 15. Classification (Multidimensional)  On the multi- dimensional space, the rU cartesian product is r  Fb taken  r  U = rFa  rFb  ...  Just the focus names r  Fa are needed  15 WI/IAT 2009, Milano, Italy FaSet
  • 16. Searching in FaSet  Resources r r1  Classified as r  U Fb q r2  Query q  Expressed uniformly as q  U Fa  Search = Filtering + Ranking  Filtering: r is relevant to q iff: (r  U) ⋂ (q  U)    Ranking: estimate the similarity S(q, r) of r to q 16 WI/IAT 2009, Milano, Italy FaSet
  • 17. Filtering  All resources that match, even partially, with the query  (r  U) ⋂ (q  U)    May be easily computed by checking focus names  Prefix-compatibility: La<p1> ≍ La<p2> iff  p1 = p2, or  p1 is a prefix of p2, or  p2 is a prefix of p1  At least one couple of foci, per each facet, must be prefix-compatible  ∀Fa : ∃ La<p1> ∈ q, La<p2> ∈ r : La<p1> ≍ La<p2> 17 WI/IAT 2009, Milano, Italy FaSet
  • 18. Example L<> L<1> L<2> L<1,1> L<1,2> L<1,3> L<2,1> L<2,2> <1,3> <2> q <1> r1 <2,2> r2 <1,2> <1,3> r3 <1,1> <1,2> r4 18 WI/IAT 2009, Milano, Italy FaSet
  • 19. Ranking  Compute similarity between resource and query  Often neglected by Faceted Search Interfaces  Define a Similarity Measure S(q, r) ∈ [0,1]  Compute similarity between matching foci (deeper matches give higher scores)  Aggregate focus-based similarity measures in the same facet (fuzzy sum)  Normalize facet-level results  Aggregate facet-based similarity measures across all facets (fuzzy product) 19 WI/IAT 2009, Milano, Italy FaSet
  • 20. FaSet Relational Implementation  The FaSet classification requires  A constant set of Facets  A constant set of Foci  An “index” table storing the list of focus names for each resource constant Resource Database 20 WI/IAT 2009, Milano, Italy FaSet
  • 21. FaSet Relational Implementation  The FaSet search algorithm uses  Set operations  Universal and existential quantification  Aggregate operations for computing ranking measures  Directly supported by Relational DBMS primitives 21 WI/IAT 2009, Milano, Italy FaSet
  • 22. Future work  Experimentation of FaSet on sample data sets  Performance evaluation  Integration with front-end AJAX interfaces  CMS module  MIT Exhibit  Evaluation of the ranking algorithm from the Information Retrieval point of view 22 WI/IAT 2009, Milano, Italy FaSet
  • 23. Conclusions - FaSet  Formally defined faceted Representation & Search model  Light formalism  Supports hierarchies, nesting, multiple classification, incomplete specifications, …  Compatible with modern web development technologies Thank you! 23 WI/IAT 2009, Milano, Italy FaSet