Presentation of the paper "FaSet: A Set Theory Model for Faceted Search" by D. Bonino, F. Corno, L. Farinetti at the 2009 IEEE/ACM/WIC Intenationational Conference on Web Intelligence
1. Politecnico di Torino
Dipartimento di Automatica e Informatica
http://elite.polito.it
FaSet: A Set Theory Model for
Faceted Search
Dario Bonino, Fulvio Corno, Laura Farinetti
3. Faceted Classification
Originated in Library Science
Ranganathan, 1962
Content-based classification scheme
Multi-dimensional
Facet = classification dimension
Multi-valued
Focus = allowed value in one of the facets
3 WI/IAT 2009, Milano, Italy FaSet
4. Example
Color Shape Taste Facets
Yellow Cube Sweet
Red Sphere Bitter
Orange Cone Neutral
Allowed foci for
Green Cylinder Acid
each facet
Blue
White
Black
Choice of the foci
describing the item
4 WI/IAT 2009, Milano, Italy FaSet
5. Faceted Search Systems
Faceted Classification
Simple, intuitive, versatile, powerful
Adopted by more and more web sites
As a classification system for their
products/items/documents/resources/…
As a model for the user interface in search, filtering,
refinement
5 WI/IAT 2009, Milano, Italy FaSet
9. Facets in the real world
Multi-valued Color Shape
classification Yellow Squared ▼
Red Cube
During classification
Orange Parallelepiped
During search
Green Rounded ▼
AND vs OR semantics? Sphere
Blue
Hierarchical (nested) White Cylinder
facets Black
Parents selectable? Other
Weight
Incomplete classification 0-50 g
Numerical ranges 50-100 g
100+ g
9 WI/IAT 2009, Milano, Italy FaSet
10. Facets in the Literature
User Interfaces Data and logic model
Active research field Methodologies from
since ~2000 Library science
Usability studies (Broughton, Vickery)
Mainly for search Formal models
interfaces Dynamic Taxonomies
Application case studies (Sacco)
Web vs desktop Uniformities, Lattices
(Priss)
environment
Granular computing
Mainly for multimedia
data Less applicable results
10 WI/IAT 2009, Milano, Italy FaSet
11. Goal of the paper
Propose a formal model: FaSet
for representing
Faceted Classification of resources
Faceted Search Interfaces for such resource sets
Searching, Filtering, Ranking operations
compatible with modern web applications
Mathematically simple
Easy mapping to Relational Algebra
Decouple classification and resources
versatile and flexible
Supports all “real-world” variations on Facets
11 WI/IAT 2009, Milano, Italy FaSet
12. Facets and Foci
Facets: disjoint sets U
Fa, Fb, Fc, … Fb
Facet space:
U = Fa Fb Fc … Fa
Focus L: subset
La Fa Fa
Many foci for each facet
La<2>
Focus name: index list La<1>
La<i,j,k,…> La<1,1>
La<1,2>
12 WI/IAT 2009, Milano, Italy FaSet
13. Hierarchy
Hierarchical nesting of Fa
La<2>
foci is represented by
La<1>
subset containment
La<1,1>
La<narrower>
La<1,2>
La<broader>
Locus names are Incomplete taxonomy
chosen to represent No overlap allowed
hierarchical containment A focus may be larger
La<i,j,k> La<i,j> than the union of its sub-
Reminds of Dewey Decimal
foci
Classification
13 WI/IAT 2009, Milano, Italy FaSet
14. Classification (Facet)
Resources r are Fa
La<2>
classified w.r.t. the facet
La<1>
space
La<1,1>
“Projection”: r Fa
La<1,2>
We may only represent
projections built by r Fa
combining foci
r Fa = ∪p La<p>
Just the focus names
are needed
{<1,1>,<2>}
14 WI/IAT 2009, Milano, Italy FaSet
15. Classification (Multidimensional)
On the multi-
dimensional space, the rU
cartesian product is r Fb
taken
r U = rFa rFb ...
Just the focus names r Fa
are needed
15 WI/IAT 2009, Milano, Italy FaSet
16. Searching in FaSet
Resources r r1
Classified as r U Fb q
r2
Query q
Expressed uniformly as q U Fa
Search = Filtering + Ranking
Filtering: r is relevant to q iff: (r U) ⋂ (q U)
Ranking: estimate the similarity S(q, r) of r to q
16 WI/IAT 2009, Milano, Italy FaSet
17. Filtering
All resources that match, even partially, with the
query
(r U) ⋂ (q U)
May be easily computed by checking focus names
Prefix-compatibility: La<p1> ≍ La<p2> iff
p1 = p2, or
p1 is a prefix of p2, or
p2 is a prefix of p1
At least one couple of foci, per each facet, must be
prefix-compatible
∀Fa : ∃ La<p1> ∈ q, La<p2> ∈ r : La<p1> ≍ La<p2>
17 WI/IAT 2009, Milano, Italy FaSet
19. Ranking
Compute similarity between resource and query
Often neglected by Faceted Search Interfaces
Define a Similarity Measure S(q, r) ∈ [0,1]
Compute similarity between matching foci (deeper
matches give higher scores)
Aggregate focus-based similarity measures in the same
facet (fuzzy sum)
Normalize facet-level results
Aggregate facet-based similarity measures across all
facets (fuzzy product)
19 WI/IAT 2009, Milano, Italy FaSet
20. FaSet Relational Implementation
The FaSet classification requires
A constant set of Facets
A constant set of Foci
An “index” table storing the list of focus names for each
resource constant
Resource
Database
20 WI/IAT 2009, Milano, Italy FaSet
21. FaSet Relational Implementation
The FaSet search algorithm uses
Set operations
Universal and existential quantification
Aggregate operations for computing ranking measures
Directly supported by Relational DBMS primitives
21 WI/IAT 2009, Milano, Italy FaSet
22. Future work
Experimentation of FaSet on sample data sets
Performance evaluation
Integration with front-end AJAX interfaces
CMS module
MIT Exhibit
Evaluation of the ranking
algorithm from the
Information Retrieval
point of view
22 WI/IAT 2009, Milano, Italy FaSet
23. Conclusions - FaSet
Formally defined faceted Representation & Search
model
Light formalism
Supports hierarchies, nesting, multiple classification,
incomplete specifications, …
Compatible with modern web development
technologies
Thank
you!
23 WI/IAT 2009, Milano, Italy FaSet