Semantic Faceted Search 
with SemFacet 
Evgeny Kharlamov 
Information Systems Group 
Department of Computer Science 
University of Oxford
Finding Data w/ Keywords is Hard 
§ Keyword search is the paradigm 
to access data on the Web, 
company websites, etc 
§ Limitations of keyword search 
§ Too many docs contain keywords 
§ Meaning is not built in keywords 
§ Becomes the art of 
“finding the best combination” 
§ Limited control on search
How to Improve Search Experience? 
§ Improve the search paradigm 
§ End-user oriented query formulation interfaces 
§ Faceted search 
§ Improve the data model 
§ Semantic Web models 
§ Our proposal: 
§ do both and combine 
§ Faceted search 
§ Semantic Web model
Enhancing Keyword Search with Facets 
§ A facet = control mechanism 
§ Name 
§ Set of values
Enhancing Keyword Search with Facets 
§ A facet = control mechanism 
§ Name 
§ Set of values 
§ Facets in action 
§ Choose a value
Enhancing Keyword Search with Facets 
§ A facet = control mechanism 
§ Name 
§ Set of values 
§ Facets in action 
§ Choose a value 
§ Restrict search result 
§ Advantages of facets 
§ Allow to say what you 
really mean 
§ Give control over 
search
Faceted Search in the Nutshell 
stars 
3-stars 
restaurant 
§ Search over 
one set of items 
§ Items annotated with 
§ Strings 
§ Search result: 
subset of items 
Asian 
Italian 
4-stars 5-stars 
French 
Find 4-star hotels with French restaurants
Faceted Search in the Nutshell 
stars 
3-stars 
restaurant 
§ Search over 
one set of items 
§ Items annotated with 
§ Strings 
§ Search result: 
subset of items 
Asian 
Italian 
4-stars 5-stars 
French 
Find 4-star hotels with French restaurants
Faceted Search in the Nutshell 
stars 
3-stars 
restaurant 
§ Search over 
one set of items 
§ Items annotated with 
§ Strings 
§ Search result: 
subset of items 
Asian 
Italian 
4-stars 5-stars 
French 
Find 4-star hotels with French restaurants
Faceted Search in the Nutshell 
stars 
3-stars 
restaurant 
§ Search over 
one set of items 
§ Items annotated with 
§ Strings 
§ Search result: 
subset of items 
Asian 
Italian 
4-stars 5-stars 
French 
output 
Find 4-star hotels with French restaurants
F-Search is the De Facto Standard
Semantic Web Models 
§ RDF data model 
§ objects annotated with strings and objects 
§ OWL 2 ontologies 
§ structure vocabularies of annotations 
4-stars French 
stars 
restaurant 
type 
walking 
distance to 
French restaurant is a Restaurant that offers French cuisine. 
FrenchRestaurant ⊑ Restaurant ⊓ ∃ offers.FrenchCuisine
Enhancing Search with SW in Practice
Enhancing Search with SW in Practice
Enhancing Search with SW in Practice 
Hello, my name is John Doe. 
I study at the University if Dreams. 
My daughter is Alice.... 
embedding 
semantic 
annotations 
<section itemscope itemtype = "http://dava-vocabulary.org/Person" 
itemid = "http://myitems/john-doe-1234" > 
Hello, my name is 
<span itemprop="name">John Doe</span>. 
I study at the 
<span itemprop="affiliation">University of Dreams</span> 
My daughter is 
<span itemtype = "http://dava-vocabulary.org/children" 
itemid = "http://myitems/alice-doe-5678" > 
Alice </span> 
....
Semantic Web Models 
§ RDF data model 
§ objects annotated with strings and objects 
§ OWL 2 ontologies 
§ structure vocabularies of annotations 
from 2011 to 2012 the fraction of structured data went from 
3.5% to 13%
Semantic Web Models 
§ RDF data model 
§ objects annotated with strings and objects 
§ OWL 2 ontologies 
§ structure vocabularies of annotations 
from 2011 to 2012 the fraction of structured data went from 
3.5% to 13%
How to Improve Search Experience? 
§ Improve the search paradigm 
§ End-user oriented query formulation interfaces 
§ Faceted Search 
§ Improve the data model 
§ Semantic Web models 
§ RDF Data 
§ OWL 2 ontologies 
§ Our proposal: 
§ Semantic Faceted Search that combines 
§ Faceted search 
§ Semantic Web model
Semantic Faceted Search in the Nutshell 
4-stars 
stars 
3-stars 
§ Search over 
several sets of items 
§ Items annotated with 
§ Strings 
§ Items 
§ Search result: 
§ user-chosen 
subset of items 
5-stars Asian Italian French 
restaurant 
Find 4-star hotels with French restaurants 
that are walking distance to Eiffel tower 
type 
walking 
distance to
Semantic Faceted Search in the Nutshell 
stars 
3-stars 
§ Search over 
several sets of items 
§ Items annotated with 
§ Strings 
§ Items 
§ Search result: 
§ user-chosen 
subset of items 
4-stars 5-stars Asian Italian French 
restaurant 
Find 4-star hotels with French restaurants 
that are walking distance to Eiffel tower 
type 
walking 
distance to
Semantic Faceted Search in the Nutshell 
stars 
3-stars 
§ Search over 
several sets of items 
§ Items annotated with 
§ Strings 
§ Items 
§ Search result: 
§ user-chosen 
subset of items 
4-stars 5-stars Asian Italian French 
restaurant 
Find 4-star hotels with French restaurants 
that are walking distance to Eiffel tower 
type 
walking 
distance to
Semantic Faceted Search in the Nutshell 
stars 
3-stars 
§ Search over 
several sets of items 
§ Items annotated with 
§ Strings 
§ Items 
§ Search result: 
§ user-chosen 
subset of items 
4-stars 5-stars Asian Italian French 
restaurant 
Find 4-star hotels with French restaurants 
that are walking distance to Eiffel tower 
type 
walking 
distance to
Semantic Faceted Search in the Nutshell 
stars 
3-stars 
§ Search over 
several sets of items 
§ Items annotated with 
§ Strings 
§ Items 
§ Search result: 
§ user-chosen 
subset of items 
4-stars 5-stars Asian Italian French 
restaurant 
Find 4-star hotels with French restaurants 
that are walking distance to Eiffel tower 
type 
walking 
distance to 
output
Semantic Faceted Search in the Nutshell 
stars 
3-stars 
§ Search over 
several sets of items 
§ Items annotated with 
§ Strings 
§ Items 
§ Search result: 
§ user-chosen 
subset of items 
4-stars 5-stars Asian Italian French 
restaurant 
Find 4-star hotels with French restaurants 
that are walking distance to Eiffel tower 
type 
walking 
distance to 
output
Semantic Faceted Search in the Nutshell 
stars 
3-stars 
§ Search over 
several sets of items 
§ Items annotated with 
§ Strings 
§ Items 
§ Search result: 
§ user-chosen 
subset of items 
4-stars 5-stars Asian Italian French 
restaurant 
Find 4-star hotels with French restaurants 
that are walking distance to Eiffel tower 
type 
walking 
distance to 
output
Research Contributions 
§ Solid foundation for Semantic F-Search 
§ Projection of ontologies on 
graph data structures 
§ Allows to incorporate ontologies 
into faceted search 
§ Gives better faceted interfaces 
politicians Search 
More Focus 
type 
USpres 
Country 
More Focus 
More Focus 
Remove 
More Focus 
Remove 
§ Generate more facets / Prune irrelevant facets 
§ Scalable algorithms to 
§ generate and update facets from 
§ Data and Ontologies 
§ Algorithms to evaluate faceted queries over semantic data 
§ Exploits bottom up query evaluation 
http://en.wikipedia.org/wiki/Bill_Clinton 
William Jefferson "Bill" Clinton (born William 
Jefferson Blythe III; August 19, 1946) is an 
American politician who served as the 42nd 
President of the United States from 1993 to 
2001. Inaugurated at age 46, he was the third-youngest 
president. He took office at the end 
of the Cold War, and was the first president of 
the baby boomer generation... 
has child 
ANY 
Remove 
Remove 
is graduated from 
Stanford Uni. 
is graduated from 
Stanford Uni. 
Harvard Uni. 
Georgetown Uni.
SemFacet System 
§ Integration of 
§ Keyword search and 
§ Semantic faceted search 
§ Main features 
§ Automatic generation of f-search interfaces 
over RDF data and OWL 2 ontologies 
§ In memory 
§ Online and offline reasoning 
§ Efficient on millions of triples 
§ Flexible configuration 
§ Interchangeable triple stores 
§ RDFOX, PAGOdA, Hermit, Sesame 
§ Configurable answers (snippets) 
§ Support of Or and And facets 
Faceted Query 
Interface 
Answers as 
Snippets 
Presentation 
Layer 
Application 
Layer 
Data 
Layer 
Facet 
Generator 
Query 
Converter 
Snippet 
Generator 
Triple Store: 
Ontology 
Data 
Keyword 
Based Search 
KBS 
Engine 
Inverted Index 
e.g. DBpedia 
Abstracts 
RDFOX, PAGOdA, Hermit, Sesame
SemFacet Team 
§ Marcelo Arenas 
§ Bernardo Cuenca Grau 
§ Evgeny Kharlamov 
§ Sarunas Marciuska 
§ Dmitriy Zheleznyakov

Semantic Faceted Search with SemFacet presentation

  • 1.
    Semantic Faceted Search with SemFacet Evgeny Kharlamov Information Systems Group Department of Computer Science University of Oxford
  • 2.
    Finding Data w/Keywords is Hard § Keyword search is the paradigm to access data on the Web, company websites, etc § Limitations of keyword search § Too many docs contain keywords § Meaning is not built in keywords § Becomes the art of “finding the best combination” § Limited control on search
  • 3.
    How to ImproveSearch Experience? § Improve the search paradigm § End-user oriented query formulation interfaces § Faceted search § Improve the data model § Semantic Web models § Our proposal: § do both and combine § Faceted search § Semantic Web model
  • 4.
    Enhancing Keyword Searchwith Facets § A facet = control mechanism § Name § Set of values
  • 5.
    Enhancing Keyword Searchwith Facets § A facet = control mechanism § Name § Set of values § Facets in action § Choose a value
  • 6.
    Enhancing Keyword Searchwith Facets § A facet = control mechanism § Name § Set of values § Facets in action § Choose a value § Restrict search result § Advantages of facets § Allow to say what you really mean § Give control over search
  • 7.
    Faceted Search inthe Nutshell stars 3-stars restaurant § Search over one set of items § Items annotated with § Strings § Search result: subset of items Asian Italian 4-stars 5-stars French Find 4-star hotels with French restaurants
  • 8.
    Faceted Search inthe Nutshell stars 3-stars restaurant § Search over one set of items § Items annotated with § Strings § Search result: subset of items Asian Italian 4-stars 5-stars French Find 4-star hotels with French restaurants
  • 9.
    Faceted Search inthe Nutshell stars 3-stars restaurant § Search over one set of items § Items annotated with § Strings § Search result: subset of items Asian Italian 4-stars 5-stars French Find 4-star hotels with French restaurants
  • 10.
    Faceted Search inthe Nutshell stars 3-stars restaurant § Search over one set of items § Items annotated with § Strings § Search result: subset of items Asian Italian 4-stars 5-stars French output Find 4-star hotels with French restaurants
  • 11.
    F-Search is theDe Facto Standard
  • 12.
    Semantic Web Models § RDF data model § objects annotated with strings and objects § OWL 2 ontologies § structure vocabularies of annotations 4-stars French stars restaurant type walking distance to French restaurant is a Restaurant that offers French cuisine. FrenchRestaurant ⊑ Restaurant ⊓ ∃ offers.FrenchCuisine
  • 13.
    Enhancing Search withSW in Practice
  • 14.
    Enhancing Search withSW in Practice
  • 15.
    Enhancing Search withSW in Practice Hello, my name is John Doe. I study at the University if Dreams. My daughter is Alice.... embedding semantic annotations <section itemscope itemtype = "http://dava-vocabulary.org/Person" itemid = "http://myitems/john-doe-1234" > Hello, my name is <span itemprop="name">John Doe</span>. I study at the <span itemprop="affiliation">University of Dreams</span> My daughter is <span itemtype = "http://dava-vocabulary.org/children" itemid = "http://myitems/alice-doe-5678" > Alice </span> ....
  • 16.
    Semantic Web Models § RDF data model § objects annotated with strings and objects § OWL 2 ontologies § structure vocabularies of annotations from 2011 to 2012 the fraction of structured data went from 3.5% to 13%
  • 17.
    Semantic Web Models § RDF data model § objects annotated with strings and objects § OWL 2 ontologies § structure vocabularies of annotations from 2011 to 2012 the fraction of structured data went from 3.5% to 13%
  • 18.
    How to ImproveSearch Experience? § Improve the search paradigm § End-user oriented query formulation interfaces § Faceted Search § Improve the data model § Semantic Web models § RDF Data § OWL 2 ontologies § Our proposal: § Semantic Faceted Search that combines § Faceted search § Semantic Web model
  • 19.
    Semantic Faceted Searchin the Nutshell 4-stars stars 3-stars § Search over several sets of items § Items annotated with § Strings § Items § Search result: § user-chosen subset of items 5-stars Asian Italian French restaurant Find 4-star hotels with French restaurants that are walking distance to Eiffel tower type walking distance to
  • 20.
    Semantic Faceted Searchin the Nutshell stars 3-stars § Search over several sets of items § Items annotated with § Strings § Items § Search result: § user-chosen subset of items 4-stars 5-stars Asian Italian French restaurant Find 4-star hotels with French restaurants that are walking distance to Eiffel tower type walking distance to
  • 21.
    Semantic Faceted Searchin the Nutshell stars 3-stars § Search over several sets of items § Items annotated with § Strings § Items § Search result: § user-chosen subset of items 4-stars 5-stars Asian Italian French restaurant Find 4-star hotels with French restaurants that are walking distance to Eiffel tower type walking distance to
  • 22.
    Semantic Faceted Searchin the Nutshell stars 3-stars § Search over several sets of items § Items annotated with § Strings § Items § Search result: § user-chosen subset of items 4-stars 5-stars Asian Italian French restaurant Find 4-star hotels with French restaurants that are walking distance to Eiffel tower type walking distance to
  • 23.
    Semantic Faceted Searchin the Nutshell stars 3-stars § Search over several sets of items § Items annotated with § Strings § Items § Search result: § user-chosen subset of items 4-stars 5-stars Asian Italian French restaurant Find 4-star hotels with French restaurants that are walking distance to Eiffel tower type walking distance to output
  • 24.
    Semantic Faceted Searchin the Nutshell stars 3-stars § Search over several sets of items § Items annotated with § Strings § Items § Search result: § user-chosen subset of items 4-stars 5-stars Asian Italian French restaurant Find 4-star hotels with French restaurants that are walking distance to Eiffel tower type walking distance to output
  • 25.
    Semantic Faceted Searchin the Nutshell stars 3-stars § Search over several sets of items § Items annotated with § Strings § Items § Search result: § user-chosen subset of items 4-stars 5-stars Asian Italian French restaurant Find 4-star hotels with French restaurants that are walking distance to Eiffel tower type walking distance to output
  • 26.
    Research Contributions §Solid foundation for Semantic F-Search § Projection of ontologies on graph data structures § Allows to incorporate ontologies into faceted search § Gives better faceted interfaces politicians Search More Focus type USpres Country More Focus More Focus Remove More Focus Remove § Generate more facets / Prune irrelevant facets § Scalable algorithms to § generate and update facets from § Data and Ontologies § Algorithms to evaluate faceted queries over semantic data § Exploits bottom up query evaluation http://en.wikipedia.org/wiki/Bill_Clinton William Jefferson "Bill" Clinton (born William Jefferson Blythe III; August 19, 1946) is an American politician who served as the 42nd President of the United States from 1993 to 2001. Inaugurated at age 46, he was the third-youngest president. He took office at the end of the Cold War, and was the first president of the baby boomer generation... has child ANY Remove Remove is graduated from Stanford Uni. is graduated from Stanford Uni. Harvard Uni. Georgetown Uni.
  • 27.
    SemFacet System §Integration of § Keyword search and § Semantic faceted search § Main features § Automatic generation of f-search interfaces over RDF data and OWL 2 ontologies § In memory § Online and offline reasoning § Efficient on millions of triples § Flexible configuration § Interchangeable triple stores § RDFOX, PAGOdA, Hermit, Sesame § Configurable answers (snippets) § Support of Or and And facets Faceted Query Interface Answers as Snippets Presentation Layer Application Layer Data Layer Facet Generator Query Converter Snippet Generator Triple Store: Ontology Data Keyword Based Search KBS Engine Inverted Index e.g. DBpedia Abstracts RDFOX, PAGOdA, Hermit, Sesame
  • 28.
    SemFacet Team §Marcelo Arenas § Bernardo Cuenca Grau § Evgeny Kharlamov § Sarunas Marciuska § Dmitriy Zheleznyakov