In fact, some of these searches are so hard that the users don’t even try them anymore
With ads, the situation is even worse due to the sparsity problem. Note how poor the ads are…
Search is a form of content aggregation
Semantic search can be seen as a retrieval paradigm Centered on the use of semantics Incorporates the semantics entailed by the query and (or) the resources into the matching process, it essentially performs semantic search.
Close to the topic of keyword-search in databases, except knowledge-bases have a schema-oblivious design Different papers assume vastly different query needs even on the same type of data
Semantic Search Peter Mika Yahoo! Research
Yahoo! serves over 680 million users in 25 countries
Yahoo! Research: visit us at research.yahoo.com
george bush (and I mean the beer brewer in Arizona)
paris hilton sexy
Imprecise or overly precise searches
pictures of strong adventures people
Searches for descriptions
countries in africa
32 year old computer scientist living in barcelona
reliable digital camera under 300 dollars
Many of these queries would not be asked by users, who learned over time what search technology can and can not do.
Dealing with sparse collections Note: don’t solve the sparsity problem where it doesn’t exist
Contextual Search: content-based recommendations Hovering over an underlined phrase triggers a search for related news items.
Contextual Search: personalization Machine Learning based ‘search’ algorithm selects the main story and the three alternate stories based on the users demographics (age, gender etc.) and previous behavior. Display advertizing is a similar top-1 search problem on the collection of advertisements.
Contextual Search: new devices Show related content Connect to friends watching the same
Aggregation across different dimensions Hyperlocal: showing content from across Yahoo that is relevant to a particular neighbourhood.
Searching data instead or in addition to searching documents
Providing direct answers (possibly with explanations)
Support for novel search tasks
Direct answers in search Information box with content from and links to Yahoo! Travel Points of interest in Vienna, Austria Since Aug, 2010, ‘regular’ search results are ‘Powered by Bing’ Products from Yahoo! Shopping
Makes use of the structure of the data or explicit schemas to understand user intent and the meaning of content
Exploits this understanding at some part of the search process
Combination of document and data retrieval
Documents with metadata
Metadata may be embedded inside the document
I’m looking for documents that mention countries in Africa.
Structured data, but searchable text fields
I’m looking for directors , who have directed movies where the synopsis mentions dinosaurs.
Web search vs. vertical/enterprise/desktop search
Keyword search in databases
Natural Language Retrieval
Semantics at every step of the IR process bla bla bla? q=“bla” * 3 Document processing bla bla bla Indexing Ranking Query interpretation Result presentation The IR engine The Web bla bla bla bla bla bla “ bla” θ (q,d)
Each resource (thing, entity) is identified by a URI
Globally unique identifiers
Locators of information
Data is broken down into individual facts
Triples of (subject, predicate, object)
A set of triples (an RDF graph) is published together in an RDF document
example:roi “ Roi Blanco” name type foaf:Person RDF document
Linked Data: interlinked RDF documents example:roi “ Roi Blanco” name foaf:Person sameAs example:roi2 worksWith example:peter “ firstname.lastname@example.org” email type type Roi’s homepage Yahoo Friend-of-a-Friend ontology
RDFa: metadata embedded in HTML … <p typeof=”foaf:Person" about="http://example.org/roi"> <span property=”foaf:name” >Roi Blanco</span>. <a rel=”owl:sameAs" href="http://research.yahoo.com/roi"> Roi Blanco </a>. You can contact him at <a rel=”foaf:mbox" href="mailto:email@example.com"> via email </ a>. </p> ... Roi’s homepage