SlideShare a Scribd company logo
1 of 50
Making the Web Searchable
P R E S E N T E D B Y P e t e r M i k a , Y a h o o L a b s ⎪ J u n e 2 5 , 2 0 1 5
Agenda
2
 Web Search
› How it works… and where it fails
 Semantic Web
› The promise and the reality
 Semantic Search
› Research and Applications at Yahoo
 What’s next?
› More intelligence!
Search is really fast, but not particularly intelligent
What it’s like to be a machine?
Roi Blanco
What it’s like to be a machine?
↵⏏☐ģ
✜Θ♬♬ţğ√∞§®ÇĤĪ✜★♬☐✓✓
ţğ★✜
✪✚✜ΔΤΟŨŸÏĞÊϖυτρ℠≠⅛⌫
≠=⅚©§★✓♪ΒΓΕ℠
✖Γ♫⅜±⏎↵⏏☐ģğğğμλκσςτ
⏎⌥°¶§ΥΦΦΦ✗✕☐
⏎↵⏏☐ģğğğ
Real problem
Search needs (more) intelligence
 Current paradigm: ad-hoc document retrieval
› Set of documents that each may fulfill the information need
› Relevance of those documents is clearly established by
• Textual similarity
• Authority
 Queries outside this paradigm are hard
› Semantic gap between the query and the content
• Queries that require a deeper understanding of the world at large
› Complex needs where no document contains the answer
 Analysis of aggregate behavior is a challenge
› We may answer two queries perfectly, without knowing how they are related
 Semantic gap
› Ambiguity
• jaguar
• paris hilton
› Secondary meaning
• george bush (and I mean the beer brewer
in Arizona)
› Subjectivity
• reliable digital camera
• paris hilton sexy
› Imprecise or overly precise searches
• jim hendler
 Complex needs
› Missing information
• brad pitt zombie
• florida man with 115 guns
• 35 year old computer scientist living in
barcelona
› Category queries
• countries in africa
• barcelona nightlife
› Transactional or computational queries
• 120 dollars in euros
• digital camera under 300 dollars
• world temperature in 2020
Examples of hard queries
Are there even
any true
keyword
queries?
Users may
have stopped
asking them
The Semantic Web
Enter the Semantic Web
10
 A social-technical ecosystem where the
meaning of content is explicit and shared
among agents (humans and machines)
› Shared identifiers for real-world entities
› Standards for exchanging structured data
• Data modeled as a graph
› Shared formal, schema languages
• Names of entity and relationship types
• Constraints on the entities, relationships and attributes
 “At the doctor's office, Lucy instructed her
Semantic Web agent through her
handheld Web browser. The agent
promptly retrieved information about
Mom's prescribed treatment from the
doctor's agent, looked up several lists of
providers, and checked for the ones in-
plan for Mom's insurance within a 20-mile
radius of her home and with a rating of
excellent or very good on trusted rating
services. It then began trying to find a
match between available appointment
times (supplied by the agents of individual
providers through their Web sites) and
Pete's and Lucy's busy schedules.”
› The Semantic Web. Tim Berners-Lee, James
Hendler, Ora Lassila. Appeared in: Scientific
American 284(5):34-43 (May 2001)
7/3/201511
Huge expectations
In a perfect world, the Semantic Web is the end-game for IR
#ROI_BLANCO
#ROI_BLANCO
#ROI_BLANCO
The view from IR: skepticism
13
 Mismatch to the IR problem
› End-users
• Do not know the identifiers of things
• Not aware of the schema of the data
• Can’t be expected to learn complex query languages
› Focus on document retrieval
• Limited recognition of the value of direct answers (initially)
 “Who is going to do the annotation?”
› Focus still largely on text
› Automated annotation/extraction tools produce poor results
 Early experiments using external knowledge are unsuccessful
› Query/document expansion using thesauri etc.
The Semantic Web off to a slow start
14
 Complex technology
› Set of standards sold as a stack
• URI, RDF, RDF/XML, RDFa, JSON-LD,
OWL, RIF, SPARQL, OWL-S, POWDER …
› Not very developer friendly
 Ideals of knowledge representation
hard to enforce on the Web
 No clear value proposition
› Chicken and egg problem
• No users/use cases, hence no data
• No data, because no users/use cases
… and became practical
15
 Simplified technology
› Fewer, more developer friendly representations
› Focusing on the lower layers of the stack
› Data first, schemas/logic second
 Giving up ideals about knowledge representation
› Shared identifiers
› Logical consistency
› Distinction between real-world entities and web resources
 Motivation for adoption
› Search engines, Facebook, Pinterest, Twitter etc. investing in information extraction
Two important achievements
16
 Linked Open Data
› Social movement to (re)publish existing datasets
• 100B+ triples of data
• Encyclopedic, governmental, geo, scientific datasets
• Impact: background knowledge
– Basis for knowledge graphs in search engines
 Metadata inside HTML pages
› Facebook’s OGP and schema.org
• Over 15% of all pages have schema.org markup (2013)
• Personal information, images, videos, reviews, recipes etc.
• Impact: remove the need for automated extraction
Example
17
View source
18
Caveat: not your perfect Semantic Web
19
 Outdated, incorrect or incomplete data
› Lack of write access or feedback mechanisms
 Mistakes made by tools
› Noisy information extraction
› Entity linking (reconciliation)
 Limited or no reuse of identifiers
 Metadata not always representative of content
Semantic Search at Yahoo
20
Semantic Search research (2007-)
 Emergence of the Semantic Search field
› Intersection of IR, NLP, DB and SemWeb
• ESAIR at SIGIR
• SemSearch at ESWC/WWW
• EOS and JIWES at SIGIR
• Semantic Search at VLDB
 Exploiting semantic understanding in the retrieval process
› User intent and resources are represented using semantic models
• Semantic models typically differ across NLP, DB and Semantic Web
› Semantic models are exploited in the matching and ranking of resources
Semantic Search – a process view
Query
Constructi
on
•Keywords
•Forms
•NL
•Formal language
Query
Processin
g
•IR-style matching & ranking
•DB-style precise matching
•KB-style matching & inferences
Result
Presentation
•Query visualization
•Document and data presentation
•Summarization
Query
Refinement
•Implicit feedback
•Explicit feedback
•Incentives
Document Representation
Knowledge Representation
Semantic Models
Resources
Documents
Result presentation using metadata
Personal and
private
homepage
of the same
person
(clear from the
snippet but it
could be also
automatically
de-duplicated)
Conferences
he plans to attend
and his vacations
from homepage
plus bio events
from LinkedIn
Geolocation
“Microsearch”
internal prototype
(2007)
Yahoo SearchMonkey (2008)
1. Extract structured data
› Semantic Web markup
• Example:
<span property=“vcard:city”>Santa Clara</span>
<span property=“vcard:region”>CA</span>
› Information Extraction
2. Presentation
› Fixed presentation templates
• One template per object type
› Applications
• Third-party modules to display data (SearchMonkey)
Effectiveness of enhanced results
 Explicit user feedback
› Side-by-side editorial evaluation (A/B testing)
• Editors are shown a traditional search result and enhanced result for the same page
• Users prefer enhanced results in 84% of the cases and traditional results in 3% (N=384)
 Implicit user feedback
› Click-through rate analysis
• Long dwell time limit of 100s (Ciemiewicz et al. 2010)
• 15% increase in ‘good’ clicks
› User interaction model
• Enhanced results lead users to relevant documents (IV) even though less likely to clicked than textual (III)
• Enhanced results effectively reduce bad clicks!
 See
› Kevin Haas, Peter Mika, Paul Tarjan, Roi Blanco: Enhanced results for web search. SIGIR 2011:
725-734
Adoption among consumers of web content
 Google announces Rich Snippets - June, 2009
› Faceted search for recipes - Feb, 2011
 Bing tiles – Feb, 2011
 Facebook’s Like button and the Open Graph Protocol (2010)
› Shows up in profiles and news feed
› Site owners can later reach users who have liked an object
schema.org
 Collaborative effort sponsored by large consumers of Web data
› Bing, Google, and Yahoo! as initial founders (June, 2011)
› Yandex joins schema.org in Nov, 2011
 Agreement on a shared set of schemas for the Web
› Available at schema.org in HTML and machine readable formats
› Free to use under W3C Royalty Free terms
Yahoo’s Knowledge Graph
Chicago Cubs
Chicago
Barack Obama
Carlos Zambrano
10% off tickets
for
plays for
plays in
lives in
Brad Pitt
Angelina Jolie
Steven Soderbergh
George Clooney
Ocean’s Twelve
partner
directs
casts in
E/R
casts
in
takes place in
Fight Club
casts in
Dust Brothers
casts
in
music by
Nicolas Torzec: Making knowledge reusable at Yahoo!: a Look at the Yahoo! Knowledge Base
(SemTech 2013)
Building the Knowledge Graph
 Information extraction
› Automated information extraction
• e.g. wrapper induction
› Metadata from HTML pages
• Focused crawler
› Public datasets (e.g. Dbpedia)
› Proprietary data
 Data fusion
› Manual mapping from the source
schemas to the ontology
› Supervised entity reconciliation
 Ontology management
› Editorially maintained OWL ontology
with 300+ classes
› Covering the domains of interest of
Yahoo
 Curation and quality assessment
› Editors and user feedback still play a
large role
Bellare et al: WOO: A Scalable and Multi-tenant Platform for Continuous Knowledge Base Synthesis. PVLDB 2013
Welch et al.: Fast and accurate incremental entity resolution relative to an entity knowledge base. CIKM 2012
 Entity linking/entity retrieval
› Identifying the most relevant entity to
the query
 Entity recommendation
› Given that the user is interested in one
entity, which entity to recommend next?
Roi Blanco, Berkant Barla Cambazoglu,
Peter Mika, Nicolas Torzec: Entity
Recommendations in Web Search. ISWC
2013
Entity displays in web search
The importance of entities
34
 Entity mention query = <entity> {+ <intent>}
› ~70% of queries contain a named entity (entity mention queries)
• brad pitt height
› ~50% of queries have an entity focus (entity seeking queries)
• brad pitt attacked by fans
› ~10% of queries are looking for a class of entities
• brad pitt movies
› Jeffrey Pound, Peter Mika, Hugo Zaragoza:
Ad-hoc object retrieval in the web of data. WWW 2010: 771-780
 Intent is typically an additional word or phrase
• Disambiguate, most often by type e.g. brad pitt actor
• Specify action or aspect e.g. brad pitt net worth, toy story trailer
brad pitt height
how tall is
tall
…
 Inverted index
› Inspired by text retrieval
• Match individual keywords
• Score and aggregate
 Parsing
› Inspired by text parsing
• Find potential mentions of entities (spots)
in query
• Score candidates for each spot
Two broad approaches to entity retrieval
brad
(actor) (boxer) (city)
(actor) (boxer) (lake)
pitt
brad pitt
(actor) (boxer)
Retrieval-based approach
 Experimented with different index structures
› Horizontal: one field for text and one for property name
› Vertical: One field per property
› Combination: one field per property weight (best performance in both AND/OR mode)
Horizontal
Vertical
R-Vertical
Retrieval-based approach
37
 Ranking based on BM25F
› R. Blanco, P. Mika, S. Vigna: Effective and Efficient Entity Search in RDF Data. ISWC 2011
› 42% improvement in MAP over best method in SemSearch 2010
› <100ms time for simple conjunctive queries
 Open source implementation and demo using WebDataCommons data
› glimmer.research.yahoo.com
› https://github.com/yahoo/Glimmer/
Doc
map
map
reduce
reduce
map reduce
Index
Entity linking approach
38
 Large-scale entity/alias dictionaries
› Alias mining from usage data, Wikipedia etc.
 Dynamic segmentation
 Novel method for scoring alias matches
› Completely unsupervised
› Combination of
• Keyphraseness: how likely is a segment to be an entity mention?
• Commonness: How likely that a linked segment refers to a particular entity?
• Context-model based on word2vec representation
› Roi Blanco, Giuseppe Ottaviano and Edgar Meij.
Fast and space-efficient entity linking in queries. WSDM 2015
Results: effectiveness
39
 Significant improvement over external baselines and internal system
› Measured on public Webscope dataset Yahoo Search Query Log to Entities
Search over Bing, top
Wikipedia result
State-of-the-art in literature
A trivial search engine
over Wikipedia
Our method:
Fast Entity Linker (FEL)
FEL + context
 Two orders of magnitude faster
than state-of-the-art
› Simplifying assumptions at scoring time
› Adding context independently
› Dynamic pruning
 Small memory footprint
› Compression techniques, e.g. 10x
reduction in word2vec storage
40
Results: efficiency
Related entity recommendations
 Some users are short on time
› Need for direct answers
› Query expansion, question-answering, information boxes, rich results…
 Other users have time at their hand
› Long term interests such as sports, celebrities, movies and music
› Long running tasks such as travel planning
Example user sessions
Spark system for related entity recommendations
Entity
graph
Data
preprocessing
Feature
extraction
Model
learning
Feature
sources
Editorial
judgements
Datapack
Ranking
model
Ranking and
disambiguation
Entity
data
Features
Machine learned ranking
 Features from the Knowledge Graph and large-scale text sources
› Unary
• Popularity features from text: probability, entropy, wiki id popularity …
• Graph features: PageRank on the entity graph, wikipedia, web graph
• Type features: entity type
› Binary
• Co-occurrence features from text: conditional probability, joint probability …
• Graph features: common neighbors …
• Type features: relation type
 Regression model using Gradient Boosted Decision Trees (GBDT)
› Trained on editorial data (cf. clicks)
Evaluation
45
1. 10-fold cross-validation
2. Side-by-side testing
› More appropriate for judging sets of results
• “Blondie and Mickey Gilley are 70’s performers and do not belong on a list of 60’s musicians.”
3. Online evaluation (bucket testing)
› Small % of search traffic redirected to test system, another small % to the baseline system
› Data collection over at least a week, looking for stat. significant differences that are also
stable over time
› Metrics
• Coverage and Click-through Rate (CTR)
• Searches per browser-cookie (SPBC)
• Other key metrics should not impacted negatively, e.g. Abandonment and retry rate, Daily Active Users
(DAU), Revenue Per Search (RPS), etc.
Click-through rate (CTR) before and after the new system
Before release:
Gradually
degrading performance
due to lack of fresh data
After release:
Learning effect:
users are starting to
use the tool again
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80
Days
CTR
CTR before Spark
Trend before Spark
CTR after Spark
Trend after Spark
Spark is deployed
in production
What’s next?
Summary
 Information Retrieval
› Reached the limits of the ad-hoc text retrieval paradigm
› Needs to go beyond syntactic representations
 Semantic Web
› Provides means for knowledge representation and reasoning across the Web
› Adoption has been slow, but picking up steadily
 Applications in Web Search
› Entity-based experiences
• Rich results, information boxes and related entities
› Question-answering
Search needs even more intelligence
49
 Representation
› Modeling the World, not just what is on the Web
› Modeling personal information and preferences
› Modeling of intents (actions that can be taken on the World)
 Understanding
› Need better understanding of context
› User profile, history and current state
 Retrieval
› (Guided) interaction
› Predictive search
Q&A
 Many thanks to members of the Semantic Search team
at Yahoo Labs London and to Yahoos around the world
 Contact me
› pmika@yahoo-inc.com
› @pmika
› http://www.slideshare.net/pmika/

More Related Content

What's hot

Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Peter Mika
 
SemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorialSemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorialPeter Mika
 
Social Networks and the Semantic Web: a retrospective of the past 10 years
Social Networks and the Semantic Web: a retrospective of the past 10 yearsSocial Networks and the Semantic Web: a retrospective of the past 10 years
Social Networks and the Semantic Web: a retrospective of the past 10 yearsPeter Mika
 
From Queries to Answers in the Web
From Queries to Answers in the WebFrom Queries to Answers in the Web
From Queries to Answers in the WebRoi Blanco
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataRoi Blanco
 
Mining Web content for Enhanced Search
Mining Web content for Enhanced Search Mining Web content for Enhanced Search
Mining Web content for Enhanced Search Roi Blanco
 
Building Knowledge Graphs in DIG
Building Knowledge Graphs in DIGBuilding Knowledge Graphs in DIG
Building Knowledge Graphs in DIGPalak Modi
 
Implementing Semantic Search
Implementing Semantic SearchImplementing Semantic Search
Implementing Semantic SearchPaul Wlodarczyk
 
Building and Using a Knowledge Graph to Combat Human Trafficking
Building and Using a Knowledge Graph to Combat Human TraffickingBuilding and Using a Knowledge Graph to Combat Human Trafficking
Building and Using a Knowledge Graph to Combat Human TraffickingCraig Knoblock
 
Inspiration Architecture: The Future of Libraries
Inspiration Architecture: The Future of LibrariesInspiration Architecture: The Future of Libraries
Inspiration Architecture: The Future of LibrariesPeter Morville
 
The Social Semantic Web
The Social Semantic WebThe Social Semantic Web
The Social Semantic WebJohn Breslin
 
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey GraingerHaystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey GraingerOpenSource Connections
 
Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...John Breslin
 
Smartlogic, Semaphore and Semantically Enhanced Search – For “Discovery”
Smartlogic, Semaphore and Semantically Enhanced Search –  For “Discovery”Smartlogic, Semaphore and Semantically Enhanced Search –  For “Discovery”
Smartlogic, Semaphore and Semantically Enhanced Search – For “Discovery”VOGIN-academie
 
Tutorial: Social Semantics
Tutorial: Social SemanticsTutorial: Social Semantics
Tutorial: Social SemanticsMatthew Rowe
 
Lecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic WebLecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic WebMarina Santini
 
Wimmics Research Team 2015 Activity Report
Wimmics Research Team 2015 Activity ReportWimmics Research Team 2015 Activity Report
Wimmics Research Team 2015 Activity ReportFabien Gandon
 
What IA, UX and SEO Can Learn from Each Other
What IA, UX and SEO Can Learn from Each OtherWhat IA, UX and SEO Can Learn from Each Other
What IA, UX and SEO Can Learn from Each OtherIan Lurie
 

What's hot (19)

Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012
 
SemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorialSemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorial
 
Social Networks and the Semantic Web: a retrospective of the past 10 years
Social Networks and the Semantic Web: a retrospective of the past 10 yearsSocial Networks and the Semantic Web: a retrospective of the past 10 years
Social Networks and the Semantic Web: a retrospective of the past 10 years
 
From Queries to Answers in the Web
From Queries to Answers in the WebFrom Queries to Answers in the Web
From Queries to Answers in the Web
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Mining Web content for Enhanced Search
Mining Web content for Enhanced Search Mining Web content for Enhanced Search
Mining Web content for Enhanced Search
 
Building Knowledge Graphs in DIG
Building Knowledge Graphs in DIGBuilding Knowledge Graphs in DIG
Building Knowledge Graphs in DIG
 
Implementing Semantic Search
Implementing Semantic SearchImplementing Semantic Search
Implementing Semantic Search
 
Building and Using a Knowledge Graph to Combat Human Trafficking
Building and Using a Knowledge Graph to Combat Human TraffickingBuilding and Using a Knowledge Graph to Combat Human Trafficking
Building and Using a Knowledge Graph to Combat Human Trafficking
 
Inspiration Architecture: The Future of Libraries
Inspiration Architecture: The Future of LibrariesInspiration Architecture: The Future of Libraries
Inspiration Architecture: The Future of Libraries
 
The Social Semantic Web
The Social Semantic WebThe Social Semantic Web
The Social Semantic Web
 
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey GraingerHaystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
 
Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...
 
Smartlogic, Semaphore and Semantically Enhanced Search – For “Discovery”
Smartlogic, Semaphore and Semantically Enhanced Search –  For “Discovery”Smartlogic, Semaphore and Semantically Enhanced Search –  For “Discovery”
Smartlogic, Semaphore and Semantically Enhanced Search – For “Discovery”
 
Tutorial: Social Semantics
Tutorial: Social SemanticsTutorial: Social Semantics
Tutorial: Social Semantics
 
Lecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic WebLecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic Web
 
Wimmics Research Team 2015 Activity Report
Wimmics Research Team 2015 Activity ReportWimmics Research Team 2015 Activity Report
Wimmics Research Team 2015 Activity Report
 
Web Mining
Web MiningWeb Mining
Web Mining
 
What IA, UX and SEO Can Learn from Each Other
What IA, UX and SEO Can Learn from Each OtherWhat IA, UX and SEO Can Learn from Each Other
What IA, UX and SEO Can Learn from Each Other
 

Similar to (Keynote) Peter Mika - “Making the Web Searchable”

Semantic mark-up with schema.org: helping search engines understand the Web
Semantic mark-up with schema.org: helping search engines understand the WebSemantic mark-up with schema.org: helping search engines understand the Web
Semantic mark-up with schema.org: helping search engines understand the WebPeter Mika
 
Understanding Queries through Entities
Understanding Queries through EntitiesUnderstanding Queries through Entities
Understanding Queries through EntitiesPeter Mika
 
Semantic search: from document retrieval to virtual assistants
Semantic search: from document retrieval to virtual assistantsSemantic search: from document retrieval to virtual assistants
Semantic search: from document retrieval to virtual assistantsPeter Mika
 
Search and social patents for 2012 and beyond
Search and social patents for 2012 and beyondSearch and social patents for 2012 and beyond
Search and social patents for 2012 and beyondBill Slawski
 
Knowledge Integration in Practice
Knowledge Integration in PracticeKnowledge Integration in Practice
Knowledge Integration in PracticePeter Mika
 
Semantic Web: introduction & overview
Semantic Web: introduction & overviewSemantic Web: introduction & overview
Semantic Web: introduction & overviewAmit Sheth
 
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search LandscapeBearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search LandscapeMarianne Sweeny
 
Semtech bizsemanticsearchtutorial
Semtech bizsemanticsearchtutorialSemtech bizsemanticsearchtutorial
Semtech bizsemanticsearchtutorialBarbara Starr
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008Blogtalk 2008
 
Rapid Data Exploration With Hadoop
Rapid Data Exploration With HadoopRapid Data Exploration With Hadoop
Rapid Data Exploration With HadoopPeter Skomoroch
 
Driving Employee Engagement Through A Social Intranet - Federal Communicators...
Driving Employee Engagement Through A Social Intranet - Federal Communicators...Driving Employee Engagement Through A Social Intranet - Federal Communicators...
Driving Employee Engagement Through A Social Intranet - Federal Communicators...Federal Communicators Network
 
Nova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web TalkNova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web Talksyawal
 
Large-Scale Semantic Search
Large-Scale Semantic SearchLarge-Scale Semantic Search
Large-Scale Semantic SearchRoi Blanco
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data ManagementeXascale Infolab
 
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...Connotate
 
Brave new search world
Brave new search worldBrave new search world
Brave new search worldvoginip
 
3 Understanding Search
3 Understanding Search3 Understanding Search
3 Understanding Searchmasiclat
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science TJ Stalcup
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignMarianne Sweeny
 

Similar to (Keynote) Peter Mika - “Making the Web Searchable” (20)

Semantic mark-up with schema.org: helping search engines understand the Web
Semantic mark-up with schema.org: helping search engines understand the WebSemantic mark-up with schema.org: helping search engines understand the Web
Semantic mark-up with schema.org: helping search engines understand the Web
 
Understanding Queries through Entities
Understanding Queries through EntitiesUnderstanding Queries through Entities
Understanding Queries through Entities
 
Semantic search: from document retrieval to virtual assistants
Semantic search: from document retrieval to virtual assistantsSemantic search: from document retrieval to virtual assistants
Semantic search: from document retrieval to virtual assistants
 
Search and social patents for 2012 and beyond
Search and social patents for 2012 and beyondSearch and social patents for 2012 and beyond
Search and social patents for 2012 and beyond
 
Knowledge Integration in Practice
Knowledge Integration in PracticeKnowledge Integration in Practice
Knowledge Integration in Practice
 
Semantic Web: introduction & overview
Semantic Web: introduction & overviewSemantic Web: introduction & overview
Semantic Web: introduction & overview
 
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search LandscapeBearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
 
Semtech bizsemanticsearchtutorial
Semtech bizsemanticsearchtutorialSemtech bizsemanticsearchtutorial
Semtech bizsemanticsearchtutorial
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008
 
Rapid Data Exploration With Hadoop
Rapid Data Exploration With HadoopRapid Data Exploration With Hadoop
Rapid Data Exploration With Hadoop
 
Driving Employee Engagement Through A Social Intranet - Federal Communicators...
Driving Employee Engagement Through A Social Intranet - Federal Communicators...Driving Employee Engagement Through A Social Intranet - Federal Communicators...
Driving Employee Engagement Through A Social Intranet - Federal Communicators...
 
Nova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web TalkNova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web Talk
 
Not Your Mom's SEO
Not Your Mom's SEONot Your Mom's SEO
Not Your Mom's SEO
 
Large-Scale Semantic Search
Large-Scale Semantic SearchLarge-Scale Semantic Search
Large-Scale Semantic Search
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data Management
 
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
 
Brave new search world
Brave new search worldBrave new search world
Brave new search world
 
3 Understanding Search
3 Understanding Search3 Understanding Search
3 Understanding Search
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By Design
 

More from icwe2015

Mikkonen liquid-sw-icwe2015
Mikkonen liquid-sw-icwe2015Mikkonen liquid-sw-icwe2015
Mikkonen liquid-sw-icwe2015icwe2015
 
(Web User Interfaces track) "Getting the Query Right: User Interface Design o...
(Web User Interfaces track) "Getting the Query Right: User Interface Design o...(Web User Interfaces track) "Getting the Query Right: User Interface Design o...
(Web User Interfaces track) "Getting the Query Right: User Interface Design o...icwe2015
 
(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...
(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...
(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...icwe2015
 
(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...
(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...
(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...icwe2015
 
(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...
(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...
(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...icwe2015
 
(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...
(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...
(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...icwe2015
 
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...icwe2015
 
(Industry track) "Interactive networks for digital cultural heritage collecti...
(Industry track) "Interactive networks for digital cultural heritage collecti...(Industry track) "Interactive networks for digital cultural heritage collecti...
(Industry track) "Interactive networks for digital cultural heritage collecti...icwe2015
 
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...icwe2015
 
(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...
(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...
(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...icwe2015
 
(Linked Data Development and Exploitation track) "Curtains Up! Lights, Camera...
(Linked Data Development and Exploitation track) "Curtains Up! Lights, Camera...(Linked Data Development and Exploitation track) "Curtains Up! Lights, Camera...
(Linked Data Development and Exploitation track) "Curtains Up! Lights, Camera...icwe2015
 
(Mobile Web Applications track) "Profiling User Activities with Minimal Traff...
(Mobile Web Applications track) "Profiling User Activities with Minimal Traff...(Mobile Web Applications track) "Profiling User Activities with Minimal Traff...
(Mobile Web Applications track) "Profiling User Activities with Minimal Traff...icwe2015
 
(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...
(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...
(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...icwe2015
 
(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...
(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...
(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...icwe2015
 
(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...
(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...
(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...icwe2015
 
(Web Application Design track) "Liquid Stream Processing across Web Browsers ...
(Web Application Design track) "Liquid Stream Processing across Web Browsers ...(Web Application Design track) "Liquid Stream Processing across Web Browsers ...
(Web Application Design track) "Liquid Stream Processing across Web Browsers ...icwe2015
 
(Web Composition and Mashups track) "REST Web Service Description for Graph-B...
(Web Composition and Mashups track) "REST Web Service Description for Graph-B...(Web Composition and Mashups track) "REST Web Service Description for Graph-B...
(Web Composition and Mashups track) "REST Web Service Description for Graph-B...icwe2015
 
(Semantic Web Technologies and Applications track) "A Quantitative Comparison...
(Semantic Web Technologies and Applications track) "A Quantitative Comparison...(Semantic Web Technologies and Applications track) "A Quantitative Comparison...
(Semantic Web Technologies and Applications track) "A Quantitative Comparison...icwe2015
 
(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...
(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...
(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...icwe2015
 

More from icwe2015 (19)

Mikkonen liquid-sw-icwe2015
Mikkonen liquid-sw-icwe2015Mikkonen liquid-sw-icwe2015
Mikkonen liquid-sw-icwe2015
 
(Web User Interfaces track) "Getting the Query Right: User Interface Design o...
(Web User Interfaces track) "Getting the Query Right: User Interface Design o...(Web User Interfaces track) "Getting the Query Right: User Interface Design o...
(Web User Interfaces track) "Getting the Query Right: User Interface Design o...
 
(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...
(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...
(Web Application Design track) "Two Factor Authentication Made Easy" - Alex Q...
 
(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...
(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...
(Semantic Web Technologies and Applications track) "MIRROR: Automatic R2RML M...
 
(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...
(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...
(Linked Data Development and Exploitation track) "YQL as a Platform for Linke...
 
(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...
(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...
(Linked Data Interfaces and Querying track) "SUMMA: A Common API for Linked D...
 
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
 
(Industry track) "Interactive networks for digital cultural heritage collecti...
(Industry track) "Interactive networks for digital cultural heritage collecti...(Industry track) "Interactive networks for digital cultural heritage collecti...
(Industry track) "Interactive networks for digital cultural heritage collecti...
 
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information ...
 
(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...
(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...
(Mobile Web Applications track) "Mobile-IDM: A Design Method for Modeling the...
 
(Linked Data Development and Exploitation track) "Curtains Up! Lights, Camera...
(Linked Data Development and Exploitation track) "Curtains Up! Lights, Camera...(Linked Data Development and Exploitation track) "Curtains Up! Lights, Camera...
(Linked Data Development and Exploitation track) "Curtains Up! Lights, Camera...
 
(Mobile Web Applications track) "Profiling User Activities with Minimal Traff...
(Mobile Web Applications track) "Profiling User Activities with Minimal Traff...(Mobile Web Applications track) "Profiling User Activities with Minimal Traff...
(Mobile Web Applications track) "Profiling User Activities with Minimal Traff...
 
(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...
(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...
(SoWeMine Workshop) "Retrieving Relevant and Interesting Tweets during Live T...
 
(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...
(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...
(NLPIT Workshop) (Keynote) Nathan Schneider - “Hacking a Way Through the Twit...
 
(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...
(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...
(PEWET Workshop) (Keynote) Vincenzo De Florio - “Fractally-organized Connecti...
 
(Web Application Design track) "Liquid Stream Processing across Web Browsers ...
(Web Application Design track) "Liquid Stream Processing across Web Browsers ...(Web Application Design track) "Liquid Stream Processing across Web Browsers ...
(Web Application Design track) "Liquid Stream Processing across Web Browsers ...
 
(Web Composition and Mashups track) "REST Web Service Description for Graph-B...
(Web Composition and Mashups track) "REST Web Service Description for Graph-B...(Web Composition and Mashups track) "REST Web Service Description for Graph-B...
(Web Composition and Mashups track) "REST Web Service Description for Graph-B...
 
(Semantic Web Technologies and Applications track) "A Quantitative Comparison...
(Semantic Web Technologies and Applications track) "A Quantitative Comparison...(Semantic Web Technologies and Applications track) "A Quantitative Comparison...
(Semantic Web Technologies and Applications track) "A Quantitative Comparison...
 
(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...
(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...
(Keynote) Mike Thelwall - “Sentiment Strength Detection for Social Media Text...
 

Recently uploaded

哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查ydyuyu
 
Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...
Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...
Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...kumargunjan9515
 
Call girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsCall girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsMonica Sydney
 
一比一原版犹他大学毕业证如何办理
一比一原版犹他大学毕业证如何办理一比一原版犹他大学毕业证如何办理
一比一原版犹他大学毕业证如何办理F
 
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...meghakumariji156
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirtrahman018755
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Roommeghakumariji156
 
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsMira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsPriya Reddy
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge GraphsEleniIlkou
 
Local Call Girls in Gomati 9332606886 HOT & SEXY Models beautiful and charmi...
Local Call Girls in Gomati  9332606886 HOT & SEXY Models beautiful and charmi...Local Call Girls in Gomati  9332606886 HOT & SEXY Models beautiful and charmi...
Local Call Girls in Gomati 9332606886 HOT & SEXY Models beautiful and charmi...Sareena Khatun
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查ydyuyu
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrHenryBriggs2
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdfMatthew Sinclair
 
Research Assignment - NIST SP800 [172 A] - Presentation.pptx
Research Assignment - NIST SP800 [172 A] - Presentation.pptxResearch Assignment - NIST SP800 [172 A] - Presentation.pptx
Research Assignment - NIST SP800 [172 A] - Presentation.pptxi191686
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制pxcywzqs
 
一比一原版帝国理工学院毕业证如何办理
一比一原版帝国理工学院毕业证如何办理一比一原版帝国理工学院毕业证如何办理
一比一原版帝国理工学院毕业证如何办理F
 
Down bad crying at the gym t shirtsDown bad crying at the gym t shirts
Down bad crying at the gym t shirtsDown bad crying at the gym t shirtsDown bad crying at the gym t shirtsDown bad crying at the gym t shirts
Down bad crying at the gym t shirtsDown bad crying at the gym t shirtsrahman018755
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查ydyuyu
 
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime BalliaBallia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Balliameghakumariji156
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...kajalverma014
 

Recently uploaded (20)

哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
 
Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...
Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...
Sensual Call Girls in Tarn Taran Sahib { 9332606886 } VVIP NISHA Call Girls N...
 
Call girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsCall girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girls
 
一比一原版犹他大学毕业证如何办理
一比一原版犹他大学毕业证如何办理一比一原版犹他大学毕业证如何办理
一比一原版犹他大学毕业证如何办理
 
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
 
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsMira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 
Local Call Girls in Gomati 9332606886 HOT & SEXY Models beautiful and charmi...
Local Call Girls in Gomati  9332606886 HOT & SEXY Models beautiful and charmi...Local Call Girls in Gomati  9332606886 HOT & SEXY Models beautiful and charmi...
Local Call Girls in Gomati 9332606886 HOT & SEXY Models beautiful and charmi...
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
 
Research Assignment - NIST SP800 [172 A] - Presentation.pptx
Research Assignment - NIST SP800 [172 A] - Presentation.pptxResearch Assignment - NIST SP800 [172 A] - Presentation.pptx
Research Assignment - NIST SP800 [172 A] - Presentation.pptx
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
 
一比一原版帝国理工学院毕业证如何办理
一比一原版帝国理工学院毕业证如何办理一比一原版帝国理工学院毕业证如何办理
一比一原版帝国理工学院毕业证如何办理
 
Down bad crying at the gym t shirtsDown bad crying at the gym t shirts
Down bad crying at the gym t shirtsDown bad crying at the gym t shirtsDown bad crying at the gym t shirtsDown bad crying at the gym t shirts
Down bad crying at the gym t shirtsDown bad crying at the gym t shirts
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
 
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime BalliaBallia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
 

(Keynote) Peter Mika - “Making the Web Searchable”

  • 1. Making the Web Searchable P R E S E N T E D B Y P e t e r M i k a , Y a h o o L a b s ⎪ J u n e 2 5 , 2 0 1 5
  • 2. Agenda 2  Web Search › How it works… and where it fails  Semantic Web › The promise and the reality  Semantic Search › Research and Applications at Yahoo  What’s next? › More intelligence!
  • 3. Search is really fast, but not particularly intelligent
  • 4. What it’s like to be a machine? Roi Blanco
  • 5. What it’s like to be a machine? ↵⏏☐ģ ✜Θ♬♬ţğ√∞§®ÇĤĪ✜★♬☐✓✓ ţğ★✜ ✪✚✜ΔΤΟŨŸÏĞÊϖυτρ℠≠⅛⌫ ≠=⅚©§★✓♪ΒΓΕ℠ ✖Γ♫⅜±⏎↵⏏☐ģğğğμλκσςτ ⏎⌥°¶§ΥΦΦΦ✗✕☐ ⏎↵⏏☐ģğğğ
  • 7. Search needs (more) intelligence  Current paradigm: ad-hoc document retrieval › Set of documents that each may fulfill the information need › Relevance of those documents is clearly established by • Textual similarity • Authority  Queries outside this paradigm are hard › Semantic gap between the query and the content • Queries that require a deeper understanding of the world at large › Complex needs where no document contains the answer  Analysis of aggregate behavior is a challenge › We may answer two queries perfectly, without knowing how they are related
  • 8.  Semantic gap › Ambiguity • jaguar • paris hilton › Secondary meaning • george bush (and I mean the beer brewer in Arizona) › Subjectivity • reliable digital camera • paris hilton sexy › Imprecise or overly precise searches • jim hendler  Complex needs › Missing information • brad pitt zombie • florida man with 115 guns • 35 year old computer scientist living in barcelona › Category queries • countries in africa • barcelona nightlife › Transactional or computational queries • 120 dollars in euros • digital camera under 300 dollars • world temperature in 2020 Examples of hard queries Are there even any true keyword queries? Users may have stopped asking them
  • 10. Enter the Semantic Web 10  A social-technical ecosystem where the meaning of content is explicit and shared among agents (humans and machines) › Shared identifiers for real-world entities › Standards for exchanging structured data • Data modeled as a graph › Shared formal, schema languages • Names of entity and relationship types • Constraints on the entities, relationships and attributes
  • 11.  “At the doctor's office, Lucy instructed her Semantic Web agent through her handheld Web browser. The agent promptly retrieved information about Mom's prescribed treatment from the doctor's agent, looked up several lists of providers, and checked for the ones in- plan for Mom's insurance within a 20-mile radius of her home and with a rating of excellent or very good on trusted rating services. It then began trying to find a match between available appointment times (supplied by the agents of individual providers through their Web sites) and Pete's and Lucy's busy schedules.” › The Semantic Web. Tim Berners-Lee, James Hendler, Ora Lassila. Appeared in: Scientific American 284(5):34-43 (May 2001) 7/3/201511 Huge expectations
  • 12. In a perfect world, the Semantic Web is the end-game for IR #ROI_BLANCO #ROI_BLANCO #ROI_BLANCO
  • 13. The view from IR: skepticism 13  Mismatch to the IR problem › End-users • Do not know the identifiers of things • Not aware of the schema of the data • Can’t be expected to learn complex query languages › Focus on document retrieval • Limited recognition of the value of direct answers (initially)  “Who is going to do the annotation?” › Focus still largely on text › Automated annotation/extraction tools produce poor results  Early experiments using external knowledge are unsuccessful › Query/document expansion using thesauri etc.
  • 14. The Semantic Web off to a slow start 14  Complex technology › Set of standards sold as a stack • URI, RDF, RDF/XML, RDFa, JSON-LD, OWL, RIF, SPARQL, OWL-S, POWDER … › Not very developer friendly  Ideals of knowledge representation hard to enforce on the Web  No clear value proposition › Chicken and egg problem • No users/use cases, hence no data • No data, because no users/use cases
  • 15. … and became practical 15  Simplified technology › Fewer, more developer friendly representations › Focusing on the lower layers of the stack › Data first, schemas/logic second  Giving up ideals about knowledge representation › Shared identifiers › Logical consistency › Distinction between real-world entities and web resources  Motivation for adoption › Search engines, Facebook, Pinterest, Twitter etc. investing in information extraction
  • 16. Two important achievements 16  Linked Open Data › Social movement to (re)publish existing datasets • 100B+ triples of data • Encyclopedic, governmental, geo, scientific datasets • Impact: background knowledge – Basis for knowledge graphs in search engines  Metadata inside HTML pages › Facebook’s OGP and schema.org • Over 15% of all pages have schema.org markup (2013) • Personal information, images, videos, reviews, recipes etc. • Impact: remove the need for automated extraction
  • 19. Caveat: not your perfect Semantic Web 19  Outdated, incorrect or incomplete data › Lack of write access or feedback mechanisms  Mistakes made by tools › Noisy information extraction › Entity linking (reconciliation)  Limited or no reuse of identifiers  Metadata not always representative of content
  • 20. Semantic Search at Yahoo 20
  • 21. Semantic Search research (2007-)  Emergence of the Semantic Search field › Intersection of IR, NLP, DB and SemWeb • ESAIR at SIGIR • SemSearch at ESWC/WWW • EOS and JIWES at SIGIR • Semantic Search at VLDB  Exploiting semantic understanding in the retrieval process › User intent and resources are represented using semantic models • Semantic models typically differ across NLP, DB and Semantic Web › Semantic models are exploited in the matching and ranking of resources
  • 22. Semantic Search – a process view Query Constructi on •Keywords •Forms •NL •Formal language Query Processin g •IR-style matching & ranking •DB-style precise matching •KB-style matching & inferences Result Presentation •Query visualization •Document and data presentation •Summarization Query Refinement •Implicit feedback •Explicit feedback •Incentives Document Representation Knowledge Representation Semantic Models Resources Documents
  • 23. Result presentation using metadata Personal and private homepage of the same person (clear from the snippet but it could be also automatically de-duplicated) Conferences he plans to attend and his vacations from homepage plus bio events from LinkedIn Geolocation “Microsearch” internal prototype (2007)
  • 24. Yahoo SearchMonkey (2008) 1. Extract structured data › Semantic Web markup • Example: <span property=“vcard:city”>Santa Clara</span> <span property=“vcard:region”>CA</span> › Information Extraction 2. Presentation › Fixed presentation templates • One template per object type › Applications • Third-party modules to display data (SearchMonkey)
  • 25. Effectiveness of enhanced results  Explicit user feedback › Side-by-side editorial evaluation (A/B testing) • Editors are shown a traditional search result and enhanced result for the same page • Users prefer enhanced results in 84% of the cases and traditional results in 3% (N=384)  Implicit user feedback › Click-through rate analysis • Long dwell time limit of 100s (Ciemiewicz et al. 2010) • 15% increase in ‘good’ clicks › User interaction model • Enhanced results lead users to relevant documents (IV) even though less likely to clicked than textual (III) • Enhanced results effectively reduce bad clicks!  See › Kevin Haas, Peter Mika, Paul Tarjan, Roi Blanco: Enhanced results for web search. SIGIR 2011: 725-734
  • 26. Adoption among consumers of web content  Google announces Rich Snippets - June, 2009 › Faceted search for recipes - Feb, 2011  Bing tiles – Feb, 2011  Facebook’s Like button and the Open Graph Protocol (2010) › Shows up in profiles and news feed › Site owners can later reach users who have liked an object
  • 27. schema.org  Collaborative effort sponsored by large consumers of Web data › Bing, Google, and Yahoo! as initial founders (June, 2011) › Yandex joins schema.org in Nov, 2011  Agreement on a shared set of schemas for the Web › Available at schema.org in HTML and machine readable formats › Free to use under W3C Royalty Free terms
  • 28. Yahoo’s Knowledge Graph Chicago Cubs Chicago Barack Obama Carlos Zambrano 10% off tickets for plays for plays in lives in Brad Pitt Angelina Jolie Steven Soderbergh George Clooney Ocean’s Twelve partner directs casts in E/R casts in takes place in Fight Club casts in Dust Brothers casts in music by Nicolas Torzec: Making knowledge reusable at Yahoo!: a Look at the Yahoo! Knowledge Base (SemTech 2013)
  • 29. Building the Knowledge Graph  Information extraction › Automated information extraction • e.g. wrapper induction › Metadata from HTML pages • Focused crawler › Public datasets (e.g. Dbpedia) › Proprietary data  Data fusion › Manual mapping from the source schemas to the ontology › Supervised entity reconciliation  Ontology management › Editorially maintained OWL ontology with 300+ classes › Covering the domains of interest of Yahoo  Curation and quality assessment › Editors and user feedback still play a large role Bellare et al: WOO: A Scalable and Multi-tenant Platform for Continuous Knowledge Base Synthesis. PVLDB 2013 Welch et al.: Fast and accurate incremental entity resolution relative to an entity knowledge base. CIKM 2012
  • 30.
  • 31.
  • 32.
  • 33.  Entity linking/entity retrieval › Identifying the most relevant entity to the query  Entity recommendation › Given that the user is interested in one entity, which entity to recommend next? Roi Blanco, Berkant Barla Cambazoglu, Peter Mika, Nicolas Torzec: Entity Recommendations in Web Search. ISWC 2013 Entity displays in web search
  • 34. The importance of entities 34  Entity mention query = <entity> {+ <intent>} › ~70% of queries contain a named entity (entity mention queries) • brad pitt height › ~50% of queries have an entity focus (entity seeking queries) • brad pitt attacked by fans › ~10% of queries are looking for a class of entities • brad pitt movies › Jeffrey Pound, Peter Mika, Hugo Zaragoza: Ad-hoc object retrieval in the web of data. WWW 2010: 771-780  Intent is typically an additional word or phrase • Disambiguate, most often by type e.g. brad pitt actor • Specify action or aspect e.g. brad pitt net worth, toy story trailer brad pitt height how tall is tall …
  • 35.  Inverted index › Inspired by text retrieval • Match individual keywords • Score and aggregate  Parsing › Inspired by text parsing • Find potential mentions of entities (spots) in query • Score candidates for each spot Two broad approaches to entity retrieval brad (actor) (boxer) (city) (actor) (boxer) (lake) pitt brad pitt (actor) (boxer)
  • 36. Retrieval-based approach  Experimented with different index structures › Horizontal: one field for text and one for property name › Vertical: One field per property › Combination: one field per property weight (best performance in both AND/OR mode) Horizontal Vertical R-Vertical
  • 37. Retrieval-based approach 37  Ranking based on BM25F › R. Blanco, P. Mika, S. Vigna: Effective and Efficient Entity Search in RDF Data. ISWC 2011 › 42% improvement in MAP over best method in SemSearch 2010 › <100ms time for simple conjunctive queries  Open source implementation and demo using WebDataCommons data › glimmer.research.yahoo.com › https://github.com/yahoo/Glimmer/ Doc map map reduce reduce map reduce Index
  • 38. Entity linking approach 38  Large-scale entity/alias dictionaries › Alias mining from usage data, Wikipedia etc.  Dynamic segmentation  Novel method for scoring alias matches › Completely unsupervised › Combination of • Keyphraseness: how likely is a segment to be an entity mention? • Commonness: How likely that a linked segment refers to a particular entity? • Context-model based on word2vec representation › Roi Blanco, Giuseppe Ottaviano and Edgar Meij. Fast and space-efficient entity linking in queries. WSDM 2015
  • 39. Results: effectiveness 39  Significant improvement over external baselines and internal system › Measured on public Webscope dataset Yahoo Search Query Log to Entities Search over Bing, top Wikipedia result State-of-the-art in literature A trivial search engine over Wikipedia Our method: Fast Entity Linker (FEL) FEL + context
  • 40.  Two orders of magnitude faster than state-of-the-art › Simplifying assumptions at scoring time › Adding context independently › Dynamic pruning  Small memory footprint › Compression techniques, e.g. 10x reduction in word2vec storage 40 Results: efficiency
  • 41. Related entity recommendations  Some users are short on time › Need for direct answers › Query expansion, question-answering, information boxes, rich results…  Other users have time at their hand › Long term interests such as sports, celebrities, movies and music › Long running tasks such as travel planning
  • 43. Spark system for related entity recommendations Entity graph Data preprocessing Feature extraction Model learning Feature sources Editorial judgements Datapack Ranking model Ranking and disambiguation Entity data Features
  • 44. Machine learned ranking  Features from the Knowledge Graph and large-scale text sources › Unary • Popularity features from text: probability, entropy, wiki id popularity … • Graph features: PageRank on the entity graph, wikipedia, web graph • Type features: entity type › Binary • Co-occurrence features from text: conditional probability, joint probability … • Graph features: common neighbors … • Type features: relation type  Regression model using Gradient Boosted Decision Trees (GBDT) › Trained on editorial data (cf. clicks)
  • 45. Evaluation 45 1. 10-fold cross-validation 2. Side-by-side testing › More appropriate for judging sets of results • “Blondie and Mickey Gilley are 70’s performers and do not belong on a list of 60’s musicians.” 3. Online evaluation (bucket testing) › Small % of search traffic redirected to test system, another small % to the baseline system › Data collection over at least a week, looking for stat. significant differences that are also stable over time › Metrics • Coverage and Click-through Rate (CTR) • Searches per browser-cookie (SPBC) • Other key metrics should not impacted negatively, e.g. Abandonment and retry rate, Daily Active Users (DAU), Revenue Per Search (RPS), etc.
  • 46. Click-through rate (CTR) before and after the new system Before release: Gradually degrading performance due to lack of fresh data After release: Learning effect: users are starting to use the tool again 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 Days CTR CTR before Spark Trend before Spark CTR after Spark Trend after Spark Spark is deployed in production
  • 48. Summary  Information Retrieval › Reached the limits of the ad-hoc text retrieval paradigm › Needs to go beyond syntactic representations  Semantic Web › Provides means for knowledge representation and reasoning across the Web › Adoption has been slow, but picking up steadily  Applications in Web Search › Entity-based experiences • Rich results, information boxes and related entities › Question-answering
  • 49. Search needs even more intelligence 49  Representation › Modeling the World, not just what is on the Web › Modeling personal information and preferences › Modeling of intents (actions that can be taken on the World)  Understanding › Need better understanding of context › User profile, history and current state  Retrieval › (Guided) interaction › Predictive search
  • 50. Q&A  Many thanks to members of the Semantic Search team at Yahoo Labs London and to Yahoos around the world  Contact me › pmika@yahoo-inc.com › @pmika › http://www.slideshare.net/pmika/