SlideShare a Scribd company logo
S e m a n t i c m a r k u p w i t h s c h e ma . o r g :
h e l p i n g s e a r c h e n g i n e s u n d e r s t a n d t h e We b
P R E S E N T E D B Y P e t e r M i k a , D i r e c t o r o f R e s e a r c h , Y a h o o L a b s ⎪ M a r c h 2 6 , 2 0 1 5
Real problem
What it’s like to be a machine?
Roi Blanco
What it’s like to be a machine?
↵⏏☐ģ
✜Θ♬♬ţğ√∞§®ÇĤĪ✜★♬☐✓✓
ţğ★✜
✪✚✜ΔΤΟŨŸÏĞÊϖυτρ℠≠⅛⌫
≠=⅚©§★✓♪ΒΓΕ℠
✖Γ♫⅜±⏎↵⏏☐ģğğğμλκσςτ
⏎⌥°¶§ΥΦΦΦ✗✕☐
What can we do?
5
 Improve Information Retrieval
› Harder and harder given the same data
• Exploited term-based relevance models, hyperlink structure and interaction data
• Combination of features using machine learning
• Heavy investment in computational power
– real-time indexing, instant search, datacenters and edge services
 Improve the Web
› Make the Web more searchable?
The Semantic Web (2001-)
3/27/20156
 Part of Tim Berners-Lee’s
original proposal for the Web
 Beginning of a research community
› Formal ontology
› Logical reasoning
› Agents, web services
 Rough start in deployment
› Misplaced expectations
› Lack of adoption
 The Semantic Web, May 2001
 “At the doctor's office, Lucy instructed her
Semantic Web agent through her handheld Web
browser. The agent promptly retrieved
information about Mom's prescribed treatment
from the doctor's agent, looked up several lists
of providers, and checked for the ones in-plan
for Mom's insurance within a 20-mile radius of
her home and with a rating of excellent or very
good on trusted rating services. It then began
trying to find a match between available
appointment times (supplied by the agents of
individual providers through their Web sites) and
Pete's and Lucy's busy schedules.”
 (The emphasized keywords indicate terms
whose semantics, or meaning, were defined for
the agent through the Semantic Web.)
3/27/20157
Misplaced expectations?
Lack of adoption
 Standardization ahead of adoption
› URI, RDF, RDF/XML, RDFa, JSON-LD,
OWL, RIF, SPARQL, OWL-S, POWDER …
 Chicken and egg problem
› No users/use cases, hence no data
› No data, because no users/use cases
 By 2007, some modest progress
› Metadata in HTML: microformats
› Linked Data: simplifying the stack
Microsearch internal prototype (2007)
Personal and
private
homepage
of the same
person
(clear from the
snippet but it
could be also
automatically
de-duplicated)
Conferences
he plans to attend
and his vacations
from homepage
plus bio events
from LinkedIn
Geolocation
Yahoo SearchMonkey (2008)
1. Extract structured data
› Semantic Web markup
• Example:
<span property=“vcard:city”>Santa Clara</span>
<span property=“vcard:region”>CA</span>
› Information Extraction
2. Presentation
› Fixed presentation templates
• One template per object type
› Applications
• Third-party modules to display data (SearchMonkey)
Effectiveness of enhanced results
 Explicit user feedback
› Side-by-side editorial evaluation (A/B testing)
• Editors are shown a traditional search result and enhanced result for the same page
• Users prefer enhanced results in 84% of the cases and traditional results in 3% (N=384)
 Implicit user feedback
› Click-through rate analysis
• Long dwell time limit of 100s (Ciemiewicz et al. 2010)
• 15% increase in ‘good’ clicks
› User interaction model
• Enhanced results lead users to relevant documents (IV) even though less likely to clicked than
textual (III)
• Enhanced results effectively reduce bad clicks!
 See
› Kevin Haas, Peter Mika, Paul Tarjan, Roi Blanco: Enhanced results for web search. SIGIR
2011: 725-734
Other applications of enhanced results
 Google Rich Snippets - June, 2009
› Faceted search for recipes - Feb, 2011
 Bing tiles – Feb, 2011
 Facebook’s Like button and the Open Graph Protocol (2010)
› Shows up in profiles and news feed
› Site owners can later reach users who have liked an object
 Twitter cards (2012)
› More visual/interactive tweets
Other types of applications: vertical search
14
Not just web pages: markup in email
 Google Now
 Yahoo Search/Mail
 Microsoft Cortana
Problem!
16
 Each of these applications require a different markup
› Different schemas and syntax
 What’s a publisher to do?
› Mark up the same content differently for every consumer
• Time consuming
• Error prone
schema.org
 Collaborative effort sponsored by large consumers of Web data
› Bing, Google, and Yahoo! as initial founders (June, 2011)
› Yandex joins schema.org in Nov, 2011
 Agreement on a shared set of schemas for the Web
› Available at schema.org in HTML and machine readable formats
› Free to use under W3C Royalty Free terms
Example
18
View source
19
View source
schema.org structure
 Classes
› Each class has a label and descriptions
› Classes form a class hierarchy
• Multiple inheritance allowed but rare (a class with two super-classes)
 Properties
› Each property has a label and description
› Properties have domains and ranges, and inverse properties
 Datatypes
› Boolean, Date, DateTime etc.
schema.org usage in practice
 Depends on the skillset of the publisher
› Instances are rarely given an identifier, or identified by the URL of the webpage
› schema.org consumers (validators etc.) are tolerant to mistakes
• e.g. accept text even when an object is required
 Driven by applications
› Publishers often provide the minimal information required in a particular context
› Validators (Bing, Google, Yandex) validate different subsets
schema.org statistics
 R.V. Guha: Light at the end of the tunnel (ISWC 2013 keynote)
› Over 15% of all pages now have schema.org markup
› Over 5 million sites, over 25 billion entity references
› In other words
• Same order of magnitude as the web
 See also
› P. Mika, T. Potter. Metadata Statistics for a Large Web Corpus, LDOW 2012
• Based on Bing US corpus
• 31% of webpages, 5% of domains contain some metadata
› WebDataCommons
• Based on CommonCrawl Nov 2013
• 26% of webpages, 14% of domains contain some metadata
schema.org process
 Process
› Initial release
• Group of experts harmonizing existing vocabularies
› Regular updates based on public discussion
• Fixes
• Extensions
• Deprecation
– almost never
 Tooling
› Website (App Engine)
• Open Source
› Github
(link)
Issue
Extensions
 External proposals integrated
› News (IPTC)
› e-Commerce (GoodRelations)
› TV/Radio fixes (BBC/EBU's)
› Content Accessibility (a11ymetadata.org, IMS)
› Not-for-profit Offers (BibExtend)
› Question/Answer (StackExchange, Drupal)
 Further integration
› Automotive
› GS1
 New extension mechanism
› Coming soon
schema.org and web standards
 schema.org builds on Semantic Web standards
› RDFa, JSON-LD, HTML5 microdata
 Not a standardization effort in the classical sense
› Continuously evolving ontology
› Huge scope (‘everything on the Web’)
› Shallow depths compared to more targeted efforts
 More specialized discussions typically at more targeted forums
› e.g. W3C Community Groups
 Large enumerations and/or rapidly changing knowledge maintained elsewhere
› e.g. PlaceOfWorship
› BuddhistTemple, CatholicChurch, Church, HinduTemple, Mosque, Synagogue …
› Meanwhile over at Wikipedia:
• https://en.wikipedia.org/wiki/Place_of_worship
• https://www.wikidata.org/wiki/Q1370598
BibExtend Community Group (W3C)
What’s new?
Task completion
36
 We would like to help our users in task completion
› But we have trained our users to talk in nouns
• Retrieval performance decreases by adding verbs to queries
› We need to understand what the available actions are
 Schema.org Actions
› Describe what actions can be taken on a page/email
› See blog post and overview article
THING
THING
Actions
 Schema.org v1.2 (April, 2014)
› See blog post and overview article for detail.
› and public-vocabs threads for even more details.
{
"@type": "Product",
"url": "http://example.com/products/ipod",
"potentialAction": {
"@type": "BuyAction",
"target": {
"@type": "EntryPoint",
"urlTemplate": "https://example.com/products/ipod/buy",
"encodingType": "application/ld+json",
"contentType": "application/ld+json"
},
"result": {
"@type": "Order",
"url-output": "required",
"confirmationNumber-output": "required",
"orderNumber-output": "required",
"orderStatus-output": "required"
}
}
}
{
"@type": "BuyAction",
"actionStatus": "CompletedActionStatus",
"object":
"https://example.com/products/ipod",
"result": {
"@type": "Order",
"url":
"http://example.com/orders/1199334"
"confirmationNumber": "1ABBCDDF23234",
"orderNumber": "1199334",
"orderStatus": "PROCESSING"
},
}
Actions example Here is a Product and
a potential action
(Buy)
After POSTing the
request to the
EntryPoint, here is
your completed action
Interactive search results (Yandex Islands)
40
(Possible) example: quick unsubscribe
41
How do I
unsubscribe?
Not very
visible to
humans…
Q&A
 Many thanks to
› The schema.org group and the many contributors to schema.org
› Dan Brickley
 Get involved
› Join the discussion at public-vocabs@w3.org
› File a bug, fork a schema, track releases at Github.org
 Contact me
› pmika@yahoo-inc.com
› @pmika
› http://www.slideshare.net/pmika/

More Related Content

Viewers also liked

Semantisch zoeken - over knowledge graph, semantisch web, rdf enz.
Semantisch zoeken - over knowledge graph, semantisch web, rdf enz.Semantisch zoeken - over knowledge graph, semantisch web, rdf enz.
Semantisch zoeken - over knowledge graph, semantisch web, rdf enz.Eric Sieverts
 
Zoeken en Open Access
Zoeken en Open AccessZoeken en Open Access
Zoeken en Open Access
hierohiero
 
Phil Bradley - Advanced Internet Searching
Phil Bradley - Advanced Internet SearchingPhil Bradley - Advanced Internet Searching
Phil Bradley - Advanced Internet Searchingvoginip
 
Onderzoeksdata in beeld / In Search 4 Data
Onderzoeksdata in beeld / In Search 4 DataOnderzoeksdata in beeld / In Search 4 Data
Onderzoeksdata in beeld / In Search 4 Data
Marina Noordegraaf
 
Linked Science - Building a Web of Research Data
Linked Science - Building a Web of Research DataLinked Science - Building a Web of Research Data
Linked Science - Building a Web of Research DataRinke Hoekstra
 
Eerste Hulp Bij Informatievrijheid
Eerste Hulp Bij InformatievrijheidEerste Hulp Bij Informatievrijheid
Eerste Hulp Bij Informatievrijheid
Marina Noordegraaf
 
Reflections on the reinvention of research - by Marydee Ojala
Reflections on the reinvention of research - by Marydee OjalaReflections on the reinvention of research - by Marydee Ojala
Reflections on the reinvention of research - by Marydee Ojala
voginip
 
Vertrouwen op semantische zoeksystemen of zelf aan het stuur
Vertrouwen op semantische zoeksystemen of zelf aan het stuurVertrouwen op semantische zoeksystemen of zelf aan het stuur
Vertrouwen op semantische zoeksystemen of zelf aan het stuur
Eric Sieverts
 
Is intelligence informatie?
Is intelligence informatie?Is intelligence informatie?
Is intelligence informatie?
voginip
 
101 innovaties in de wetenschappelijke communicatie
101 innovaties in de wetenschappelijke communicatie101 innovaties in de wetenschappelijke communicatie
101 innovaties in de wetenschappelijke communicatie
voginip
 
Een beter internet voor kinderen
Een beter internet voor kinderenEen beter internet voor kinderen
Een beter internet voor kinderen
voginip
 
De factcheckparadox
De factcheckparadox De factcheckparadox
De factcheckparadox
voginip
 
Inzet van kennisportals tussen organisatie en klant
Inzet van kennisportals tussen organisatie en klantInzet van kennisportals tussen organisatie en klant
Inzet van kennisportals tussen organisatie en klant
voginip
 
Social media tools
Social media toolsSocial media tools
Social media tools
voginip
 
Brave new search world
Brave new search worldBrave new search world
Brave new search world
voginip
 
Een nieuwe rol voor IP-ers
Een nieuwe rol voor IP-ersEen nieuwe rol voor IP-ers
Een nieuwe rol voor IP-ers
voginip
 
Iedereen factchecker; tools en technieken
Iedereen factchecker; tools en techniekenIedereen factchecker; tools en technieken
Iedereen factchecker; tools en technieken
voginip
 
Smartlogic, Semaphore and Semantically Enhanced Search – For “Discovery”
Smartlogic, Semaphore and Semantically Enhanced Search –  For “Discovery”Smartlogic, Semaphore and Semantically Enhanced Search –  For “Discovery”
Smartlogic, Semaphore and Semantically Enhanced Search – For “Discovery”
voginip
 
Use of Twitter and social media
Use of Twitter and social mediaUse of Twitter and social media
Use of Twitter and social media
voginip
 
Newsreader vogin-ip-26-mar-2015
Newsreader vogin-ip-26-mar-2015Newsreader vogin-ip-26-mar-2015
Newsreader vogin-ip-26-mar-2015
Piek Vossen
 

Viewers also liked (20)

Semantisch zoeken - over knowledge graph, semantisch web, rdf enz.
Semantisch zoeken - over knowledge graph, semantisch web, rdf enz.Semantisch zoeken - over knowledge graph, semantisch web, rdf enz.
Semantisch zoeken - over knowledge graph, semantisch web, rdf enz.
 
Zoeken en Open Access
Zoeken en Open AccessZoeken en Open Access
Zoeken en Open Access
 
Phil Bradley - Advanced Internet Searching
Phil Bradley - Advanced Internet SearchingPhil Bradley - Advanced Internet Searching
Phil Bradley - Advanced Internet Searching
 
Onderzoeksdata in beeld / In Search 4 Data
Onderzoeksdata in beeld / In Search 4 DataOnderzoeksdata in beeld / In Search 4 Data
Onderzoeksdata in beeld / In Search 4 Data
 
Linked Science - Building a Web of Research Data
Linked Science - Building a Web of Research DataLinked Science - Building a Web of Research Data
Linked Science - Building a Web of Research Data
 
Eerste Hulp Bij Informatievrijheid
Eerste Hulp Bij InformatievrijheidEerste Hulp Bij Informatievrijheid
Eerste Hulp Bij Informatievrijheid
 
Reflections on the reinvention of research - by Marydee Ojala
Reflections on the reinvention of research - by Marydee OjalaReflections on the reinvention of research - by Marydee Ojala
Reflections on the reinvention of research - by Marydee Ojala
 
Vertrouwen op semantische zoeksystemen of zelf aan het stuur
Vertrouwen op semantische zoeksystemen of zelf aan het stuurVertrouwen op semantische zoeksystemen of zelf aan het stuur
Vertrouwen op semantische zoeksystemen of zelf aan het stuur
 
Is intelligence informatie?
Is intelligence informatie?Is intelligence informatie?
Is intelligence informatie?
 
101 innovaties in de wetenschappelijke communicatie
101 innovaties in de wetenschappelijke communicatie101 innovaties in de wetenschappelijke communicatie
101 innovaties in de wetenschappelijke communicatie
 
Een beter internet voor kinderen
Een beter internet voor kinderenEen beter internet voor kinderen
Een beter internet voor kinderen
 
De factcheckparadox
De factcheckparadox De factcheckparadox
De factcheckparadox
 
Inzet van kennisportals tussen organisatie en klant
Inzet van kennisportals tussen organisatie en klantInzet van kennisportals tussen organisatie en klant
Inzet van kennisportals tussen organisatie en klant
 
Social media tools
Social media toolsSocial media tools
Social media tools
 
Brave new search world
Brave new search worldBrave new search world
Brave new search world
 
Een nieuwe rol voor IP-ers
Een nieuwe rol voor IP-ersEen nieuwe rol voor IP-ers
Een nieuwe rol voor IP-ers
 
Iedereen factchecker; tools en technieken
Iedereen factchecker; tools en techniekenIedereen factchecker; tools en technieken
Iedereen factchecker; tools en technieken
 
Smartlogic, Semaphore and Semantically Enhanced Search – For “Discovery”
Smartlogic, Semaphore and Semantically Enhanced Search –  For “Discovery”Smartlogic, Semaphore and Semantically Enhanced Search –  For “Discovery”
Smartlogic, Semaphore and Semantically Enhanced Search – For “Discovery”
 
Use of Twitter and social media
Use of Twitter and social mediaUse of Twitter and social media
Use of Twitter and social media
 
Newsreader vogin-ip-26-mar-2015
Newsreader vogin-ip-26-mar-2015Newsreader vogin-ip-26-mar-2015
Newsreader vogin-ip-26-mar-2015
 

Similar to Semantic mark-up with schema.org: helping search engines understand the Web

Semantic Search keynote at CORIA 2015
Semantic Search keynote at CORIA 2015Semantic Search keynote at CORIA 2015
Semantic Search keynote at CORIA 2015
Peter Mika
 
Semantic Search on the Rise
Semantic Search on the RiseSemantic Search on the Rise
Semantic Search on the Rise
Peter Mika
 
Semantic Search at Yahoo
Semantic Search at YahooSemantic Search at Yahoo
Semantic Search at Yahoo
Peter Mika
 
Making the Web Searchable - Keynote ICWE 2015
Making the Web Searchable - Keynote ICWE 2015Making the Web Searchable - Keynote ICWE 2015
Making the Web Searchable - Keynote ICWE 2015
Peter Mika
 
(Keynote) Peter Mika - “Making the Web Searchable”
(Keynote) Peter Mika - “Making the Web Searchable”(Keynote) Peter Mika - “Making the Web Searchable”
(Keynote) Peter Mika - “Making the Web Searchable”icwe2015
 
Social Networks and the Semantic Web: a retrospective of the past 10 years
Social Networks and the Semantic Web: a retrospective of the past 10 yearsSocial Networks and the Semantic Web: a retrospective of the past 10 years
Social Networks and the Semantic Web: a retrospective of the past 10 years
Peter Mika
 
Search and social patents for 2012 and beyond
Search and social patents for 2012 and beyondSearch and social patents for 2012 and beyond
Search and social patents for 2012 and beyondBill Slawski
 
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search LandscapeBearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Marianne Sweeny
 
The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...
Pieter De Leenheer
 
Rapid Data Exploration With Hadoop
Rapid Data Exploration With HadoopRapid Data Exploration With Hadoop
Rapid Data Exploration With Hadoop
Peter Skomoroch
 
Understanding Queries through Entities
Understanding Queries through EntitiesUnderstanding Queries through Entities
Understanding Queries through Entities
Peter Mika
 
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Connotate
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008Blogtalk 2008
 
Not Your Mom's SEO
Not Your Mom's SEONot Your Mom's SEO
Not Your Mom's SEO
Marianne Sweeny
 
Nova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web TalkNova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web Talk
syawal
 
Pratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnectPratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnect
Jan-Willem Bobbink - Freelance SEO Consultant
 
Microformats I: What & Why
Microformats I: What & WhyMicroformats I: What & Why
Microformats I: What & Why
Rachael L Moore
 
Search V Next Final
Search V Next FinalSearch V Next Final
Search V Next Final
Marianne Sweeny
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
itnewsafrica
 
Overview of-semantic-technologies-and-ontologies
Overview of-semantic-technologies-and-ontologiesOverview of-semantic-technologies-and-ontologies
Overview of-semantic-technologies-and-ontologies
Andrea Westerinen
 

Similar to Semantic mark-up with schema.org: helping search engines understand the Web (20)

Semantic Search keynote at CORIA 2015
Semantic Search keynote at CORIA 2015Semantic Search keynote at CORIA 2015
Semantic Search keynote at CORIA 2015
 
Semantic Search on the Rise
Semantic Search on the RiseSemantic Search on the Rise
Semantic Search on the Rise
 
Semantic Search at Yahoo
Semantic Search at YahooSemantic Search at Yahoo
Semantic Search at Yahoo
 
Making the Web Searchable - Keynote ICWE 2015
Making the Web Searchable - Keynote ICWE 2015Making the Web Searchable - Keynote ICWE 2015
Making the Web Searchable - Keynote ICWE 2015
 
(Keynote) Peter Mika - “Making the Web Searchable”
(Keynote) Peter Mika - “Making the Web Searchable”(Keynote) Peter Mika - “Making the Web Searchable”
(Keynote) Peter Mika - “Making the Web Searchable”
 
Social Networks and the Semantic Web: a retrospective of the past 10 years
Social Networks and the Semantic Web: a retrospective of the past 10 yearsSocial Networks and the Semantic Web: a retrospective of the past 10 years
Social Networks and the Semantic Web: a retrospective of the past 10 years
 
Search and social patents for 2012 and beyond
Search and social patents for 2012 and beyondSearch and social patents for 2012 and beyond
Search and social patents for 2012 and beyond
 
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search LandscapeBearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
 
The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...
 
Rapid Data Exploration With Hadoop
Rapid Data Exploration With HadoopRapid Data Exploration With Hadoop
Rapid Data Exploration With Hadoop
 
Understanding Queries through Entities
Understanding Queries through EntitiesUnderstanding Queries through Entities
Understanding Queries through Entities
 
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008
 
Not Your Mom's SEO
Not Your Mom's SEONot Your Mom's SEO
Not Your Mom's SEO
 
Nova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web TalkNova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web Talk
 
Pratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnectPratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnect
 
Microformats I: What & Why
Microformats I: What & WhyMicroformats I: What & Why
Microformats I: What & Why
 
Search V Next Final
Search V Next FinalSearch V Next Final
Search V Next Final
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
Overview of-semantic-technologies-and-ontologies
Overview of-semantic-technologies-and-ontologiesOverview of-semantic-technologies-and-ontologies
Overview of-semantic-technologies-and-ontologies
 

Recently uploaded

原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
Javier Lasa
 
Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
GTProductions1
 
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptxInternet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
VivekSinghShekhawat2
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
JungkooksNonexistent
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Sanjeev Rampal
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
laozhuseo02
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
eutxy
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
Rogerio Filho
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
nirahealhty
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
Gal Baras
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
natyesu
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
keoku
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
ufdana
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Brad Spiegel Macon GA
 

Recently uploaded (20)

原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
 
Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
 
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptxInternet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
 

Semantic mark-up with schema.org: helping search engines understand the Web

  • 1. S e m a n t i c m a r k u p w i t h s c h e ma . o r g : h e l p i n g s e a r c h e n g i n e s u n d e r s t a n d t h e We b P R E S E N T E D B Y P e t e r M i k a , D i r e c t o r o f R e s e a r c h , Y a h o o L a b s ⎪ M a r c h 2 6 , 2 0 1 5
  • 3. What it’s like to be a machine? Roi Blanco
  • 4. What it’s like to be a machine? ↵⏏☐ģ ✜Θ♬♬ţğ√∞§®ÇĤĪ✜★♬☐✓✓ ţğ★✜ ✪✚✜ΔΤΟŨŸÏĞÊϖυτρ℠≠⅛⌫ ≠=⅚©§★✓♪ΒΓΕ℠ ✖Γ♫⅜±⏎↵⏏☐ģğğğμλκσςτ ⏎⌥°¶§ΥΦΦΦ✗✕☐
  • 5. What can we do? 5  Improve Information Retrieval › Harder and harder given the same data • Exploited term-based relevance models, hyperlink structure and interaction data • Combination of features using machine learning • Heavy investment in computational power – real-time indexing, instant search, datacenters and edge services  Improve the Web › Make the Web more searchable?
  • 6. The Semantic Web (2001-) 3/27/20156  Part of Tim Berners-Lee’s original proposal for the Web  Beginning of a research community › Formal ontology › Logical reasoning › Agents, web services  Rough start in deployment › Misplaced expectations › Lack of adoption
  • 7.  The Semantic Web, May 2001  “At the doctor's office, Lucy instructed her Semantic Web agent through her handheld Web browser. The agent promptly retrieved information about Mom's prescribed treatment from the doctor's agent, looked up several lists of providers, and checked for the ones in-plan for Mom's insurance within a 20-mile radius of her home and with a rating of excellent or very good on trusted rating services. It then began trying to find a match between available appointment times (supplied by the agents of individual providers through their Web sites) and Pete's and Lucy's busy schedules.”  (The emphasized keywords indicate terms whose semantics, or meaning, were defined for the agent through the Semantic Web.) 3/27/20157 Misplaced expectations?
  • 8. Lack of adoption  Standardization ahead of adoption › URI, RDF, RDF/XML, RDFa, JSON-LD, OWL, RIF, SPARQL, OWL-S, POWDER …  Chicken and egg problem › No users/use cases, hence no data › No data, because no users/use cases  By 2007, some modest progress › Metadata in HTML: microformats › Linked Data: simplifying the stack
  • 9. Microsearch internal prototype (2007) Personal and private homepage of the same person (clear from the snippet but it could be also automatically de-duplicated) Conferences he plans to attend and his vacations from homepage plus bio events from LinkedIn Geolocation
  • 10. Yahoo SearchMonkey (2008) 1. Extract structured data › Semantic Web markup • Example: <span property=“vcard:city”>Santa Clara</span> <span property=“vcard:region”>CA</span> › Information Extraction 2. Presentation › Fixed presentation templates • One template per object type › Applications • Third-party modules to display data (SearchMonkey)
  • 11. Effectiveness of enhanced results  Explicit user feedback › Side-by-side editorial evaluation (A/B testing) • Editors are shown a traditional search result and enhanced result for the same page • Users prefer enhanced results in 84% of the cases and traditional results in 3% (N=384)  Implicit user feedback › Click-through rate analysis • Long dwell time limit of 100s (Ciemiewicz et al. 2010) • 15% increase in ‘good’ clicks › User interaction model • Enhanced results lead users to relevant documents (IV) even though less likely to clicked than textual (III) • Enhanced results effectively reduce bad clicks!  See › Kevin Haas, Peter Mika, Paul Tarjan, Roi Blanco: Enhanced results for web search. SIGIR 2011: 725-734
  • 12. Other applications of enhanced results  Google Rich Snippets - June, 2009 › Faceted search for recipes - Feb, 2011  Bing tiles – Feb, 2011  Facebook’s Like button and the Open Graph Protocol (2010) › Shows up in profiles and news feed › Site owners can later reach users who have liked an object  Twitter cards (2012) › More visual/interactive tweets
  • 13. Other types of applications: vertical search 14
  • 14. Not just web pages: markup in email  Google Now  Yahoo Search/Mail  Microsoft Cortana
  • 15. Problem! 16  Each of these applications require a different markup › Different schemas and syntax  What’s a publisher to do? › Mark up the same content differently for every consumer • Time consuming • Error prone
  • 16. schema.org  Collaborative effort sponsored by large consumers of Web data › Bing, Google, and Yahoo! as initial founders (June, 2011) › Yandex joins schema.org in Nov, 2011  Agreement on a shared set of schemas for the Web › Available at schema.org in HTML and machine readable formats › Free to use under W3C Royalty Free terms
  • 19.
  • 21. schema.org structure  Classes › Each class has a label and descriptions › Classes form a class hierarchy • Multiple inheritance allowed but rare (a class with two super-classes)  Properties › Each property has a label and description › Properties have domains and ranges, and inverse properties  Datatypes › Boolean, Date, DateTime etc.
  • 22. schema.org usage in practice  Depends on the skillset of the publisher › Instances are rarely given an identifier, or identified by the URL of the webpage › schema.org consumers (validators etc.) are tolerant to mistakes • e.g. accept text even when an object is required  Driven by applications › Publishers often provide the minimal information required in a particular context › Validators (Bing, Google, Yandex) validate different subsets
  • 23. schema.org statistics  R.V. Guha: Light at the end of the tunnel (ISWC 2013 keynote) › Over 15% of all pages now have schema.org markup › Over 5 million sites, over 25 billion entity references › In other words • Same order of magnitude as the web  See also › P. Mika, T. Potter. Metadata Statistics for a Large Web Corpus, LDOW 2012 • Based on Bing US corpus • 31% of webpages, 5% of domains contain some metadata › WebDataCommons • Based on CommonCrawl Nov 2013 • 26% of webpages, 14% of domains contain some metadata
  • 24. schema.org process  Process › Initial release • Group of experts harmonizing existing vocabularies › Regular updates based on public discussion • Fixes • Extensions • Deprecation – almost never  Tooling › Website (App Engine) • Open Source › Github
  • 26.
  • 27.
  • 28. Extensions  External proposals integrated › News (IPTC) › e-Commerce (GoodRelations) › TV/Radio fixes (BBC/EBU's) › Content Accessibility (a11ymetadata.org, IMS) › Not-for-profit Offers (BibExtend) › Question/Answer (StackExchange, Drupal)  Further integration › Automotive › GS1  New extension mechanism › Coming soon
  • 29. schema.org and web standards  schema.org builds on Semantic Web standards › RDFa, JSON-LD, HTML5 microdata  Not a standardization effort in the classical sense › Continuously evolving ontology › Huge scope (‘everything on the Web’) › Shallow depths compared to more targeted efforts  More specialized discussions typically at more targeted forums › e.g. W3C Community Groups  Large enumerations and/or rapidly changing knowledge maintained elsewhere › e.g. PlaceOfWorship › BuddhistTemple, CatholicChurch, Church, HinduTemple, Mosque, Synagogue … › Meanwhile over at Wikipedia: • https://en.wikipedia.org/wiki/Place_of_worship • https://www.wikidata.org/wiki/Q1370598
  • 30.
  • 32.
  • 33.
  • 35. Task completion 36  We would like to help our users in task completion › But we have trained our users to talk in nouns • Retrieval performance decreases by adding verbs to queries › We need to understand what the available actions are  Schema.org Actions › Describe what actions can be taken on a page/email › See blog post and overview article THING THING
  • 36. Actions  Schema.org v1.2 (April, 2014) › See blog post and overview article for detail. › and public-vocabs threads for even more details.
  • 37.
  • 38. { "@type": "Product", "url": "http://example.com/products/ipod", "potentialAction": { "@type": "BuyAction", "target": { "@type": "EntryPoint", "urlTemplate": "https://example.com/products/ipod/buy", "encodingType": "application/ld+json", "contentType": "application/ld+json" }, "result": { "@type": "Order", "url-output": "required", "confirmationNumber-output": "required", "orderNumber-output": "required", "orderStatus-output": "required" } } } { "@type": "BuyAction", "actionStatus": "CompletedActionStatus", "object": "https://example.com/products/ipod", "result": { "@type": "Order", "url": "http://example.com/orders/1199334" "confirmationNumber": "1ABBCDDF23234", "orderNumber": "1199334", "orderStatus": "PROCESSING" }, } Actions example Here is a Product and a potential action (Buy) After POSTing the request to the EntryPoint, here is your completed action
  • 39. Interactive search results (Yandex Islands) 40
  • 40. (Possible) example: quick unsubscribe 41 How do I unsubscribe? Not very visible to humans…
  • 41. Q&A  Many thanks to › The schema.org group and the many contributors to schema.org › Dan Brickley  Get involved › Join the discussion at public-vocabs@w3.org › File a bug, fork a schema, track releases at Github.org  Contact me › pmika@yahoo-inc.com › @pmika › http://www.slideshare.net/pmika/