The Web Comes Alive with Data!
Schema.org and Structured Data on
the Web: Past, Present, Potential
@jaymyers
Google DevFes...
• Early adopter
• Semantic
Web, Linked &
Open data
enthusiast
• Speaker
Web of Today
•
•
•
•

25 million web sites
Trillions of web pages
5 billion web pages change every day
1000x more web page...
Structured data

Transform

User
Users
Transform
Structured data

Machines
“1980-10-21”
“Actress”*
birthDate
jobTitle
jobTitle

Kim Kardashian

“Director”*
“HollywoodLife”

provider

provider

“Too...
Goals
• Create a web for both humans and machines
• Entice webmasters to make metadata
available through structured HTML
•...
Early Attempts
• Meta Content
Framework
• RDF
• OWL
Semantic Web

“A new form of Web content that is meaningful to computers
will unleash a revolution of new possibilities” -...
Microformats (‘03)
Addresses, geo, blog posts, media
(images/ video), news, products,
recipes, reviews and more!
Microformats example
<div class="item hproduct">
<ol>
<li class="lister vcard"><a class="url fn" href="http://storename.co...
Ontology Models:FOAF
<mailto:jmyers1551@gmail.com>

<mailto:cledwyn@gmail.com>

foaf:mbox
“Jaydog”

foaf:knows

foaf:nick
...
Ontology Models:SKOS
category:Boston_Celtics
category:Minnesota_
Timberwolves

category:National_Basketball_Association_
f...
Ontology Models:GoodRelations

“Make a delicious breakfast treat…”
“Euro Cuisine”
gr:description
gr:hasManufacturer

gr:in...
RDFa
<html xmlns=“http://www.w3.org/1999/xhtml” xmlns:rdfs=“Http://www.w3.org/2000/01/rdf-schema#”
xmlns:dc=“http://purl.o...
Why?
Circa 2007
Additional content on
SERPs

Data automagically extracted from
HTML
Value prop:
“Give us your data in a machinereadable format and we’ll make
your stuff more attractive in search
results”
Results
• 1000x increase in structured markup
• Increases in user engagement (click throughs)
for SERP objects created fro...
But…
• Too many choices
(syntax, ontology, etc.),
fragmented
• A lot of bad markup –
up to 40%
• Not easy enough for
your ...
2010
schema.org
• Common vocabularies that search engines
can understand
• Lower the bar for webmasters to publish
data on the ...
Introducing: Microdata
<div id="pagecontent" itemscope itemtype="http://schema.org/Person">
<a href="/media/rm974696448/nm...
Looks Like We’ve Got Something Here!
• 15% of all sites contain schema.org markup
• Many major sites
• Adoption by content...
Practical Applications in Search
Yahoo! Related Entities
Practical Applications in Search
Yandex Islands
Practical Applications in Search
Google Knowledge Graph

Additional content driven
by schema.org derived
data
Practical Applications in Search
Google Knowledge Graph

Additional content driven
by schema.org derived
data
Other Applications
Pinterest Rich Pins
Other Applications
Gmail “Actions in the Inbox”
• Actions – rent a movie, buy something
• Orders – post transaction order
...
Other Applications
JSON-LD
{
"@context": "http://schema.org",
"@type": "Person",
"name": "John Doe",
"jobTitle": ”Technolo...
Thank you
@jaymyers
Credits
Guha, Ramanathan V. “Light at the End of the Tunnel.” 12th International Semantic Web Conference
(ISWC), Sydney, N...
The Web Comes Alive with Data! Schema.org and Structured Data on the Web: Past, Present, Potential
The Web Comes Alive with Data! Schema.org and Structured Data on the Web: Past, Present, Potential
Upcoming SlideShare
Loading in...5
×

The Web Comes Alive with Data! Schema.org and Structured Data on the Web: Past, Present, Potential

3,027

Published on

For years, technologists have been trying to make sense of the vast wealth of insightful data we have locked in databases and unstructured formats. There is a growing movement to transform the web from one meant for solely human consumption to one accessible for humans and machines. Schema.org is a coalition of technology companies, including Google, promoting better structured data on the web.

The Web Comes Alive with Data! Schema.org and Structured Data on the Web: Past, Present, Potential

  1. 1. The Web Comes Alive with Data! Schema.org and Structured Data on the Web: Past, Present, Potential @jaymyers Google DevFest Twin Cities February 8, 2014
  2. 2. • Early adopter • Semantic Web, Linked & Open data enthusiast • Speaker
  3. 3. Web of Today • • • • 25 million web sites Trillions of web pages 5 billion web pages change every day 1000x more web pages on the “deep web”
  4. 4. Structured data Transform User
  5. 5. Users Transform Structured data Machines
  6. 6. “1980-10-21” “Actress”* birthDate jobTitle jobTitle Kim Kardashian “Director”* “HollywoodLife” provider provider “TooFab” * questionable, but we’ll go with it
  7. 7. Goals • Create a web for both humans and machines • Entice webmasters to make metadata available through structured HTML • Gain access to the meaning of web sites
  8. 8. Early Attempts • Meta Content Framework • RDF • OWL
  9. 9. Semantic Web “A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities” - TBL
  10. 10. Microformats (‘03) Addresses, geo, blog posts, media (images/ video), news, products, recipes, reviews and more!
  11. 11. Microformats example <div class="item hproduct"> <ol> <li class="lister vcard"><a class="url fn" href="http://storename.com">Magers and Quinn</a></li> <li class="category"><a href="http://storename.com/categories/books">Books</a></li> </ol> <img src="http://images.storename.com/products/ramsay-fast-food.jpg" class="photo" alt="gordon ramsay fast food book" /> <p><span class="condition">New:</span> <span class="price">$27.99</span></p> <p>Pub price: $35.00</p> <p>Hardcover</p> <p class="availability">Out of stock</p> <h1 class="fn">Gordon Ramsay's Fast Food Recipes from the F Word</h1> <p>By Ramsay, Gordon</p> <dl class="identifier"> <dt>ISBN:</dt> <dd>1554700647</dd> </dl> <dl class="identifier"> <dt>Publisher:</dt> <dd>Amer Youth Hostels</dd> </dl> <h4>Publishers Comments</h4> <p class="description">A celebrity host of Hell's Kitchen features more than one hundred accessible recipes that are organized in accordance with everyday needs and special occasions, in a volume that places an emphasis on fast preparation and features complementary tips on stocking a pantry.</p> </div>
  12. 12. Ontology Models:FOAF <mailto:jmyers1551@gmail.com> <mailto:cledwyn@gmail.com> foaf:mbox “Jaydog” foaf:knows foaf:nick foaf:homepage <http://jaymmyers.tumblr.com> foaf:mbox Jay Myers a foaf:Person foaf:nick Lloyd Cledwyn a foaf:Person “Professor Lloyd” foaf:homepage <http://stthomas.edu>
  13. 13. Ontology Models:SKOS category:Boston_Celtics category:Minnesota_ Timberwolves category:National_Basketball_Association_ franchise_relocations category:Atlanta_Hawks skos:broader skos:broader skos:broader skos:broader skos:broader skos:prefLabel National Basketball Association teams a skos:Concept category:Defunct_National_ Basketball_Association_teams “National Basketball Association teams”
  14. 14. Ontology Models:GoodRelations “Make a delicious breakfast treat…” “Euro Cuisine” gr:description gr:hasManufacturer gr:includesObject a gr:Offering gr:hasMPN Euro Cuisine – gr:category 8" Heart-Shape Waffle Maker a gr:ProductOrService “WM520” “Waffle_Makers”
  15. 15. RDFa <html xmlns=“http://www.w3.org/1999/xhtml” xmlns:rdfs=“Http://www.w3.org/2000/01/rdf-schema#” xmlns:dc=“http://purl.org/dc/elements/1.1/” xmlns:xsd=http://www.w3.org/2001/XMLSchema# xmlns:foaf=“http://xmlns.com/foaf/0.1/” xmlns:gr=http://purl.org/goodrelations/v1# xmlns:geo=http://www.w3.org/2003/01/geo/wgs84_pos# xmlns:v=http://www.w3.org/2006/vcard/ns# xmlns:r=http://rdf.data-vocabulary.org/#> <div class="vcard" typeof="gr:LocationOfSalesOrServiceProvisioning" about="#store_201"> <h1 id="site_title" property="geo:lat_long" content="29.521643, -98.493599"> <a href="http://stores.bestbuy.com/201">Best Buy - San Antonio</a> </h1> </div> <div rel="v:adr"> <p class="geo" typeof="v:Address v:Work”> <strong><span property="v:street-address">125 Nw Loop 410 Ste 201</span></strong><br /> <strong> <span property="v:locality">San Antonio, </span> <span property="v:region">TX</span> <span property="v:postal-code">78216</span> </strong> <br /> Phone: <span property="v:tel"><span typeof="v:Tel v:Home">888-229-3770</span></span><br /> Email: <a href="mailto:Keith.Allen2@bestbuy.com" rel="v:email">Keith.Allen2@bestbuy.com</a></p> <span rel="v:geo">GEO: <span property="v:latitude">29.521643</span>, <span property="v:longitude">98.493599</span></span></p> </div>
  16. 16. Why?
  17. 17. Circa 2007
  18. 18. Additional content on SERPs Data automagically extracted from HTML
  19. 19. Value prop: “Give us your data in a machinereadable format and we’ll make your stuff more attractive in search results”
  20. 20. Results • 1000x increase in structured markup • Increases in user engagement (click throughs) for SERP objects created from structured markup • Small number of interesting applications built on top of structured data
  21. 21. But… • Too many choices (syntax, ontology, etc.), fragmented • A lot of bad markup – up to 40% • Not easy enough for your average “Joe Webmaster”
  22. 22. 2010
  23. 23. schema.org • Common vocabularies that search engines can understand • Lower the bar for webmasters to publish data on the web • Improve user experience through data
  24. 24. Introducing: Microdata <div id="pagecontent" itemscope itemtype="http://schema.org/Person"> <a href="/media/rm974696448/nm2578007?ref_=nm_ov_ph"> <img id="name-poster" alt="Kim Kardashian Picture" title="Kim Kardashian Picture" src="http://ia.mediaimdb.com/images/M/MV5BMTc0MjkzOTAxNV5BMl5BanBnXkFtZTcwNTk1NjcyNw@@._V1_SX214_CR0, 0,214,317_.jpg" itemprop="image"/> </a> <h1 class="header"> <span class="itemprop" itemprop="name">Kim Kardashian</span></h1> <div class="infobar" id="name-job-categories"> <span class="itemprop" itemprop="jobTitle">Actress</span> <span class="itemprop" itemprop="jobTitle">Producer</span> </div> <div class="inline" itemprop="description"> TV star, entrepreneur, fashion designer, and author (New York Times bestseller - "Kardashian Konfidential"), Kim Kardashian first burst onto the scene in 2007, after the premiere of her hit E! Entertainment reality series ... </div> <time datetime="1980-10-21" itemprop="birthDate"> <a href="/search/name?birth_monthday=1021&refine=birth_monthday&ref_=nm_ov_bth_monthday" >October 21</a>, <a href="/search/name?birth_year=1980&ref_=nm_ov_bth_year" >1980</a> </time> </div>
  25. 25. Looks Like We’ve Got Something Here! • 15% of all sites contain schema.org markup • Many major sites • Adoption by content systems like Drupal and Wordpress • Around 1200 object types and growing • Significant reduction in error rates
  26. 26. Practical Applications in Search Yahoo! Related Entities
  27. 27. Practical Applications in Search Yandex Islands
  28. 28. Practical Applications in Search Google Knowledge Graph Additional content driven by schema.org derived data
  29. 29. Practical Applications in Search Google Knowledge Graph Additional content driven by schema.org derived data
  30. 30. Other Applications Pinterest Rich Pins
  31. 31. Other Applications Gmail “Actions in the Inbox” • Actions – rent a movie, buy something • Orders – post transaction order confirmation, shipping status • Reservations – restaurant, travel, tickets
  32. 32. Other Applications JSON-LD { "@context": "http://schema.org", "@type": "Person", "name": "John Doe", "jobTitle": ”Technologist", "affiliation": ”Big Boxen ‘R’ Us", "additionalName": "Johnny", "url": "http://www.example.com", "address": { "@type": "PostalAddress", "streetAddress": "1234 Freeze Drive", "addressLocality": ”Icebox", "addressRegion": ”Minnesota" } }
  33. 33. Thank you @jaymyers
  34. 34. Credits Guha, Ramanathan V. “Light at the End of the Tunnel.” 12th International Semantic Web Conference (ISWC), Sydney, NSW, Australia. 23 October 2013. Keynote Address.
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×