Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Web Comes Alive with Data! and Structured Data on the Web: Past, Present, Potential


Published on

For years, technologists have been trying to make sense of the vast wealth of insightful data we have locked in databases and unstructured formats. There is a growing movement to transform the web from one meant for solely human consumption to one accessible for humans and machines. is a coalition of technology companies, including Google, promoting better structured data on the web.

The Web Comes Alive with Data! and Structured Data on the Web: Past, Present, Potential

  1. 1. The Web Comes Alive with Data! and Structured Data on the Web: Past, Present, Potential @jaymyers Google DevFest Twin Cities February 8, 2014
  2. 2. • Early adopter • Semantic Web, Linked & Open data enthusiast • Speaker
  3. 3. Web of Today • • • • 25 million web sites Trillions of web pages 5 billion web pages change every day 1000x more web pages on the “deep web”
  4. 4. Structured data Transform User
  5. 5. Users Transform Structured data Machines
  6. 6. “1980-10-21” “Actress”* birthDate jobTitle jobTitle Kim Kardashian “Director”* “HollywoodLife” provider provider “TooFab” * questionable, but we’ll go with it
  7. 7. Goals • Create a web for both humans and machines • Entice webmasters to make metadata available through structured HTML • Gain access to the meaning of web sites
  8. 8. Early Attempts • Meta Content Framework • RDF • OWL
  9. 9. Semantic Web “A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities” - TBL
  10. 10. Microformats (‘03) Addresses, geo, blog posts, media (images/ video), news, products, recipes, reviews and more!
  11. 11. Microformats example <div class="item hproduct"> <ol> <li class="lister vcard"><a class="url fn" href="">Magers and Quinn</a></li> <li class="category"><a href="">Books</a></li> </ol> <img src="" class="photo" alt="gordon ramsay fast food book" /> <p><span class="condition">New:</span> <span class="price">$27.99</span></p> <p>Pub price: $35.00</p> <p>Hardcover</p> <p class="availability">Out of stock</p> <h1 class="fn">Gordon Ramsay's Fast Food Recipes from the F Word</h1> <p>By Ramsay, Gordon</p> <dl class="identifier"> <dt>ISBN:</dt> <dd>1554700647</dd> </dl> <dl class="identifier"> <dt>Publisher:</dt> <dd>Amer Youth Hostels</dd> </dl> <h4>Publishers Comments</h4> <p class="description">A celebrity host of Hell's Kitchen features more than one hundred accessible recipes that are organized in accordance with everyday needs and special occasions, in a volume that places an emphasis on fast preparation and features complementary tips on stocking a pantry.</p> </div>
  12. 12. Ontology Models:FOAF <> <> foaf:mbox “Jaydog” foaf:knows foaf:nick foaf:homepage <> foaf:mbox Jay Myers a foaf:Person foaf:nick Lloyd Cledwyn a foaf:Person “Professor Lloyd” foaf:homepage <>
  13. 13. Ontology Models:SKOS category:Boston_Celtics category:Minnesota_ Timberwolves category:National_Basketball_Association_ franchise_relocations category:Atlanta_Hawks skos:broader skos:broader skos:broader skos:broader skos:broader skos:prefLabel National Basketball Association teams a skos:Concept category:Defunct_National_ Basketball_Association_teams “National Basketball Association teams”
  14. 14. Ontology Models:GoodRelations “Make a delicious breakfast treat…” “Euro Cuisine” gr:description gr:hasManufacturer gr:includesObject a gr:Offering gr:hasMPN Euro Cuisine – gr:category 8" Heart-Shape Waffle Maker a gr:ProductOrService “WM520” “Waffle_Makers”
  15. 15. RDFa <html xmlns=“” xmlns:rdfs=“Http://” xmlns:dc=“” xmlns:xsd= xmlns:foaf=“” xmlns:gr= xmlns:geo= xmlns:v= xmlns:r=> <div class="vcard" typeof="gr:LocationOfSalesOrServiceProvisioning" about="#store_201"> <h1 id="site_title" property="geo:lat_long" content="29.521643, -98.493599"> <a href="">Best Buy - San Antonio</a> </h1> </div> <div rel="v:adr"> <p class="geo" typeof="v:Address v:Work”> <strong><span property="v:street-address">125 Nw Loop 410 Ste 201</span></strong><br /> <strong> <span property="v:locality">San Antonio, </span> <span property="v:region">TX</span> <span property="v:postal-code">78216</span> </strong> <br /> Phone: <span property="v:tel"><span typeof="v:Tel v:Home">888-229-3770</span></span><br /> Email: <a href="" rel="v:email"></a></p> <span rel="v:geo">GEO: <span property="v:latitude">29.521643</span>, <span property="v:longitude">98.493599</span></span></p> </div>
  16. 16. Why?
  17. 17. Circa 2007
  18. 18. Additional content on SERPs Data automagically extracted from HTML
  19. 19. Value prop: “Give us your data in a machinereadable format and we’ll make your stuff more attractive in search results”
  20. 20. Results • 1000x increase in structured markup • Increases in user engagement (click throughs) for SERP objects created from structured markup • Small number of interesting applications built on top of structured data
  21. 21. But… • Too many choices (syntax, ontology, etc.), fragmented • A lot of bad markup – up to 40% • Not easy enough for your average “Joe Webmaster”
  22. 22. 2010
  23. 23. • Common vocabularies that search engines can understand • Lower the bar for webmasters to publish data on the web • Improve user experience through data
  24. 24. Introducing: Microdata <div id="pagecontent" itemscope itemtype=""> <a href="/media/rm974696448/nm2578007?ref_=nm_ov_ph"> <img id="name-poster" alt="Kim Kardashian Picture" title="Kim Kardashian Picture" src=", 0,214,317_.jpg" itemprop="image"/> </a> <h1 class="header"> <span class="itemprop" itemprop="name">Kim Kardashian</span></h1> <div class="infobar" id="name-job-categories"> <span class="itemprop" itemprop="jobTitle">Actress</span> <span class="itemprop" itemprop="jobTitle">Producer</span> </div> <div class="inline" itemprop="description"> TV star, entrepreneur, fashion designer, and author (New York Times bestseller - "Kardashian Konfidential"), Kim Kardashian first burst onto the scene in 2007, after the premiere of her hit E! Entertainment reality series ... </div> <time datetime="1980-10-21" itemprop="birthDate"> <a href="/search/name?birth_monthday=1021&refine=birth_monthday&ref_=nm_ov_bth_monthday" >October 21</a>, <a href="/search/name?birth_year=1980&ref_=nm_ov_bth_year" >1980</a> </time> </div>
  25. 25. Looks Like We’ve Got Something Here! • 15% of all sites contain markup • Many major sites • Adoption by content systems like Drupal and Wordpress • Around 1200 object types and growing • Significant reduction in error rates
  26. 26. Practical Applications in Search Yahoo! Related Entities
  27. 27. Practical Applications in Search Yandex Islands
  28. 28. Practical Applications in Search Google Knowledge Graph Additional content driven by derived data
  29. 29. Practical Applications in Search Google Knowledge Graph Additional content driven by derived data
  30. 30. Other Applications Pinterest Rich Pins
  31. 31. Other Applications Gmail “Actions in the Inbox” • Actions – rent a movie, buy something • Orders – post transaction order confirmation, shipping status • Reservations – restaurant, travel, tickets
  32. 32. Other Applications JSON-LD { "@context": "", "@type": "Person", "name": "John Doe", "jobTitle": ”Technologist", "affiliation": ”Big Boxen ‘R’ Us", "additionalName": "Johnny", "url": "", "address": { "@type": "PostalAddress", "streetAddress": "1234 Freeze Drive", "addressLocality": ”Icebox", "addressRegion": ”Minnesota" } }
  33. 33. Thank you @jaymyers
  34. 34. Credits Guha, Ramanathan V. “Light at the End of the Tunnel.” 12th International Semantic Web Conference (ISWC), Sydney, NSW, Australia. 23 October 2013. Keynote Address.