How google is using linked data today and vision for tomorrow


Published on

In this presentation, I will discuss how modern search engines, such as Google, make use of Linked Data spread inWeb pages for displaying Rich Snippets. Also i will present an example of the technology and analyze its current uptake.

Then i sketched some ideas on how Rich Snippets could be extended in the future, in particular for multimedia documents.

Original Paper :

Another Presentation by Author:

Published in: Technology, News & Politics
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Eurecom  is a graduate school in the domain of information and communication technology and a research center in communication systems. Digital Enterprise Research Institute ,  DERI , is a research institute at the  National University of Ireland, Galway . Its focus is research into the  Semantic Web  and  Web Science
  • . Humans are capable of using the Web to carry out tasks such as finding the  Irish  word for "folder", reserving a library book, and searching for the lowest price for a DVD. However,  machines  cannot accomplish all of these tasks without human direction, because web pages are designed to be read by people, not machines. The semantic web is a vision of information that can be readily interpreted by machines, so machines can perform more of the tedious work involved in finding, combining, and acting upon information on the web.
  • embedded
  • They were developed by Google but built on open standards, so rich snippets were later adopted by other search engines. Rich snippets can support various markup languages to help identify the information to be presented, such as event or item information.
  • A breadcrumb trail is a set of links (breadcrumbs) that can help a user understand and navigate your site's hierarchy All other @
  • embedded
  • embedded
  • embedded
  • embedded
  • embedded
  • embedded
  • embedded
  • embedded
  • The components of this URI are first a temporal dimension (t=428,434), which selects seconds 428 to 434 of the whole video, and then a spatial dimension (xywh=150,60,50,70 and xywh=240,50,50,70), which creates two bounding boxes at the x, y parameters with a width w and a height h.
  • How google is using linked data today and vision for tomorrow

    1. 1. How Google is using Linked Data Today and Vision For Tomorrow Thomas Steiner (Google, Germany), Raphael Troncy (Eurecom, France) and Michael Hausenblas (Deri, Ireland)Published at The Future Internet Assembly, Dec 2010, Ghent, Belgium Research Paper Presenter : Vasu Jain
    2. 2. Contents• Challenges on Web in terms of research• How Google is using Linked data to display Rich Snippets• Rich Snippets formats and entities supported and analysis of its usage• Visual Examples of Rich snippets• RDFa, Microformat and Microdata• Presence on Web and Business Impact• Extension of Rich Snippets in the future in particular for multimedia content• Future Internet Architecture and Thought-experiment of Triple-centric Networking• Displayed your website’s rich snippets in Google search results• Conclusion, References and Useful Links April-2012 Contents 2
    3. 3. Challenges on Web in terms of research• Web is important part of Application layer of Network architectures. Two major trends opening huge perspectives and challenges on Web in terms of research • The Web of Data (also called Semantic Web) • The Social Web (also called Web 2.0)• As originally envisioned, "The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. It is a system that enables machines to "understand" and respond to complex human requests based on their meaning.• Web 2.0 applications are a new trend in Web development and design that facilitates communication, secures information sharing, interoperability, and collaboration. Web 2.0 basically refers to a dynamic Web that includes open communication with an emphasis on Web-based communities of users, and more open sharing of information.• Web 2.0 is ‘The Web as Platform’ and the Semantic Web is ‘The Web of Meaning’. April-2012 Challenges on Web in terms of research 3
    4. 4. Shift triggered by these trendsFundamental shift triggered by these trends:•While previously the Internet has been concerned about sending bits from one host ofthe network to another, new applications now require to make sense out of those bits.•In other words, the Internet architecture needs a new layer, that takes care of datainteroperability for interconnecting pieces of machine- process able data to make senseout of them.Proposals for New Layer•A new data layer to the Systems Interconnection (OSI) stack, a so called Linked DataLayer" located between the application layer and the presentation layer and that aims tomake sense of the data in such a way that it establishes interoperability between differentapplications. April-2012 Shift triggered by these trends 4
    5. 5. Snippets in Google SearchIn 1998, Google introduced Snippet, a short description of or excerpt from a website whichappears in Google search results. Snippets are created automatically based on the sitescontent. April-2012 5 Snippets in Google Search
    6. 6. Rich Snippets• Rich Snippets in Google SearchIn 2009, Google announced, Rich Snippets, a new presentation of snippets that applies Googles algorithms to display structured data embedded in search result pages with the objective of highlighting the searched for properties to user in a visually outstanding way. Rich Snippets give users convenient summary information about their search results at a glance. When searching for a product or service, users can easily see reviews and ratings, and when searching for a person, theyll get help distinguishing between people with the same name. April-2012 Conclusions 6
    7. 7. Rich Snippets• Rich Snippets formats Supported A lot of previous work on structured data has focused on debates around encoding format to be accepted Later on Google realized that structured data on the web can and should accommodate multiple encodings and thus accepted both Microformat encoding and RDFa encoding.The Rich Snippet feature was built on open standards or community-agreed-on approaches• RDFa (Resource Description Framework – in Attributes)• Microformat Encoding• Microdata Encoding additional feature was launched along with Rich Snippets announcement, Rich Snippets in Custom Search, similar to Yahoo!s BOSS (Build your Own Search Service) initiative. April-2012 Rich Snippets 7
    8. 8. Entities in Rich Snippet EncodingsEntities supported by Google Rich Snippets as of now….•Software applications•Breadcrumbs•Events•Music•Businesses and Organizations•People•Products•Recipes•Review Ratings•Reviews•Videos: Facebook Share (… they promise to get more soon) April-2012 Entities in Rich Snippet Encodings 8
    9. 9. Rich Snippets: ReviewsApril-2012 Pros & Cons of the paper 9
    10. 10. Rich Snippets: PeopleApril-2012 Pros & Cons of the paper 10
    11. 11. Rich Snippets: Events Pros & Cons of the paper 11
    12. 12. Rich Snippets: Recipes Pros & Cons of the paper 12
    13. 13. Rich Snippets - RDFa• RDFa (Resource Description Framework – in Attributes) RDFa is a way to label content to describe a specific type of information, such as a restaurant review, an event, a person, or a product listing. These information types are called entities or items. Each entity may have a number of properties. For example, a Person properties are name, address and email address.In general, RDFa uses simple attributes in XHTML tags (often <span> or <div>) to assign brief and descriptive names to entities and properties..• An Example (Ticket Booking of an upcoming Event) A short HTML block showing an entity. Here, details about entity can be marked up in the body of a Web page in order to help understanding the location, schedule, price or reviews of the event. April-2012 Rich Snippets - RDFa 13
    14. 14. Rich Snippets - RDFa exampleMark-up for an event at a certain business location. From structured mark-up on a Website...typeof="v:Event" indicates the marked-up content describes an Event (Item type)The dimensions that composed the event (description, type, starting time) are describedwith properties. The property name is prefixed with v: <span property="v:description"> April-2012 Rich Snippets - RDFa 14
    15. 15. Rich Snippets - RDFa a Rich Snippet on GoogleCaution: Google does not display information that isnt visible to the user like Hidden divs .It can be tempting to add all the content relevant for a rich snippet in one place on thepage, mark it up, and then hide the entire block of text using CSS or other techniques.Google will not show content from hidden divs in Rich Snippets.Exceptions: Geo information (latitude and longitude of location) can be included in theHTML markup. April-2012 Rich Snippets - RDFa 15
    16. 16. Rich Snippets – Microformat & Microdata• Microformat and Microdata are simple conventions (known as entities) used on web pages to describe a specific type of information like a person, product etc. Each entity has its own properties.• Microformats use the class attribute in HTML tags (often <span> or <div>) to assign brief and descriptive names to entities and their properties.• Microdata uses simple attributes in HTML tags (often <span> or <div>) to assign brief and descriptive names to items and properties. April-2012 An example of HTML block showing basic contact info for a person. 16
    17. 17. Rich Snippets – Presence on WebStatistics with regards to semantic mark-up on the Web in June 2010•A random sample of one million Web pages have been harvested in order to compare theuse of Microformats and RDFa markup.•Then, they examined how much of this mark-up data was actually used for Rich Snippets.•Only a tiny fraction of all semantic mark-up present on the Web was used for Rich Snippetsat the time of this experiment.Pitfalls•Incorrect labeling (e.g. marking up the date of an event as part of the event description),•Incorrect inclusion of unrelated words in the structured mark-up (e.g. marking up writtenby John Doe" rather than just John Doe" as value of the property v:reviewer).•Furthermore, they observe a general confusion with what parts of a document should bemarked up at all. Although some web pages include RDFa event markup, none of them areused by the Rich Snippet technology as of today. April-2012 Presence on Web 17
    18. 18. Rich Snippets – Presence on WebApril-2012 Presence on Web 18
    19. 19. Rich Snippets - Business ImpactBenefits of Rich Snippets in Google Search …•Webmasters: Provided webmasters the ability to add useful information to their websearch result snippets to help Google make sense of their bits.•Purpose To provide more information to a user about the content that exists on page sothey can decide which result is more relevant for their query.•Additional Traffic to a webpage With extra information people tend to rely more on aparticular search result with linked data, thus an increasing number of impressions notedon sites with Rich Snippets.•Higher Click through Rate An increasing number of higher click-through rate for pageswith Rich Snippets was experienced as shown in a paper by Kavi Goel, Pravir Gupta•Rich snippet markup accurately reflects the primary content of your page. Web sites cansuffer significant sales collapses by going down a position in their natural search ranking.•Easy to add simple lines of Markup to existing HTML, no affect to Visual appearance ofthe webpage. April-2012 Business Impact 19
    20. 20. Vision for Rich Snippets in FutureNew business-related vocabularies such as the Tickets Ontology, are expected to seebroader and broader usage and implementation. This would allow for comparative RichSnippets.Using information from the user’s Social Graph, given user has given access to her socialgraph. This would mean to carry part of the Facebook experience right into the searchexperience. Conclusions 20
    21. 21. Vision for Rich Snippets in FutureEven Richer Snippets using Multimedia semantics to provide richer video search results. Webelieve that there is high potential for semantically annotated multimedia content toimprove content search.We show a mock-up of a person highlighted, which could be based on media fragment URIs.Such media fragment URI could look like:,434#xywh=150,60,50,70&xywh=240,50,50,70 Vision for Rich Snippets in Future 21
    22. 22. Future Internet ArchitectureToday’s Rich Snippets: Content is exclusively determined by the information in one particularWeb pageVision of extended Rich Snippet: Outlined above features information from more than justone data source. Thus, an information sharing mechanism must be established to combineinformation coming from various data sources.Content-centric Networking: Two notions of packages involved: Interest and Data packages.Interests get broadcast by consumers, and as soon as a node can satisfy an interest, itresponds with the data. Otherwise, it rebroadcasts the interest.Advantage over common host-based networking•Data packages are not only exclusively thought for the initially interested node, but can beshared between nodes with common interests.•Useful when many parties are interested in the same content. April-2012 Future Internet Architecture 22
    23. 23. Future Internet Architecture• Figure illustrates how the interest and data packages could look like if we applied the principle of Content-centric Networking to Triple-centric Networking.• Google is at the early stages of this thought-experiment and have not carried out any experimentation to justify their assumption. April-2012 Pros & Cons of the paper 23
    24. 24. Displayed your website’s rich snippets in Google search resultsWebsites like TripAdvisor, Yelp, Amazon etc. stand out over other search results with theirstar ratings thus increasing their click through. To get your website’s rich snippets displayedon Google search results:1. Mark it up with microformats : Markup your website with extra information for the entitieslike Reviews, People, Products, Businesses, Recipes, Events2. Test to make sure it works : Use the Google rich snippets testing tool3. Submit your site to Google Google approves websites that they see as reliable source of reviews, have a substantialamount of reviews, are marked up correctly. April-2012 Displayed your website’s rich snippets in Google search results 24
    25. 25. Rich snippets Experiment
    26. 26. ConclusionIt has become visible that Rich Snippets are a very sensible element in the Linked Data valuechain due to the high visibility and the confirmed change of user click-through behavior.We have interlinked the social graph of a user with common event-related data in the LinkedData cloud.It is obvious that for the online ticket search example, the decision what ticket vendor toinclude, and what vendor to exclude from the vendors shown in the Rich Snippets is a crucialone.The suggested addition of a Linked Data layer between the current application andpresentation layer could help establish the links between the data providers and facilitate tomake sense of data and present them in an efficient way. April-2012 Conclusions 26
    27. 27. Useful Links• Expression-of-interest form for webmasters to indicate their interest for Rich Snippets to be shown for their pages.• Rich Snippets Testing Tool Beta April-2012 Useful links 27
    28. 28. References and Useful Links•••!/tomayac•••••• About RDFa• About microformat• About microdata April-2012 Pros & Cons of the paper 28
    29. 29. THANK YOU !