Semantic & Linked data<br />“coming of Age”<br />Jay Myers,<br />
Web at-a-glance<br />1<br />25 billion web pages in the indexable web <br />2<br />1 trillion unique URLs discovered by Go...
2020: 25 zettabytes digital data online<br />2002: 5 exabytes of data online (total)<br />2010: 21 exabytes of data flow m...
”Sounds cool, but what is Semantic Web and Linked Data?"<br />
RDF/XML<br />N3<br />Machine<br />Semantics<br />Turtle<br />N-Triples<br />SPARQL<br />
RDFa<br />Microformats<br />Human-readable<br />Semantics<br /><html><br />Microdata<br />
Ontologies<br />
Simple form/ user input<br />Basic transform engine<br />Human & machine readable data<br />RDFa<br />Human-readable<br />...
Simple form/ user input<br />Basic transform engine and API<br />Human & machine readable data<br />Catalog API<br />+<br ...
Results<br />Openly publishing rich data to the web via employees<br />Makes every store blog an open data source<br />Sig...
RAW Data Is Plentiful<br />1<br />500 Million Facebook users<br />2<br />190 Million Twitter users<br />3<br />65 Million ...
Case Study: Best Buy<br />1,100+ Stores<br />155,000<br />Employees<br />460,000+ Products<br />6 Countries<br />10 Brands...
BBY UK<br />Customer<br />Insights<br />@BestBuy<br />Twitter<br />annot.<br />BBY US<br />Local<br />Stores<br />BBY UK<b...
strategicformula<br />Human-readable<br />Semantics<br />Machine<br />Semantics<br />+<br />=<br />Insight Engine<br />
"Many of our greatest companies did not start because they thought there was a big pot of gold at the end of the rainbow. ...
Problem: Shrinking margins & attach rates<br />“…e-commerce still lacks browsing and discovery experiences that satisfy cu...
Create product relationships<br />Margin: 49% <br />Margin: 10% <br />Margin: 17% <br />Margin: 9% <br />Margin: 31% <br /...
SPARQL<br />select distinct ?o as ?uri, bif:sprintf("%.2f",?p2) as ?price, ? <br />currency, ?text, ?label, ?thumb, ?ean, ...
Problem: declining customer service<br />"Poor service in the guise of ill-informed store staff creates lack of trust and ...
SPARQL<br />select distinct ?o as ?uri, bif:sprintf("%.2f",?p2) as ?price, ? <br />currency, ?text, ?label, ?thumb, ?ean, ...
Problem: staying connected in the “connected World”<br />Insight Engine<br />
Thank you!<br />@jaymyers<br />
Upcoming SlideShare
Loading in …5

Semantic & Linked Data; coming of age


Published on

Semantic Web, Linked Data and it's real world application at Best Buy

Published in: Technology
1 Comment
1 Like
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Thank you. I have had the opportunity and distinct pleasure of talking to many people over the last year about Semantic Web and Linked Data technologies. Some are more familiar than others, but anywhere I go I usually run into people who have had the same experiences hunting for products on the web as I have. Search engines and machines have become the entry point to the web with 67% of shoppers using the web before ever stepping foot into a physical store location. For the most part, machines do a pretty good job in helping us find products we’re looking for – with the caveat that we know exactly what we’re looking for. If we have a model number, brand name or machine part number, chances are pretty good a search engine or other helper application will get you to your product in a relatively short number of clicks.But not everything we shop for is so cut and dry and strongly defined! I have seen people struggle attempting to find a replacement battery for a 2 year old laptop. I have seen frustrated customers return products that were missing an important feature or attribute, which causes disappointment. I tell my own personal story of a hunt for my family’s hunt for a refrigerator and how frustrated I was at the lack of data and the extreme challenge of finding that appliance. Many of these stories have something in common – they involve a search for products that’s a little more undefined, searches that involve discovering products based on product attributes, features, and benefits -- where you don’t have a defining product name, manufacturer name, or model number to search on. They also leave potential buyers stranded and confused, spending countless hours sifting through the web trying to make sense of massive amounts of product data, or worse, completely abandoning their product search.
  • When we really take a look at the web, it’s no surprise that you and I are struggling with searching for the right product. The web is a massive place with over 25 billion page (rest of the stats)To summarize, we live in a web with a lot of stuff
  • Of course the size of the web as we know it today will pale in comparison to what it will look like even a few years from now. Let’s take a moment to look at a visual timeline of the growth of the web. Back in 2000 the web consisted of 5 Exabytes of data total. Fast forward just 6 years later, in 2008, and we see that same number, 5 exabytes, flow monthly on the web. Just two years later – right now, in 2010, 21 exabytes of data flow monthly. In 2015 we’ll see 10 zettabytes of digital data online, and in 2020 that number will be 25 zettabytes. For those of you following along at home, a zettabyte is made up of 21 zeros! In order to make sense of all of this information we have and will continue to rely on machines – there’s no way humans can handle and process the massive amounts of data being produced every second of every day. Also to our detriment is the fact that much of this valuable data is relatively unstructured or is completely inaccessible by machines. Additionally, we’ve spent the past 15 or so years creating a closed, siloed web of documents that has shut off valuable data from the rest of the world.I believe these issues can be addressed using SemanticWeb and LinkedData technologies
  • So I bet you are still asking yourself what semantic web and linked data actually areTo me, semantic web is a way of defining objects and things on the web, providing a tiny bit of structure around the data we have available to us and making that data accessible to machines. In the case of products, it’s being able to create a complete data model of a product – basically providing a rich virtual representation of a physical object, and making all the attributes and features of a product available to a machine to process.This idea of a “semantic web” isn’t new –Sir Tim Berners Lee was talking about a linked web of data way back in 1999.Later, Sir Tim Berners Lee coined the phrase linked data – a technique that links semantic data structures together and creates relationships between things. These relationships can then be explored to gather information and create additional knowledge.Semantically structured and linked data has huge potential and it’s something I will talk about later in my prezoSemantic Web technologies have been growing now for many years. As an end user, I see these technologies in two different flavors
  • The first flavor is machine semantics. Typically these technologies are created just for machine readability and processing, not for humans to consume first hand. There has been a great deal of work done in academia on machine semantics, and the technologies themselves have been around for a while. They consist of standards like:RDF/ XMLN3TurtleN-TriplesEven have a great query language – SPARQL, that allows us to query datasets for rich information and relationships
  • The second flavor of semantic web and linked data is something I refer to as machine-readable semantics -- one of the most exciting, accessible, and fastest growing parts of the semantic web. So what do I mean by human readable? Human readable semantic web techniques allow web developers to integrate structured markup and data into our current “human readable” web. It uses standard web pages to deliver data without altering the visual web we have built over the past 15 years and have grown accustomed to. The major markup standards on the human-readable semantic web include:RDFaMicrodataMicroformatsAll these markup techniques can be integrated into normal HTML, which is something that millions of people publish everyday. I like to think of theses technologies as the “Gateway Drug to the Semantic Web”. They enable millions of everyday front-end web developers to populate a rich web of data, more accessible to machines than ever before, while retaining the visual design and structure of a standard web page. In fact, most people who browse semantically-enabled web pages probably have no idea the pages integrating rich machine-readable data behind the scenes.Adopting these open semantic markup standards, we can begin to open up our data to machines and humans -- making the entire web one large database or API, which can be beneficial in many ways
  • Both machine readable and human readable semantic technologies rely on ontologies, usually called vocabularies, to help define objects on the web. There are many out there, with some of the more popular ones being:FOAF friend of a friend – describes relationships between peopleSIOC “shock” semantically interlinked online communities – provides ways for discussion communites to connect with each otherSKOS simple knowledge organization system – describes thesauri, classification schemes and taxonomiesLast but not least, GoodRelations is a vocabulary that has been beneficial to bestbuy,. It is a great way to describe products, good and services on the web. It can describe anything from complex product offers to store opening hours. It is extremely powerful.A common assumption of the semantic web and vocabularies is that these popular vocabularies serve as the authoritative description of an object on the web. This is not the case – it is only natural that everyone sees the world in a slightly different way, so you can always create and publish your own vocabulary, or mix and match popular vocabularies with ones you define on your own to create the right solution for your needs.
  • Now that you have a grasp of the concepts of semantic web and linked data, let’s talk real world examples of how best buy is using this technology stack. Back in 2008, we went on a journey looking for ways to utilize semantic web technologies to address real world business needs. At the time, we identified a failing project that would be a perfect candidate to experiment with the markup and data generation with. There was a business desire to start establishing a presence for each one of our stores on the web. That makes sense, since a majority of consumers search the web to find products, it would only make sense that consumers would also be hunting for the stores they could visit to touch, look and feel the products before ultimately purchasing those products.We have a large physical store presence in the US -- over 1100 stores, and there was a biz desire to start establishing a presence for each one of our stores on the webSo we decided to create blogs for every one of our stores. When you look at these stores, they’re like rich little bundles of data, with some very basic but important data properties – attributes like address, phone, geo coordinates. Many of our stores even have in store events and other interesting data that should be exposed and shared with the world.As an experiment we decided to use semantic technologies like RDFa and GoodRelations on all of our store blogs to highlight this data and make it more accessible not only to humans, but to machines as well.
  • We asked two employees in every store to blog, and start providing data about their stores utilizing simple web forms. Users have the ability to fill in basic attributes like store address, store name and number, a store photo, or add an event or store announcement. This data is run through WordPress, and open source blogging platform, where it is applied to a template. The result of the templates are information filled blogs available for humans to consume to find a local store, check on events and generally be connected to their local store. Plus the html code behind is data rich – utilizing RDFa and the GoodRelations vocabulary and number of other vocabularies to maximize machine readability
  • Using this same toolset, we are addressing the problem of open box and returned items. Most people should be familiar with open box products -- open box products are fully functioning products with significant markdowns, many are simple customer returnsOn average return rate for electronic devices is 11-20%Cost to the CE industry: $13.8 billion (2007)Open box products represent a significant challenge to our stores – some have as many as 500 open box products in store – but we haven’t provided any visibility to these products on the web. We ask our store employees to fill out a simple form, providing the SKU, markdown price, and reason the item was returned.Then using wordpress and pinging our open catalog API Remix, we are able to publish localized, unique, data-rich product detail pages that make these products visible.Again, if you look in the source code, you’ll see these pages are coded using RDFa for better machine readability
  • Several reasons why I think these two examples are significant:1. Simple2. Organic3. Harnessing human-generated input and turning it into machine-readable data. We made semantic web accessible to everyday store employees4. Removed valuable store data from silos publishing it in an open format for everything to consume
  • We’re not the only ones doing this! There is an incredible amount of work being done by some major players on the web. Yahoo Search Monkey started a trend as an early adopter of RDFa and rich markup.Google has recently come on board with Microformat, Microdata and RDFa support in their Rich Snippets initative
  • Their support looks something like thisYahoo!reported a 15% click through rate on results that are richly annotated with semantic markup
  • Facebook Open GraphTurns your web pages into graph objects500 million usersFlow of social information can be beneficialMaps out all the connections between peopleBuilding out connections between thingsMakes it more contextual and relevantSimply by clicking the like buttonAnd through RDFa tagsLike button is clicked over a billion times a day
  • I want to switch gears and talk about the other side of semantics – the purely machine-readable sideData is plentiful for small and large organizations alikeBut most companies use just a fraction of the data available to themUsing semantic web and linked data technologies we can mash different data sources up and begin to ask questions of it and learn from itLet me show you how, using Best Buy as a case study
  • Best Buy is made up of many different parts – stores, employees, products, etc.Take these parts and pieces as a whole, and we have millions, maybe even billions of data touchpoints out there, where we’re either publishing and displaying data (store information, product information) or receiving and consuming data (forums, ratings and reviews, twitter/ facebook/ social)Businesses can take this wealth of data and utilize it to improve our business, our bottom line and our customer’s lives
  • I’ve started to challenge my colleagues on both the business and technical side to start looking at all these different entities from a data perspective – imagine all this data in a global graphImagine linking all of these entities into a huge global graph of dataIt adds up incredibly fastThis slide is just a small representation of the vast amount of data we can start using to solve additional issues plaguing our retail industry today
  • The use of semantic technologies can be a huge benefit to any size businessI have developed a simple formula Human readable semantics like RDFa, Microformats and MicrodataServe as an external view of biz entities on the webBiz can broadcast and open their data to search engines, parsers, and the like to gain visibiltyOn the other side, we have this abundance of machine readable data at our disposa – some of this can be private data, public data owned by the company, or from external sourcesl, just waiting for us to mash it up and query itCombining the human readable semantic web and the machine semantic web together and you have a tremendously valuable insight engine – allowing biz to query data and gain insights or publish data and insights to the web for consumption by anythingThe end result is a massive insight engine
  • So what do we use this insight engine for?To solve problemsOne of my favorite quotesLet’s look at some best buy biz specific problems we can start to solve using an insight engine
  • It’s no secret that margins on many retail products are extremely thinIn order to stay profitable, retailers have to sell other products in addition to the primary product (attach other products)Ecommerce sites struggle with basket sizeIn a recent article, Alexander Gruensteidl talks about “surviving the future of retail”. He finds that retailers aren’t building the types of user experiences that engage customers in a way that promotes deeper product discovery and curiosity. While much of this article focuses on customer experience design, there is a definite tie in to information and data that cannot be ignoredI believe we can promote deeper discovery of products, goods and services by utilizing linked data
  • We can start developing deep relationships between primary and secondary products with open and linked dataTake a popular product like a netbook. To drive traffic into the store or web site, retailers will offer primary products at negative margin in hopes of being able to attach secondary products that have positive margin.Many times we do not provide adequate connections from primary to secondary products to promote these attachmentsWe should be exploring relationships between products to drive sales and increase attachment rates while providing the complete solution to the consumerWe could be providing pathways of exploration -- going beyond the expected into products that the consumer may not have exploredThink of it as “degrees of product separation” going from the obvious to the less expected, connecting all products through data and their inherent relationshipsThere are several benefits:Offering paths of exploration and choice creates a perceived valueUtilizes the strength of a retailer’s large catalog3. Extends the product long tail4. Achieves positive business results (offset negative margin, increase attach rates) while helping the consumer with a complete solution and improving the customer’s life with products she/ he may not have thought of
  • On the technical side, we can achieve this with a SPARQL query on our global graph of data
  • An ongoing study by IPG Media Lab reveals that shopper satisfaction at retail stores is declining up to 15% per year ( are more empowered than ever before, and more knowledgeable – often times will have performed more researched than store employees and may be more knowledgableCustomers perceive this lack of knowledge as poor service, which drives people awayThe traditional solution to this issue has been more training for our store employeesBut take these factors into consideration:1. Training to create a more informed staff through traditional learning methods drives up costs and squeezes already tight margins2. Compound this with the fact that retail employee turnover is extremely high3. And -- the products we sell can change rapidly, old ones go away and new ones appear in a matter of weeksAll of these issues add to a decline is customer serviceNot having the same or better tools as customers puts employees at a disadvantageThere’s a solution for this involving data
  • We can empower our employees with data-drive apps and devices that fit in the palm of their handNot only can they assist and explore products with customers, but these tools can serve as on the floor trainingApps that explore a large graph of data can be extremely powerful and allow the employee to fully serve the customer’s needs, and in return, slow the decline in customer service rates These devices and apps are not only an output, but can harness real time sentiment and trend data directly from employees, adding data back to the graph
  • Our CEO Brian Dunn is driving the idea of the “connected world” a place where everything everyone is connected – companies, employees, customers, brands, devices – a ubiquitous layer of connectivity Traditional digital marketing and advertising methods will not be enough to fuel a connected worldA connected world relies on open, accessible and queryable data to provide a solid foundation for companies to be everywhere in the connected worldLinked data will connect corporate entities to other entities and fuel valuable insight to customers everywhere they are atData is device, platform and trend agnostic and it has the power to “future proof” your businessAllows to business to adapt and use gathered insights to navigate many different business scenarios
  • Thank you!
  • Semantic & Linked Data; coming of age

    1. 1. Semantic & Linked data<br />“coming of Age”<br />Jay Myers,<br />
    2. 2. Web at-a-glance<br />1<br />25 billion web pages in the indexable web <br />2<br />1 trillion unique URLs discovered by Google<br />3<br />109.5 million web sites<br />2 billion users<br />4<br />1000x more sites in the “deep web”<br />, March 2009 <br />1<br />2<br />Google Official Blog, July 2008 <br />Name Intelligence, May 2009 <br />3<br />BrightPlanet, November 2010<br />4<br />Data via Scott Brinker,<br />
    3. 3. 2020: 25 zettabytes digital data online<br />2002: 5 exabytes of data online (total)<br />2010: 21 exabytes of data flow monthly<br />2000<br />2005<br />2010<br />2020<br />2015: 10 zettabytes digital data online<br />2008: 5 exabytes of data flow monthly<br />
    4. 4. ”Sounds cool, but what is Semantic Web and Linked Data?"<br />
    5. 5. RDF/XML<br />N3<br />Machine<br />Semantics<br />Turtle<br />N-Triples<br />SPARQL<br />
    6. 6. RDFa<br />Microformats<br />Human-readable<br />Semantics<br /><html><br />Microdata<br />
    7. 7. Ontologies<br />
    8. 8.
    9. 9. Simple form/ user input<br />Basic transform engine<br />Human & machine readable data<br />RDFa<br />Human-readable<br />Semantics<br />
    10. 10. Simple form/ user input<br />Basic transform engine and API<br />Human & machine readable data<br />Catalog API<br />+<br />RDFa<br />Human-readable<br />Semantics<br />
    11. 11. Results<br />Openly publishing rich data to the web via employees<br />Makes every store blog an open data source<br />Significant rise in organic search traffic <br />Human-readable<br />Semantics<br />
    12. 12.
    13. 13.
    14. 14.
    15. 15. RAW Data Is Plentiful<br />1<br />500 Million Facebook users<br />2<br />190 Million Twitter users<br />3<br />65 Million tweets per day<br />4<br />4 Million Foursquare users<br />Customer forums<br />APIs<br />Internal sales/ customer data<br />Product data<br />And more!<br />Mark Zuckerberg, July 2010 <br />1<br />2<br />Techcrunch, July 2010 <br />Twitter blog, June 2009 <br />3<br />Machine<br />Semantics<br />Business Insider, October 2010<br />4<br />Data via Scott Brinker<br />
    16. 16. Case Study: Best Buy<br />1,100+ Stores<br />155,000<br />Employees<br />460,000+ Products<br />6 Countries<br />10 Brands<br />1,400 Domains<br />
    17. 17. BBY UK<br />Customer<br />Insights<br />@BestBuy<br />Twitter<br />annot.<br />BBY US<br />Local<br />Stores<br />BBY UK<br />Facebook<br />BBY US<br />Customer<br />Insights <br />BBY US<br />Facebook<br />BBY UK<br />Employee<br />Insights<br />Carphone<br />Warehouse<br />Reward<br />Zone<br />@twelp-<br />force Twitter<br />annot.<br />BBY US Products<br />BBY UK<br />Site<br />Analytics<br />Best Buy<br />Mobile<br />Best Buy<br />UK<br />@BestBuy<br />UK<br />Twitter<br />BBY UK<br />Products<br /><br />.com<br />BBY QR<br />Code<br />Data<br />BBY US<br />Employee<br />Insights<br />Best Buy<br />US<br />BBY UK<br />Site<br />Analytics<br />BBY<br />Mobile <br />Apps<br />BBY US<br />Site<br />Analytics<br />Best Buy<br />Global<br />Graph<br />Geek<br />Squad<br />BBY CN<br />Site<br />Analytics<br />Best Buy<br />China<br />BBY US<br />Mobile App <br />Data<br />Pacific<br />Sales<br />Magnolia<br />BBY CA<br />Employee<br />Insights<br />BBY CA<br />Local<br />Stores<br />BBY CN<br />Products<br />Five Star<br />Products<br />BBY MX<br />Site<br />Analytics<br />BBY CA<br />Customer<br />Insights<br />Best Buy<br />Mexico<br />Best Buy<br />Canada<br />BBY TK Products<br />BBY CA<br />Customer<br />Insights<br />BBY MX Products<br />Best Buy<br />Turkey<br />BBY CA Products<br />BBY MX<br />Customer<br />Insights<br />BBY MX<br />Employee<br />Insights<br />BBY CA<br />Site<br />Analytics<br />BBY TK<br />Site<br />Analytics<br />BBY MX<br />Local<br />Stores<br />BBY TK<br />Employee<br />Insights<br />@BestBuy<br />CA<br />Twitter<br />
    18. 18. strategicformula<br />Human-readable<br />Semantics<br />Machine<br />Semantics<br />+<br />=<br />Insight Engine<br />
    19. 19. "Many of our greatest companies did not start because they thought there was a big pot of gold at the end of the rainbow. They started because they thought there was an interesting problem to be solved."<br />- Tim O’Reilly, Web 2.0 Summit 2008 <br />
    20. 20. Problem: Shrinking margins & attach rates<br />“…e-commerce still lacks browsing and discovery experiences that satisfy curiosity."<br /><ul><li> Alexander Gruensteidl. “Four Keys to Surviving the Future of Retail”. </li></ul>09 September 2010 . <><br />
    21. 21. Create product relationships<br />Margin: 49% <br />Margin: 10% <br />Margin: 17% <br />Margin: 9% <br />Margin: 31% <br />Margin: 49% <br />Margin: 10% <br />Margin: 61% <br />Margin: 19% <br />Margin: -15% <br />Margin: 8% <br />Margin: 25% <br />Margin: 12% <br />Margin: 21% <br />Margin: 40% <br />
    22. 22. SPARQL<br />select distinct ?o as ?uri, bif:sprintf("%.2f",?p2) as ?price, ? <br />currency, ?text, ?label, ?thumb, ?ean, ?order_link where<br />{<br /> ?s1 rdfs:comment ?text .<br /> ?text bif:contains ’”Netbook”’.<br />Insight Engine<br />
    23. 23. Problem: declining customer service<br />"Poor service in the guise of ill-informed store staff creates lack of trust and drives shoppers to look for alternatives."<br />- Nigel Fenwick. “Industry Innovation: Retail”. Forrester Research. 28 July 2010 .<br />
    24. 24. SPARQL<br />select distinct ?o as ?uri, bif:sprintf("%.2f",?p2) as ?price, ? <br />currency, ?text, ?label, ?thumb, ?ean, ?order_link where<br />{<br /> ?s1 rdfs:comment ?text .<br /> ?text bif:contains ’”LCD TV”’.<br />Insight Engine<br />
    25. 25. Problem: staying connected in the “connected World”<br />Insight Engine<br />
    26. 26. Thank you!<br />@jaymyers<br />