Linked Data ROI 20110426


Published on

A presentation on the trends in Linked Data, including a discussion of opportunities for return on investment for enterprise deployments.

Published in: Education, Technology, Design
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Linked Data ROI 20110426

  1. 1. 26 April 2011 David Wood @prototypo
  2. 2. Semantic technologies provide greater computing context. Let’s discuss context. The goodnews is that we are living in a golden age. The bad news is that many or even most arehaving a difficult time keeping up.Photo credit:
  3. 3. $ cat foo.txt | grep blah | sort 1970s 1980s 1990s A neat little package Client-Server The Early WebA Golden Age may be identified by its internal rate of change.Note the tendency to centralize first and distribute later. Distribution is harder.
  4. 4. Universal Client Ubiquitous, reusable applications URL Curation Universal Connection Logic and interlinking Web of Data Universal DatabaseThe Web is very different. The Web of Data is different yet again, and has largerramifications.
  5. 5. Access per 100 population worldwide 80 70 Mobile cellular subscriptions 60 Fixed telephone lines Internet users 50 40 30 20 10 0 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008The elephant in the room is the mobile market. New Internet users in the US are flattening (and has been for 5years). Growth worldwide is still occurring. About 25% of the world’s population uses the Internet. Almost everyworking adult in the US has a cell phone. Lots of non-working adults have cell phones in the US. Sales arecontinuing at the rate of over 30M in Q4 2010. 19% of phones in US are smart phones, which means 80% arenot, but nearly half of new sales are smart phones.
  6. 6. The PC is yesterday’s news. Businesses built on PCs are also yesterday’s news.Photo credit: Sue Waters,
  7. 7. You may have bought your last laptops, your last servers. If not, you competitors have.Mobile devices are new clients. Servers and systems administrators have moved to the cloud.Photo credit: Apple
  8. 8. Our data is still in the dinosaur age. Traditional data is hierarchical, tabular with externalschemas, and so are the systems that support them.Photo credit: David Wood, 2009
  9. 9. “We are beginning to routinely deal with vast quantities of data and only through a metadatamanagement strategy can we tackle the quantity of data to look for the nuggets for gold.”Credit: The Economist, Monstrous Amounts of Data, Feb 27th, 2010 Special Report on Managing Information.
  10. 10. We may want to fly immediately to some New World nirvana, but we can’t.Photo credit: David Wood, 2009
  11. 11. least partially because our tools aren’t mature enough. Some techniques are ready now,though.Compare to the airline industry and its impact on the world (~2.1M people will fly today).Photo credits: David Wood, 2009,
  12. 12. Users of RDF today. You may have heard of some of them.
  13. 13. Users of microformats. They can only scale the conversation further using an extensiblemechanism (like RDF) to combine microformat techniques.Image: The Conversation Prism by Brian Solis and JESS3
  14. 14. Microsoft has reinvented the RDF wheel in NTFS, Sharepoint and other products after pullingout of standards activities due to perceived competition.
  15. 15. The painter... does not fit the paints to the world. He fits himself to the paint. -- Paul KleeWhy is RDF important? Because we must fit ourselves to a better paint to deal withinformation overload and changing business requirements.
  16. 16. R&D is research. Marston Bates said, "Research is the process of going up alleys to see ifthey are blind." Research is difficult if not impossible to tie to ROI.Geek & Poke comic from
  17. 17. 4% 17% 13% 16% 6 months 49% 12 months 18 months 24 months More than 24 monthsHowever, corporate R&D must, sooner or later, be tied to ROI. The process is generallysooner (< 18 months for IT projects).Source: “Data Center Transformation: Key Implementation Drivers”, Hansa/GCR and HP, Oct.2008,
  18. 18. $50B in revenue (2010), $1.3B net income. 19% of consumer electronics market.
  19. 19. store name hours address phone geo ratings services eventsRDFa added to the sites for 1,100 retail stores, including store-specific blogs.
  20. 20. Why? 58% of Americans research online before they buy.Pew Internet & American Life Project:
  21. 21. “We really didn’t go into it with any expectations. We just wanted to see if it was something we might want to do. That’s why we were caught by surprise by the results… we weren’t really expecting any.” -- Jay Myers, Lead Development Engineer, Best Buy
  22. 22. The impact: 30% increase in organic search results 15% increase in click-through rate (CTR)
  23. 23. 100% House email 90% SEO 80% Paid search Banners, 70% buttons Text-link adsUsage >>> Affiliate Marketing 60% Behavioral Contextual targeting targeting Rented email lists 50% Rich media/ video 40% Pop-ups/ pop-unders 30% 0% 10% 20% 30% 40% 50% 60% Marketers Reporting “Great” Return on InvestmentPerhaps we shouldn’t be surprised after all; SEO is known to be effective.The size of the bubble illustrates the relative budget compared to other tactics.Source: Marketing Sherpa and Ad Tech: Year End Surveys, January 2009See also:
  24. 24. BBC is the largest broadcaster in the world, with 23,000 employees.
  25. 25. A Web presence for each broadcast1,000-1,500 programs broadcast per day.BBC Programmes ( provides for each broadcast:a Web identifier, HTML pages, machine-readable feeds (RDF/XML, JSON and XML)
  26. 26. "Creating web identifiers for every item the BBC has an interest in, and considering those as aggregations of BBC content about that item, allows us to enable very rich cross-domain user journeys." -- Yves Raimond, BBCNeeded to relate information across media for both users and third-party developers.
  27. 27. A Web presence for each artistBBC Music is underpinned by the Musicbrainz music database and Wikipedia.
  28. 28. A Web presence for each species (and other biological ranks), habitat and adaptationWildlife programmes (clips and episodes) are identified by tagging the clip or episode withthe appropriate dbpedia URI.
  29. 29. "The RDF representations of these web identifiers allow developers to use our data to build applications." -- Yves RaimondThe LOD/LED approach allows “different development teams to concentrate on differentdomains while at the same time benefiting from the activities of the other teams.”
  30. 30. Each HTML page is paired with a machine-readable data representation.
  31. 31. Like HTML and RDF, credit cards have a human-readable side and a machine-readable side.
  32. 32. A government use case - that applies to large enterprises, too.Envirofacts is an application built on a data warehouse consisting of data from manyrelational databases.
  33. 33. A government use case - that applies to large enterprises, too.Envirofacts is an application built on a data warehouse consisting of data from manyrelational databases.
  34. 34. A government use case - that applies to large enterprises, too.Envirofacts is an application built on a data warehouse consisting of data from manyrelational databases.
  35. 35. Changing data from a single relational database to another (e.g. for an upgrade) routinelyrequires 6 months or more.Combining LOD takes a matter of weeks to re-model the data (4-6 week sprint) and days toreuse the data in applications.
  36. 36. Pitney Bowes has quite a number of facilities in this dataset.
  37. 37. 10-90% failure rates: failure rates: reasons: without coordination:
  38. 38. • Linked Data means: “Cooperation without coordination”10-90% failure rates: failure rates: reasons: without coordination:
  39. 39. Scope: Bigger than any other deployed systemAdaptability: Changes piecemealOwnership: Nobody owns it
  40. 40.