An introduction to Web 3.0, aka the Semantic Web, with notes for PR practitioners

  1. 1. 1 web 3.0an introduction + notes for PR practitioners Creative CommonsAttribution-NonCommercial-ShareAlike 3.0 Unported License by Philip Sheldrake, Euler Partners kindly sponsored by Brandwatch for Social Data Week 16-20 September 2013
  2. 2. 17th September 2013 Web 3.0 aka the Semantic Web Web 3.0, the semantic web, marks the transition to a web that understands the content that we put there. As public relations is about working towards mutual understanding between stakeholders, practitioners must understand this understanding and begin to work with Web 3.0 technologies. The next 3 slides situate Web 3.0 in terms of public relations. Then the key concepts are discussed over 21 slides. It does get a bit techie, but no more than perhaps this Web 2.0 thing looked a decade ago. I hope you find it illuminating. Personally, I think the semantic web is nothing less than awesome. 1
  3. 3. Public Relations Whether or not you subscribe to the definition of public relations in my Share This Too chapter (29 – The Six Influence Flows), you’ll appreciate that mutual understanding is attempted and reputations are formed in part via the digestion of data and information over time. Traditionally, PR practitioners focused on communication with human intermediaries (eg, journalists, analysts). More recently, social media has brought fresh imperative to establishing and maintaining relationships with publics directly too. And now, as those publics begin to adopt Web 3.0 type capabilities to investigate and learn about a field of mutual concern, PR practitioners must ensure that corresponding data and information is available in an open, accessible and standard format. The following short example provides some context. 2 the planned and sustained effort to influence opinion and behaviour, and to be influenced similarly, in order to build mutual understanding and goodwill
  4. 4. An Example Jo’s organization is a leading exponent in this relatively new method of releasing natural gas for extraction called fracking. Her company recognises the controversy surrounding the technique and decides to commission and publish research addressing the concerns. She issues a press release announcing the publication of the corresponding report and website, and engages stakeholders directly via social media. Jo is familiar with the categorisation of paid, owned and earned media, but she is also aware of the rise of machined media – content that is automatically discovered, presented and published by machines for humans. As environment and energy sector journalists and analysts lean more and more on semantic web services, she knows her company’s data and information should be discoverable in a way its competitors have yet to appreciate, giving her company competitive advantage. 3
  5. 5. An Example So whereas she might once have been satisfied with these Web 1.0 and Web 2.0 tactics, she knows she would be negligent these days to ignore Web 3.0. She publishes the source research data and the report semantically so that others, and software acting on behalf of others, may machine the data and test the validity of the report’s conclusions for themselves. Moreover, those that analyse the data reciprocate so that Jo’s company can get to grips with the issues others raise quickly and efficiently. Jo will have improved mutual understanding, helping the company strive for consensus. Will your organization and stakeholders, or clients and their stakeholders, secure similar advantage? Read on to find out what Web 3.0 is and how it works. 4 This is just an example. It should not be interpreted as my having a position on fracking.
  6. 6. Berlin What do you think of Berlin? You have no problem with that question. Your amazing mind has already placed it in a context that works for you. You may not therefore appreciate how much your mind has done on autopilot, or indeed how complex the question is devoid of context. So what is your context? If you’re into great 20th Century songwriting, you may be thinking about the works of Irving Berlin. Perhaps you’re a New Wave fan and you’re contemplating theAmerican synth pop band most famous for “Take My BreathAway”. Maybe your appetite for popular culture reached its zenith in the 70s, in which case Bowie’s Berlin Trilogy is front of mind, or Lou Reed’s Berlin album, either because it was regarded by Rolling Stone magazine as “a disaster”, or because it still made Rolling Stone’s Top 500 albums of all time. Go figure. 5
  7. 7. Berlin Perhaps you’re thinking in terms of place not music, in which case it’s obvious right? Well, the capital of Germany goes by the name of course, but so do around twenty places in the US and one in SouthAfrica. So if you were to express an opinion or publish anything about Berlin in a digital format, a format that can be read and machined by, well, machines, then the utility of that machining would be greatly enhanced if it could determine which entity called Berlin you’re actually talking about. This is the domain of Web 3.0. 6
  8. 8. Semantic Web Retrospectively, we call the initial manifestation of the web, a web of interlinked documents, Web 1.0.  The social web, Web 2.0, is a web of people – or at least a web of documents representing people. Web 3.0 is about the Web itself understanding the meaning of the content and social participation. In the words of Sir Tim Berners-Lee, inventor of the World Wide Web, the Web becomes a universal medium for the exchange of data, information and knowledge. Web 3.0 is more accurately called the Semantic Web, although the phrase Web of Data is increasingly popular. The Semantic Web is a collaborative movement led by the international standards body, the World Wide Web Consortium (W3C). a universal medium for the exchange of data, information and knowledge 7
  9. 9. Disambiguation So, I want to be really clear about what I mean when I write “I love Berlin”. We’ve already seen how that simple phrase is ambiguous without additional clarification. Which Berlin? As we’ll see, the Semantic Web allows data to be described with reference to universally available common vocabularies. I need to reference one of those universal vocabularies to declare which Berlin I’m talking about. In the jargon, I’m disambiguating. I’m talking about the capital of Germany, and the particular vocabulary I need in this instance is the one for geographic names called GeoNames ( Avocabulary is structured with Unique Resource Identifiers, or URIs for short.AURL, Unique Resource Locator, is a type of URI. 8 Berlin
  10. 10. Disambiguation Now here’s a subtle but important distinction.A webpage about Berlin is not Berlin. Obviously. So there are two URIs relating to Berlin here: This URI stands for Berlin. We use this URI when we want to refer to the city of Berlin, Germany. This URI is the document with the information GeoNames has about Berlin, Germany. There is an expectation however that a web browser should be able to resolve the first URI – in other words, show you some information about Berlin rather than show you round Berlin itself! So for this reason, GeoNames redirects any web browser calling for the first URI to the content available at the second URI. 9 2950159/berlin.html
  11. 11. Disambiguation The webpage describing the ‘resource’ called Berlin is a street map or satellite image of the area featuring: • the latitude, longitude and altitude • the population • a link that calls up a list of alternate names and name variants (Berliini in Finnish, Berlijn in Dutch) • a link to a ‘geotree’ listing the geographies that make up the place in question and the wider geography in which it’s located • And the hyperlinked text “.rdf”. 10
  12. 12. RDF RDF stands for resource description framework, a metadata data model. In plainer English, RDF is a family of standard ways to present data about entities – Berlin in our example. The RDF link for Berlin is: On the next couple of pages I’ve included an excerpt of what you get if you follow that link at the time of writing.As this is likely the first time you’ll see RDF in the raw, I have added some explanatory notes. It’s not as fearsome as it might first appear, but you can see why this stuff is usually the preserve of machines rather than humans. You might also like to know that marketing and PR practitioners won’t really need to see the raw stuff again, so feel free to skip over it! 11
  13. 13. Explanation The RDF doc starts here The identifying number The URI for the RDF doc The name of the entity Alternative names for the entity, typically in other languages The country code The population The latitude The longitude The altitude The description of the entity Berlin, the capital of Germany <rdf:RDF> <gn:Feature rdf:about=""> <rdfs:isDefinedBy rdf:resource="" /> <gn:name>Berlin</gn:name> … <gn:alternateName xml:lang="ga">Beirlín</gn:alternateName> <gn:alternateName xml:lang="csb">Berlëno</gn:alternateName> <gn:alternateName xml:lang="li">Berlien</gn:alternateName> <gn:alternateName xml:lang="et">Berliin</gn:alternateName> <gn:alternateName xml:lang="fi">Berliini</gn:alternateName> <gn:alternateName xml:lang="nl">Berlijn</gn:alternateName> … <gn:countryCode>DE</gn:countryCode> <gn:population>3426354</gn:population> <wgs84_pos:lat>52.52437</wgs84_pos:lat> <wgs84_pos:long>13.41053</wgs84_pos:long> <wgs84_pos:alt>74</wgs84_pos:alt> RDF for Berlin, capital of Germany 12
  14. 14. Explanation The entity’s geographic parent, a URI for 4th order administrative area Berlin, Stadt The URI for the country, the Federal Republic of Germany The URI for the first order area in which the entity is found, Land Berlin AURI for a list of URIs of entities nearby to Berlin AURI to the entity’s location map Alist of URIs to Wikipedia articles in various languages The RDF doc ends here The description of the entity Berlin, the capital of Germany <gn:parentFeature rdf:resource=""/> <gn:parentCountry rdf:resource=""/> <gn:parentADM1 rdf:resource=""/> … <gn:nearbyFeatures rdf:resource=""/> <gn:locationMap rdf:resource=""/> <gn:wikipediaArticle rdf:resource=""/> <gn:wikipediaArticle rdf:resource=""/> … <gn:wikipediaArticle rdf:resource=""/> … <rdfs:seeAlso rdf:resource=""/> … </rdf:RDF> RDF for Berlin, capital of Germany, cont. 13
  15. 15. RDF Now, I’m sure you’ll agree, there’s no confusing the object of my affection. This obviously isn’t the Town of Berlin, Worcester County, Massachusetts. You’ll find that’s: berlin.html Or the ghost town in Nevada: And I’m obviously not referring to Irving, as much as I might like “Puttin’ on the Ritz”. You can find a RDF document for him here: 14
  16. 16. Linked Data The Semantic Web includes a vision known as Linked Data. In simple terms it builds on HTML, RDF and URIs to interlink published structured data to make it more useful, more valuable. Sir Tim Berners-Lee has outlined four principles of the Linked Data approach, effectively: 1. Use URIs to identify things 2. Use HTTP URIs so that these things can be referred to and looked up ("dereferenced") by people or software acting on someone’s behalf 3. Provide useful information about the thing when its URI is dereferenced, using standard formats such as RDF/XML 4. Include links to other, related URIs in the exposed data to improve discovery of other related information on the Web. 15
  17. 17. Bottom-Up This bottom-up approach demands diligent markup of web content, and for this reason many consider it to be laborious, long-term work. Nevertheless, the BBC is a leading exponent (its brilliant London 2012 Olympics website was semantically powered), the UK government leads the world with the facility, and a project called dbpedia (links to which we’ve seen twice already, if you’ve been watching carefully) endeavours to semantically markup the entire Wikipedia corpus. NewsML-G2, a news multi-media exchange standard developed by the International Press Telecommunications Council (IPTC), has semantic components. I should add that while publishing semantically marked up content requires a bit more effort, you don’t need to create RDF directly.Appropriately capable content management systems should look after that for you. 16
  18. 18. Linked Data This diagram portrays a cloud of datasets that have been published openly in Linked Data format as of September 2011. (Visit for ad hoc updates.) Figure: Linking Open Data cloud diagram, by Richard Cyganiak andAnja Jentzsch. License –Attribution-ShareAlike 3.0. 17
  19. 19. Knowledge Graph Commercial imperative drives innovations to link things together sooner than a bottom-up only way might otherwise achieve, and perhaps it’s no surprise that Google leads the way here. Here’s what the Google blog had to say about the May 2012 launch of its Knowledge Graph*: “Search is a lot about discovery – the basic human need to learn and broaden your horizons. But searching still requires a lot of hard work by you, the user … Take a query like [taj mahal]. For more than four decades, search has essentially been about matching keywords to queries. To a search engine the words [taj mahal] have been just that – two words. “But we all know that [taj mahal] has a much richer meaning. You might think of one of the world’s most beautiful monuments, or a GrammyAward-winning musician, or possibly even a casino inAtlantic City, NJ. Or, depending on when you last ate, the nearest Indian restaurant. It’s why we’ve been working on an intelligent model – in geek-speak, a ‘graph’ – that understands real-world entities and their relationships to one another: things, not strings.” *Introducing Knowledge Graph, Things Not Strings, Google’s blog, 18
  20. 20. Top-Down How does Google attempt this? Well first and foremost it taps into the bottom-up work achieved so far with those public Linked Data datasets, and they’re augmenting this with some smart software designed to determine the precise entity described in web content even when it isn’t semantically marked up – what you might call a top-down approach. According to a December 2012 post to Google’s Search Blog, the Knowledge Graph covered 580 million objects (entities) and 18 billion facts and connections at that time*. Fundamentally, Google is intent on “building the next generation of search, which taps into the collective intelligence of the web and understands the world a bit more like people do”, which sounds very similar to the intent we saw earlier for Web 3.0. *Get SmarterAnswers from Knowledge Graph from …, Google’s Search Blog, 19
  21. 21. Knowledge Graph If you use Google search then you will have seen some immediate manifestations of this work. As of mid-2012, typing a famous person’s name into Google search for example usually brings up a collection of images and information to the right of the main search results about that individual. Enter “Berlin” and you’ll see things like a map of the German capital, a description of the city, the population, the weather, the local time, etc. That is if Google’s algorithms determine you’re looking for information about the German capital of course rather than another entity of the same name. As of mid-2013, Google also introduced the Knowledge Graph powered image carousel above the main search results. 20
  22. 22. Knowledge Graph Absent disambiguation, the Knowledge Graph craves context. Take a search for “Soho” for example.As a London resident writing this in London and employing my default browser Firefox, Google Search presents me with Knowledge Graph information about Soho in London. However, if I boot up a different browser and tell Google to ignore my location (by going to, I get information about Soho in New York. Interestingly, if I tell Google to ignore my location in my default browser, I get a map for Sokcho-si, Gangwon-do, South Korea. What does this tell us? It tells us that delivering the Knowledge Graph vision is a considerable challenge! search/knowledge.html 21 6545173/soho.html
  23. 23. Link it together Now, if you specify a date you’re going to be in Berlin, and it happens to coincide with my next stay in Berlin, we can rest assured it’s the same place and we can get together for a Berliner Weisse. In fact, if we describe our friendship semantically, and if we describe our predilection for German beer semantically, then our respective semantic calendar services might auto-suggest our getting together at a specific time and place without our having to labour over it. And perhaps a mutual friend has expressed their love of a suitable venue via a service making it semantically understandable. But that’s a fairly trivial example. 22
  24. 24. Link it together At the other end of the spectrum, imagine that all the data harvested by cancer research scientists around the world is published semantically. This allows a semantic data scientist to study the studies, merging datasets and finding patterns and reaching conclusions that would otherwise have proved very difficult or impossible. Imagine a ‘semantic data Member of Parliament’, a ‘semantic data town planner’, a ‘semantic data journalist’. To really begin to imagine this, I need you to go online... 23
  25. 25. Relfinder dbpedia has made great progress semantically marking up music and movie data so I’m going to point you to a simple semantic web browser demonstration, called Relfinder, to explore the connections between: U2’s most successful album to date, The Joshua Tree of 1987, and their 2009 album No Line On The Horizon – The films Letters from Iwo Jima and Million Dollar Baby – (These links work at the time of writing, but Relfinder is a demonstration browser and may have been taken down or hosted elsewhere since. In which case, search for “Relfinder” and see if you can find it and have a play. Click the “Examples” tab to get going quickly.) 24
  26. 26. More Info About the Semantic Web Web 3.0, a 14 min video About rich snippets and structured data Google’s structured data testing tool About Semantic Web technologies and PR Why you need to use GoogleAuthor Rank and social search, a blog post by Stephen Waddington, 25th Oct 2012 The Semantic Web and PR, a blog post by David Phillips, 13th May 2013 The Web this decade and what it means for your organisation, a blog post by Philip Sheldrake, 25th May 2011 The most exciting development in PR since the Cluetrain, a blog post by Philip Sheldrake, 22ndApril 2010 25
  29. 29. Philip is Managing Partner, Euler Partners. He’s the author of The Business of Influence: Reframing Marketing and PR for the DigitalAge (Wiley, 2011) and Attenzi – a social business story (2013). He contributed the digital marketing chapter of The Marketing Century, a book celebrating the centenary of the Chartered Institute of Marketing, two chapters of the best- selling Share This from the Chartered Institute of Public Relations (CIPR), and a chapter in the follow-up Share This Too. He is a Chartered Engineer, ran Europe’s first email money service and Europe’s first Google Maps mashup. He helped ‘liberate’ UK government flood data in 2007, co- founded the CIPR social media panel and delivered his first presentation on PR and Web 3.0 in 2010, kindly hosted by Stephen Waddington. He’s a director of Tech UK (formerly Intellect). / @sheldrake Euler Partners advises organizations on how to meet the continually shifting expectations and behaviours of customers, employees, partners and shareholders by embracing social business and new ways of influencing and being influenced. 28
