Web 3.0 emerges… Jim Hendler Tetherless World Professor of Computer and Cognitive Science Assistant Dean of Information Technology and Web Science Rensselaer Polytechnic Institute http://www.cs.rpi.edu/~hendler
 
The Semantic Web (ca. 2001) (Berners-Lee, Hendler, Lassila; 2001)
Semantic Web ca. 2010 Semantic Web finding success even in tough market Lots of small companies in the market: Altova… Zepheira (eg. C&P, Franz, Intellidimension, Intellisophic, Ontology Works, Siderean, SandPiper, SiberLogic, TopQuadrant …) Web 3.0 new buzzword: Garlik, Twine, Freebase, Bintro, Siri, Talis, … Semantic Search taking off - Powerset bought by Microsoft for over $100,000,000, hakia, bing, T2, tiptop, … Semantic match: classifieds (bintro), clinical studies (TrialX.com) Bigger players buying in 2009 announcements at SemTech (June): Google, New York Times, Oracle, IBM, Yahoo, MS Live Labs, Siri, … 2008: Gartner identifies Corporate Semantic Web as one of three "High impact" Web technologies Tool market forming: AllegroGraph, TopBraid, Pellet2, … O’Reilly “Programming the Semantic Web”  Government projects in and across agencies Recent open data announcements by UK and US UK in linked data format, US 3 rd  party to linked data (5B triples so far) Projects/demos in EU, Japan, Korea, China, India… SKOS update in govt (and private) libraries Several "verticals" heavily using Semantic Web technologies Health Care and Life Sciences  Financial services Human Resources Publishing/New Media Sciences other than Life Science Virtual observatory, Geo ontology, …
Two very different sorts of use cases cf.  US National Center for Biotechnology Information, "Oncology Metathesaurus" 50,000+ classes, ~8 people supporting full time, monthly updates, mandated for use by NIH-funded cancer researchers OWL DL rigorously followed Provably consistent cf.  Friend of a Friend (Foaf) 30+ classes, Dan Brickley and Libby Miller made it, maintained by consensus in a small community of developers Violates DL rules (undecidable) Used inconsistently
Widely varying use NCBI Oncology Ontology  High use in medical community Very "trusted" information (provenance from NCBI) Primarily terminological (relationships between cancer-related concepts), not data-oriented  FOAF ~60M Foaf people (not necessarily distinct individuals)  Used by a number of large providers If you use LiveJournal, you have a FOAF file Also flickr, ecademy, tribe, joost, … And you can export Foaf from Facebook and many other social networking sites Becoming de facto standard for open social networking
Why ? NCBI view: Formal properties Based on a decidable subset of KR Description logics For which much scaling research has been happening Ca. 2000 - 10,000 axioms, no facts, 1 day Ca. 2008 - 50,000 axioms, million facts, 10 min. Not just faster computers (but Moore's Law helps), significant research into optimization, "average case" Moving to parallel (Web server) With some new ways of linking to larger data sets SHER, IBM, "reduced Abox" OWL-Prime, Oracle, "materialized views” OWL 2 QL (?) In this view OWL is a formal  knowledge representation  standard
The argument for this  seems compelling  When "folksonomy" isn't enough… Which one do you want  your   doctor to use?
But the cost is high Formal modeling finds its use cases in verticals and enterprises Where the vocabulary can be controlled Where finding things in the data is important Example Drug discovery from data Model  the molecule (site, chemical properties, etc) as  faithfully  and expressively as possible Use "Realization" to categorize data assets against the ontology Bad or missed answers are money down the drain But the modeling is very expensive and the return on investment must be very high! Which is part of why the "expert systems revolution" wasn't one Became part of the technology tool kit, a useful niche in the programming pantheon, but didn't change the world
The alternative OWL is based on RDF, a language designed for the (Semantic) Web Built with Web architecture in mind Exploits Web infrastructure, respects W3C TAG recommendations Internationalization, accessibility, extensibility Fits the  Web culture Open and extensible, supports communities of interest If you don't like my ontology, extend it, change it, or build your own Fits the Web application development paradigm Scales like "databases" With some new ways of linking to formal models Heavy use of a small amount of OWL   Generally used "like it sounds" not like the formal model Example "owl:sameAs"  debate “ linked data” often used to describe this low semantics Semantic Web (slogan: a little semantics goes a long way)
Semantic   Web Applications ~2006: Web app developers discover the Semantic Web RDF Triple Store Dynamic Content Engine HTTP RDF Web App (w SPARQL) RDF Triple Store … HTML 2008 examples include sites from "regular" Web players such as Dow Jones, Reuters and Yahoo!
cf.  Yahoo mixes RDF with other  technologies: at Web scale Dave  Beckett, SemTech 08 http://www.semantic-conference.com/session/733/
Linked Data + Semantics "Linked Data" approach finds its use cases in Web Applications (at Web scales) A lot of data, a little semantics Finding anything in the mess can be a win! Example Declare simple inferable relationships and apply, at scale, to large, heterogeneous data collections eg.  Use InverseFunctional triangulation to find the entities that can be inferred to be the same These are "heuristics" not every answer must be right (qua Google)  But remember  time = money !
The data is out there Example: the linked open data cloud now has tens of billions of triples and is growing rapidly
Government Data on the Web
Moving data.gov to linked data (UK) Built around linked data with top-down push from “Number 10”
Moving data.gov to linked data (US) Third parties (like RPI) translate the govt data into Sem Web forms and link to sources •  Plans for a semantic.data.gov in OGD implementation plans,, but unfunded
Adding Meta-data
Pump through to Google Viz for demos
Data.gov + epa.gov
Adding some Web magic Web Analytics Social Data Networks External Links
Mashup w/Web content
Linked Data  (RDF, SPARQL) Semantic Web  (RDFS, owl) Web 3.0  Web 2.0 Web 3.0 extends current Web applications using Semantic Web technologies and graph-based, open data.
Semantic Search IEEE Computer, Jan 2010; IEEE Computing Now, Feb 2010 (free)
Semantic Search examples T2 (twine.com) TipTop ( feeltiptop.com )
Web 3.0 examples Semantic classified (bintro.com)
Trialx.com
Web 3.0 examples
Web 3.0 examples Social database (freebase.com)
Tiptop health Ok, so we have a way to go on this one  
Web 3.0 excitement (hype?) Significant and growing commercial interest… Web:  Google, Amazon, Travelocity… Web 2.0: Facebook, Wikipedia, YouTube, Twitter… Web 3.0: the big ones are still out there
1B computers (mostly owned by a few large companies) 3B cell phones (most are Web enable)
Sem Web  going mobile? (Add social contexts)
Summary The Semantic Web is going just fine thank you People asking “how,” not why So far the commercial driver has been “weak semantics” In the enterprise Web 3.0 adds semantics as a value add to regular Web functionality Semantic search Semantic match Semantic etc The big one is still out there
What’s next? Still, like many researchers working in this area, Hendler is already looking beyond emerging Semantic Web strategies and related technologies that are now collectively called Web 3.0. “This stuff is new and exciting,” he says. “ But I look at it this way: I started playing with the Semantic Web back in the 1990s. As a researcher, I’m not content to sit around and exploit Web 3.0; my job is to help create Web 4.0.” “Engineering the Web’s Third Decade,” CACM March, 2010

Web 3.0 Emerging

  • 1.
    Web 3.0 emerges…Jim Hendler Tetherless World Professor of Computer and Cognitive Science Assistant Dean of Information Technology and Web Science Rensselaer Polytechnic Institute http://www.cs.rpi.edu/~hendler
  • 2.
  • 3.
    The Semantic Web(ca. 2001) (Berners-Lee, Hendler, Lassila; 2001)
  • 4.
    Semantic Web ca.2010 Semantic Web finding success even in tough market Lots of small companies in the market: Altova… Zepheira (eg. C&P, Franz, Intellidimension, Intellisophic, Ontology Works, Siderean, SandPiper, SiberLogic, TopQuadrant …) Web 3.0 new buzzword: Garlik, Twine, Freebase, Bintro, Siri, Talis, … Semantic Search taking off - Powerset bought by Microsoft for over $100,000,000, hakia, bing, T2, tiptop, … Semantic match: classifieds (bintro), clinical studies (TrialX.com) Bigger players buying in 2009 announcements at SemTech (June): Google, New York Times, Oracle, IBM, Yahoo, MS Live Labs, Siri, … 2008: Gartner identifies Corporate Semantic Web as one of three "High impact" Web technologies Tool market forming: AllegroGraph, TopBraid, Pellet2, … O’Reilly “Programming the Semantic Web” Government projects in and across agencies Recent open data announcements by UK and US UK in linked data format, US 3 rd party to linked data (5B triples so far) Projects/demos in EU, Japan, Korea, China, India… SKOS update in govt (and private) libraries Several "verticals" heavily using Semantic Web technologies Health Care and Life Sciences Financial services Human Resources Publishing/New Media Sciences other than Life Science Virtual observatory, Geo ontology, …
  • 5.
    Two very differentsorts of use cases cf. US National Center for Biotechnology Information, "Oncology Metathesaurus" 50,000+ classes, ~8 people supporting full time, monthly updates, mandated for use by NIH-funded cancer researchers OWL DL rigorously followed Provably consistent cf. Friend of a Friend (Foaf) 30+ classes, Dan Brickley and Libby Miller made it, maintained by consensus in a small community of developers Violates DL rules (undecidable) Used inconsistently
  • 6.
    Widely varying useNCBI Oncology Ontology High use in medical community Very "trusted" information (provenance from NCBI) Primarily terminological (relationships between cancer-related concepts), not data-oriented FOAF ~60M Foaf people (not necessarily distinct individuals) Used by a number of large providers If you use LiveJournal, you have a FOAF file Also flickr, ecademy, tribe, joost, … And you can export Foaf from Facebook and many other social networking sites Becoming de facto standard for open social networking
  • 7.
    Why ? NCBIview: Formal properties Based on a decidable subset of KR Description logics For which much scaling research has been happening Ca. 2000 - 10,000 axioms, no facts, 1 day Ca. 2008 - 50,000 axioms, million facts, 10 min. Not just faster computers (but Moore's Law helps), significant research into optimization, "average case" Moving to parallel (Web server) With some new ways of linking to larger data sets SHER, IBM, "reduced Abox" OWL-Prime, Oracle, "materialized views” OWL 2 QL (?) In this view OWL is a formal knowledge representation standard
  • 8.
    The argument forthis seems compelling When "folksonomy" isn't enough… Which one do you want your doctor to use?
  • 9.
    But the costis high Formal modeling finds its use cases in verticals and enterprises Where the vocabulary can be controlled Where finding things in the data is important Example Drug discovery from data Model the molecule (site, chemical properties, etc) as faithfully and expressively as possible Use "Realization" to categorize data assets against the ontology Bad or missed answers are money down the drain But the modeling is very expensive and the return on investment must be very high! Which is part of why the "expert systems revolution" wasn't one Became part of the technology tool kit, a useful niche in the programming pantheon, but didn't change the world
  • 10.
    The alternative OWLis based on RDF, a language designed for the (Semantic) Web Built with Web architecture in mind Exploits Web infrastructure, respects W3C TAG recommendations Internationalization, accessibility, extensibility Fits the Web culture Open and extensible, supports communities of interest If you don't like my ontology, extend it, change it, or build your own Fits the Web application development paradigm Scales like "databases" With some new ways of linking to formal models Heavy use of a small amount of OWL Generally used "like it sounds" not like the formal model Example "owl:sameAs" debate “ linked data” often used to describe this low semantics Semantic Web (slogan: a little semantics goes a long way)
  • 11.
    Semantic Web Applications ~2006: Web app developers discover the Semantic Web RDF Triple Store Dynamic Content Engine HTTP RDF Web App (w SPARQL) RDF Triple Store … HTML 2008 examples include sites from "regular" Web players such as Dow Jones, Reuters and Yahoo!
  • 12.
    cf. Yahoomixes RDF with other technologies: at Web scale Dave Beckett, SemTech 08 http://www.semantic-conference.com/session/733/
  • 13.
    Linked Data +Semantics "Linked Data" approach finds its use cases in Web Applications (at Web scales) A lot of data, a little semantics Finding anything in the mess can be a win! Example Declare simple inferable relationships and apply, at scale, to large, heterogeneous data collections eg. Use InverseFunctional triangulation to find the entities that can be inferred to be the same These are "heuristics" not every answer must be right (qua Google) But remember time = money !
  • 14.
    The data isout there Example: the linked open data cloud now has tens of billions of triples and is growing rapidly
  • 15.
  • 16.
    Moving data.gov tolinked data (UK) Built around linked data with top-down push from “Number 10”
  • 17.
    Moving data.gov tolinked data (US) Third parties (like RPI) translate the govt data into Sem Web forms and link to sources • Plans for a semantic.data.gov in OGD implementation plans,, but unfunded
  • 18.
  • 19.
    Pump through toGoogle Viz for demos
  • 20.
  • 21.
    Adding some Webmagic Web Analytics Social Data Networks External Links
  • 22.
  • 23.
    Linked Data (RDF, SPARQL) Semantic Web (RDFS, owl) Web 3.0 Web 2.0 Web 3.0 extends current Web applications using Semantic Web technologies and graph-based, open data.
  • 24.
    Semantic Search IEEEComputer, Jan 2010; IEEE Computing Now, Feb 2010 (free)
  • 25.
    Semantic Search examplesT2 (twine.com) TipTop ( feeltiptop.com )
  • 26.
    Web 3.0 examplesSemantic classified (bintro.com)
  • 27.
  • 28.
  • 29.
    Web 3.0 examplesSocial database (freebase.com)
  • 30.
    Tiptop health Ok,so we have a way to go on this one 
  • 31.
    Web 3.0 excitement(hype?) Significant and growing commercial interest… Web: Google, Amazon, Travelocity… Web 2.0: Facebook, Wikipedia, YouTube, Twitter… Web 3.0: the big ones are still out there
  • 32.
    1B computers (mostlyowned by a few large companies) 3B cell phones (most are Web enable)
  • 33.
    Sem Web going mobile? (Add social contexts)
  • 34.
    Summary The SemanticWeb is going just fine thank you People asking “how,” not why So far the commercial driver has been “weak semantics” In the enterprise Web 3.0 adds semantics as a value add to regular Web functionality Semantic search Semantic match Semantic etc The big one is still out there
  • 35.
    What’s next? Still,like many researchers working in this area, Hendler is already looking beyond emerging Semantic Web strategies and related technologies that are now collectively called Web 3.0. “This stuff is new and exciting,” he says. “ But I look at it this way: I started playing with the Semantic Web back in the 1990s. As a researcher, I’m not content to sit around and exploit Web 3.0; my job is to help create Web 4.0.” “Engineering the Web’s Third Decade,” CACM March, 2010

Editor's Notes

  • #4 Ora
  • #5 Ora
  • #33 And for 4.0 - one big change is the move to mobile computing – there are now many billions of more cellular phones than laptops in the world, and these machines are becoming more and more capable – we need to combine communications and information technology to create the mobile Web
  • #34 The slogan of Rensselaer is “Why not change the world” – with a combination of Web, Data and Communications technology, that is just what we are doing!