Your SlideShare is downloading. ×
Sd sem weboct252010
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Sd sem weboct252010

1,408
views

Published on

Published in: Technology, Education

3 Comments
1 Like
Statistics
Notes
  • Interesting. Kind of a lot of undefined jargon, though. 'Rich snippets,' e.g. If you know what these are you probably don't need the presentation, and if you don't, the presentation is confusing. LOD. Lots of undefined stuff.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Barbara is one of the leaders in the field, so you are viewing the 'latest' with her thoughtful analysis. Don't miss the Semantic Web San Diego 11 November 2010 meeting featuring www.ai-one.com intelligent (semantic) analysis tool. I know I won't.

    Jeffrey Abbott
    HSI & Semantic Web Analyst
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Brilliant presentation. Barbara is clearly a thought-leader on how semantic technologies will transform the internet by making relevance more precise and less subject to the games played by SEO.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
1,408
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
18
Comments
3
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Leveraging  the  growth  of  the  Seman1c   Web  -­‐  from  Seman1c  SEO  to  ..... San  Diego  Seman+c  Web  Meetup Oct  25,  2010 Barbara Starr Email: bstarr@Ontologica.us Twitter: @BarbaraStarr
  • 2. So  …  Let  us  begin  to  take  a  look  at   how  the  Seman+c  Web  is  being  used   and  leveraged  in  the  real  world  of   late  (feel  free  to  add:  ….. And  of  course,  who  is  using  it  ,  how,  ........
  • 3. Seman+c  Search/SEO   The  major  Search  Engines  &  Social   Networks  are  currently  leveraging   Seman+c  Web  Technology
  • 4. What  is  Seman+c  Search • Semantic Search is basically the notion of improving search by using metadata or searching on that metadata. • There are several ways that the Search engines on the web may use this to enhance search results. – FIND, rather than SEARCH. • Searching directly on the metadata directly can yield specific answers or results as demonstrated in the following example: Query “Barack Obama Birthday” Results on
  • 5. Google  acquires  Metaweb
  • 6. Defini+ve  Answer  on  Top
  • 7. Bing Definitive Answer Note: Freebase part of Metaweb acquisition by Google
  • 8. Definitive answer & enhanced display Bing leveraged  this  for  quite  some  +me
  • 9. What  is  Seman+c  Search  (cont) • Semantic Search is basically the notion of improving search by using metadata or searching on that metadata. • There are several ways that the Search engines on the web may use this to enhance search results. – FIND, rather than SEARCH. • Searching directly on the metadata directly can yield specific answers or results as demonstrated in the following example: • Ran the query “Barack Obama Birthday” on both google, and bing. Obtained the following: – Answer  engines  rather  than  Search  Engines?
  • 10. What  is  Seman+c  Search  (Cont) • Semantic Search is basically the notion of improving search by using metadata or searching on that metadata. • There are several ways that the Search engines on the web may use this to enhance search results. – FIND, rather than SEARCH. – Another aspect of using metadata such as embedding metadata or semantic markup in web pages could be demonstrated by enhanced displays in search results (e.g. rich snippets  in  google).    Both  Google  and   Yahoo  support  enhanced  displays  for  RDFa  markup.    
  • 11. Rich  Snippets • Google  now  supports  Rich  snippets  for – People – Events – Businesses  and  organiza+ons – Reviews – Recipes – Products  when  related  to  a  review – Breadcrumbs – Local  Search h[p://rdf.data-­‐vocabulary.org/#
  • 12. Events
  • 13. 14 Recipes
  • 14. Sept  2,  2010 now  see  more  than  twice  as  many  searches  with  rich  snippets  in  the  results  in  the  US,  and  a   four-­‐fold  increase  globally,  compared  to  one  year  ago.
  • 15. Single  Events  –  Sept  2,  2010
  • 16. Social  Networks • While  search  engines  can  benefit  from  access   to  social  networks,  social  networks  can  benefit   from  seman+c  metadata  in  web  pages –Example  is  Facebook’s  Open  Graph  Protocol  (also   supports  RDFa)  which  allows  users  to  share  &  like   objects  (such  as  products)  as  opposed  to  web   pages.  Enables  “Seman+c  Profiling”  of  the  users   by  facebook.    (Japanese  MIXI  now  using  it)
  • 17. Web  Benefits  /  Uses • Yahoo stated 15% increase in CTR as a result of enhanced displays, rich snippets in Google • Definitive answers enabled by understanding and leveraging how search engines are searching directly on metadata • Semantic Profiling and adoption by social networks • Embedding semantic markup in web pages and product pages ultimately makes information “findable” by search engines, enabling them to provide improvements such as definitive answers, enhanced displays, etc
  • 18. RDFa  produc+on • Drupal  7  now  produces  RDFa  (previous   meetup) • Many  CMS  publishers
  • 19. Consuming  RDFa • Previously  indicated  increase  of  RDFa  in   general  and  produc+on  of  RDFa • Available  consumers/parsers – Sindice  (any23) – Rdfa  dis+ller Sindice.com
  • 20. Handy  Validators • RDFA  VALIDATORS  AND  TESTERS • New  RDFa  Validator:  h[p://check.rdfa.info/ • Sindice  Inspector:  h[p://inspector.sindice.com/ • Yahoo  Objeclinder:  h[p:// developer.search.yahoo.com/help/objeclinder • Google  rich  snippets  tester:  h[p:// www.google.com/webmasters/tools/richsnippets
  • 21. Adopters? • UK  Government • US  Government • BBC  (FIFA  world  cup  site  dynamically  generated  using  linked  data) • Thomson  Reuters • Freebase • NY  Times • Best  Buy • Google  (More  to  follow  h[p://rdf.data-­‐vocabulary.org/#) • Yahoo • Facebook • Mixi • Oracle • Overstock • Drug  research  and  discovery  companies,  pfizer,  …. • Tons  more  –  Just  look  at  the  diversity  in  the  LOD  data  cloud  (genng  there)
  • 22. Spectrum  of  Applica+ons • Seman+c  Wiki’s  (Seman+c  media  Wiki) • Seman+cs  as  a  Service  (e.g.  SIRI)  –  interoperability  of  web   services,  underlying  service  Ontologies • Enterprise  data  integra+on  (Anzo, • Seman+cs  in  publishing – Open  Calais  now  has  Openpublish – Zemanta,  primal  pages – Drupal  and  other  CMS  systems • Contextual  Adver+sing • Sen+ment  Analysis  (COGITO) • Seman+c  Search  (documents  &  structured  data  sources) • Seman+c  Social  Networks
  • 23. LOD  Cloud  Evolu+on The  rate  of  growth  has  been   remarkable Source  maintained  by:  Richard  Cygniak  and  Anja  Jentsch.  h[p://lod-­‐cloud.net
  • 24. Oct  2007
  • 25. Nov  2007  (1)
  • 26. Nov  2007  (2)
  • 27. Feb  2008
  • 28. Mar  2008
  • 29. Sept  2008
  • 30. Mar  2009  (1)
  • 31. Mar  2009  (2)
  • 32. March  5  -­‐  2009 As of March 2009 LinkedCT Reactome Taxonomy KEGG PubMed GeneID Pfam UniProt OMIM PDB Symbol ChEBI Daily Med Disea- some CAS HGNC Inter Pro Drug Bank UniParc UniRef ProDom PROSITE Gene Ontology Homolo Gene Pub Chem MGI UniSTS GEO Species Jamendo BBC Programm es Music- brainz Magna- tune BBC Later + TOTP Surge Radio MySpace Wrapper Audio- Scrobbler Linked MDB BBC John Peel BBC Playcount Data Gov- Track US Census Data riese Geo- names lingvoj World Fact- book Euro- stat IRIT Toulouse SW Conference Corpus RDF Book Mashup Project Guten- berg DBLP Hannover DBLP Berlin LAAS- CNRS Buda- pest BME IEEE IBM Resex Pisa New- castle RAE 2001 CiteSeer ACM DBLP RKB Explorer eprints LIBRIS Semantic Web.org Eurécom ECS South- ampton RevyuSIOC Sites Doap- space Flickr exporter FOAF profiles flickr wrappr Crunch Base Sem- Web- Central Open- Guides Wiki- company QDOS Pub Guide Open Calais RDF ohloh W3C WordNet Open Cyc UMBEL Yago DBpedia Freebase Virtuoso Sponger
  • 33. March  27  -­‐  2009 As of March 2009 LinkedCT Reactome Taxonomy KEGG PubMed GeneID Pfam UniProt OMIM PDB Symbol ChEBI Daily Med Disea- some CAS HGNC Inter Pro Drug Bank UniParc UniRef ProDom PROSITE Gene Ontology Homolo Gene Pub Chem MGI UniSTS GEO Species Jamendo BBC Programm es Music- brainz Magna- tune BBC Later + TOTP Surge Radio MySpace Wrapper Audio- Scrobbler Linked MDB BBC John Peel BBC Playcount Data Gov- Track US Census Data riese Geo- names lingvoj World Fact- book Euro- stat flickr wrappr Open Calais RevyuSIOC Sites Doap- space Flickr exporter FOAF profiles Crunch Base Sem- Web- Central Open- Guides Wiki- company QDOS Pub Guide RDF ohloh W3C WordNet Open Cyc UMBEL Yago DBpedia Freebase Virtuoso Sponger DBLP Hannover IRIT Toulouse SW Conference Corpus RDF Book Mashup Project Guten- berg DBLP Berlin LAAS- CNRS Buda- pest BME IEEE IBM Resex Pisa New- castle RAE 2001 CiteSeer ACM DBLP RKB Explorer eprints LIBRIS Semantic Web.org Eurécom RKB ECS South- ampton CORDIS ReSIST Project Wiki National Science Foundation ECS South- ampton
  • 34. July  14  -­‐    2009
  • 35. Sept  22  -­‐  2010 As of September 2010 Music Brainz (zitgist) P20 YAGO World Fact- book (FUB) WordNet (W3C) WordNet (VUA) VIVO UF VIVO Indiana VIVO Cornell VIAF URI Burner Sussex Reading Lists Plymouth Reading Lists UMBEL UK Post- codes legislation .gov.uk Uberblic UB Mann- heim TWC LOGD Twarql transport data.gov .uk totl.net Tele- graphis TCM Gene DIT Taxon Concept The Open Library (Talis) t4gm Surge Radio STW RAMEAU SH statistics data.gov .uk St. Andrews Resource Lists ECS South- ampton EPrints Semantic Crunch Base semantic web.org Semantic XBRL SW Dog Food rdfabout US SEC Wiki UN/ LOCODE Ulm ECS (RKB Explorer) Roma RISKS RESEX RAE2001 Pisa OS OAI NSF New- castle LAAS KISTI JISC IRIT IEEE IBM Eurécom ERA ePrints dotAC DEPLOY DBLP (RKB Explorer) Course- ware CORDIS CiteSeer Budapest ACM riese Revyu research data.gov .uk reference data.gov .uk Recht- spraak. nl RDF ohloh Last.FM (rdfize) RDF Book Mashup PSH Product DB PBAC Poké- pédia Ord- nance Survey Openly Local The Open Library Open Cyc OpenCal ais OpenEI New York Times NTU Resource Lists NDL subjects MARC Codes List Man- chester Reading Lists Lotico The London Gazette LOIUS lobid Resources lobid Organi- sations Linked MDB Linked LCCN Linked GeoData Linked CT Linked Open Numbers lingvoj LIBRIS Lexvo LCSH DBLP (L3S) Linked Sensor Data (Kno.e.sis) Good- win Family Jamendo iServe NSZL Catalog GovTrack GESIS Geo Species Geo Names Geo Linked Data (es) GTAA STITCH SIDER Project Guten- berg (FUB) Medi Care Euro- stat (FUB) Drug Bank Disea- some DBLP (FU Berlin) Daily Med Freebase flickr wrappr Fishes of Texas FanHubz Event- Media EUTC Produc- tions Eurostat EUNIS ESD stan- dards Popula- tion (En- AKTing) NHS (EnAKTing) Mortality (En- AKTing) Energy (En- AKTing) CO2 (En- AKTing) education data.gov .uk ECS South- ampton Gem. Norm- datei data dcs MySpace (DBTune) Music Brainz (DBTune) Magna- tune John Peel (DB Tune) classical (DB Tune) Audio- scrobbler (DBTune) Last.fm Artists (DBTune) DB Tropes dbpedia lite DBpedia Pokedex Airports NASA (Data Incu- bator) Music Brainz (Data Incubator) Moseley Folk Discogs (Data In- cubator) Climbing Linked Data for Intervals Cornetto Chronic- ling America Chem2 Bio2RDF biz. data. gov.uk UniSTS UniRef Uni Path- way UniParc Taxo- nomy UniProt SGD Reactome PubMed Pub Chem PRO- SITE ProDom Pfam PDB OMIM OBO MGI KEGG Reaction KEGG Pathway KEGG Glycan KEGG Enzyme KEGG Drug KEGG Cpd InterPro Homolo Gene HGNC Gene Ontology GeneID Gen Bank ChEBI CAS Affy- metrix BibBase BBC Wildlife Finder BBC Program mes BBC Music rdfabout US Census
  • 36. LOD  cloud  –  Sept  22  2010 As of September 2010 Music Brainz (zitgist) P20 YAGO World Fact- book (FUB) WordNet (W3C) WordNet (VUA) VIVO UF VIVO Indiana VIVO Cornell VIAF URI Burner Sussex Reading Lists Plymouth Reading Lists UMBEL UK Post- codes legislation .gov.uk Uberblic UB Mann- heim TWC LOGD Twarql transport data.gov .uk totl.net Tele- graphis TCM Gene DIT Taxon Concept The Open Library (Talis) t4gm Surge Radio STW RAMEAU SH statistics data.gov .uk St. Andrews Resource Lists ECS South- ampton EPrints Semantic Crunch Base semantic web.org Semantic XBRL SW Dog Food rdfabout US SEC Wiki UN/ LOCODE Ulm ECS (RKB Explorer) Roma RISKS RESEX RAE2001 Pisa OS OAI NSF New- castle LAAS KISTI JISC IRIT IEEE IBM Eurécom ERA ePrints dotAC DEPLOY DBLP (RKB Explorer) Course- ware CORDIS CiteSeer Budapest ACM riese Revyu research data.gov .uk reference data.gov .uk Recht- spraak. nl RDF ohloh Last.FM (rdfize) RDF Book Mashup PSH Product DB PBAC Poké- pédia Ord- nance Survey Openly Local The Open Library Open Cyc Open Calais OpenEI New York Times NTU Resource Lists NDL subjects MARC Codes List Man- chester Reading Lists Lotico The London Gazette LOIUS lobid Resources lobid Organi- sations Linked MDB Linked LCCN Linked GeoData Linked CT Linked Open Numbers lingvoj LIBRIS Lexvo LCSH DBLP (L3S) Linked Sensor Data (Kno.e.sis) Good- win Family Jamendo iServe NSZL Catalog GovTrack GESIS Geo Species Geo Names Geo Linked Data (es) GTAA STITCH SIDER Project Guten- berg (FUB) Medi Care Euro- stat (FUB) Drug Bank Disea- some DBLP (FU Berlin) Daily Med Freebase flickr wrappr Fishes of Texas FanHubz Event- Media EUTC Produc- tions Eurostat EUNIS ESD stan- dards Popula- tion (En- AKTing) NHS (EnAKTing) Mortality (En- AKTing) Energy (En- AKTing) CO2 (En- AKTing) education data.gov .uk ECS South- ampton Gem. Norm- datei data dcs MySpace (DBTune) Music Brainz (DBTune) Magna- tune John Peel (DB Tune) classical (DB Tune) Audio- scrobbler (DBTune) Last.fm Artists (DBTune) DB Tropes dbpedia lite DBpedia Pokedex Airports NASA (Data Incu- bator) Music Brainz (Data Incubator) Moseley Folk Discogs (Data In- cubator) Climbing Linked Data for Intervals Cornetto Chronic- ling America Chem2 Bio2RDF biz. data. gov.uk UniSTS UniRef Uni Path- way UniParc Taxo- nomy UniProt SGD Reactome PubMed Pub Chem PRO- SITE ProDom Pfam PDB OMIM OBO MGI KEGG Reaction KEGG Pathway KEGG Glycan KEGG Enzyme KEGG Drug KEGG Cpd InterPro Homolo Gene HGNC Gene Ontology GeneID Gen Bank ChEBI CAS Affy- metrix BibBase BBC Wildlife Finder BBC Program mes BBC Music rdfabout US Census Media Geographic Publications Government Cross-domain Life sciences User-generated content latest  LOD  cloud
  • 37. Leveraging  Linked  Datasets     Pharmaceu+cal  example • There  are  many  ways  to  leverage  exis+ng   informa+on  and  to  perform  knowledge   discovery  within  them. • This  example  makes  use  of  the  allegrograph   plalorm  and  query  interface  supported  by   Franz  Inc,  A  web  3.0  database  provider. • Allegrograph  can  be  downloaded  from  their   website  at    h[p://www.franz.com
  • 38. Leveraging  Linked  Datasets     Pharmaceu+cal  example • Facilitates  informa+on  sharing  between   knowledge  bases  and  between  researchers • The  graphical  viewers  and  browsers  provide   by  Franz  enable  visualiza+on  of  rela+onships   between  en++es  (GRUFF  displays   rela+onships  between  en++es  as  well  as   providing  a  query  interface)
  • 39. Life  Sciences  Example  -­‐  Allegrograph • Drugs from Drug Bank • Looked them up in the text of the clinical trials LinkedCT • Looked up all side effects in SIDER and looked them up in the texts in the clinical trials. • Resulted in about a million new triples. • Ability to now search for a drug, find all the clinical trials that mention them and then also find all the side effects also mentioned in the same trials.
  • 40. Life  Sciences  Example  -­‐  Allegrograph
  • 41. Life  Sciences  Example  -­‐  Allegrograph Namely, we took a look at information dealing with: - drugs - targets - diseases - side-effects And ran a query to find all clinical trials for Atorvastatin where side effect of Atorvastatin (or lipitor) is type 2 diabetes
  • 42. Life  Sciences  Example  -­‐  Allegrograph SPARQL query: SELECT ?drug ?sideeffect ?trial WHERE { ?drug rdfs:label 'Atorvastatin' . ?sideeffect rdfs:label 'Type 2 Diabetes' . ?trial franz:discusses-drug ?drug . ?trial franz:discusses-side-effect ?sideeffect . } limit 10 Translated  into  English,  the  SPARQL  query  reads:      “find  every,  drug,  sideffect  and  clinical  trial  where   the  label  of  the  drug  is  Atorvasta+n,  the  side  effect   is  type  2  diabetes,  restrict  output  to  10  ” Example  by:  (Jans  Aasman  –  Franz  Inc)                                      Web  3.0’s  database
  • 43. Life  Sciences  Example  -­‐  Allegrograph
  • 44. Tools  for  more  profitable  eCommerce
  • 45. Online  Commerce • BEST  BUY  and  other  retailers  are  using   seman+c  technologies  to  improve  visibility  of   of  products  and  services  leveraging: – Goodrela+ons  Ontology  for  e-­‐Commerce – RDFa
  • 46. Other  major  online  retailers  also  leveraging  the  technology
  • 47. h[p://www.overstock.com/Home-­‐Garden/Hotel-­‐8-­‐ piece-­‐Comforter-­‐Set/367226/product.html
  • 48. Sindice  Inspector  -­‐  .nt  format
  • 49. Gruff  View
  • 50. Summary • Significant  adop+on  in  many  arenas  and  by  many   of  the  “major  players” • Growing  number  of  Vendor’s  providing  services   and  tools • Many  open  source  tools  &  resources  (“RDFizers”,   SPARQL  endpoints,  SINDICE  –  Seman+c  Web  index) • Technology  mature  enough  at  this  point  to  provide   compe++ve  advantage  in  many  arenas.