Lots of SIOC Data, Now What?


Published on

Second Meeting of the IFIP Working Group 12.7 on Social Networking Semantics and Collective Intelligence / VU University, Amsterdam / 26th April 2012

Published in: Education

Lots of SIOC Data, Now What?

  1. 1. Lots of sioc data, now what? image from tinyurl.com/siocfrost
  2. 2. Data silos on the social web image from pidgintech.com
  3. 3. Can be linkedvia semantics
  4. 4. Social Semantic Web
  5. 5. Two-way street: semantic web can help social web, vice versa• Can Use semantic web to describe people, content objects and the connections that bind them all together so that social sites can interoperate via semantics• In the other direction, object-centered social websites can serve as rich social data sources for semantic applications image from tinyurl.com/highway2
  6. 6. Semantically-Interlinked Online Communities (SIOC)• Goal of the SIOC ontology is to address interoperability issues on the Social Web – sioc-project.org – W3c member submission in 2007 – SIOC has been adopted in a framework of 100 applications or modules deployed on hundreds of sites – Web 2.0, enterprise info integ, HCLS, e-gov image from tinyurl.com/friendship2
  7. 7. Some of the SIOC core ontology classes and properties
  8. 8. Sioc food chain
  9. 9. Some applications using sioc
  10. 10. RDFA in drupal 7• Drupal cms used by 2 percent of all sites• drupal 7 release has semantic web support built-in• Rdfa (sioc, foaf, dublin core, skos) data for blog posts, forums, etc.• Video at www.semantic-drupal.com image from tinyurl.com/drupaper
  11. 11. Rdfa on newsweek.com
  12. 12. An ontology stack for the social semantic web
  13. 13. DistributedArchitecture www.Smob.me
  14. 14. Sioc can be used to...• ...provide a layer of rdfa metadata from a social website, e.g. to enhance search results >> superceded by schema.org?• ...get a Complete representation/xml dump of a social website (export, import)• ...be a native format for social websites• ...do other stuff; just imagine! image from tinyurl.com/orionw
  15. 15. So…HOW MUCH SIOC DATA IS OUT THERE? images (this one and later backgrounds) from publicdomainpictures.net
  16. 16. Sindice 2011•  Trec 2011 dataset•  From 270k SLDs –  data.sindice.com/ trec2011/statistics.html•  Top 10 classes plus some of the social semantic web ones >>
  17. 17. Sindice 2012: classes• Total instances of sioc classes: 7.7M – Up 200k in three months• Most occurences: sioc:Item (2.2m) – Followed by: • UserAccount (1.6m), Microblogpost (1.3m), Post (800k), user (700k), comment (400k)... – Note: 1 billion foaf:Person instances!!!• Used on most [distinct] sites: – Item (7k), useraccount (7k), post (3k)...
  18. 18. Sindice 2012: predicates• Total instances of sioc predicates: 22.5M – Up 400k in three months• Most occurences: sioc:follows (4.6m) – Followed by: • topic (4m), account_of (3.5m), has_creator (2.7m), links_to (1.5m), has_discussion (1.3m)...• Used on most [distinct] sites: – Has_creator (8k), num_replies (7k), name (2k), account_of (1.5k), reply_of (1.5k)...
  19. 19. Sindice 2012: namespaces• Sioc data is being generated from 10k distinct domains (2k slds) (plus 2k domains for the sioc types module) – Increasing by about 100 domains a month – No doubt helped by drupal!• Foaf data is being generated from 3M distinct domains (100k slds) – Increasing by over 1000 domains a month
  20. 20. Commoncrawl•  Muehleisen and bizer –  Ldow 2012 @ www 2012•  1.5 billion web pages•  3 billion RDF quads•  Top 20 rdfa types >>
  22. 22. Make A giant brain(-storm!)• Distributed conversation navigator• Comment search engine for the a in q&a• Expert finding applications galore ... – Be cognisant of the huge growth in social semantic web data being provided by the adopters of schema.org and its new terms
  23. 23. Sioc-related initiatives
  24. 24. Brainstorm ontology
  25. 25. Sioc for e-participation• Rdf-powered e-participation platform – To be Based on wordpress+native rdf store• Will Allow citizen discussions to be linked to relevant linked data from public sites, governments, etc. – adds context• Work in progress – Galwaytf.com
  26. 26. Citizen sensors•  Model commons sensors on Android phones•  Attach sensor information to microblog posts automatically or with user approval –  Using Twitter annotations format and/or RDF (ssn + sioc + sioc types and sensors modules): •  sioct:MicroblogPost siocs:has_sensor_data ssn:Observationvalue – 
  27. 27. cross-wiki integration using sioc
  28. 28. Using ppo/ppm to access sioc +foaf data
  29. 29. New! social semantic journalism
  30. 30. John@bresl.in / @johnbreslin / johnbreslin.comTALK TO ME!