Lots of sioc data, now what?




        image from tinyurl.com/siocfrost
Data silos on the social web




         image from pidgintech.com
Can be linked
via semantics
Social Semantic Web
Two-way street: semantic web
 can help social web, vice versa
• Can Use semantic web to describe people,
   content objects and the connections that
   bind them all together so that social sites
   can interoperate via semantics
• In the other direction, object-centered
   social websites can serve as rich social
   data sources for semantic applications

              image from tinyurl.com/highway2
Semantically-Interlinked Online
       Communities (SIOC)
• Goal of the SIOC ontology is to address
  interoperability issues on the Social Web
  – sioc-project.org
  – W3c member submission in 2007
  – SIOC has been adopted in a framework of
    100 applications or modules deployed on
    hundreds of sites
  – Web 2.0, enterprise info integ, HCLS, e-gov
              image from tinyurl.com/friendship2
Some of the SIOC core ontology
   classes and properties
Sioc food chain
Some applications using sioc
RDFA in drupal 7
• Drupal cms used by 2 percent of all sites
• drupal 7 release has semantic web
   support built-in
• Rdfa (sioc, foaf, dublin core, skos) data
   for blog posts, forums, etc.
• Video at www.semantic-drupal.com


              image from tinyurl.com/drupaper
Rdfa on newsweek.com
An ontology stack for the social
         semantic web
Distributed
Architecture
               www.Smob.me
Sioc can be used to...
• ...provide a layer of rdfa metadata from
   a social website, e.g. to enhance search
   results >> superceded by schema.org?
• ...get a Complete representation/xml
   dump of a social website (export, import)
• ...be a native format for social websites
• ...do other stuff; just imagine!
              image from tinyurl.com/orionw
So…

HOW MUCH SIOC DATA IS OUT THERE?


      images (this one and later backgrounds) from publicdomainpictures.net
Sindice 2011
•  Trec 2011 dataset
•  From 270k SLDs
   –  data.sindice.com/
      trec2011/statistics.html
•  Top 10 classes plus some
    of the social semantic
    web ones >>
Sindice 2012: classes
• Total instances of sioc classes: 7.7M
  – Up 200k in three months
• Most occurences: sioc:Item (2.2m)
  – Followed by:
     • UserAccount (1.6m), Microblogpost (1.3m), Post
       (800k), user (700k), comment (400k)...
  – Note: 1 billion foaf:Person instances!!!
• Used on most [distinct] sites:
  – Item (7k), useraccount (7k), post (3k)...
Sindice 2012: predicates
• Total instances of sioc predicates: 22.5M
  – Up 400k in three months
• Most occurences: sioc:follows (4.6m)
  – Followed by:
     • topic (4m), account_of (3.5m), has_creator
        (2.7m), links_to (1.5m), has_discussion (1.3m)...
• Used on most [distinct] sites:
  – Has_creator (8k), num_replies (7k), name
    (2k), account_of (1.5k), reply_of (1.5k)...
Sindice 2012: namespaces
• Sioc data is being generated from 10k
   distinct domains (2k slds) (plus 2k
   domains for the sioc types module)
  – Increasing by about 100 domains a month
  – No doubt helped by drupal!
• Foaf data is being generated from 3M
   distinct domains (100k slds)
  – Increasing by over 1000 domains a month
Commoncrawl
•  Muehleisen and bizer
   –  Ldow 2012 @ www 2012
•  1.5 billion web pages
•  3 billion RDF quads
•  Top 20 rdfa types >>
WE HAVE MADE ALL THIS DATA, NOW
DREAM ABOUT THE USEFUL APPLICATIONS!
Make A giant brain(-storm!)
• Distributed conversation navigator
• Comment search engine for the a in q&a
• Expert finding applications galore
   ...
  – Be cognisant of the huge growth in social
    semantic web data being provided by the
    adopters of schema.org and its new terms
Sioc-related initiatives
Brainstorm ontology
Sioc for e-participation
• Rdf-powered e-participation platform
  – To be Based on wordpress+native rdf store
• Will Allow citizen discussions to be linked
  to relevant linked data from public sites,
  governments, etc.
  – adds context
• Work in progress
  – Galwaytf.com
Citizen sensors
•  Model commons sensors on Android phones
•  Attach sensor information to microblog posts
   automatically or with user approval
   –  Using Twitter annotations format and/or RDF (ssn +
      sioc + sioc types and sensors modules):
        •  sioct:MicroblogPost siocs:has_sensor_data
            ssn:Observationvalue




   – 
cross-wiki integration using sioc
Using ppo/ppm to access sioc
         +foaf data
New! social semantic journalism
John@bresl.in / @johnbreslin / johnbreslin.com

TALK TO ME!

Lots of SIOC Data, Now What?

  • 1.
    Lots of siocdata, now what? image from tinyurl.com/siocfrost
  • 2.
    Data silos onthe social web image from pidgintech.com
  • 3.
  • 4.
  • 5.
    Two-way street: semanticweb can help social web, vice versa • Can Use semantic web to describe people, content objects and the connections that bind them all together so that social sites can interoperate via semantics • In the other direction, object-centered social websites can serve as rich social data sources for semantic applications image from tinyurl.com/highway2
  • 6.
    Semantically-Interlinked Online Communities (SIOC) • Goal of the SIOC ontology is to address interoperability issues on the Social Web – sioc-project.org – W3c member submission in 2007 – SIOC has been adopted in a framework of 100 applications or modules deployed on hundreds of sites – Web 2.0, enterprise info integ, HCLS, e-gov image from tinyurl.com/friendship2
  • 8.
    Some of theSIOC core ontology classes and properties
  • 10.
  • 11.
  • 12.
    RDFA in drupal7 • Drupal cms used by 2 percent of all sites • drupal 7 release has semantic web support built-in • Rdfa (sioc, foaf, dublin core, skos) data for blog posts, forums, etc. • Video at www.semantic-drupal.com image from tinyurl.com/drupaper
  • 13.
  • 14.
    An ontology stackfor the social semantic web
  • 15.
  • 16.
    Sioc can beused to... • ...provide a layer of rdfa metadata from a social website, e.g. to enhance search results >> superceded by schema.org? • ...get a Complete representation/xml dump of a social website (export, import) • ...be a native format for social websites • ...do other stuff; just imagine! image from tinyurl.com/orionw
  • 17.
    So… HOW MUCH SIOCDATA IS OUT THERE? images (this one and later backgrounds) from publicdomainpictures.net
  • 18.
    Sindice 2011 •  Trec2011 dataset •  From 270k SLDs –  data.sindice.com/ trec2011/statistics.html •  Top 10 classes plus some of the social semantic web ones >>
  • 19.
    Sindice 2012: classes • Totalinstances of sioc classes: 7.7M – Up 200k in three months • Most occurences: sioc:Item (2.2m) – Followed by: • UserAccount (1.6m), Microblogpost (1.3m), Post (800k), user (700k), comment (400k)... – Note: 1 billion foaf:Person instances!!! • Used on most [distinct] sites: – Item (7k), useraccount (7k), post (3k)...
  • 20.
    Sindice 2012: predicates • Totalinstances of sioc predicates: 22.5M – Up 400k in three months • Most occurences: sioc:follows (4.6m) – Followed by: • topic (4m), account_of (3.5m), has_creator (2.7m), links_to (1.5m), has_discussion (1.3m)... • Used on most [distinct] sites: – Has_creator (8k), num_replies (7k), name (2k), account_of (1.5k), reply_of (1.5k)...
  • 21.
    Sindice 2012: namespaces • Siocdata is being generated from 10k distinct domains (2k slds) (plus 2k domains for the sioc types module) – Increasing by about 100 domains a month – No doubt helped by drupal! • Foaf data is being generated from 3M distinct domains (100k slds) – Increasing by over 1000 domains a month
  • 22.
    Commoncrawl •  Muehleisen andbizer –  Ldow 2012 @ www 2012 •  1.5 billion web pages •  3 billion RDF quads •  Top 20 rdfa types >>
  • 23.
    WE HAVE MADEALL THIS DATA, NOW DREAM ABOUT THE USEFUL APPLICATIONS!
  • 24.
    Make A giantbrain(-storm!) • Distributed conversation navigator • Comment search engine for the a in q&a • Expert finding applications galore ... – Be cognisant of the huge growth in social semantic web data being provided by the adopters of schema.org and its new terms
  • 25.
  • 26.
  • 27.
    Sioc for e-participation • Rdf-powerede-participation platform – To be Based on wordpress+native rdf store • Will Allow citizen discussions to be linked to relevant linked data from public sites, governments, etc. – adds context • Work in progress – Galwaytf.com
  • 28.
    Citizen sensors •  Modelcommons sensors on Android phones •  Attach sensor information to microblog posts automatically or with user approval –  Using Twitter annotations format and/or RDF (ssn + sioc + sioc types and sensors modules): •  sioct:MicroblogPost siocs:has_sensor_data ssn:Observationvalue – 
  • 29.
  • 30.
    Using ppo/ppm toaccess sioc +foaf data
  • 31.
  • 32.
    John@bresl.in / @johnbreslin/ johnbreslin.com TALK TO ME!