SlideShare a Scribd company logo
1 of 19
Integrating  and Interpreting Social Data from Heterogeneous Sources Matthew Rowe  Organisations, Information and Knowledge Group University of Sheffield SuvodeepMazumdar Department of Information Studies University of Sheffield
Outline Information overload Increase in social data publication Interlinking social data Metadata Generation Integrating Social Data Application: Interpreting Social Data Cumbrian Floods Use Case Interacting with Social Data Conclusions
Information Overload Masses of social data are published every day E.g. 50 million tweets (600 per second) http://blog.twitter.com 22million Facebook users in the UK http://www.clickymedia.co.uk/2009/10/uk-facebook-user-statistics-october-2009/ Too much information to deal with! Social data is multi-faceted: Provenance Topic Geo Trend services (e.g. trendistic, blogpulse): Focus on majority consensus Need to listen in to a specific topic Concentrate on a single source/platform Do not consider geo facet
Interlinking Social Data Consider multi-faceted nature of social data: Allows fine-grained analysis Show geo-localised social data Relevant past social data Solution: Interlink social data from heterogeneous sources Use semantics! Consistent data interpretation
Metadata Generation Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post and itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance
Metadata Generation <photo id="949406913" media="photo">	   <owner nsid="54948696@N00”/>   <title>DSC00171.JPG</title>	   <description></description>	   <dates posted="1205398307" taken="2009-01-09 09:16:31" lastupdate="1257421561" />   <tags>		     <tag id="24539622-2330113101-400" author="54948696@N00" raw="arctic">arctic</tag>     <tag id="24539622-2330113101-401" author="54948696@N00" raw="monkeys">monkeys</tag>   </tags>   <location latitude="53.4813" longitude="-2.2392" place_id="R8vDw_abBpSzUA">     <locality place_id="R8vDw_abBpSzUA" woeid="27872">Manchester</locality>     <region place_id="pn4MsiGbBZlXeplyXg" woeid="24554868">England</region>     <country place_id="DevLebebApj4RVbtaQ" woeid="23424975">United Kingdom</country>   </location>	 </photo>	 Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post and itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <status>   <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at>   <id>9774519667</id>   <text>Writing up our Geovation work for #lupas2010.</text>   <truncated>false</truncated>   <in_reply_to_status_id></in_reply_to_status_id>   <in_reply_to_user_id></in_reply_to_user_id>   <favorited>false</favorited>   <in_reply_to_screen_name></in_reply_to_screen_name>   <geo xmlns:georss="http://www.georss.org/georss">     <georss:point>53.3833,-1.4722</georss:point>   </geo> </status>
Metadata Generation Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post and itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <status>   <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at>   <id>9774519667</id>   <text>Writing up our Geovation work for #lupas2010.</text>   <truncated>false</truncated>   <in_reply_to_status_id></in_reply_to_status_id>   <in_reply_to_user_id></in_reply_to_user_id>   <favorited>false</favorited>   <in_reply_to_screen_name></in_reply_to_screen_name>   <geo xmlns:georss="http://www.georss.org/georss">     <georss:point>53.3833,-1.4722</georss:point>   </geo> </status>
Metadata Generation <status>   <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at>   <id>9774519667</id>   <text>Writing up our Geovation work for #lupas2010.</text>   <truncated>false</truncated>   <in_reply_to_status_id></in_reply_to_status_id>   <in_reply_to_user_id></in_reply_to_user_id>   <favorited>false</favorited>   <in_reply_to_screen_name></in_reply_to_screen_name>   <geo xmlns:georss="http://www.georss.org/georss">     <georss:point>53.3833,-1.4722</georss:point>   </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667>  rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ;
Metadata Generation <status>   <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at>   <id>9774519667</id>   <text>Writing up our Geovation work for #lupas2010.</text>   <truncated>false</truncated>   <in_reply_to_status_id></in_reply_to_status_id>   <in_reply_to_user_id></in_reply_to_user_id>   <favorited>false</favorited>   <in_reply_to_screen_name></in_reply_to_screen_name>   <geo xmlns:georss="http://www.georss.org/georss">     <georss:point>53.3833,-1.4722</georss:point>   </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667>  rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ;  sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ;
Metadata Generation <status>   <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at>   <id>9774519667</id>   <text>Writing up our Geovation work for #lupas2010.</text>   <truncated>false</truncated>   <in_reply_to_status_id></in_reply_to_status_id>   <in_reply_to_user_id></in_reply_to_user_id>   <favorited>false</favorited>   <in_reply_to_screen_name></in_reply_to_screen_name>   <geo xmlns:georss="http://www.georss.org/georss">     <georss:point>53.3833,-1.4722</georss:point>   </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667>  rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ;  sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ; itr:has_Localization _:a2 . _:a2 rdf:typegml:Geometry ; gml:pos "53.3833,-1.4722" .
Metadata Generation <status>   <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at>   <id>9774519667</id>   <text>Writing up our Geovation work for #lupas2010.</text>   <truncated>false</truncated>   <in_reply_to_status_id></in_reply_to_status_id>   <in_reply_to_user_id></in_reply_to_user_id>   <favorited>false</favorited>   <in_reply_to_screen_name></in_reply_to_screen_name>   <geo xmlns:georss="http://www.georss.org/georss">     <georss:point>53.3833,-1.4722</georss:point>   </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667>  rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ;  sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ; dcterms:created "2010-2-28 12:22:47.0" ; itr:has_Localization _:a2 . _:a2 rdf:typegml:Geometry ; gml:pos "53.3833,-1.4722" .
Metadata Generation <status>   <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at>   <id>9774519667</id>   <text>Writing up our Geovation work for #lupas2010.</text>   <truncated>false</truncated>   <in_reply_to_status_id></in_reply_to_status_id>   <in_reply_to_user_id></in_reply_to_user_id>   <favorited>false</favorited>   <in_reply_to_screen_name></in_reply_to_screen_name>   <geo xmlns:georss="http://www.georss.org/georss">     <georss:point>53.3833,-1.4722</georss:point>   </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow>  rdf:typefoaf:Person ; rdf:typeitr:LocalizedResource ;	 foaf:name "Matthew Rowe" ; foaf:homepage <http://www.dcs.shef.ac.uk/~mrowe> ; <http://twitter.com/mattroweshow/9774519667>  rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ;  sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ; dcterms:created "2010-2-28 12:22:47.0" ; sioc:hasCreator <http://twitter.com/mattroweshow> ; itr:has_Localization _:a2 . _:a2 rdf:typegml:Geometry ; gml:pos "53.3833,-1.4722" .
Integrated Social Data Triplify social data from multiple platforms Flickr XML response -> RDF Picassa XML response -> RDF Use common semantics Can perform SPARQL queries PREFIX dcterms:<http://purl.org/dc/terms> SELECT ?item WHERE { 	?item dcterms:subject "iranelections" . 	?item dcterms:created ?date } ORDER BY DESC(?date) PREFIX dcterms:<http://purl.org/dc/terms> PREFIX itr:<http://www.dcs.shef.ac.uk/~gregoire/interaction/ns#> PREFIX gml:<http://www.opengis.net/gml/> SELECT DISTINCT ?post ?tag WHERE { 	?post dcterms:subject ?tag . 	?post itr:has_Localization ?geo . 	?geo gml:pos "53.4813,-2.2392"   }
Interpreting Social Data Cumbrian Use Case UK region suffered worst floods in centuries Observe the effects in social data Rise in publication Fine-grained geocoded social data		 Dataset: Microblogs from 200 Cumbrian Twitter users Published during 2009 3513 microblogs Produced 475,043 triples Images from Flickr taken in Cumbria 6663 images Produced 182,304
Interacting with Social Data Built a visualisation application to analyse social data fragments http://www.dcs.shef.ac.uk/~suvodeep/ViziSocial Filter by date Lower slider Fine-grained focus Zoom in Tag cloud Shows fragment topics Window controls tag cloud topics Markers contain number of fragments
Conclusions Consistent interpretation of social data	 Across heterogeneous sources Application Allows analyses of social data To fine-grained detail Utilises multiple facets of social data Requires metadata  Issue of scalability Future Work Adapting to real time data acquisition	 Focussing on South Yorkshire region at present Assess scalability issue
Twitter:  @mattroweshow Web:     http://www.dcs.shef.ac.uk/~mrowe Email:   m.rowe@dcs.shef.ac.uk Questions?

More Related Content

What's hot (12)

10/12/11 Boston Area SharePoint Users Group Meeting
10/12/11 Boston Area SharePoint Users Group Meeting10/12/11 Boston Area SharePoint Users Group Meeting
10/12/11 Boston Area SharePoint Users Group Meeting
 
3/9/11 Boston Area SharePoint Users Group Meeting
3/9/11 Boston Area SharePoint Users Group Meeting3/9/11 Boston Area SharePoint Users Group Meeting
3/9/11 Boston Area SharePoint Users Group Meeting
 
Search on Mobile - Mobile Copenhagen 2012
Search on Mobile - Mobile Copenhagen 2012Search on Mobile - Mobile Copenhagen 2012
Search on Mobile - Mobile Copenhagen 2012
 
Boston Area SharePoint User Group 10/21/10 Meeting
Boston Area SharePoint User Group 10/21/10 MeetingBoston Area SharePoint User Group 10/21/10 Meeting
Boston Area SharePoint User Group 10/21/10 Meeting
 
7/14/10 Boston Area SharePoint Users Group Meeting
7/14/10 Boston Area SharePoint Users Group Meeting7/14/10 Boston Area SharePoint Users Group Meeting
7/14/10 Boston Area SharePoint Users Group Meeting
 
8/11/10 Boston Area SharePoint Users Group meeting
8/11/10 Boston Area SharePoint Users Group meeting8/11/10 Boston Area SharePoint Users Group meeting
8/11/10 Boston Area SharePoint Users Group meeting
 
Boston Area SharePoint Users Group January 11th, 2012 Meeting
Boston Area SharePoint Users Group January 11th, 2012 MeetingBoston Area SharePoint Users Group January 11th, 2012 Meeting
Boston Area SharePoint Users Group January 11th, 2012 Meeting
 
January 9th, 2013 BASPUG Meeting
January 9th, 2013 BASPUG MeetingJanuary 9th, 2013 BASPUG Meeting
January 9th, 2013 BASPUG Meeting
 
Online policy primer google - al black
Online policy primer   google - al blackOnline policy primer   google - al black
Online policy primer google - al black
 
Web As A Platform
Web As A PlatformWeb As A Platform
Web As A Platform
 
Google Search Policy Primer
Google Search Policy PrimerGoogle Search Policy Primer
Google Search Policy Primer
 
BASPUG 8/13/13 Meeting
BASPUG 8/13/13 MeetingBASPUG 8/13/13 Meeting
BASPUG 8/13/13 Meeting
 

Similar to Integrating and Interpreting Social Data from Heterogeneous Sources

technical fluency
technical fluencytechnical fluency
technical fluencyjudell
 
Agile Descriptions
Agile DescriptionsAgile Descriptions
Agile DescriptionsTony Hammond
 
Social Media Release Xml
Social Media Release XmlSocial Media Release Xml
Social Media Release XmlEcordia
 
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod LacoulShamod Lacoul
 
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google InfrastructureLiving in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google InfrastructurePamela Fox
 
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google InfrastructureLiving in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructureguest517f2f
 
Social Semantic Web on Facebook Open Graph protocol and Twitter Annotations
Social Semantic Web on Facebook Open Graph protocol and Twitter AnnotationsSocial Semantic Web on Facebook Open Graph protocol and Twitter Annotations
Social Semantic Web on Facebook Open Graph protocol and Twitter AnnotationsMyungjin Lee
 
Linked Data and Search: Thomas Steiner (Google Inc, Germany)
Linked Data and Search:  Thomas Steiner (Google Inc, Germany)Linked Data and Search:  Thomas Steiner (Google Inc, Germany)
Linked Data and Search: Thomas Steiner (Google Inc, Germany)FIA2010
 
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google InfrastructureLiving in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructureguest517f2f
 
MicroWSMO editor - Bachelor's thesis presentation
MicroWSMO editor - Bachelor's thesis presentationMicroWSMO editor - Bachelor's thesis presentation
MicroWSMO editor - Bachelor's thesis presentationSimone Spaccarotella
 
Creating Linked Data 2/5 Semtech2011
Creating Linked Data 2/5 Semtech2011Creating Linked Data 2/5 Semtech2011
Creating Linked Data 2/5 Semtech2011Juan Sequeda
 
OpenSocial - GTUG Stockholm Meeting Oct 1 2009
OpenSocial - GTUG Stockholm Meeting Oct 1 2009OpenSocial - GTUG Stockholm Meeting Oct 1 2009
OpenSocial - GTUG Stockholm Meeting Oct 1 2009Jacob Gyllenstierna
 
Struts2 course chapter 2: installation and configuration
Struts2 course chapter 2: installation and configurationStruts2 course chapter 2: installation and configuration
Struts2 course chapter 2: installation and configurationJavaEE Trainers
 
Illuminated Hacks -- Where 2.0 101 Tutorial
Illuminated Hacks -- Where 2.0 101 TutorialIlluminated Hacks -- Where 2.0 101 Tutorial
Illuminated Hacks -- Where 2.0 101 Tutorialmikel_maron
 
Searching the Now
Searching the NowSearching the Now
Searching the Nowlucasjosh
 
Rss Godort2008
Rss Godort2008Rss Godort2008
Rss Godort2008jajacobs
 

Similar to Integrating and Interpreting Social Data from Heterogeneous Sources (20)

technical fluency
technical fluencytechnical fluency
technical fluency
 
Agile Descriptions
Agile DescriptionsAgile Descriptions
Agile Descriptions
 
Social Media Release Xml
Social Media Release XmlSocial Media Release Xml
Social Media Release Xml
 
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
 
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google InfrastructureLiving in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
 
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google InfrastructureLiving in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
 
Social Semantic Web on Facebook Open Graph protocol and Twitter Annotations
Social Semantic Web on Facebook Open Graph protocol and Twitter AnnotationsSocial Semantic Web on Facebook Open Graph protocol and Twitter Annotations
Social Semantic Web on Facebook Open Graph protocol and Twitter Annotations
 
Linked Data and Search: Thomas Steiner (Google Inc, Germany)
Linked Data and Search:  Thomas Steiner (Google Inc, Germany)Linked Data and Search:  Thomas Steiner (Google Inc, Germany)
Linked Data and Search: Thomas Steiner (Google Inc, Germany)
 
Embedded Metadata working group
Embedded Metadata working groupEmbedded Metadata working group
Embedded Metadata working group
 
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google InfrastructureLiving in the Cloud: Hosting Data & Apps Using the Google Infrastructure
Living in the Cloud: Hosting Data & Apps Using the Google Infrastructure
 
MicroWSMO editor - Bachelor's thesis presentation
MicroWSMO editor - Bachelor's thesis presentationMicroWSMO editor - Bachelor's thesis presentation
MicroWSMO editor - Bachelor's thesis presentation
 
Creating Linked Data 2/5 Semtech2011
Creating Linked Data 2/5 Semtech2011Creating Linked Data 2/5 Semtech2011
Creating Linked Data 2/5 Semtech2011
 
Jabber Bot
Jabber BotJabber Bot
Jabber Bot
 
OpenSocial - GTUG Stockholm Meeting Oct 1 2009
OpenSocial - GTUG Stockholm Meeting Oct 1 2009OpenSocial - GTUG Stockholm Meeting Oct 1 2009
OpenSocial - GTUG Stockholm Meeting Oct 1 2009
 
Struts2 course chapter 2: installation and configuration
Struts2 course chapter 2: installation and configurationStruts2 course chapter 2: installation and configuration
Struts2 course chapter 2: installation and configuration
 
CurrentCost
CurrentCostCurrentCost
CurrentCost
 
Illuminated Hacks -- Where 2.0 101 Tutorial
Illuminated Hacks -- Where 2.0 101 TutorialIlluminated Hacks -- Where 2.0 101 Tutorial
Illuminated Hacks -- Where 2.0 101 Tutorial
 
Jquery mobile
Jquery mobileJquery mobile
Jquery mobile
 
Searching the Now
Searching the NowSearching the Now
Searching the Now
 
Rss Godort2008
Rss Godort2008Rss Godort2008
Rss Godort2008
 

More from Matthew Rowe

Social Computing Research with Apache Spark
Social Computing Research with Apache SparkSocial Computing Research with Apache Spark
Social Computing Research with Apache SparkMatthew Rowe
 
Predicting Online Community Churners using Gaussian Sequences
Predicting Online Community Churners using Gaussian SequencesPredicting Online Community Churners using Gaussian Sequences
Predicting Online Community Churners using Gaussian SequencesMatthew Rowe
 
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...Matthew Rowe
 
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting RatingsSemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings Matthew Rowe
 
The Semantic Evolution of Online Communities
The Semantic Evolution of Online CommunitiesThe Semantic Evolution of Online Communities
The Semantic Evolution of Online CommunitiesMatthew Rowe
 
From Mining to Understanding: The Evolution of Social Web Users
From Mining to Understanding: The Evolution of Social Web UsersFrom Mining to Understanding: The Evolution of Social Web Users
From Mining to Understanding: The Evolution of Social Web UsersMatthew Rowe
 
Mining User Lifecycles from Online Community Platforms and their Application ...
Mining User Lifecycles from Online Community Platforms and their Application ...Mining User Lifecycles from Online Community Platforms and their Application ...
Mining User Lifecycles from Online Community Platforms and their Application ...Matthew Rowe
 
From User Needs to Community Health: Mining User Behaviour to Analyse Online ...
From User Needs to Community Health: Mining User Behaviour to Analyse Online ...From User Needs to Community Health: Mining User Behaviour to Analyse Online ...
From User Needs to Community Health: Mining User Behaviour to Analyse Online ...Matthew Rowe
 
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...Matthew Rowe
 
Identity: Physical, Cyber, Future
Identity: Physical, Cyber, FutureIdentity: Physical, Cyber, Future
Identity: Physical, Cyber, FutureMatthew Rowe
 
Measuring the Topical Specificity of Online Communities
Measuring the Topical Specificity of Online CommunitiesMeasuring the Topical Specificity of Online Communities
Measuring the Topical Specificity of Online CommunitiesMatthew Rowe
 
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...Matthew Rowe
 
Attention Economics in Social Web Systems
Attention Economics in Social Web SystemsAttention Economics in Social Web Systems
Attention Economics in Social Web SystemsMatthew Rowe
 
What makes communities tick? Community health analysis using role compositions
What makes communities tick? Community health analysis using role compositionsWhat makes communities tick? Community health analysis using role compositions
What makes communities tick? Community health analysis using role compositionsMatthew Rowe
 
Existing Research and Future Research Agenda
Existing Research and Future Research AgendaExisting Research and Future Research Agenda
Existing Research and Future Research AgendaMatthew Rowe
 
Tutorial: Social Semantics
Tutorial: Social SemanticsTutorial: Social Semantics
Tutorial: Social SemanticsMatthew Rowe
 
Modelling and Analysis of User Behaviour in Online Communities
Modelling and Analysis of User Behaviour in Online CommunitiesModelling and Analysis of User Behaviour in Online Communities
Modelling and Analysis of User Behaviour in Online CommunitiesMatthew Rowe
 
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web SystemsUsing Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web SystemsMatthew Rowe
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsMatthew Rowe
 
Semantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic DataSemantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic DataMatthew Rowe
 

More from Matthew Rowe (20)

Social Computing Research with Apache Spark
Social Computing Research with Apache SparkSocial Computing Research with Apache Spark
Social Computing Research with Apache Spark
 
Predicting Online Community Churners using Gaussian Sequences
Predicting Online Community Churners using Gaussian SequencesPredicting Online Community Churners using Gaussian Sequences
Predicting Online Community Churners using Gaussian Sequences
 
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
Transferring Semantic Categories with Vertex Kernels: Recommendations with Se...
 
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting RatingsSemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
 
The Semantic Evolution of Online Communities
The Semantic Evolution of Online CommunitiesThe Semantic Evolution of Online Communities
The Semantic Evolution of Online Communities
 
From Mining to Understanding: The Evolution of Social Web Users
From Mining to Understanding: The Evolution of Social Web UsersFrom Mining to Understanding: The Evolution of Social Web Users
From Mining to Understanding: The Evolution of Social Web Users
 
Mining User Lifecycles from Online Community Platforms and their Application ...
Mining User Lifecycles from Online Community Platforms and their Application ...Mining User Lifecycles from Online Community Platforms and their Application ...
Mining User Lifecycles from Online Community Platforms and their Application ...
 
From User Needs to Community Health: Mining User Behaviour to Analyse Online ...
From User Needs to Community Health: Mining User Behaviour to Analyse Online ...From User Needs to Community Health: Mining User Behaviour to Analyse Online ...
From User Needs to Community Health: Mining User Behaviour to Analyse Online ...
 
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...
Changing with Time: Modelling and Detecting User Lifecycle Periods in Online ...
 
Identity: Physical, Cyber, Future
Identity: Physical, Cyber, FutureIdentity: Physical, Cyber, Future
Identity: Physical, Cyber, Future
 
Measuring the Topical Specificity of Online Communities
Measuring the Topical Specificity of Online CommunitiesMeasuring the Topical Specificity of Online Communities
Measuring the Topical Specificity of Online Communities
 
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
 
Attention Economics in Social Web Systems
Attention Economics in Social Web SystemsAttention Economics in Social Web Systems
Attention Economics in Social Web Systems
 
What makes communities tick? Community health analysis using role compositions
What makes communities tick? Community health analysis using role compositionsWhat makes communities tick? Community health analysis using role compositions
What makes communities tick? Community health analysis using role compositions
 
Existing Research and Future Research Agenda
Existing Research and Future Research AgendaExisting Research and Future Research Agenda
Existing Research and Future Research Agenda
 
Tutorial: Social Semantics
Tutorial: Social SemanticsTutorial: Social Semantics
Tutorial: Social Semantics
 
Modelling and Analysis of User Behaviour in Online Communities
Modelling and Analysis of User Behaviour in Online CommunitiesModelling and Analysis of User Behaviour in Online Communities
Modelling and Analysis of User Behaviour in Online Communities
 
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web SystemsUsing Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community Forums
 
Semantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic DataSemantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic Data
 

Recently uploaded

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 

Recently uploaded (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Integrating and Interpreting Social Data from Heterogeneous Sources

  • 1. Integrating and Interpreting Social Data from Heterogeneous Sources Matthew Rowe Organisations, Information and Knowledge Group University of Sheffield SuvodeepMazumdar Department of Information Studies University of Sheffield
  • 2. Outline Information overload Increase in social data publication Interlinking social data Metadata Generation Integrating Social Data Application: Interpreting Social Data Cumbrian Floods Use Case Interacting with Social Data Conclusions
  • 3. Information Overload Masses of social data are published every day E.g. 50 million tweets (600 per second) http://blog.twitter.com 22million Facebook users in the UK http://www.clickymedia.co.uk/2009/10/uk-facebook-user-statistics-october-2009/ Too much information to deal with! Social data is multi-faceted: Provenance Topic Geo Trend services (e.g. trendistic, blogpulse): Focus on majority consensus Need to listen in to a specific topic Concentrate on a single source/platform Do not consider geo facet
  • 4.
  • 5.
  • 6. Interlinking Social Data Consider multi-faceted nature of social data: Allows fine-grained analysis Show geo-localised social data Relevant past social data Solution: Interlink social data from heterogeneous sources Use semantics! Consistent data interpretation
  • 7. Metadata Generation Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post and itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance
  • 8. Metadata Generation <photo id="949406913" media="photo"> <owner nsid="54948696@N00”/> <title>DSC00171.JPG</title> <description></description> <dates posted="1205398307" taken="2009-01-09 09:16:31" lastupdate="1257421561" /> <tags> <tag id="24539622-2330113101-400" author="54948696@N00" raw="arctic">arctic</tag> <tag id="24539622-2330113101-401" author="54948696@N00" raw="monkeys">monkeys</tag> </tags> <location latitude="53.4813" longitude="-2.2392" place_id="R8vDw_abBpSzUA"> <locality place_id="R8vDw_abBpSzUA" woeid="27872">Manchester</locality> <region place_id="pn4MsiGbBZlXeplyXg" woeid="24554868">England</region> <country place_id="DevLebebApj4RVbtaQ" woeid="23424975">United Kingdom</country> </location> </photo> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post and itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <status> <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at> <id>9774519667</id> <text>Writing up our Geovation work for #lupas2010.</text> <truncated>false</truncated> <in_reply_to_status_id></in_reply_to_status_id> <in_reply_to_user_id></in_reply_to_user_id> <favorited>false</favorited> <in_reply_to_screen_name></in_reply_to_screen_name> <geo xmlns:georss="http://www.georss.org/georss"> <georss:point>53.3833,-1.4722</georss:point> </geo> </status>
  • 9. Metadata Generation Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post and itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <status> <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at> <id>9774519667</id> <text>Writing up our Geovation work for #lupas2010.</text> <truncated>false</truncated> <in_reply_to_status_id></in_reply_to_status_id> <in_reply_to_user_id></in_reply_to_user_id> <favorited>false</favorited> <in_reply_to_screen_name></in_reply_to_screen_name> <geo xmlns:georss="http://www.georss.org/georss"> <georss:point>53.3833,-1.4722</georss:point> </geo> </status>
  • 10. Metadata Generation <status> <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at> <id>9774519667</id> <text>Writing up our Geovation work for #lupas2010.</text> <truncated>false</truncated> <in_reply_to_status_id></in_reply_to_status_id> <in_reply_to_user_id></in_reply_to_user_id> <favorited>false</favorited> <in_reply_to_screen_name></in_reply_to_screen_name> <geo xmlns:georss="http://www.georss.org/georss"> <georss:point>53.3833,-1.4722</georss:point> </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667> rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ;
  • 11. Metadata Generation <status> <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at> <id>9774519667</id> <text>Writing up our Geovation work for #lupas2010.</text> <truncated>false</truncated> <in_reply_to_status_id></in_reply_to_status_id> <in_reply_to_user_id></in_reply_to_user_id> <favorited>false</favorited> <in_reply_to_screen_name></in_reply_to_screen_name> <geo xmlns:georss="http://www.georss.org/georss"> <georss:point>53.3833,-1.4722</georss:point> </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667> rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ; sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ;
  • 12. Metadata Generation <status> <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at> <id>9774519667</id> <text>Writing up our Geovation work for #lupas2010.</text> <truncated>false</truncated> <in_reply_to_status_id></in_reply_to_status_id> <in_reply_to_user_id></in_reply_to_user_id> <favorited>false</favorited> <in_reply_to_screen_name></in_reply_to_screen_name> <geo xmlns:georss="http://www.georss.org/georss"> <georss:point>53.3833,-1.4722</georss:point> </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667> rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ; sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ; itr:has_Localization _:a2 . _:a2 rdf:typegml:Geometry ; gml:pos "53.3833,-1.4722" .
  • 13. Metadata Generation <status> <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at> <id>9774519667</id> <text>Writing up our Geovation work for #lupas2010.</text> <truncated>false</truncated> <in_reply_to_status_id></in_reply_to_status_id> <in_reply_to_user_id></in_reply_to_user_id> <favorited>false</favorited> <in_reply_to_screen_name></in_reply_to_screen_name> <geo xmlns:georss="http://www.georss.org/georss"> <georss:point>53.3833,-1.4722</georss:point> </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow/9774519667> rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ; sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ; dcterms:created "2010-2-28 12:22:47.0" ; itr:has_Localization _:a2 . _:a2 rdf:typegml:Geometry ; gml:pos "53.3833,-1.4722" .
  • 14. Metadata Generation <status> <created_at>Sun Feb 28 12:22:47 +0000 2010</created_at> <id>9774519667</id> <text>Writing up our Geovation work for #lupas2010.</text> <truncated>false</truncated> <in_reply_to_status_id></in_reply_to_status_id> <in_reply_to_user_id></in_reply_to_user_id> <favorited>false</favorited> <in_reply_to_screen_name></in_reply_to_screen_name> <geo xmlns:georss="http://www.georss.org/georss"> <georss:point>53.3833,-1.4722</georss:point> </geo> </status> Web 2.0 platforms return data using: Proprietary formats; Heterogeneous data schemas Need to link data together from disparate sources A social data fragment = a single piece of social data E.g. A tweet, an image, a video Lift each social data fragment to RDF: Create an instance of sioc:Post/itr:LocalizedResource Assign it a URI Assign the content to the instance (topic) Use hashtags of the microblog Create an instance of gml:Geometry (geo) Capture geo facet Assign timestamp of fragment creation (provenance) Using dc:created Assign the fragment to its owner (provenance) Create foaf:Person instance <http://twitter.com/mattroweshow> rdf:typefoaf:Person ; rdf:typeitr:LocalizedResource ; foaf:name "Matthew Rowe" ; foaf:homepage <http://www.dcs.shef.ac.uk/~mrowe> ; <http://twitter.com/mattroweshow/9774519667> rdf:typesioc:Post ; rdf:typeitr:LocalizedResource ; sioc:content "Writing up our Geovation work for #lupas2010." ; dcterms:subject "lupas2010" ; dcterms:created "2010-2-28 12:22:47.0" ; sioc:hasCreator <http://twitter.com/mattroweshow> ; itr:has_Localization _:a2 . _:a2 rdf:typegml:Geometry ; gml:pos "53.3833,-1.4722" .
  • 15. Integrated Social Data Triplify social data from multiple platforms Flickr XML response -> RDF Picassa XML response -> RDF Use common semantics Can perform SPARQL queries PREFIX dcterms:<http://purl.org/dc/terms> SELECT ?item WHERE { ?item dcterms:subject "iranelections" . ?item dcterms:created ?date } ORDER BY DESC(?date) PREFIX dcterms:<http://purl.org/dc/terms> PREFIX itr:<http://www.dcs.shef.ac.uk/~gregoire/interaction/ns#> PREFIX gml:<http://www.opengis.net/gml/> SELECT DISTINCT ?post ?tag WHERE { ?post dcterms:subject ?tag . ?post itr:has_Localization ?geo . ?geo gml:pos "53.4813,-2.2392" }
  • 16. Interpreting Social Data Cumbrian Use Case UK region suffered worst floods in centuries Observe the effects in social data Rise in publication Fine-grained geocoded social data Dataset: Microblogs from 200 Cumbrian Twitter users Published during 2009 3513 microblogs Produced 475,043 triples Images from Flickr taken in Cumbria 6663 images Produced 182,304
  • 17. Interacting with Social Data Built a visualisation application to analyse social data fragments http://www.dcs.shef.ac.uk/~suvodeep/ViziSocial Filter by date Lower slider Fine-grained focus Zoom in Tag cloud Shows fragment topics Window controls tag cloud topics Markers contain number of fragments
  • 18. Conclusions Consistent interpretation of social data Across heterogeneous sources Application Allows analyses of social data To fine-grained detail Utilises multiple facets of social data Requires metadata Issue of scalability Future Work Adapting to real time data acquisition Focussing on South Yorkshire region at present Assess scalability issue
  • 19. Twitter: @mattroweshow Web: http://www.dcs.shef.ac.uk/~mrowe Email: m.rowe@dcs.shef.ac.uk Questions?

Editor's Notes

  1. Trend ServicesTrendisticOnly twitterBlogpulseBlogosphere
  2. Trend ServicesTrendisticOnly twitterBlogpulseBlogosphere
  3. Trend ServicesTrendisticOnly twitterBlogpulseBlogosphere
  4. Web 2.0 platforms provide data in proprietary formats:XML according to bespoke schemasLift to RDF using consistent semantics