SlideShare a Scribd company logo
1 of 30
Mapping Tweets to Conference Talks: A
Goldmine for Semantics
Milan Stankovic, Hypios, Paris-Sorbonne, FR & Matthew Rowe, KMI, Open University, UK
On Conference We Tweet
Is there a Correspondance?
?
Why?
tweettweet talktalk
is about
Why?
tweettweet talktalk
is about
Topic 3
Topic 2
Topic 1
has topic
has topic
has topic
useruser
made
Why?
tweettweet talktalk
is about
Topic 3
Topic 2
Topic 1
has topic
has topic
has topic
useruser
made
interest ?
Why?
tweettweet talktalk
is about
useruser
made
were at the same talk ?
tweettweet
is about
useruser
made
Potential Benefits
• Digital memory
• Conference feedback
– number of tweets for a talk
– conversational aspects
– sentiment analysis
• User profiling and expert finding
• Trending topics
Rich Activity Twitter Event Data
• We take Twitter archives from
TwapperKeeper
• We enrich Tweets with relevant DBPedia
concepts using Zemanta
• We rely on existing Linked Data about talks to
perform the mappings.
ESWC Dataset
• Collected during the Extended Semantic Web
Conference 2010
– Any tweets tagged with “eswc”
• 1082 tweets
• 213 tweets enriched with concepts
Aligning Tweets with Talks
• Goal: Label tweets with talks
• Method:
– Induce a labelling function to perform alignment
– Labelled data = events from Web of Data
– Unlabelled data = tweets
( ){ }L
iii yx 1
, =
( ){ }U
iix 1=
YXf →:
Aligning Tweets with Talks
1. Feature Extraction:
@prefix swrc: <http://swrc.ontoware.org/ontology#>
@prefix swc: <http://data.semanticweb.org/ns/swc/ontology#>
@prefix dog: <http://data.semanticweb.org>
@prefix dc: <http://purl.org/dc/elements/1.1/>
<http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ;
dc:subject "Knowledge Acquisition" ;
dc:subject "Semantic Analysis" ;
dc:subject "Social Web" ;
dc:subject "Microblogs" ;
dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ;
swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ;
swrc:author <http://data.semanticweb.org/person/claudia-wagner> .
<http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ;
foaf:name "Claudia Wagner" ;
swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ;
foaf:based_near <http://dbpedia.org/resource/Austria>
Aligning Tweets with Talks
1. Feature Extraction: F1 - Immediate Resource
Leaves
@prefix swrc: <http://swrc.ontoware.org/ontology#>
@prefix swc: <http://data.semanticweb.org/ns/swc/ontology#>
@prefix dog: <http://data.semanticweb.org>
@prefix dc: <http://purl.org/dc/elements/1.1/>
<http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ;
dc:subject "Knowledge Acquisition" ;
dc:subject "Semantic Analysis" ;
dc:subject "Social Web" ;
dc:subject "Microblogs" ;
dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ;
swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ;
swrc:author <http://data.semanticweb.org/person/claudia-wagner> .
<http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ;
foaf:name "Claudia Wagner" ;
swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ;
foaf:based_near <http://dbpedia.org/resource/Austria>
Knowledge Acquisition Semantic
Analysis Social Web Microblogs
Exploring the Wisdom of the Tweets:
Knowledge Acquisition from Social
Awareness Streams Although one might
argue that little wisdom can be
conveyed in messages of 140
http://data.semanticweb.org/person/cla
udia-wagner
Knowledge Acquisition Semantic
Analysis Social Web Microblogs
Exploring the Wisdom of the Tweets:
Knowledge Acquisition from Social
Awareness Streams Although one might
argue that little wisdom can be
conveyed in messages of 140
http://data.semanticweb.org/person/cla
udia-wagner
Aligning Tweets with Talks
1. Feature Extraction: F2 – 1-step Resource
Leaves
@prefix swrc: <http://swrc.ontoware.org/ontology#>
@prefix swc: <http://data.semanticweb.org/ns/swc/ontology#>
@prefix dog: <http://data.semanticweb.org>
@prefix dc: <http://purl.org/dc/elements/1.1/>
<http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ;
dc:subject "Knowledge Acquisition" ;
dc:subject "Semantic Analysis" ;
dc:subject "Social Web" ;
dc:subject "Microblogs" ;
dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ;
swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ;
swrc:author <http://data.semanticweb.org/person/claudia-wagner> .
<http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ;
foaf:name "Claudia Wagner" ;
swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ;
foaf:based_near <http://dbpedia.org/resource/Austria>
http://data.semanticweb.org/person/cla
udia-wagner Claudia Wagner
http://data.semanticweb.org/organizati
on/joanneum-research
http://dbpedia.org/resource/Austria
http://data.semanticweb.org/person/cla
udia-wagner Claudia Wagner
http://data.semanticweb.org/organizati
on/joanneum-research
http://dbpedia.org/resource/Austria
Aligning Tweets with Talks
1. Feature Extraction: F3 – DBPedia Concepts
@prefix swrc: <http://swrc.ontoware.org/ontology#>
@prefix swc: <http://data.semanticweb.org/ns/swc/ontology#>
@prefix dog: <http://data.semanticweb.org>
@prefix dc: <http://purl.org/dc/elements/1.1/>
<http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ;
dc:subject "Knowledge Acquisition" ;
dc:subject "Semantic Analysis" ;
dc:subject "Social Web" ;
dc:subject "Microblogs" ;
dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ;
swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ;
swrc:author <http://data.semanticweb.org/person/claudia-wagner> .
<http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ;
foaf:name "Claudia Wagner" ;
swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ;
foaf:based_near <http://dbpedia.org/resource/Austria>
Http://dbpedia.org/resource/TwitterHttp://dbpedia.org/resource/Twitter
Http://dbpedia.org/resource/Social_WebHttp://dbpedia.org/resource/Social_Web
Http://dbpedia.org/resource/MicroblogsHttp://dbpedia.org/resource/Microblogs
Aligning Tweets with Talks
2. Feature Vector Composition
Knowledge Acquisition Semantic
Analysis Social Web Microblogs
Exploring the Wisdom of the Tweets:
Knowledge Acquisition from Social
Awareness Streams Although one might
argue that little wisdom can be
conveyed in messages of 140
http://data.semanticweb.org/person/cla
udia-wagner
Knowledge Acquisition Semantic
Analysis Social Web Microblogs
Exploring the Wisdom of the Tweets:
Knowledge Acquisition from Social
Awareness Streams Although one might
argue that little wisdom can be
conveyed in messages of 140
http://data.semanticweb.org/person/cla
udia-wagner
knowledge
acquisition
semantic
analysis
social
web
microblogs
exploring
wisdom
tweets
knowledge
acquisition
social
awareness
streams
wisdom
messages
IndexerIndexer
knowledge 2
acquisition 2
semantic 1
analysis 1
social 2
web 1
microblogs 1
exploring 1
wisdom 1
tweets 1
awareness 1
streams 1
wisdom 1
messages 1
Aligning Tweets with Talks
3. Inducing the Labelling Function
– Both tweets and events are provided as feature
vectors
– Induce a labelling function:
Choose the most likely event (y) given the tweet (x)
YXf →:
Aligning Tweets with Talks
3. Inducing the Labelling Function: Proximity-
based Clustering
– Build a centroid vector for each event
• From event feature vectors
– Compare each tweet vector with each centroid
• Choose event (y) which is closest
)),((minarg y
Yy
xdy µ
∈
=
∑=
−=
n
i
iixxmanhat
1
),( µµ ( )
2
1
),( ∑=
−=
n
i
iixxeucl µµ
Aligning Tweets with Talks
3. Inducing the Labelling Function: Naive Bayes
Classification
– Assigns most probably event label given tweet
features
– Using Bayes Theorem, we write this as:
),,,|( 21maxarg n
Yy
xxxyPy 
∈
=
∏
∈
∈
∈
=
=
=
i
i
Yy
n
Yy
n
n
Yy
yxPyPy
yPyxxxPy
xxxP
yPyxxxP
y
)|()(
)()|,,,(
),,,(
)()|,,,(
maxarg
maxarg
maxarg
21
21
21



Experiments
• Dataset
– Corpus of Tweets collected during ESWC 2010
• Gold Standard Construction
– Used 3 raters to label a portion of tweet corpus
• 200 tweets labelled
– Took interrater agreement between raters
• Using Kappa statistic
– Initial Agreement was too low: 0.328
– Utilised Delphi method to improve agreement
– Second round of labelling produced: 0.820
Experiments
• Evaluation Measures
– Precision: proportion of event tweets correctly
labelled
– Recall: proportion of tweets successfully
returned for a tweet
– F-measure: Harmonic mean of precision and
recall
• Placed emphasis of precision over recall
RP
RP
measuref
+×
××+
=− 2
2
)1(
β
β
{ }1,5.0,2.0=β
Results
Imagine…
Imagine user profiling
ESWC dataset, user Matthew Rowe
Imagine conference feedback
ESWC dataset
directly from Tweets
from mappings (Talks)
We Challenge You
We Challenge You!
• Beat us in mappings!
• We provide the human generated gold
stadnard mappings
• Can you find a more precise way to do tweet-
talk mappings?
• Can you find other uses? Let us know!
We Challenge You!
• you can find the gold standard data here :
http://research.hypios.com/?page_id=131
• you can find all the data (and automated
mappings) here:
http://data.hypios.com/tweets/sparql
We Challenge You!
http://data.hypios.com/tweets/sparql
SELECT ?tweet ?talk WHERE {
?tweet <http://linkedevents.org/ontology/illustrate> ?talk.
}
brought to you by
milan.stankovic@hypios.com & M.C.Rowe@open.ac.uk
November 2010, Shanghaï, China

More Related Content

What's hot

Threat Hunting with Splunk
Threat Hunting with Splunk Threat Hunting with Splunk
Threat Hunting with Splunk Splunk
 
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptxChi En (Ashley) Shen
 
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...OpenDNS
 
OSINT tools for security auditing with python
OSINT tools for security auditing with pythonOSINT tools for security auditing with python
OSINT tools for security auditing with pythonJose Manuel Ortega Candel
 
Malware Static Analysis
Malware Static AnalysisMalware Static Analysis
Malware Static AnalysisHossein Yavari
 
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...Paladion Networks
 

What's hot (9)

Threat Hunting with Splunk
Threat Hunting with Splunk Threat Hunting with Splunk
Threat Hunting with Splunk
 
Supraja_SMS_presentation
Supraja_SMS_presentationSupraja_SMS_presentation
Supraja_SMS_presentation
 
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
 
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
 
Tcpdump hunter
Tcpdump hunterTcpdump hunter
Tcpdump hunter
 
OSINT tools for security auditing with python
OSINT tools for security auditing with pythonOSINT tools for security auditing with python
OSINT tools for security auditing with python
 
BDACA1516s2 - Lecture5
BDACA1516s2 - Lecture5BDACA1516s2 - Lecture5
BDACA1516s2 - Lecture5
 
Malware Static Analysis
Malware Static AnalysisMalware Static Analysis
Malware Static Analysis
 
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...
 

Viewers also liked

Istc 655 Chapter 7 Ppt
Istc 655 Chapter 7 PptIstc 655 Chapter 7 Ppt
Istc 655 Chapter 7 Pptcdegro1
 
Envoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail ProcessEnvoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail Processwijrwsr
 
Rabies Virus
Rabies VirusRabies Virus
Rabies VirusDikshan
 
MAD ./ \. DOG
MAD ./ \. DOGMAD ./ \. DOG
MAD ./ \. DOGDikshan
 
Evaluation Slide Show
Evaluation Slide ShowEvaluation Slide Show
Evaluation Slide Showgemmibearrox
 
rabies 2
rabies 2rabies 2
rabies 2Dikshan
 
Envoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail ProcessEnvoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail Processwijrwsr
 
FOAF+SSL: More then Authentication
FOAF+SSL: More then AuthenticationFOAF+SSL: More then Authentication
FOAF+SSL: More then AuthenticationMilan Stankovic
 
Finding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked DataFinding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked DataMilan Stankovic
 
Open Innovation and Semantic Web
Open Innovation and Semantic WebOpen Innovation and Semantic Web
Open Innovation and Semantic WebMilan Stankovic
 
Semantic Web In Practice
Semantic Web In PracticeSemantic Web In Practice
Semantic Web In PracticeMilan Stankovic
 
Looking for Experts? What can Linked Data do for You?
Looking for Experts? What can Linked Data do for You?Looking for Experts? What can Linked Data do for You?
Looking for Experts? What can Linked Data do for You?Milan Stankovic
 
Translate Subtitles
Translate SubtitlesTranslate Subtitles
Translate Subtitlesguest78ba8c
 
How does your media project represent particualr social groups?
How does your media project represent particualr social groups?How does your media project represent particualr social groups?
How does your media project represent particualr social groups?gemmibearrox
 
Accessibility U 1237927961698 S U
Accessibility U 1237927961698 S UAccessibility U 1237927961698 S U
Accessibility U 1237927961698 S Uguest45d56
 
Gtdessentialsles320090324.Key U 1237929710943 B U
Gtdessentialsles320090324.Key U 1237929710943 B UGtdessentialsles320090324.Key U 1237929710943 B U
Gtdessentialsles320090324.Key U 1237929710943 B Uguest45d56
 

Viewers also liked (20)

gs0703
gs0703gs0703
gs0703
 
Istc 655 Chapter 7 Ppt
Istc 655 Chapter 7 PptIstc 655 Chapter 7 Ppt
Istc 655 Chapter 7 Ppt
 
Envoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail ProcessEnvoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail Process
 
Rabies Virus
Rabies VirusRabies Virus
Rabies Virus
 
MAD ./ \. DOG
MAD ./ \. DOGMAD ./ \. DOG
MAD ./ \. DOG
 
Evaluation Slide Show
Evaluation Slide ShowEvaluation Slide Show
Evaluation Slide Show
 
rabies
rabiesrabies
rabies
 
rabies 2
rabies 2rabies 2
rabies 2
 
Envoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail ProcessEnvoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail Process
 
FOAF+SSL: More then Authentication
FOAF+SSL: More then AuthenticationFOAF+SSL: More then Authentication
FOAF+SSL: More then Authentication
 
Finding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked DataFinding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked Data
 
Open Innovation and Semantic Web
Open Innovation and Semantic WebOpen Innovation and Semantic Web
Open Innovation and Semantic Web
 
Semantic Web In Practice
Semantic Web In PracticeSemantic Web In Practice
Semantic Web In Practice
 
Faceted Online Presence
Faceted Online PresenceFaceted Online Presence
Faceted Online Presence
 
Looking for Experts? What can Linked Data do for You?
Looking for Experts? What can Linked Data do for You?Looking for Experts? What can Linked Data do for You?
Looking for Experts? What can Linked Data do for You?
 
Translate Subtitles
Translate SubtitlesTranslate Subtitles
Translate Subtitles
 
Online Presence
Online PresenceOnline Presence
Online Presence
 
How does your media project represent particualr social groups?
How does your media project represent particualr social groups?How does your media project represent particualr social groups?
How does your media project represent particualr social groups?
 
Accessibility U 1237927961698 S U
Accessibility U 1237927961698 S UAccessibility U 1237927961698 S U
Accessibility U 1237927961698 S U
 
Gtdessentialsles320090324.Key U 1237929710943 B U
Gtdessentialsles320090324.Key U 1237929710943 B UGtdessentialsles320090324.Key U 1237929710943 B U
Gtdessentialsles320090324.Key U 1237929710943 B U
 

Similar to Mapping Tweets to Conference Talks: A Goldmine for Semantics

apidays LIVE Australia 2021 - Tracing across your distributed process boundar...
apidays LIVE Australia 2021 - Tracing across your distributed process boundar...apidays LIVE Australia 2021 - Tracing across your distributed process boundar...
apidays LIVE Australia 2021 - Tracing across your distributed process boundar...apidays
 
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology DomainFacilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology DomainChristophe Debruyne
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSAPRBETTER
 
Patterns of Streaming Applications
Patterns of Streaming ApplicationsPatterns of Streaming Applications
Patterns of Streaming ApplicationsC4Media
 
Semantic Support for Complex Ecosystem Research Environments
Semantic Support for Complex Ecosystem Research EnvironmentsSemantic Support for Complex Ecosystem Research Environments
Semantic Support for Complex Ecosystem Research EnvironmentsHenrique O. Santos
 
Unleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightUnleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightMatthew Russell
 
Unleashing twitter data for fun and insight
Unleashing twitter data for fun and insightUnleashing twitter data for fun and insight
Unleashing twitter data for fun and insightDigital Reasoning
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignMichael Noll
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsthelabdude
 
Information Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteInformation Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteDeep Kayal
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesTony Hammond
 
Filtering From the Firehose: Real Time Social Media Streaming
Filtering From the Firehose: Real Time Social Media StreamingFiltering From the Firehose: Real Time Social Media Streaming
Filtering From the Firehose: Real Time Social Media StreamingCloud Elements
 
Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...
Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...
Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...DataStax Academy
 
30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real world30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real worldDiego Valerio Camarda
 
AAT LOD Microthesauri
AAT LOD MicrothesauriAAT LOD Microthesauri
AAT LOD MicrothesauriMarcia Zeng
 
Recommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and DatoRecommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and DatoAshok Venkatesan
 
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...Steffen Staab
 
Rob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for ReproducibilityRob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for ReproducibilityGigaScience, BGI Hong Kong
 
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...WSO2
 
myExperiment @ Nettab
myExperiment @ NettabmyExperiment @ Nettab
myExperiment @ NettabDuncan Hull
 

Similar to Mapping Tweets to Conference Talks: A Goldmine for Semantics (20)

apidays LIVE Australia 2021 - Tracing across your distributed process boundar...
apidays LIVE Australia 2021 - Tracing across your distributed process boundar...apidays LIVE Australia 2021 - Tracing across your distributed process boundar...
apidays LIVE Australia 2021 - Tracing across your distributed process boundar...
 
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology DomainFacilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
 
Patterns of Streaming Applications
Patterns of Streaming ApplicationsPatterns of Streaming Applications
Patterns of Streaming Applications
 
Semantic Support for Complex Ecosystem Research Environments
Semantic Support for Complex Ecosystem Research EnvironmentsSemantic Support for Complex Ecosystem Research Environments
Semantic Support for Complex Ecosystem Research Environments
 
Unleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightUnleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and Insight
 
Unleashing twitter data for fun and insight
Unleashing twitter data for fun and insightUnleashing twitter data for fun and insight
Unleashing twitter data for fun and insight
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Information Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteInformation Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ Deloitte
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
 
Filtering From the Firehose: Real Time Social Media Streaming
Filtering From the Firehose: Real Time Social Media StreamingFiltering From the Firehose: Real Time Social Media Streaming
Filtering From the Firehose: Real Time Social Media Streaming
 
Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...
Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...
Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...
 
30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real world30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real world
 
AAT LOD Microthesauri
AAT LOD MicrothesauriAAT LOD Microthesauri
AAT LOD Microthesauri
 
Recommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and DatoRecommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and Dato
 
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
 
Rob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for ReproducibilityRob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
 
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
 
myExperiment @ Nettab
myExperiment @ NettabmyExperiment @ Nettab
myExperiment @ Nettab
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Mapping Tweets to Conference Talks: A Goldmine for Semantics

  • 1. Mapping Tweets to Conference Talks: A Goldmine for Semantics Milan Stankovic, Hypios, Paris-Sorbonne, FR & Matthew Rowe, KMI, Open University, UK
  • 3. Is there a Correspondance? ?
  • 5. Why? tweettweet talktalk is about Topic 3 Topic 2 Topic 1 has topic has topic has topic useruser made
  • 6. Why? tweettweet talktalk is about Topic 3 Topic 2 Topic 1 has topic has topic has topic useruser made interest ?
  • 7. Why? tweettweet talktalk is about useruser made were at the same talk ? tweettweet is about useruser made
  • 8. Potential Benefits • Digital memory • Conference feedback – number of tweets for a talk – conversational aspects – sentiment analysis • User profiling and expert finding • Trending topics
  • 9. Rich Activity Twitter Event Data • We take Twitter archives from TwapperKeeper • We enrich Tweets with relevant DBPedia concepts using Zemanta • We rely on existing Linked Data about talks to perform the mappings.
  • 10. ESWC Dataset • Collected during the Extended Semantic Web Conference 2010 – Any tweets tagged with “eswc” • 1082 tweets • 213 tweets enriched with concepts
  • 11. Aligning Tweets with Talks • Goal: Label tweets with talks • Method: – Induce a labelling function to perform alignment – Labelled data = events from Web of Data – Unlabelled data = tweets ( ){ }L iii yx 1 , = ( ){ }U iix 1= YXf →:
  • 12. Aligning Tweets with Talks 1. Feature Extraction: @prefix swrc: <http://swrc.ontoware.org/ontology#> @prefix swc: <http://data.semanticweb.org/ns/swc/ontology#> @prefix dog: <http://data.semanticweb.org> @prefix dc: <http://purl.org/dc/elements/1.1/> <http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ; dc:subject "Knowledge Acquisition" ; dc:subject "Semantic Analysis" ; dc:subject "Social Web" ; dc:subject "Microblogs" ; dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ; swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ; swrc:author <http://data.semanticweb.org/person/claudia-wagner> . <http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ; foaf:name "Claudia Wagner" ; swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ; foaf:based_near <http://dbpedia.org/resource/Austria>
  • 13. Aligning Tweets with Talks 1. Feature Extraction: F1 - Immediate Resource Leaves @prefix swrc: <http://swrc.ontoware.org/ontology#> @prefix swc: <http://data.semanticweb.org/ns/swc/ontology#> @prefix dog: <http://data.semanticweb.org> @prefix dc: <http://purl.org/dc/elements/1.1/> <http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ; dc:subject "Knowledge Acquisition" ; dc:subject "Semantic Analysis" ; dc:subject "Social Web" ; dc:subject "Microblogs" ; dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ; swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ; swrc:author <http://data.semanticweb.org/person/claudia-wagner> . <http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ; foaf:name "Claudia Wagner" ; swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ; foaf:based_near <http://dbpedia.org/resource/Austria> Knowledge Acquisition Semantic Analysis Social Web Microblogs Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams Although one might argue that little wisdom can be conveyed in messages of 140 http://data.semanticweb.org/person/cla udia-wagner Knowledge Acquisition Semantic Analysis Social Web Microblogs Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams Although one might argue that little wisdom can be conveyed in messages of 140 http://data.semanticweb.org/person/cla udia-wagner
  • 14. Aligning Tweets with Talks 1. Feature Extraction: F2 – 1-step Resource Leaves @prefix swrc: <http://swrc.ontoware.org/ontology#> @prefix swc: <http://data.semanticweb.org/ns/swc/ontology#> @prefix dog: <http://data.semanticweb.org> @prefix dc: <http://purl.org/dc/elements/1.1/> <http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ; dc:subject "Knowledge Acquisition" ; dc:subject "Semantic Analysis" ; dc:subject "Social Web" ; dc:subject "Microblogs" ; dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ; swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ; swrc:author <http://data.semanticweb.org/person/claudia-wagner> . <http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ; foaf:name "Claudia Wagner" ; swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ; foaf:based_near <http://dbpedia.org/resource/Austria> http://data.semanticweb.org/person/cla udia-wagner Claudia Wagner http://data.semanticweb.org/organizati on/joanneum-research http://dbpedia.org/resource/Austria http://data.semanticweb.org/person/cla udia-wagner Claudia Wagner http://data.semanticweb.org/organizati on/joanneum-research http://dbpedia.org/resource/Austria
  • 15. Aligning Tweets with Talks 1. Feature Extraction: F3 – DBPedia Concepts @prefix swrc: <http://swrc.ontoware.org/ontology#> @prefix swc: <http://data.semanticweb.org/ns/swc/ontology#> @prefix dog: <http://data.semanticweb.org> @prefix dc: <http://purl.org/dc/elements/1.1/> <http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ; dc:subject "Knowledge Acquisition" ; dc:subject "Semantic Analysis" ; dc:subject "Social Web" ; dc:subject "Microblogs" ; dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ; swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ; swrc:author <http://data.semanticweb.org/person/claudia-wagner> . <http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ; foaf:name "Claudia Wagner" ; swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ; foaf:based_near <http://dbpedia.org/resource/Austria> Http://dbpedia.org/resource/TwitterHttp://dbpedia.org/resource/Twitter Http://dbpedia.org/resource/Social_WebHttp://dbpedia.org/resource/Social_Web Http://dbpedia.org/resource/MicroblogsHttp://dbpedia.org/resource/Microblogs
  • 16. Aligning Tweets with Talks 2. Feature Vector Composition Knowledge Acquisition Semantic Analysis Social Web Microblogs Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams Although one might argue that little wisdom can be conveyed in messages of 140 http://data.semanticweb.org/person/cla udia-wagner Knowledge Acquisition Semantic Analysis Social Web Microblogs Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams Although one might argue that little wisdom can be conveyed in messages of 140 http://data.semanticweb.org/person/cla udia-wagner knowledge acquisition semantic analysis social web microblogs exploring wisdom tweets knowledge acquisition social awareness streams wisdom messages IndexerIndexer knowledge 2 acquisition 2 semantic 1 analysis 1 social 2 web 1 microblogs 1 exploring 1 wisdom 1 tweets 1 awareness 1 streams 1 wisdom 1 messages 1
  • 17. Aligning Tweets with Talks 3. Inducing the Labelling Function – Both tweets and events are provided as feature vectors – Induce a labelling function: Choose the most likely event (y) given the tweet (x) YXf →:
  • 18. Aligning Tweets with Talks 3. Inducing the Labelling Function: Proximity- based Clustering – Build a centroid vector for each event • From event feature vectors – Compare each tweet vector with each centroid • Choose event (y) which is closest )),((minarg y Yy xdy µ ∈ = ∑= −= n i iixxmanhat 1 ),( µµ ( ) 2 1 ),( ∑= −= n i iixxeucl µµ
  • 19. Aligning Tweets with Talks 3. Inducing the Labelling Function: Naive Bayes Classification – Assigns most probably event label given tweet features – Using Bayes Theorem, we write this as: ),,,|( 21maxarg n Yy xxxyPy  ∈ = ∏ ∈ ∈ ∈ = = = i i Yy n Yy n n Yy yxPyPy yPyxxxPy xxxP yPyxxxP y )|()( )()|,,,( ),,,( )()|,,,( maxarg maxarg maxarg 21 21 21   
  • 20. Experiments • Dataset – Corpus of Tweets collected during ESWC 2010 • Gold Standard Construction – Used 3 raters to label a portion of tweet corpus • 200 tweets labelled – Took interrater agreement between raters • Using Kappa statistic – Initial Agreement was too low: 0.328 – Utilised Delphi method to improve agreement – Second round of labelling produced: 0.820
  • 21. Experiments • Evaluation Measures – Precision: proportion of event tweets correctly labelled – Recall: proportion of tweets successfully returned for a tweet – F-measure: Harmonic mean of precision and recall • Placed emphasis of precision over recall RP RP measuref +× ××+ =− 2 2 )1( β β { }1,5.0,2.0=β
  • 24. Imagine user profiling ESWC dataset, user Matthew Rowe
  • 25. Imagine conference feedback ESWC dataset directly from Tweets from mappings (Talks)
  • 27. We Challenge You! • Beat us in mappings! • We provide the human generated gold stadnard mappings • Can you find a more precise way to do tweet- talk mappings? • Can you find other uses? Let us know!
  • 28. We Challenge You! • you can find the gold standard data here : http://research.hypios.com/?page_id=131 • you can find all the data (and automated mappings) here: http://data.hypios.com/tweets/sparql
  • 29. We Challenge You! http://data.hypios.com/tweets/sparql SELECT ?tweet ?talk WHERE { ?tweet <http://linkedevents.org/ontology/illustrate> ?talk. }
  • 30. brought to you by milan.stankovic@hypios.com & M.C.Rowe@open.ac.uk November 2010, Shanghaï, China