SlideShare a Scribd company logo
Can Twitter & Co. Save Lives?
Nattiya Kanhabua, Avaré Stewart, Sara Romano
Ernesto Diaz-Aviles, Wolf Siberski, and Wolfgang Nejdl
L3S Research Center / Leibniz Universität Hannover, Germany
Research Seminar @MPII, Saarbrücken
22 October 2013
Motivation
• Numerous works use Twitter to infer the existence
and magnitude of real-world events in real-time
– Earthquake [Sakaki et al., 2010]
– Predicting financial time series [Ruiz et al., 2012]
– Influenza epidemics [Culotta, 2010; Lampos et al.,
2011; Paul et al., 2011]
Early Warnings
Health related tweets
• User status updates or news related to
public health are common in Twitter
– I have the mumps...am I alone?
– my baby girl has a Gastroenteritis so great!! Please
do not give it to meee
– #Cholera breaks out in #Dadaab refugee camp in
#Kenya http://t.co/....
– As many as 16 people have been found infected with
Anthrax in Shahjadpur upazila of the Sirajganj district
in Bangladesh.
Twitter vs. Official Source
Basic Approach
[Kanhabua et al., CIKM’12]
Basic Approach
[Kanhabua et al., CIKM’12]
M-Eco System
Medical Ecosystem: Personalized Event-based Surveillance
http://www.meco-project.eu/
Data Collection
• Official outbreak reports
– ~3,000 ProMED-mail reports from 2011
– WHO reports have very small coverage
• Twitter data
– ~1,200 health-related terms (i.e., infectious
diseases, their synonyms, pathogens and symptoms)
– Over 112 millions of tweets from 2011
• Series of NLP tools including
– OpenNLP (tokenization, sentence splitting, POS
tagging)
– OpenCalais (named entity recognition)
– HeidelTime (temporal expression extraction)
Ground Truths
[Kanhabua et al., TAIA’ 12]
Event Extraction
• An event is a sentence containing two entities
– (1) medical condition and (2) geographic expression
– A minimum requirement by domain experts
• A victim and the time of an event can be identified
from the sentence itself, or its surrounding context
• Output: a set of event candidates
Reported by World Health Organization (WHO) on
29 July 2012 about an ongoing Ebola outbreak
in Uganda since the beginning of July 2012
[Kanhabua et al., TAIA’ 12]
Message Filtering: Challenges
• Ambiguity
– having several meanings
– used in different contexts
• Incompleteness
– missing or under-reported events
– data processing errors
Message Filtering: Challenges
• Ambiguity
– having several meanings
– used in different contexts
• Incompleteness
– missing or under-reported events
– data processing errors
Category Example tweet
Literature A two hour train journey, Love In the Time of Cholera ...
Music Dengue Fever’s “Uku,” Mixed by Paul Dreux Smith
Universal Audio...
Marketing Exclusive distributor of high quality #HIV/AIDS Blood &
Urine and #Hepatitis #Self -testers.
General Identification of genotype 4 Hepatitis E virus binding
proteins on swine liver cells: Hepatitis E virus...
Negative i dont have sniffles and no real coughing..well its
coughing but not like an influenza cough.
Joke Thought I had Bieber Fever. Ends up I just had a combo
of the mumps, mono, measles & the hershey squ...
Challenge I. Noisy/evolving
• Evolving data
– Relevant features changes over time
Challenge I. Noisy/evolving
• Evolving data
– Relevant features changes over time
Approach for Noisy Data
• MedISys1
– providing a list of negative keywords created
by medical experts
• Urban Dictionary2
– a Web-based dictionary of slang, ethnic
culture words or phrases
1
http://medusa.jrc.it/medisys/homeedition/en/home.html
2
http://www.urbandictionary.com/
Approach for Noisy Data
• MedISys1
– providing a list of negative keywords created
by medical experts
• Urban Dictionary2
– a Web-based dictionary of slang, ethnic
culture words or phrases
1
http://medusa.jrc.it/medisys/homeedition/en/home.html
2
http://www.urbandictionary.com/
[Kanhabua and Nejdl, WOW’ 13]
[Kanhabua and Nejdl, WOW’ 13]
Approach for Feature Changes
Signal Generation: Challenges
• Temporal Dynamics
– seasonal infectious diseases
– rare and spontaneous outbreaks
• Location Dynamics
– frequency and duration
– levels of prevalence or severity
Signal Generation: Challenges
• Temporal Dynamics
– seasonal infectious diseases
– rare and spontaneous outbreaks
• Location Dynamics
– frequency and duration
– levels of prevalence or severity
[Rortais et al., 2010 in Journal of Food Research International]
Signal Generation: Challenges
• Temporal Dynamics
– seasonal infectious diseases
– rare and spontaneous outbreaks
• Location Dynamics
– frequency and duration
– levels of prevalence or severity
Signal Generation: Challenges
• Temporal Dynamics
– seasonal infectious diseases
– rare and spontaneous outbreaks
• Location Dynamics
– frequency and duration
– levels of prevalence or severity
[Emch et al., 2008 in International Journal of Health Geographics]
Outbreak Categorization
Outbreak Categorization
How to generate a reliable
signal for low aggregate counts?
Approach
[Kanhabua and Nejdl, WOW’ 13]
Temporal Diversity
• Refined Jaccard Index (RDJ-index)
– average Jaccard similarity of all object pairs
• Note: lower RDJ corresponds to higher diversity
• Problem: “All-Pair comparison”
• Solution: Estimation algorithms with probabilistic
error bound guarantees
[Deng et al., CIKM’
∑<−
=
ji
ji OOJS
nn
RDJ ),(
)1(
2
nji ≤<≤1
∩ UU
Jaccard similarity
Temporal Diversity
• Refined Jaccard Index (RDJ-index)
– average Jaccard similarity of all object pairs
• Note: lower RDJ corresponds to higher diversity
• Problem: “All-Pair comparison”
• Solution: Estimation algorithms with probabilistic
error bound guarantees
[Deng et al., CIKM’
∑<−
=
ji
ji OOJS
nn
RDJ ),(
)1(
2
nji ≤<≤1
∩ UU
Jaccard similarity
(1) Top-k terms
(2) Entities
Threat Assessment: Challenge
• Overwhelming with the large number of tweets
Approach
• Personalized Tweet Ranking for Epidemic
Intelligence
– Learning to rank and recommender systems
– User's context as implicit criteria for recommendation
[Diaz-Aviles et al., ICWSM’
12, Diaz-Aviles et al., WWW’
Approach
• Personalized Tweet Ranking for Epidemic
Intelligence
– Learning to rank and recommender systems
– User's context as implicit criteria for recommendation
[Diaz-Aviles et al., ICWSM’
12, Diaz-Aviles et al., WWW’
Signal Search Prototype
Conclusion
• Can Twitter & Co. Save Lives?
– On a global level, we were able to generate
signals earlier than official reporting
mechanisms.
– The ultimate answer depends on: how a health
organization will use and react to information
provided by our system.
Future Work
• Real-Time Analysis of Big and Fast
Social Web Streams
– Scalable, efficient methods for filtering and
generating signals in real-time
– Effective methods for aggregating and
visualizing information in a meaningful way
Thank you!
kanhabua@L3S.de
References• [Culotta, 2010] A. Culotta. Towards detecting influenza epidemics by analyzing twitter
messages. In Proceedings of the First Workshop on Social Media Analytics (SOMA’2010), 2010.
• [Diaz-Aviles et al., 2012a] E. Diaz-Aviles, A. Stewart, E. Velasco, K. Denecke, and W. Nejdl.
Towards personalized learning to rank for epidemic intelligence based on social media streams.
In Proceedings of the 21st World Wide Web Conference (WWW ‘2012), 2012.
• [Diaz-Aviles et al., 2012b] E. Diaz-Aviles, A. Stewart, E. Velasco, K. Denecke, and W. Nejdl.
Epidemic intelligence for the crowd, by the crowd. In Proceedings of International AAAI
Conference on Weblogs and Social Media (ICWSM’2012), 2012.
• [Kanhabua et al., 2012a] N. Kanhabua, Sara Romano, and A. Stewart, Identifying Relevant
Temporal Expressions for Real-world Events, In SIGIR 2012 Workshop on Time-aware
Information Access (TAIA'2012), 2012.
• [Kanhabua et al., 2012b] N. Kanhabua, Sara Romano, and A. Stewart and W. Nejdl. Supporting
Temporal Analytics for Health Related Events in Microblogs. In Proceedings of CIKM'2012, 2012.
• [Kanhabua and Nejdl 2013] N. Kanhabua and W. Nejdl. Understanding the Diversity of Tweets
in the Time of Outbreaks. In Proceedings of the First International Web Observatory Workshop
(WOW'2013) at WWW'2013, 2013.
• [Lampos et al., 2011] V. Lampos and N. Cristianini. Nowcasting events from the social web with
statistical learning. ACM TIST, 3, 2011.
• [Paul et al., 2011] M. J. Paul and M. Dredze. You are what you tweet: Analyzing twitter for public
health. In Proceedings of International AAAI Conference on Weblogs and Social Media
(ICWSM’2011), 2011.
• [Ruiz et al., 2012] E. J. Ruiz, V. Hristidis, C. Castillo, A. Gionis, and A. Jaimes. Correlating
financial time series with micro-blogging activity. In Proceedings of WSDM’2012, 2012.
• [Sakaki et al., 2010] T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users:
real-time event detection by social sensors. In Proceedings of WWW’2010, 2010.

More Related Content

Viewers also liked

Preservation and Forgetting: Friends or Foes?
Preservation and Forgetting: Friends or Foes?Preservation and Forgetting: Friends or Foes?
Preservation and Forgetting: Friends or Foes?
Nattiya Kanhabua
 
Dynamics of Web: Analysis and Implications from Search Perspective
Dynamics of Web: Analysis and Implications from Search  PerspectiveDynamics of Web: Analysis and Implications from Search  Perspective
Dynamics of Web: Analysis and Implications from Search Perspective
Nattiya Kanhabua
 
Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...
Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...
Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...
Nattiya Kanhabua
 
Leveraging Dynamic Query Subtopics for Time-aware Search Result Diversification
Leveraging Dynamic Query Subtopics for Time-aware Search Result DiversificationLeveraging Dynamic Query Subtopics for Time-aware Search Result Diversification
Leveraging Dynamic Query Subtopics for Time-aware Search Result Diversification
Nattiya Kanhabua
 
Ranking Related News Predictions
Ranking Related News PredictionsRanking Related News Predictions
Ranking Related News Predictions
Nattiya Kanhabua
 
Concise Preservation by Combining Managed Forgetting and Contextualized Remem...
Concise Preservation by Combining Managed Forgetting and Contextualized Remem...Concise Preservation by Combining Managed Forgetting and Contextualized Remem...
Concise Preservation by Combining Managed Forgetting and Contextualized Remem...
Nattiya Kanhabua
 
Temporal summarization of event related updates
Temporal summarization of event related updatesTemporal summarization of event related updates
Temporal summarization of event related updates
Nattiya Kanhabua
 
What Triggers Human Remembering of Events? A Large-Scale Analysis of Catalyst...
What Triggers Human Remembering of Events? A Large-Scale Analysis of Catalyst...What Triggers Human Remembering of Events? A Large-Scale Analysis of Catalyst...
What Triggers Human Remembering of Events? A Large-Scale Analysis of Catalyst...
Nattiya Kanhabua
 
Understanding the Diversity of Tweets in the Time of Outbreaks
Understanding the Diversity of Tweets in the Time of OutbreaksUnderstanding the Diversity of Tweets in the Time of Outbreaks
Understanding the Diversity of Tweets in the Time of Outbreaks
Nattiya Kanhabua
 
Improving Temporal Language Models For Determining Time of Non-Timestamped Do...
Improving Temporal Language Models For Determining Time of Non-Timestamped Do...Improving Temporal Language Models For Determining Time of Non-Timestamped Do...
Improving Temporal Language Models For Determining Time of Non-Timestamped Do...
Nattiya Kanhabua
 
Determining Time of Queries for Re-ranking Search Results
Determining Time of Queries for Re-ranking Search ResultsDetermining Time of Queries for Re-ranking Search Results
Determining Time of Queries for Re-ranking Search Results
Nattiya Kanhabua
 
Exploiting Time-based Synonyms in Searching Document Archives
Exploiting Time-based Synonyms in Searching Document ArchivesExploiting Time-based Synonyms in Searching Document Archives
Exploiting Time-based Synonyms in Searching Document Archives
Nattiya Kanhabua
 
Searching the Temporal Web: Challenges and Current Approaches
Searching the Temporal Web: Challenges and Current ApproachesSearching the Temporal Web: Challenges and Current Approaches
Searching the Temporal Web: Challenges and Current Approaches
Nattiya Kanhabua
 
Time-aware Approaches to Information Retrieval
Time-aware Approaches to Information RetrievalTime-aware Approaches to Information Retrieval
Time-aware Approaches to Information Retrieval
Nattiya Kanhabua
 
Search, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving DataSearch, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving Data
Nattiya Kanhabua
 
Why Is It Difficult to Detect Outbreaks in Twitter?
Why Is It Difficult to Detect Outbreaks in Twitter?Why Is It Difficult to Detect Outbreaks in Twitter?
Why Is It Difficult to Detect Outbreaks in Twitter?
Nattiya Kanhabua
 
Exploiting temporal information in retrieval of archived documents (doctoral ...
Exploiting temporal information in retrieval of archived documents (doctoral ...Exploiting temporal information in retrieval of archived documents (doctoral ...
Exploiting temporal information in retrieval of archived documents (doctoral ...
Nattiya Kanhabua
 
Temporal Web Dynamics and Implications for Information Retrieval
Temporal Web Dynamics and Implications for Information RetrievalTemporal Web Dynamics and Implications for Information Retrieval
Temporal Web Dynamics and Implications for Information Retrieval
Nattiya Kanhabua
 
Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)
Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)
Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)
Nattiya Kanhabua
 

Viewers also liked (19)

Preservation and Forgetting: Friends or Foes?
Preservation and Forgetting: Friends or Foes?Preservation and Forgetting: Friends or Foes?
Preservation and Forgetting: Friends or Foes?
 
Dynamics of Web: Analysis and Implications from Search Perspective
Dynamics of Web: Analysis and Implications from Search  PerspectiveDynamics of Web: Analysis and Implications from Search  Perspective
Dynamics of Web: Analysis and Implications from Search Perspective
 
Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...
Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...
Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...
 
Leveraging Dynamic Query Subtopics for Time-aware Search Result Diversification
Leveraging Dynamic Query Subtopics for Time-aware Search Result DiversificationLeveraging Dynamic Query Subtopics for Time-aware Search Result Diversification
Leveraging Dynamic Query Subtopics for Time-aware Search Result Diversification
 
Ranking Related News Predictions
Ranking Related News PredictionsRanking Related News Predictions
Ranking Related News Predictions
 
Concise Preservation by Combining Managed Forgetting and Contextualized Remem...
Concise Preservation by Combining Managed Forgetting and Contextualized Remem...Concise Preservation by Combining Managed Forgetting and Contextualized Remem...
Concise Preservation by Combining Managed Forgetting and Contextualized Remem...
 
Temporal summarization of event related updates
Temporal summarization of event related updatesTemporal summarization of event related updates
Temporal summarization of event related updates
 
What Triggers Human Remembering of Events? A Large-Scale Analysis of Catalyst...
What Triggers Human Remembering of Events? A Large-Scale Analysis of Catalyst...What Triggers Human Remembering of Events? A Large-Scale Analysis of Catalyst...
What Triggers Human Remembering of Events? A Large-Scale Analysis of Catalyst...
 
Understanding the Diversity of Tweets in the Time of Outbreaks
Understanding the Diversity of Tweets in the Time of OutbreaksUnderstanding the Diversity of Tweets in the Time of Outbreaks
Understanding the Diversity of Tweets in the Time of Outbreaks
 
Improving Temporal Language Models For Determining Time of Non-Timestamped Do...
Improving Temporal Language Models For Determining Time of Non-Timestamped Do...Improving Temporal Language Models For Determining Time of Non-Timestamped Do...
Improving Temporal Language Models For Determining Time of Non-Timestamped Do...
 
Determining Time of Queries for Re-ranking Search Results
Determining Time of Queries for Re-ranking Search ResultsDetermining Time of Queries for Re-ranking Search Results
Determining Time of Queries for Re-ranking Search Results
 
Exploiting Time-based Synonyms in Searching Document Archives
Exploiting Time-based Synonyms in Searching Document ArchivesExploiting Time-based Synonyms in Searching Document Archives
Exploiting Time-based Synonyms in Searching Document Archives
 
Searching the Temporal Web: Challenges and Current Approaches
Searching the Temporal Web: Challenges and Current ApproachesSearching the Temporal Web: Challenges and Current Approaches
Searching the Temporal Web: Challenges and Current Approaches
 
Time-aware Approaches to Information Retrieval
Time-aware Approaches to Information RetrievalTime-aware Approaches to Information Retrieval
Time-aware Approaches to Information Retrieval
 
Search, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving DataSearch, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving Data
 
Why Is It Difficult to Detect Outbreaks in Twitter?
Why Is It Difficult to Detect Outbreaks in Twitter?Why Is It Difficult to Detect Outbreaks in Twitter?
Why Is It Difficult to Detect Outbreaks in Twitter?
 
Exploiting temporal information in retrieval of archived documents (doctoral ...
Exploiting temporal information in retrieval of archived documents (doctoral ...Exploiting temporal information in retrieval of archived documents (doctoral ...
Exploiting temporal information in retrieval of archived documents (doctoral ...
 
Temporal Web Dynamics and Implications for Information Retrieval
Temporal Web Dynamics and Implications for Information RetrievalTemporal Web Dynamics and Implications for Information Retrieval
Temporal Web Dynamics and Implications for Information Retrieval
 
Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)
Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)
Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)
 

Similar to Can Twitter & Co. Save Lives?

Ebola response in Liberia: A step towards real-time epidemic science
Ebola response in Liberia: A step towards real-time epidemic scienceEbola response in Liberia: A step towards real-time epidemic science
Ebola response in Liberia: A step towards real-time epidemic science
Biocomplexity Institute of Virginia Tech
 
I so p 9.10.2017
I so p 9.10.2017I so p 9.10.2017
I so p 9.10.2017
Nigel Collier
 
ENSULIB webinar Series November2022 Neighborhood Science.pdf
ENSULIB webinar Series November2022 Neighborhood Science.pdfENSULIB webinar Series November2022 Neighborhood Science.pdf
ENSULIB webinar Series November2022 Neighborhood Science.pdf
Environment, Sustainability and Libraries Section IFLA
 
Dengue Transmission and Risk Factors in Dhaka, Bangladesh
Dengue Transmission and Risk Factors in Dhaka, Bangladesh Dengue Transmission and Risk Factors in Dhaka, Bangladesh
Dengue Transmission and Risk Factors in Dhaka, Bangladesh
Global Risk Forum GRFDavos
 
APHA Presentation: Using Predictive Analytics for West Nile Disease Prevention
APHA Presentation: Using Predictive Analytics for West Nile Disease PreventionAPHA Presentation: Using Predictive Analytics for West Nile Disease Prevention
APHA Presentation: Using Predictive Analytics for West Nile Disease Prevention
Raed Mansour
 
High throughput analysis and alerting of disease outbreaks from the grey lite...
High throughput analysis and alerting of disease outbreaks from the grey lite...High throughput analysis and alerting of disease outbreaks from the grey lite...
High throughput analysis and alerting of disease outbreaks from the grey lite...
Nigel Collier
 
Informatics for Disease Surveillance – New Technologies
Informatics for Disease Surveillance – New TechnologiesInformatics for Disease Surveillance – New Technologies
Informatics for Disease Surveillance – New Technologies
Dr Wasim Ahmed
 
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersKno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Amit Sheth
 
Harnessing the Power of Infectious Disease Information with a Relational Data...
Harnessing the Power of Infectious Disease Information with a Relational Data...Harnessing the Power of Infectious Disease Information with a Relational Data...
Harnessing the Power of Infectious Disease Information with a Relational Data...
Jay Brown
 
Carl koppeschaar: Disease Radar: Measuring and Forecasting the Spread of Infe...
Carl koppeschaar: Disease Radar: Measuring and Forecasting the Spread of Infe...Carl koppeschaar: Disease Radar: Measuring and Forecasting the Spread of Infe...
Carl koppeschaar: Disease Radar: Measuring and Forecasting the Spread of Infe...
Flávio Codeço Coelho
 
Adaptive governance - global networks (Victor Galaz)
Adaptive governance - global networks (Victor Galaz)Adaptive governance - global networks (Victor Galaz)
Adaptive governance - global networks (Victor Galaz)
Victor Galaz
 
Global Trends in Use of IT for Efficient Public Health Care
Global Trends in Use of IT for Efficient Public Health CareGlobal Trends in Use of IT for Efficient Public Health Care
Global Trends in Use of IT for Efficient Public Health Care
Biplav Srivastava
 
Classroom without Walls
Classroom without WallsClassroom without Walls
mQoL-Lab : Living Lab Infrastructure
mQoL-Lab : Living Lab InfrastructuremQoL-Lab : Living Lab Infrastructure
mQoL-Lab : Living Lab Infrastructure
Katarzyna Wac & The QoL Lab
 
Nudging0and0ha0m0behavioral0science
Nudging0and0ha0m0behavioral0scienceNudging0and0ha0m0behavioral0science
Nudging0and0ha0m0behavioral0scienceKibuuka Fahad
 
original.ppteefefedfeddfeddedfwqdfqdqdqdqddfqafwq
original.ppteefefedfeddfeddedfwqdfqdqdqdqddfqafwqoriginal.ppteefefedfeddfeddedfwqdfqdqdqdqddfqafwq
original.ppteefefedfeddfeddedfwqdfqdqdqdqddfqafwq
SamKuruvilla5
 
Acegid Presentation - Health - Cotonou
Acegid Presentation - Health - CotonouAcegid Presentation - Health - Cotonou
Acegid Presentation - Health - Cotonou
Association of African Univerisites
 
Digital Scholarship: Enlightenment or Devastated Landscape?
Digital Scholarship: Enlightenment or Devastated Landscape? Digital Scholarship: Enlightenment or Devastated Landscape?
Digital Scholarship: Enlightenment or Devastated Landscape?
TheContentMine
 
Digital Scholarship
Digital ScholarshipDigital Scholarship
Digital Scholarship
petermurrayrust
 
Changing the World in Healthcare, Education, and Energy through Science, Tech...
Changing the World in Healthcare, Education, and Energy through Science, Tech...Changing the World in Healthcare, Education, and Energy through Science, Tech...
Changing the World in Healthcare, Education, and Energy through Science, Tech...
Mohamed Labadi
 

Similar to Can Twitter & Co. Save Lives? (20)

Ebola response in Liberia: A step towards real-time epidemic science
Ebola response in Liberia: A step towards real-time epidemic scienceEbola response in Liberia: A step towards real-time epidemic science
Ebola response in Liberia: A step towards real-time epidemic science
 
I so p 9.10.2017
I so p 9.10.2017I so p 9.10.2017
I so p 9.10.2017
 
ENSULIB webinar Series November2022 Neighborhood Science.pdf
ENSULIB webinar Series November2022 Neighborhood Science.pdfENSULIB webinar Series November2022 Neighborhood Science.pdf
ENSULIB webinar Series November2022 Neighborhood Science.pdf
 
Dengue Transmission and Risk Factors in Dhaka, Bangladesh
Dengue Transmission and Risk Factors in Dhaka, Bangladesh Dengue Transmission and Risk Factors in Dhaka, Bangladesh
Dengue Transmission and Risk Factors in Dhaka, Bangladesh
 
APHA Presentation: Using Predictive Analytics for West Nile Disease Prevention
APHA Presentation: Using Predictive Analytics for West Nile Disease PreventionAPHA Presentation: Using Predictive Analytics for West Nile Disease Prevention
APHA Presentation: Using Predictive Analytics for West Nile Disease Prevention
 
High throughput analysis and alerting of disease outbreaks from the grey lite...
High throughput analysis and alerting of disease outbreaks from the grey lite...High throughput analysis and alerting of disease outbreaks from the grey lite...
High throughput analysis and alerting of disease outbreaks from the grey lite...
 
Informatics for Disease Surveillance – New Technologies
Informatics for Disease Surveillance – New TechnologiesInformatics for Disease Surveillance – New Technologies
Informatics for Disease Surveillance – New Technologies
 
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersKno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
 
Harnessing the Power of Infectious Disease Information with a Relational Data...
Harnessing the Power of Infectious Disease Information with a Relational Data...Harnessing the Power of Infectious Disease Information with a Relational Data...
Harnessing the Power of Infectious Disease Information with a Relational Data...
 
Carl koppeschaar: Disease Radar: Measuring and Forecasting the Spread of Infe...
Carl koppeschaar: Disease Radar: Measuring and Forecasting the Spread of Infe...Carl koppeschaar: Disease Radar: Measuring and Forecasting the Spread of Infe...
Carl koppeschaar: Disease Radar: Measuring and Forecasting the Spread of Infe...
 
Adaptive governance - global networks (Victor Galaz)
Adaptive governance - global networks (Victor Galaz)Adaptive governance - global networks (Victor Galaz)
Adaptive governance - global networks (Victor Galaz)
 
Global Trends in Use of IT for Efficient Public Health Care
Global Trends in Use of IT for Efficient Public Health CareGlobal Trends in Use of IT for Efficient Public Health Care
Global Trends in Use of IT for Efficient Public Health Care
 
Classroom without Walls
Classroom without WallsClassroom without Walls
Classroom without Walls
 
mQoL-Lab : Living Lab Infrastructure
mQoL-Lab : Living Lab InfrastructuremQoL-Lab : Living Lab Infrastructure
mQoL-Lab : Living Lab Infrastructure
 
Nudging0and0ha0m0behavioral0science
Nudging0and0ha0m0behavioral0scienceNudging0and0ha0m0behavioral0science
Nudging0and0ha0m0behavioral0science
 
original.ppteefefedfeddfeddedfwqdfqdqdqdqddfqafwq
original.ppteefefedfeddfeddedfwqdfqdqdqdqddfqafwqoriginal.ppteefefedfeddfeddedfwqdfqdqdqdqddfqafwq
original.ppteefefedfeddfeddedfwqdfqdqdqdqddfqafwq
 
Acegid Presentation - Health - Cotonou
Acegid Presentation - Health - CotonouAcegid Presentation - Health - Cotonou
Acegid Presentation - Health - Cotonou
 
Digital Scholarship: Enlightenment or Devastated Landscape?
Digital Scholarship: Enlightenment or Devastated Landscape? Digital Scholarship: Enlightenment or Devastated Landscape?
Digital Scholarship: Enlightenment or Devastated Landscape?
 
Digital Scholarship
Digital ScholarshipDigital Scholarship
Digital Scholarship
 
Changing the World in Healthcare, Education, and Energy through Science, Tech...
Changing the World in Healthcare, Education, and Energy through Science, Tech...Changing the World in Healthcare, Education, and Energy through Science, Tech...
Changing the World in Healthcare, Education, and Energy through Science, Tech...
 

Recently uploaded

Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
Faculty of Medicine And Health Sciences
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Sebastiano Panichella
 
Tom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issueTom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issue
amekonnen
 
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Dutch Power
 
Acorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutesAcorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutes
IP ServerOne
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Sebastiano Panichella
 
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
0x01 - Newton's Third Law:  Static vs. Dynamic Abusers0x01 - Newton's Third Law:  Static vs. Dynamic Abusers
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
OWASP Beja
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
Howard Spence
 
Burning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdfBurning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdf
kkirkland2
 
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Access Innovations, Inc.
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
faizulhassanfaiz1670
 
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AwangAniqkmals
 
Gregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptxGregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptx
gharris9
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
khadija278284
 
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Dutch Power
 
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXOBitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Matjaž Lipuš
 
Getting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control TowerGetting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control Tower
Vladimir Samoylov
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
Sebastiano Panichella
 
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
OECD Directorate for Financial and Enterprise Affairs
 
María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
eCommerce Institute
 

Recently uploaded (20)

Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
 
Tom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issueTom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issue
 
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
 
Acorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutesAcorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutes
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
 
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
0x01 - Newton's Third Law:  Static vs. Dynamic Abusers0x01 - Newton's Third Law:  Static vs. Dynamic Abusers
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
 
Burning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdfBurning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdf
 
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
 
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
 
Gregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptxGregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptx
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
 
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
 
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXOBitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXO
 
Getting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control TowerGetting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control Tower
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
 
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
 
María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
 

Can Twitter & Co. Save Lives?

  • 1. Can Twitter & Co. Save Lives? Nattiya Kanhabua, Avaré Stewart, Sara Romano Ernesto Diaz-Aviles, Wolf Siberski, and Wolfgang Nejdl L3S Research Center / Leibniz Universität Hannover, Germany Research Seminar @MPII, Saarbrücken 22 October 2013
  • 2. Motivation • Numerous works use Twitter to infer the existence and magnitude of real-world events in real-time – Earthquake [Sakaki et al., 2010] – Predicting financial time series [Ruiz et al., 2012] – Influenza epidemics [Culotta, 2010; Lampos et al., 2011; Paul et al., 2011]
  • 4. Health related tweets • User status updates or news related to public health are common in Twitter – I have the mumps...am I alone? – my baby girl has a Gastroenteritis so great!! Please do not give it to meee – #Cholera breaks out in #Dadaab refugee camp in #Kenya http://t.co/.... – As many as 16 people have been found infected with Anthrax in Shahjadpur upazila of the Sirajganj district in Bangladesh.
  • 6. Basic Approach [Kanhabua et al., CIKM’12]
  • 7. Basic Approach [Kanhabua et al., CIKM’12]
  • 8. M-Eco System Medical Ecosystem: Personalized Event-based Surveillance http://www.meco-project.eu/
  • 9. Data Collection • Official outbreak reports – ~3,000 ProMED-mail reports from 2011 – WHO reports have very small coverage • Twitter data – ~1,200 health-related terms (i.e., infectious diseases, their synonyms, pathogens and symptoms) – Over 112 millions of tweets from 2011 • Series of NLP tools including – OpenNLP (tokenization, sentence splitting, POS tagging) – OpenCalais (named entity recognition) – HeidelTime (temporal expression extraction)
  • 10. Ground Truths [Kanhabua et al., TAIA’ 12]
  • 11. Event Extraction • An event is a sentence containing two entities – (1) medical condition and (2) geographic expression – A minimum requirement by domain experts • A victim and the time of an event can be identified from the sentence itself, or its surrounding context • Output: a set of event candidates Reported by World Health Organization (WHO) on 29 July 2012 about an ongoing Ebola outbreak in Uganda since the beginning of July 2012 [Kanhabua et al., TAIA’ 12]
  • 12. Message Filtering: Challenges • Ambiguity – having several meanings – used in different contexts • Incompleteness – missing or under-reported events – data processing errors
  • 13. Message Filtering: Challenges • Ambiguity – having several meanings – used in different contexts • Incompleteness – missing or under-reported events – data processing errors Category Example tweet Literature A two hour train journey, Love In the Time of Cholera ... Music Dengue Fever’s “Uku,” Mixed by Paul Dreux Smith Universal Audio... Marketing Exclusive distributor of high quality #HIV/AIDS Blood & Urine and #Hepatitis #Self -testers. General Identification of genotype 4 Hepatitis E virus binding proteins on swine liver cells: Hepatitis E virus... Negative i dont have sniffles and no real coughing..well its coughing but not like an influenza cough. Joke Thought I had Bieber Fever. Ends up I just had a combo of the mumps, mono, measles & the hershey squ...
  • 14. Challenge I. Noisy/evolving • Evolving data – Relevant features changes over time
  • 15. Challenge I. Noisy/evolving • Evolving data – Relevant features changes over time
  • 16. Approach for Noisy Data • MedISys1 – providing a list of negative keywords created by medical experts • Urban Dictionary2 – a Web-based dictionary of slang, ethnic culture words or phrases 1 http://medusa.jrc.it/medisys/homeedition/en/home.html 2 http://www.urbandictionary.com/
  • 17. Approach for Noisy Data • MedISys1 – providing a list of negative keywords created by medical experts • Urban Dictionary2 – a Web-based dictionary of slang, ethnic culture words or phrases 1 http://medusa.jrc.it/medisys/homeedition/en/home.html 2 http://www.urbandictionary.com/
  • 18. [Kanhabua and Nejdl, WOW’ 13]
  • 19. [Kanhabua and Nejdl, WOW’ 13]
  • 21. Signal Generation: Challenges • Temporal Dynamics – seasonal infectious diseases – rare and spontaneous outbreaks • Location Dynamics – frequency and duration – levels of prevalence or severity
  • 22. Signal Generation: Challenges • Temporal Dynamics – seasonal infectious diseases – rare and spontaneous outbreaks • Location Dynamics – frequency and duration – levels of prevalence or severity [Rortais et al., 2010 in Journal of Food Research International]
  • 23. Signal Generation: Challenges • Temporal Dynamics – seasonal infectious diseases – rare and spontaneous outbreaks • Location Dynamics – frequency and duration – levels of prevalence or severity
  • 24. Signal Generation: Challenges • Temporal Dynamics – seasonal infectious diseases – rare and spontaneous outbreaks • Location Dynamics – frequency and duration – levels of prevalence or severity [Emch et al., 2008 in International Journal of Health Geographics]
  • 26. Outbreak Categorization How to generate a reliable signal for low aggregate counts?
  • 28. Temporal Diversity • Refined Jaccard Index (RDJ-index) – average Jaccard similarity of all object pairs • Note: lower RDJ corresponds to higher diversity • Problem: “All-Pair comparison” • Solution: Estimation algorithms with probabilistic error bound guarantees [Deng et al., CIKM’ ∑<− = ji ji OOJS nn RDJ ),( )1( 2 nji ≤<≤1 ∩ UU Jaccard similarity
  • 29. Temporal Diversity • Refined Jaccard Index (RDJ-index) – average Jaccard similarity of all object pairs • Note: lower RDJ corresponds to higher diversity • Problem: “All-Pair comparison” • Solution: Estimation algorithms with probabilistic error bound guarantees [Deng et al., CIKM’ ∑<− = ji ji OOJS nn RDJ ),( )1( 2 nji ≤<≤1 ∩ UU Jaccard similarity (1) Top-k terms (2) Entities
  • 30. Threat Assessment: Challenge • Overwhelming with the large number of tweets
  • 31. Approach • Personalized Tweet Ranking for Epidemic Intelligence – Learning to rank and recommender systems – User's context as implicit criteria for recommendation [Diaz-Aviles et al., ICWSM’ 12, Diaz-Aviles et al., WWW’
  • 32. Approach • Personalized Tweet Ranking for Epidemic Intelligence – Learning to rank and recommender systems – User's context as implicit criteria for recommendation [Diaz-Aviles et al., ICWSM’ 12, Diaz-Aviles et al., WWW’
  • 34. Conclusion • Can Twitter & Co. Save Lives? – On a global level, we were able to generate signals earlier than official reporting mechanisms. – The ultimate answer depends on: how a health organization will use and react to information provided by our system.
  • 35. Future Work • Real-Time Analysis of Big and Fast Social Web Streams – Scalable, efficient methods for filtering and generating signals in real-time – Effective methods for aggregating and visualizing information in a meaningful way
  • 37. References• [Culotta, 2010] A. Culotta. Towards detecting influenza epidemics by analyzing twitter messages. In Proceedings of the First Workshop on Social Media Analytics (SOMA’2010), 2010. • [Diaz-Aviles et al., 2012a] E. Diaz-Aviles, A. Stewart, E. Velasco, K. Denecke, and W. Nejdl. Towards personalized learning to rank for epidemic intelligence based on social media streams. In Proceedings of the 21st World Wide Web Conference (WWW ‘2012), 2012. • [Diaz-Aviles et al., 2012b] E. Diaz-Aviles, A. Stewart, E. Velasco, K. Denecke, and W. Nejdl. Epidemic intelligence for the crowd, by the crowd. In Proceedings of International AAAI Conference on Weblogs and Social Media (ICWSM’2012), 2012. • [Kanhabua et al., 2012a] N. Kanhabua, Sara Romano, and A. Stewart, Identifying Relevant Temporal Expressions for Real-world Events, In SIGIR 2012 Workshop on Time-aware Information Access (TAIA'2012), 2012. • [Kanhabua et al., 2012b] N. Kanhabua, Sara Romano, and A. Stewart and W. Nejdl. Supporting Temporal Analytics for Health Related Events in Microblogs. In Proceedings of CIKM'2012, 2012. • [Kanhabua and Nejdl 2013] N. Kanhabua and W. Nejdl. Understanding the Diversity of Tweets in the Time of Outbreaks. In Proceedings of the First International Web Observatory Workshop (WOW'2013) at WWW'2013, 2013. • [Lampos et al., 2011] V. Lampos and N. Cristianini. Nowcasting events from the social web with statistical learning. ACM TIST, 3, 2011. • [Paul et al., 2011] M. J. Paul and M. Dredze. You are what you tweet: Analyzing twitter for public health. In Proceedings of International AAAI Conference on Weblogs and Social Media (ICWSM’2011), 2011. • [Ruiz et al., 2012] E. J. Ruiz, V. Hristidis, C. Castillo, A. Gionis, and A. Jaimes. Correlating financial time series with micro-blogging activity. In Proceedings of WSDM’2012, 2012. • [Sakaki et al., 2010] T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In Proceedings of WWW’2010, 2010.

Editor's Notes

  1. To exploit this timeliness potential, we present an event-based Epidemic Intelligence (EI) system, which has emerged as a type of intelligence gathering aimed to detect events of interest to the public health from unstructured text on the Web In the medical domain, there has been a surge in detecting health related tweets for early warning Allow a rapid response from authorities [Diaz-Aviles et al., 2012]
  2. Note that, there are existing EI systems, such as, the Bio- Caster Global Health Monitor1 or HealthMap 2. However, they differ from our proposed system in the level of analysis, information sources, language coverage and visualization. Frequencies of cases reported to RKI and number of tweets mentioning the name of the disease: EHEC. Pearson correlation coefficient = 0.864. The monitor of Twitter allowed M-Eco to generate the first signals on Friday, May 20th, 2011.
  3. We study and propose solutions to three main research challenges in gathering epidemic intelligence from social media streams: 1) dynamic classification to enable message filtering, 2) producing reliable warning signals (temporal anomalyies found) based on observe term frequency changes in these messages, using biosurveillance algorithms, and 3) providing suitable information and recommendations to domain experts, for better assessment of the potential outbreak threats associated with the generated signals. Part I. Ground truth creation Official outbreak reports World Health Organization1 ProMED-mail2 Part II. Creating Twitter time series medical condition disease name, synonyms, pathogens, symptoms location geographic expressions, geo-location, or user profile 3 levels: country, continent, latitude
  4. M-Eco strives to detect a large variety of infectious diseases, so we make use of a list of 1,258 terms consisting of infectious diseases, their synonyms, pathogens and symptoms, which are provided by the domain experts in two languages, namely English and German, for an initial filtering step. All documents and tweets are annotated with locations, medical conditions and temporal expressions using a series of language processing tools, including OpenNLP2 for tokenization, sentence splitting and part-of-speech tagging, HeidelTime [34] for temporal expression extraction
  5. Hash-tags co-occurring with #EHEC during May 23 and June 19, 2011, the main period of the outbreak. The hash-tags are classified as entities of type Medical Condition, Location, or Complementary Context, hash-tags out of these categories are discarded.
  6. Hash-tags co-occurring with #EHEC during May 23 and June 19, 2011, the main period of the outbreak. The hash-tags are classified as entities of type Medical Condition, Location, or Complementary Context, hash-tags out of these categories are discarded.
  7. Our approach builds upon [18] and extends it by: 1) incorporating the use of an orthogonal vector, which is learned by a Support Vector Machine (SVM), as a description of the feature change; and 2) computing a novelty score that lets the system identify those tweets that contribute to the feature change, so that their true labels can be obtained.
  8. In order to detect outbreak events for early warning, we exploit different state-of-the-art Biosurveillance algorithms as anomaly detectors in disease-related Twitter messages: \textbf{C1}, \textbf{C2}, \textbf{C3}, F-Statistic (\textbf{FS}), Experimental Weighted Moving Average (\textbf{EWMA}) and Farrington (\textbf{FA})~\cite{basseville1993detection, farrington_1996}. Traditional bio-surveillance systems usually exploit information from official sources, e.g., laboratory results, mortality rates, or the number of reported patients suffering from a disease outbreak. In recent years, researchers in the medical domain have begun to leverage real-time, social Web data, such as, tweets.
  9. In order to detect outbreak events for early warning, we exploit different state-of-the-art Biosurveillance algorithms as anomaly detectors in disease-related Twitter messages: \textbf{C1}, \textbf{C2}, \textbf{C3}, F-Statistic (\textbf{FS}), Experimental Weighted Moving Average (\textbf{EWMA}) and Farrington (\textbf{FA})~\cite{basseville1993detection, farrington_1996}. Traditional bio-surveillance systems usually exploit information from official sources, e.g., laboratory results, mortality rates, or the number of reported patients suffering from a disease outbreak. In recent years, researchers in the medical domain have begun to leverage real-time, social Web data, such as, tweets.
  10. Identified topics show similar trends during the known time periods of real-world outbreaks Diversity reflects how the language (i.e., terms and locations) are used differently Div(entity) highly correlates with topic dynamics for some diseases, i.e., mumps, ebola, botulism and ehec Div(term) shows correlation with topic dynamics for cholera, anthrax and rubella
  11. Algorithms: SampleDJ, TrackDJ (claims and proofs in [Deng et al., 2012])
  12. Algorithms: SampleDJ, TrackDJ (claims and proofs in [Deng et al., 2012])
  13. (1) Rely upon abundant user interactions and/or the availability of explicit feedback (e.g., ratings, likes, dislikes) (2) Within M-Eco, we use the tweets from signals in developing techniques to provide a personalized short list of tweets that meets the context of the investigation. In this section, we review one of them; namely Personalized Tweet Ranking for Epidemic Intelligence (PTR4EI) [13, 14] and discuss the evaluation conducted during a major EHEC outbreak in Germany. where t is a discrete Time interval, MCu the set of Medical Conditions, and Lu the set of Locations of user interest. More precisely, we expand the user&amp;apos;s context, Cu, using latent topics computed with LDA [5] on: 1) an indexed collection of tweets for epidemic intelligence; and 2) the hash-tags that co-occur with this context.
  14. (1) Rely upon abundant user interactions and/or the availability of explicit feedback (e.g., ratings, likes, dislikes) (2) Within M-Eco, we use the tweets from signals in developing techniques to provide a personalized short list of tweets that meets the context of the investigation. In this section, we review one of them; namely Personalized Tweet Ranking for Epidemic Intelligence (PTR4EI) [13, 14] and discuss the evaluation conducted during a major EHEC outbreak in Germany. where t is a discrete Time interval, MCu the set of Medical Conditions, and Lu the set of Locations of user interest. More precisely, we expand the user&amp;apos;s context, Cu, using latent topics computed with LDA [5] on: 1) an indexed collection of tweets for epidemic intelligence; and 2) the hash-tags that co-occur with this context.