Executive Summary: Natural Language Processing of Twitter #swineflu (H1N1) Posts using Semantic MEDLINE Prototype
Upcoming SlideShare
Loading in...5
×
 

Executive Summary: Natural Language Processing of Twitter #swineflu (H1N1) Posts using Semantic MEDLINE Prototype

on

  • 5,133 views

Natural Language Processing of Twitter #swineflu Posts using the Semantic MEDLINE Prototype at the National Library of Medicine, National Institutes of Health, U.S. Dept. of Health and Human Services

Natural Language Processing of Twitter #swineflu Posts using the Semantic MEDLINE Prototype at the National Library of Medicine, National Institutes of Health, U.S. Dept. of Health and Human Services

Statistics

Views

Total Views
5,133
Views on SlideShare
4,985
Embed Views
148

Actions

Likes
2
Downloads
37
Comments
1

3 Embeds 148

http://www.scienceforseo.com 127
http://www.slideshare.net 19
http://staging.pragyasystems.com 2

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • A norwegian blog about the swine flu can be found here; http://svineinfluensaen.blogspot.com
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Influenza A(H1N1)Executive Summary:Natural Language Processing of Twitter #swineflu Posts using the Semantic MEDLINE PrototypeDr. AllaKeselman, Dr. Thomas Rindflesch, David HaleNational Library of Medicine, National Institutes of Health,Department of Health and Human ServicesMay 2009
  • CDCemergency page on Twitter showing posts during the initial H1N1 outbreak, April 2009http://twitter.com/CDCemergency
  • H1N1 information via Twitter:Communication issuesInformation receiversInformation overload>12,000 #swineflu (H1N1) posts/hour @ peakSignal:Noise ratioQuality?Authority?Twitter accounts impersonating CDCInformation providersEffective information provisionBiosurveillance
  • (un)Controlled VocabularyFolksonomyHashtags (#)GrammarAbbreviationsSRSLY IMO ROI 4 RT? YMMVHigh context
  • Examples of #swineflu Tweets
  • Acquisition ChallengesTwitter timelineStorage requirementsPrivacyTwitter APILimited search functionalityTemporal and range limitationsRange definition limited to midnight1500 posts from limit
  • Semantic MEDLINE PrototypeSummarizes MEDLINE citations returned by PubMed searchNatural Language Processing (MetaMap, SemRep) used to analyze salient content in titles and abstractsInformation presented in graph that has links to the MEDLINE text processedVisualize relationships, such as:A is a process of BX treats Y
  • Semantic MEDLINE Prototype Search pageBreast Cancer is highlighted in a list of available MEDLINE searches.
  • Semantic MEDLINE Prototype Search pageSummarize page for term: breast cancer.“Malignant neoplasm of breast” is highlighted in a list of topics on which to summarize.
  • Semantic MEDLINE Prototype Search pageSemantic MEDLINE Visualization. The term “Malignant neoplasm of breast” is in the center of the page. Dark blue arrows point into the center showing that terms such as “Endocrine therapy” and “Operative Surgical Procedures” TREATS the center term. Medications such as “trastuzumab” and “Tamoxifen” point to the center term. A brown arrow points from the center to the term “human” showing that the center term is a PROCESS OF “human.”
  • Semantic processing of#swineflu TweetsSample - 1267 TweetsAfternoon of April 27, 2009No adjustments made to NLP software (MetaMap, SemRep)No additional vocabulary, abbreviations, etc.
  • Preliminary Processing of #swineflu TweetsSample page from SemRep report, showing SemRep processing, Concepts, and Filter concepts.
  • Preliminary Processing of #swineflu TweetsSample page from SemRep report, showing Filter concepts by semantic type, Predications, and PROCESS_OF terms.
  • Concepts in Tweets Isolated by Semantic ProcessingDisease: influenzaDisease symptom: coughingGeographic area: MexicoAnimal: family suidaeHealth care organization: Centers for Disease Control and Prevention (U.S.)Medical device: mask
  • Next StepsProcessing of larger datasetinclude non-H1N1-related TweetsAdditional vocabularyFolksonomy, abbreviations, etc.Visualization of semantic processing results
  • OpportunitiesBiosurveillanceMonitoring of wide-spread sentimentTargeted information provisionRespond to misinformation trendsEvaluation of accuracy/authenticity
  • LinksSemantic MEDLINE Prototypehttp://skr3.nlm.nih.gov/SemMedDemo/Semantic Medline: Multi-Document Summarization and Visualizationhttp://www.nlm.nih.gov/pubs/techbull/mj07/theater_ppt/semantic.pptNational Library of Medicinehttp://www.nlm.nih.govNational Institutes of Healthhttp://nih.govDepartment of Health and Human Serviceshttp://hhs.gov
  • Dr. AllaKeselmankeselmana AT mail DOT nlm DOT nih DOT govDr. Thomas Rindfleschtrindflesch AT mail DOT nih DOT govDavid Haledavid DOT hale ATnih DOT gov

Executive Summary: Natural Language Processing of Twitter #swineflu (H1N1) Posts using Semantic MEDLINE Prototype Executive Summary: Natural Language Processing of Twitter #swineflu (H1N1) Posts using Semantic MEDLINE Prototype Presentation Transcript

  • Influenza A(H1N1) Executive Summary: Natural Language Processing of Twitter #swineflu Posts using the Semantic MEDLINE Prototype Dr. AllaKeselman, Dr. Thomas Rindflesch, David Hale National Library of Medicine, National Institutes of Health, Department of Health and Human Services May 2009
  • http://twitter.com/CDCemergency
  • H1N1 information via Twitter: Communication issues • Information receivers – Information overload • >12,000 #swineflu (H1N1) posts/hour @ peak – Signal:Noise ratio • Quality? • Authority? – Twitter accounts impersonating CDC • Information providers – Effective information provision – Biosurveillance
  • (un)ControlledVocabulary • Folksonomy • Hashtags(#) • Grammar • Abbreviations – SRSLY IMO ROI 4 RT? YMMV • High context
  • #swineflu Tweets
  • Acquisition Challenges • Twitter timeline – Storage requirements – Privacy • Twitter API – Limited search functionality • Temporal and range limitations – Range definition limited to midnight – 1500 posts from limit
  • Semantic MEDLINE Prototype • Summarizes MEDLINE citations returned by PubMed search • Natural Language Processing (MetaMap, SemRep) used to analyze salient content in titles and abstracts • Information presented in graph that has links to the MEDLINE text processed • Visualize relationships, such as: – A is a process of B – X treats Y
  • http://skr3.nlm.nih.gov/SemMedDemo/
  • http://skr3.nlm.nih.gov/SemMedDemo/
  • http://skr3.nlm.nih.gov/SemMedDemo/
  • Semantic processing of #swineflu Tweets • Sample - 1267 Tweets – Afternoon of April 27, 2009 • No adjustments made to NLP software (MetaMap, SemRep) – No additional vocabulary, abbreviations, etc.
  • Preliminary Processing of #swineflu Tweets
  • Preliminary Processing of #swineflu Tweets
  • Concepts in Tweets Isolated by Semantic Processing • Disease: influenza • Disease symptom: coughing • Geographic area: Mexico • Animal: family suidae • Health care organization: Centers for Disease Control and Prevention (U.S.) • Medical device: mask
  • Next Steps • Processing of larger dataset – include non-H1N1-related Tweets • Additional vocabulary – Folksonomy, abbreviations, etc. • Visualization of semantic processing results
  • Opportunities • Biosurveillance • Monitoring of wide-spread sentiment • Targeted information provision – Respond to misinformation trends • Evaluation of accuracy/authenticity
  • Links • Semantic MEDLINE Prototype – http://skr3.nlm.nih.gov/SemMedDemo/ • Semantic Medline: Multi-Document Summarization and Visualization – http://www.nlm.nih.gov/pubs/techbull/mj07/theater_ppt/ semantic.ppt • National Library of Medicine – http://www.nlm.nih.gov • National Institutes of Health – http://nih.gov • Department of Health and Human Services – http://hhs.gov
  • Dr. AllaKeselman keselmana AT mail DOT nlm DOT nih DOT gov Dr. Thomas Rindflesch trindflesch AT mail DOT nih DOT gov David Hale davidDOT hale AT nih DOT gov