EpiSPIDER is a web-based event-based surveillance system that aggregates and visualizes unstructured outbreak reports from online sources. It was created in 2005 using open-source tools to address the need for early detection of emerging infectious diseases. EpiSPIDER extracts key information from reports using natural language processing and geotags events on interactive maps. It faces ongoing challenges in adapting to the changing nature of web data and services and linking unstructured information to ontologies. Future steps include incorporating standardized ontologies to enable data sharing across surveillance systems and implementing event-based approaches at the national level.
Presentation_Kerr - Using Innovations and Partnerships in Digital Technologi...CORE Group
Zenysis integrates data from various sources like health management information systems, finance data, and survey data into a single platform to allow for advanced analytics. It has worked with institutions serving 1.7 billion people globally. Zenysis can triangulate different types of data, like comparing health and supply chain data, to remedy inefficiencies and anticipate issues. During humanitarian crises, Zenysis rapidly integrates multiple siloed data sources to facilitate analysis and help direct emergency response efforts. For example, after Cyclone Idai in Mozambique, Zenysis integrated surveillance, vaccination, and other data to help organizations respond to health threats and target their interventions.
Riff: A Social Network and Collaborative Platform for Public Health Disease S...Taha Kass-Hout, MD, MS
A hybrid (event-based and indicator-based) platform designed to streamline the collaboration between domain experts and machine learning algorithms for detection, prediction and response to health-related events (such as disease outbreaks or pandemics). The platform helps synthesize health-related event indicators from a wide variety of information sources (structured and unstructured) into a consolidated picture for analysis, maintenance of “community-wide coherence”, and collaboration processes. The platform offers features to detect anomalies, visualize clusters of potential events, predict the rate and spread of a disease outbreak and provide decision makers with tools, methodologies and processes to investigate the event.
This document discusses and compares monitoring and surveillance in veterinary epidemiology. It defines surveillance as a more intensive form of monitoring that involves the gathering, analysis, and dissemination of disease data to support control actions. The key differences provided are that surveillance requires professional analysis and judgment to make recommendations, has formulated standards, and can differentiate between acceptable and unacceptable changes in disease status. Various types of surveillance systems and their uses in disease control planning and evaluation are also outlined.
Surveillance involves the systematic collection, analysis, and use of health data for decision-making. It serves as an early warning system and monitors the impact of interventions. There are different types of surveillance including community-based, hospital-based, and active/passive surveillance. Community-based surveillance engages community members to detect and report health events. Hospital-based surveillance relies on regular reporting from hospitals. Active surveillance actively seeks out cases, while passive surveillance waits for cases to be reported. The appropriate surveillance method depends on the context and challenges.
This document discusses different types of surveillance including electronic, computer, audio, visual, and biometric surveillance. It provides examples of various surveillance methods such as electronic article surveillance, social network analysis, wiretapping, red light cameras, and gait analysis. The document also discusses debates around surveillance powers and technologies used by law enforcement.
1) The document discusses surveillance in public health and describes its key components and purposes. Surveillance involves the systematic collection, analysis, and interpretation of health data to provide information for action.
2) An effective surveillance system is simple, flexible, timely, and produces high-quality data. It addresses an important public health problem and accomplishes its objectives of understanding disease trends, detecting outbreaks, and evaluating control measures.
3) The document outlines how to establish a surveillance system, including selecting priority diseases, defining standard case definitions, and developing regular reporting and data dissemination processes. Both passive and active surveillance methods are described.
A Web Based Tool For the Detection and Analysis of AvianInfluenza Outbreaks ...Ian Turton
The document describes a web-based tool developed by Ian Turton and Andrew Murdoch to automatically detect and map avian influenza outbreaks reported in internet news sources. The tool collects news feeds, extracts location information from articles using named entity recognition and geocoding, stores the data in a spatial database, and serves interactive maps through a web client. While the tool was able to map some outbreaks reported in the news, it also mapped many irrelevant articles and requires improvements to better filter content and enhance the user experience.
The document discusses integrating air quality and pollution data from different sources using standards-based networking approaches. It describes the DataFed system, which allows non-intrusive integration of diverse data types from local, regional and global sources through web services and reusable components. The summary highlights that DataFed has been applied to EPA policy and science needs but more collaboration is still needed to fully connect heterogeneous data sources and enable new insights.
Presentation_Kerr - Using Innovations and Partnerships in Digital Technologi...CORE Group
Zenysis integrates data from various sources like health management information systems, finance data, and survey data into a single platform to allow for advanced analytics. It has worked with institutions serving 1.7 billion people globally. Zenysis can triangulate different types of data, like comparing health and supply chain data, to remedy inefficiencies and anticipate issues. During humanitarian crises, Zenysis rapidly integrates multiple siloed data sources to facilitate analysis and help direct emergency response efforts. For example, after Cyclone Idai in Mozambique, Zenysis integrated surveillance, vaccination, and other data to help organizations respond to health threats and target their interventions.
Riff: A Social Network and Collaborative Platform for Public Health Disease S...Taha Kass-Hout, MD, MS
A hybrid (event-based and indicator-based) platform designed to streamline the collaboration between domain experts and machine learning algorithms for detection, prediction and response to health-related events (such as disease outbreaks or pandemics). The platform helps synthesize health-related event indicators from a wide variety of information sources (structured and unstructured) into a consolidated picture for analysis, maintenance of “community-wide coherence”, and collaboration processes. The platform offers features to detect anomalies, visualize clusters of potential events, predict the rate and spread of a disease outbreak and provide decision makers with tools, methodologies and processes to investigate the event.
This document discusses and compares monitoring and surveillance in veterinary epidemiology. It defines surveillance as a more intensive form of monitoring that involves the gathering, analysis, and dissemination of disease data to support control actions. The key differences provided are that surveillance requires professional analysis and judgment to make recommendations, has formulated standards, and can differentiate between acceptable and unacceptable changes in disease status. Various types of surveillance systems and their uses in disease control planning and evaluation are also outlined.
Surveillance involves the systematic collection, analysis, and use of health data for decision-making. It serves as an early warning system and monitors the impact of interventions. There are different types of surveillance including community-based, hospital-based, and active/passive surveillance. Community-based surveillance engages community members to detect and report health events. Hospital-based surveillance relies on regular reporting from hospitals. Active surveillance actively seeks out cases, while passive surveillance waits for cases to be reported. The appropriate surveillance method depends on the context and challenges.
This document discusses different types of surveillance including electronic, computer, audio, visual, and biometric surveillance. It provides examples of various surveillance methods such as electronic article surveillance, social network analysis, wiretapping, red light cameras, and gait analysis. The document also discusses debates around surveillance powers and technologies used by law enforcement.
1) The document discusses surveillance in public health and describes its key components and purposes. Surveillance involves the systematic collection, analysis, and interpretation of health data to provide information for action.
2) An effective surveillance system is simple, flexible, timely, and produces high-quality data. It addresses an important public health problem and accomplishes its objectives of understanding disease trends, detecting outbreaks, and evaluating control measures.
3) The document outlines how to establish a surveillance system, including selecting priority diseases, defining standard case definitions, and developing regular reporting and data dissemination processes. Both passive and active surveillance methods are described.
A Web Based Tool For the Detection and Analysis of AvianInfluenza Outbreaks ...Ian Turton
The document describes a web-based tool developed by Ian Turton and Andrew Murdoch to automatically detect and map avian influenza outbreaks reported in internet news sources. The tool collects news feeds, extracts location information from articles using named entity recognition and geocoding, stores the data in a spatial database, and serves interactive maps through a web client. While the tool was able to map some outbreaks reported in the news, it also mapped many irrelevant articles and requires improvements to better filter content and enhance the user experience.
The document discusses integrating air quality and pollution data from different sources using standards-based networking approaches. It describes the DataFed system, which allows non-intrusive integration of diverse data types from local, regional and global sources through web services and reusable components. The summary highlights that DataFed has been applied to EPA policy and science needs but more collaboration is still needed to fully connect heterogeneous data sources and enable new insights.
The document discusses collaboration in disease surveillance and response. It describes InSTEDD's hybrid approach to disease surveillance which combines various data sources to identify health risks. It also discusses tools developed by InSTEDD like GeoChat and Mesh4x that enable real-time information sharing and collaboration between organizations responding to disease outbreaks. The document emphasizes that collaboration is critical for effective outbreak containment and humanitarian response.
InSTEDD: Collaboration in Disease Surveillance & ResponseInSTEDD
The document discusses collaboration in disease surveillance and response. It describes InSTEDD's hybrid approach to disease surveillance which combines various data sources to identify health risks. It also discusses tools developed by InSTEDD like GeoChat and Mesh4x that enable real-time information sharing and collaboration between organizations responding to disease outbreaks. The document emphasizes that collaboration is critical for effective outbreak containment and humanitarian response.
A Cloud-Based Prototype Implementation of a Disease Outbreak Notification SystemIJCSEA Journal
This paper describes the design, prototype implementation and performance characteristics of a Disease Outbreak Notification System (DONS). The prototype was implemented in a hybrid cloud environment as an online/real-time system. It detects potential outbreaks of both listed and unknown diseases. It uses data mining techniques to choose the correct algorithm to detect outbreaks of unknown diseases. Our experiments showed that the proposed system has very high accuracy rate in choosing the correct detection algorithm. To our best knowledge, DONS is the first of its kind to detect outbreaks of unknown diseases using data mining techniques.
Big Data Fusion for eHealth and Ambient Assisted Living Cloud ApplicationsAccelerate Project
The document describes a cloud-based system for monitoring senior citizens' healthcare using data from heterogeneous sensors. It discusses existing ambient assisted living solutions, challenges with data fusion from multiple sources, and proposed hardware and software components. The system would gather data from various physiological, environmental, and lifestyle sensors. Software components include a big data fusion platform running on a distributed cloud architecture. The goal is to enable applications for monitoring health conditions and daily activities.
A CLOUD-BASED PROTOTYPE IMPLEMENTATION OF A DISEASE OUTBREAK NOTIFICATION SYS...IJCSEA Journal
This paper describes the design, prototype implementation and performance characteristics of a Disease Outbreak Notification System (DONS). The prototype was implemented in a hybrid cloud environment as an online/real-time system. It detects potential outbreaks of both listed and unknown diseases. It uses data mining techniques to choose the correct algorithm to detect outbreaks of unknown diseases. Our experiments showed that the proposed system has very high accuracy rate in choosing the correct detection algorithm. To our best knowledge, DONS is the first of its kind to detect outbreaks of unknown diseases using data mining techniques.
Data Synchronization of Epi Info™ Using a Mesh4X Adapter: Presentation at the AMIA 2009 Annual Symposium-Demonstrations: Management of Populations.
Disclaimer: Any views or opinions expressed by the speaker do not necessarily represent the views of the CDC, HHS, or any other entity of the United States government. Furthermore, the use of any product names, trade names, images, or commercial sources is for identification purposes only, and does not imply endorsement or government sanction by the U.S. Department of Health and Human Services.
This document outlines a project to map avian influenza outbreak data from RSS feeds onto an interactive map and timeline using open source tools. The goals are to visualize the geographic and temporal spread of H5N1 outbreaks among humans and birds. While initial tasks were completed, such as displaying historical WHO data, further work is needed to filter data and integrate additional mapping capabilities.
This document summarizes the transition from clinical information systems to health grids and the future of health research infrastructure. It discusses trends like rising populations in Asia, increasing resource scarcity, and the need for multidisciplinary and open collaboration. Health grids are presented as enabling virtual collaborations across institutions. Key areas like medical imaging, computational models, and genomic medicine are highlighted. Adoption challenges and requirements like reliable, usable infrastructure are also summarized.
The document discusses the adoption of semantic web technologies. It notes that while tools and specifications have matured, applications are now coming to the forefront. It provides examples of semantic web deployments in various domains like digital libraries, eGovernment, healthcare and by major companies. Real-world applications include data integration, intelligent portals and knowledge management systems. Overall adoption is increasing but skills and training remain obstacles.
ESIP Federation: Using social networks and social media to connect communitie...Erin Robinson
ESIP Federation is a consortium of over 120 organizations that collects, interprets, and develops applications for Earth observation information. ESIP uses social networks and social media to connect communities of practice related to Earth science data. This includes using tools like wikis, social media platforms, and teleconferencing to facilitate collaboration between geographically distributed groups. ESIP provides a neutral platform to leverage members' expertise for innovation and to make Earth data more accessible and usable to various stakeholders.
The document describes the development of the Open Drug Discovery Teams (ODDT) mobile app, which aims to facilitate collaboration in drug discovery. The app aggregates open science data from sources like Twitter on topics related to rare and neglected diseases. It provides a magazine-style interface for browsing recent posts. The app and its backend were developed iteratively, with input from researchers during testing. The app harvests tweets with specific hashtags and allows users to endorse or reject posts. It can visualize chemical structures and tables linked from tweets. The goal is to connect researchers and data to help accelerate open drug discovery.
Global Pulse aims to close the information gap between when a crisis occurs and when actionable data is available to help decision makers. It does this by harnessing innovation to monitor vulnerable populations in real-time using new sources of data like social media, mobile phones, and sensors. Global Pulse develops tools like its open collaboration platform to facilitate data sharing and analysis across organizations and helps establish innovation labs around the world to better understand and respond to community needs during crises.
Invited talk "Open Data as a driver of Society 5.0: how you and your scientif...Anastasija Nikiforova
This presentation is prepared as a part of my talk on the openness (open data and open science) in the context of Society 5.0 during the International Conference and Expo on Nanotechnology and Nanomaterials. It was very pleasant to receive an invitation to deliver the talk on my recently published article Smarter Open Government Data for Society 5.0: Are Your Open Data Smart Enough? (Sensors 2021, 21(15), 5204), which I have entitled as “Open Data as a driver of Society 5.0: how you and your scientific outputs can contribute to the development of the Super Smart Society and transformation into Smart Living?“. The paper has been briefly discussed in my previous post, thus, just a few words on this talk and overall experience.
Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK NeISSProject
Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK . Talk given by Richard Sinnott at Urban Research Infrastructure Network Workshops, Melbourne, Brisbane, Sydney, September 2010.
Supporting epidemic intelligence, personalised and public health with advance...Joao Pita Costa
Today, our everyday access to technology permits a health monitoring that can complement the traditional methods in Healthcare and Public Health. In this paper, we present some of this available technology, with a particular focus on disease detection, topological data analysis, and media monitoring tools, made available by the AILAB at the JSI and the ISI Foundation. This technology is ready to be adapted to research and commercial problems in the context of health systems.
High throughput mining of the scholarly literature; talk at NIHpetermurrayrust
Elsevier stopped Chris Hartgerink, a statistician, from downloading research papers in bulk from Sciencedirect for the purpose of content mining to detect potentially problematic research findings, despite having legal access through his university's subscription and only intending to extract facts without redistributing full papers; he had downloaded around 30GB of data over 10 days to mine psychology literature for test results, figures, tables and other information reported in papers. Hartgerink's research aims to investigate unreliable findings that can harm policy and research progress through an innovative content mining method.
Topic:
Effective Visualizations that will aid in minimizing the spread of infectious diseases
Group members:
Lamar Munoz, Michael Brockenbrough, Neisha Sadhnani
1. The document introduces mashups and discusses examples of mashups in different domains like health, libraries, and search.
2. It describes what a mashup is by combining two or more applications to create a new application. Examples provided include WalkScore, media watch on climate change, and health mashups that combine data from different sources.
3. The document also discusses potential mashups in libraries and search and asks what users would like to mashup to make tasks easier, faster, and better. It concludes by briefly mentioning browser extensions and add-ons.
2003-12-02 Environmental Information Systems for Monitoring, Assessment, and ...Rudolf Husar
The document discusses environmental information systems for monitoring, assessment, and decision-making. It covers topics like spatial analysis, web-based information systems, sensor webs, spatial interpolation techniques, integrating satellite and surface monitoring data, and developing interoperable environmental information systems. The goal is to improve access to and use of environmental data for applications like air quality mapping and monitoring networks.
Unidata provides data services, tools, and cyberinfrastructure to advance Earth system science and broaden participation in the geosciences. It was created through a grassroots effort and is funded by the NSF and UCAR. Unidata's work is driven by science, education, technology, and social needs. It provides real-time data from various sources to over 260 sites worldwide and develops standards like netCDF and services like THREDDS to facilitate data sharing and access. Unidata is working to broaden its community through international collaborations and empowering users around the world.
The document discusses collaboration in disease surveillance and response. It describes InSTEDD's hybrid approach to disease surveillance which combines various data sources to identify health risks. It also discusses tools developed by InSTEDD like GeoChat and Mesh4x that enable real-time information sharing and collaboration between organizations responding to disease outbreaks. The document emphasizes that collaboration is critical for effective outbreak containment and humanitarian response.
InSTEDD: Collaboration in Disease Surveillance & ResponseInSTEDD
The document discusses collaboration in disease surveillance and response. It describes InSTEDD's hybrid approach to disease surveillance which combines various data sources to identify health risks. It also discusses tools developed by InSTEDD like GeoChat and Mesh4x that enable real-time information sharing and collaboration between organizations responding to disease outbreaks. The document emphasizes that collaboration is critical for effective outbreak containment and humanitarian response.
A Cloud-Based Prototype Implementation of a Disease Outbreak Notification SystemIJCSEA Journal
This paper describes the design, prototype implementation and performance characteristics of a Disease Outbreak Notification System (DONS). The prototype was implemented in a hybrid cloud environment as an online/real-time system. It detects potential outbreaks of both listed and unknown diseases. It uses data mining techniques to choose the correct algorithm to detect outbreaks of unknown diseases. Our experiments showed that the proposed system has very high accuracy rate in choosing the correct detection algorithm. To our best knowledge, DONS is the first of its kind to detect outbreaks of unknown diseases using data mining techniques.
Big Data Fusion for eHealth and Ambient Assisted Living Cloud ApplicationsAccelerate Project
The document describes a cloud-based system for monitoring senior citizens' healthcare using data from heterogeneous sensors. It discusses existing ambient assisted living solutions, challenges with data fusion from multiple sources, and proposed hardware and software components. The system would gather data from various physiological, environmental, and lifestyle sensors. Software components include a big data fusion platform running on a distributed cloud architecture. The goal is to enable applications for monitoring health conditions and daily activities.
A CLOUD-BASED PROTOTYPE IMPLEMENTATION OF A DISEASE OUTBREAK NOTIFICATION SYS...IJCSEA Journal
This paper describes the design, prototype implementation and performance characteristics of a Disease Outbreak Notification System (DONS). The prototype was implemented in a hybrid cloud environment as an online/real-time system. It detects potential outbreaks of both listed and unknown diseases. It uses data mining techniques to choose the correct algorithm to detect outbreaks of unknown diseases. Our experiments showed that the proposed system has very high accuracy rate in choosing the correct detection algorithm. To our best knowledge, DONS is the first of its kind to detect outbreaks of unknown diseases using data mining techniques.
Data Synchronization of Epi Info™ Using a Mesh4X Adapter: Presentation at the AMIA 2009 Annual Symposium-Demonstrations: Management of Populations.
Disclaimer: Any views or opinions expressed by the speaker do not necessarily represent the views of the CDC, HHS, or any other entity of the United States government. Furthermore, the use of any product names, trade names, images, or commercial sources is for identification purposes only, and does not imply endorsement or government sanction by the U.S. Department of Health and Human Services.
This document outlines a project to map avian influenza outbreak data from RSS feeds onto an interactive map and timeline using open source tools. The goals are to visualize the geographic and temporal spread of H5N1 outbreaks among humans and birds. While initial tasks were completed, such as displaying historical WHO data, further work is needed to filter data and integrate additional mapping capabilities.
This document summarizes the transition from clinical information systems to health grids and the future of health research infrastructure. It discusses trends like rising populations in Asia, increasing resource scarcity, and the need for multidisciplinary and open collaboration. Health grids are presented as enabling virtual collaborations across institutions. Key areas like medical imaging, computational models, and genomic medicine are highlighted. Adoption challenges and requirements like reliable, usable infrastructure are also summarized.
The document discusses the adoption of semantic web technologies. It notes that while tools and specifications have matured, applications are now coming to the forefront. It provides examples of semantic web deployments in various domains like digital libraries, eGovernment, healthcare and by major companies. Real-world applications include data integration, intelligent portals and knowledge management systems. Overall adoption is increasing but skills and training remain obstacles.
ESIP Federation: Using social networks and social media to connect communitie...Erin Robinson
ESIP Federation is a consortium of over 120 organizations that collects, interprets, and develops applications for Earth observation information. ESIP uses social networks and social media to connect communities of practice related to Earth science data. This includes using tools like wikis, social media platforms, and teleconferencing to facilitate collaboration between geographically distributed groups. ESIP provides a neutral platform to leverage members' expertise for innovation and to make Earth data more accessible and usable to various stakeholders.
The document describes the development of the Open Drug Discovery Teams (ODDT) mobile app, which aims to facilitate collaboration in drug discovery. The app aggregates open science data from sources like Twitter on topics related to rare and neglected diseases. It provides a magazine-style interface for browsing recent posts. The app and its backend were developed iteratively, with input from researchers during testing. The app harvests tweets with specific hashtags and allows users to endorse or reject posts. It can visualize chemical structures and tables linked from tweets. The goal is to connect researchers and data to help accelerate open drug discovery.
Global Pulse aims to close the information gap between when a crisis occurs and when actionable data is available to help decision makers. It does this by harnessing innovation to monitor vulnerable populations in real-time using new sources of data like social media, mobile phones, and sensors. Global Pulse develops tools like its open collaboration platform to facilitate data sharing and analysis across organizations and helps establish innovation labs around the world to better understand and respond to community needs during crises.
Invited talk "Open Data as a driver of Society 5.0: how you and your scientif...Anastasija Nikiforova
This presentation is prepared as a part of my talk on the openness (open data and open science) in the context of Society 5.0 during the International Conference and Expo on Nanotechnology and Nanomaterials. It was very pleasant to receive an invitation to deliver the talk on my recently published article Smarter Open Government Data for Society 5.0: Are Your Open Data Smart Enough? (Sensors 2021, 21(15), 5204), which I have entitled as “Open Data as a driver of Society 5.0: how you and your scientific outputs can contribute to the development of the Super Smart Society and transformation into Smart Living?“. The paper has been briefly discussed in my previous post, thus, just a few words on this talk and overall experience.
Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK NeISSProject
Infrastructures Supporting Inter-disciplinary Research - Exemplars from the UK . Talk given by Richard Sinnott at Urban Research Infrastructure Network Workshops, Melbourne, Brisbane, Sydney, September 2010.
Supporting epidemic intelligence, personalised and public health with advance...Joao Pita Costa
Today, our everyday access to technology permits a health monitoring that can complement the traditional methods in Healthcare and Public Health. In this paper, we present some of this available technology, with a particular focus on disease detection, topological data analysis, and media monitoring tools, made available by the AILAB at the JSI and the ISI Foundation. This technology is ready to be adapted to research and commercial problems in the context of health systems.
High throughput mining of the scholarly literature; talk at NIHpetermurrayrust
Elsevier stopped Chris Hartgerink, a statistician, from downloading research papers in bulk from Sciencedirect for the purpose of content mining to detect potentially problematic research findings, despite having legal access through his university's subscription and only intending to extract facts without redistributing full papers; he had downloaded around 30GB of data over 10 days to mine psychology literature for test results, figures, tables and other information reported in papers. Hartgerink's research aims to investigate unreliable findings that can harm policy and research progress through an innovative content mining method.
Topic:
Effective Visualizations that will aid in minimizing the spread of infectious diseases
Group members:
Lamar Munoz, Michael Brockenbrough, Neisha Sadhnani
1. The document introduces mashups and discusses examples of mashups in different domains like health, libraries, and search.
2. It describes what a mashup is by combining two or more applications to create a new application. Examples provided include WalkScore, media watch on climate change, and health mashups that combine data from different sources.
3. The document also discusses potential mashups in libraries and search and asks what users would like to mashup to make tasks easier, faster, and better. It concludes by briefly mentioning browser extensions and add-ons.
2003-12-02 Environmental Information Systems for Monitoring, Assessment, and ...Rudolf Husar
The document discusses environmental information systems for monitoring, assessment, and decision-making. It covers topics like spatial analysis, web-based information systems, sensor webs, spatial interpolation techniques, integrating satellite and surface monitoring data, and developing interoperable environmental information systems. The goal is to improve access to and use of environmental data for applications like air quality mapping and monitoring networks.
Unidata provides data services, tools, and cyberinfrastructure to advance Earth system science and broaden participation in the geosciences. It was created through a grassroots effort and is funded by the NSF and UCAR. Unidata's work is driven by science, education, technology, and social needs. It provides real-time data from various sources to over 260 sites worldwide and develops standards like netCDF and services like THREDDS to facilitate data sharing and access. Unidata is working to broaden its community through international collaborations and empowering users around the world.
2. Presentation Outline
What is EpiSPIDER?
Why was EpiSPIDER built?
What is event-based surveillance?
How was EpiSPIDER built?
The EpiSPIDER “Information Ecosystem”
Evolution of EpiSPIDER
How has EpiSPIDER been used?
What are the challenges in implementing EpiSPIDER?
Overall challenges in event-based surveillance
Next steps
Summary
3. What is EpiSPIDER?
The acronym stands for Semantic Processing and
Integration of Distributed Electronic Resources
for Epidemics and disasters
Key words
Semantic processing
Integration of distributed electronic resources
• “Mashup”
• Visualization
4. Why was EpiSPIDER built?
2005: Request from ProMED Mail to represent
their emerging infectious disease reports in time
and space and provide RSS feeds to their
members
2006: Growth beyond ProMED Mail and Google
maps
2009 and beyond: Leveraging linked data to
reduce information overload
5. Why was EpiSPIDER built?
Early response to disease outbreaks is a public health priority
Emerging infectious diseases may not be part of routine public
health reporting in many countries
We can potentially leverage non-traditional sources of data to
provide practitioners with early warning
Specifically, leverage Internet killer applications to collect and
exchange health event information
Extracting and visualizing event information from unstructured
data can be done using computer algorithms such as NLP and text
mining (80% of health information remain locked in free text)
The Role of Information Technology and Surveillance Systems in Bioterrorism Readiness.
Bioterrorism and Health System Preparedness, Issue Brief No. 5. AHRQ Publication No. 05-0072,
March 2005. Agency for Healthcare Research and Quality, Rockville, MD.
http://www.ahrq.gov/news/ulp/btbriefs/btbrief5.htm
6. What is event-based surveillance?
WHO DEFINITION
Definition: The organized and rapid capture of information about events that
are a potential risk to public health
Can be rumors and other ad-hoc reports transmitted through formal channels
(i.e. established routine reporting systems) and informal channels (i.e. media,
health workers and nongovernmental organizations reports), including:
Events related to the occurrence of disease in humans, such as clustered cases of a
disease or syndromes, unusual disease patterns or unexpected deaths as
recognized by health workers and other key informants in the country; and
Events related to potential exposure for humans, such as events related to diseases
and deaths in animals, contaminated food products or water, and environmental
hazards including chemical and radio-nuclear events.
Information received through event-based surveillance should be rapidly
assessed for the risk the event poses to public health and responded to
appropriately
Source: WHO, A guide to establishing event-based surveillance, 2008. URL:
http://www.wpro.who.int/internet/resources.ashx/CSR/Publications/eventbasedsurv.pdf
7. Role of event-based surveillance in
national surveillance system (WHO)
Source: WHO, A guide to establishing event-based surveillance, 2008. URL:
http://www.wpro.who.int/internet/resources.ashx/CSR/Publications/eventbasedsurv.pdf
Indicator-based Surveillance
Routine reporting of cases of disease,
including
•Notifiable disease surveillance system
•Sentinel surveillance
•Laboratory-based surveillance
Commonly
•Health care facility based
•Weekly, monthly reporting
Event-based Surveillance
Rapid detection, reporting,
confirmation, assessment of public
health events including
•Clusters of disease
•Rumors of unexplained deaths
Commonly
•Immediate reporting
Response
Linked to surveillance
National and subnational capacity to respond to alerts
8. Role of event-based surveillance in
national surveillance (ECDC)
Indicator-based
component
Surveillance Systems
Event-based
component
Event-monitoring
Data Events
Signal
Public health alert
Control measures
Capture
Filter
Validate
Collect
Analyse
Interpret
Assess
Investigate
Disseminate
Confidential: EWRS
Restricted access: network
inquiries, ECDC threat bulletin
Public: Eurosurveillance, press
release, web site
Paquet C, et..al. Epidemic intelligence: A new framework for strengthening disease surveillance in Europe. Euro Surveill.
2006;11(12): 212-4. URL: http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=665
9. Major challenges in developing automated
event-based surveillance systems
Can event-based surveillance systems be
automated?
Major challenges:
Describing what information can be extracted from
event reports
Identifying methods to extract desired information
Identifying methods to convert unstructured to
structured data
10. How was EpiSPIDER built?
Began as a fellowship project in 2005 with Dr. Raoul
Kamadjeu
On a “shoestring budget,” utilizing Open-Source
software and freely available web services and data
sources
Linux, Apache, MySQL and PHP (LAMP)
Initially Scalable Vector Graphics then Yahoo Maps and
Google Maps
Existing RSS feeds and unstructured web content
Custom-developed NLP later replaced with
OpenCalais NLP web service
11. The Ecosystem
Definition: Any natural unit or entity including living and non-living
parts that interact to produce a stable system through cyclic exchange
of materials [NASA Earth Observatory Glossary].
Concept can be applied to Internet-based applications that function as
information-consuming or information producing “organisms” that
interact with each other in an interdependent way through exchange of
information.
This information “ecosystem” has:
Producers of data
Transformers of data
Consumers of data
http://earthobservatory.nasa.gov/Glossary/?mode=all
12. Graphical depiction of “ecosystem”
Yahoo Pipes
ProMED Mail
UNDP
CIA
WAHID
Unstructured Text
Google News
Moreover
Reuters
WHO
GDACS
Twitter
RSS
RSS
RSS, GeoRSS
OpenCalais
Alchemy
UMLSKS
uClassifier
Geonames
Google Translate
Yahoo Maps
Wikipedia
KML
Exhibit
Faceted Browsing
Google Maps
JSON data
RDF, XML
XMLSOAP REST
Mobile Provider
SMTP
SMS
Dapper
RSS
Consumers
Transformers
Producers
RSS
EpiSPIDER
RSS
RSS
13. EpiSPIDER Web Services
CATEGORIES BY TASK
Task Category Services
Information retrieval Search engines , RSS feeds, Raw HTML sources
Information extraction Dapper, Yahoo Pipes, Alchemy
Language identification Alchemy, Twitter, uClassifier
Language translation Google Translate
Keyword extraction Alchemy
Named entity recognition OpenCalais, Alchemy
Text classification uClassifier
Visualization SIMILE Exhibit, Google Visualization API, Google Maps
Georeferencing Google Maps, Yahoo Maps, Geonames, Twitter,
OpenCalais, Alchemy
Concept annotation UMLS Knowledge Source Server
14. Technology Adoption Timeline
2005 2006 2007 2008
Data sources
RSS Feeds (2)
Unstructured content
(1)
Visualization tools
Scalable Vector
Graphics
JPGraph
Web services
Yahoo Maps
askMEDLINE
Products
RSS feeds
Visualizations
Data sources
RSS Feeds (4)
Unstructured content
(3)
Email
Visualization tools
Google, Yahoo Maps
JPGraph
Web services
Yahoo Maps
Google Maps
askMEDLINE
Geonames
Products
RSS feeds
Visualizations
Data sources
RSS Feeds (8)
Unstructured content
(4)
Email
Visualization tools
SIMILE Exhibit
AJAX visualization
tools
Web services
Yahoo Maps
Google Maps
askMEDLINE
Geonames
Wikipedia
Products
RSS , GeoRSS feeds
KML feeds
SMS
Visualizations
Custom products
Data sources
RSS Feeds (8)
Unstructured content (4)
Email
(Server)
Visualization tools
SIMILE Exhibit
AJAX visualization tools
Google Earth
Web services
Yahoo Maps
Google Maps
Google Visualization API (1)
askMEDLINE
Geonames
Wikipedia
UMLSKS
OpenCalais
Yahoo Pipes
Dapper
Products
RSS, GeoRSS feeds
KML feeds
SMS
Visualizations
Custom products
Data sources
RSS Feeds (9)
Unstructured content (6)
Linked Data
Email
(Server)
Social networks: Twitter
Visualization tools
SIMILE Exhibit
AJAX visualization tools
Google Earth
Wordle
Web services
Yahoo Maps
Google Maps
Google Translate
Google Visualization API (3)
askMEDLINE
Geonames
Wikipedia
UMLSKS
OpenCalais
Yahoo Pipes
Dapper
uClassifier
Alchemy
Twitter
URL services
Products
RSS, GeoRSS feeds
KML feeds
SMS
Visualizations
Custom products
2009
24. How has EpiSPIDER been used?
Access by type (most to least)
RSS
Exhibit
KML
Access by organization
Government agencies
Academic institutions
Research organizations
Health departments
Access by individuals
25. Challenges in implementing EpiSPIDER
Changing nature of data
Emergent nature of web services
Understanding and developing connections with
complex APIs
Information extraction and data linking
challenges
Service delivery expansion increases resource
demands
26. Changing nature of web data
CHALLENGES IN IMPLEMENTING EPISPIDER
Challenges with underlying HTML structure
Non-standard HTML use prevents effective parsing of
content
Need to map data to shared terminologies and
ontologies and knowledge metadata
For better integration into an information ecosystem,
system needs to let other “organisms” know what
information it needs and what type of information it
produces
27. Emergent nature of web services
CHALLENGES IN IMPLEMENTING EPISPIDER
Adapting to changing interfaces
Must go beyond “taping” applications together manually
- need for automated “duct tape” adjustments
Difficult for some interfaces (non-SOAP)
Feed URL changes
Have to subscribe to multiple mailing lists
Changes in data structure of service response
Service may have new data elements
Example, new Twitter geolocation elements
28. Understanding complex APIs
CHALLENGES IN IMPLEMENTING EPISPIDER
APIs are in continuous development
Complexity increasing
Knowledge base rapidly expanding
Example:
OpenCalais and Alchemy - addition of named entities
and relationships and linked data (Wikipedia,
Freebase) for disambiguation
Promising developments
Number of APIs in different task categories increasing
29. Information extraction and data linking challenges
CHALLENGES IN IMPLEMENTING EPISPIDER
Named entity recognition and disambiguation
Named entity recognition by web services of emerging
diseases may lag behind and provide non-specific
references
Example: H1N1 may just be tagged as “influenza”
(nonspecific)
Missing piece: UMLS Knowledge Source Server
named-entity extraction and concept annotation
web service
Currently a standalone download: Metamap Transfer
30. Service delivery increases resource demands
CHALLENGES IN IMPLEMENTING EPISPIDER
Managing contention for scarce computing
resources
How to process huge amounts of information
without crashing the server
Automated responses to certain parameters –
feedback loop
Avoiding process collisions
Alerting mechanisms
How to send alerts when the server is about to crash
31. Overall challenges in event-based surveillance for
public health threats
Increasing dependence on and need for development of
semantic tools to:
Identify emerging outbreaks
Assign outbreak severity
Track escalation/decline, social disruption and government
response over time
Promoting semantic data sharing among similar systems
Shared terminologies
Ontologies
Knowledge metadata
Chute C. Biosurveillance, Classification, and Semantic Health Technologies (editorial), J Am Med Inform Assoc.
2008;15:172–173.
32. Advantages of web services
Main advantages
Outsource complex tasks to agents who can devote
resources and economies of scale to deliver high
quality, reliable service and outputs
Promote use of standards for information exchange
Other advantages
Develop and reuse standard tools for processing
unstructured information
33. What could be next steps?
Critical
Incorporation of and mapping of knowledge base to ontology for event-based
surveillance to enable sharing of data across event-based surveillance systems
Implementing event-based surveillance systems at national level to enable
targeted, distributed collection of event-based data
Exposing underlying database as Resource Description Framework (RDF) or other
standards-based data
Collaboration across event-based surveillance systems to enable system-to-system
interoperability
Non-critical
Continue to explore new data sources
Annotated view of news articles
Providing citizen reporting and participatory information processing interfaces to
end-users
34. Summary
Inflection point in evolution of web services just
“around the corner”
Challenges remain in:
Automation and integration of web services in event-
based surveillance systems
Integrating event-based surveillance in national
surveillance systems (local public health context)
Enabling sharing of data across event-based
surveillance systems
35. Acknowledgements
NCIRD: Raoul Kamadjeu
NLM: Paul Fontelo, Fang Liu, Olivier Bodenreider
ProMED Mail: Larry Madoff, Marjorie Pollack,
Alison Bodenheimer, Drew Tenenholz
The findings and conclusions in this report are those of the author(s) and
do not necessarily represent the official position of the Centers for
Disease Control and Prevention