The National Institute of Environmental Health Sciences (NIEHS) supported a Children's Health Exposure Analysis Repository(CHEAR) program that needed to integrate data across exposure science and health. We led the data science effort of this program and design the CHEAR ontology to support data integration and to leverage a wide range of existing ontologies and vocabularies. We are refactoring the ontology to support human health (instead of just aiming to support child health, and broadening support a broad range of environmental health sciences applications.
Big Data: Big Opportunities or Big Trouble?Shea Swauger
Big data is changing how research is being conducted and allowing new kinds of questions to be asked. Meanwhile, data management has enabled a rapid increase in the dissemination and preservation of research products and many funding agencies like the National Science Foundation and National Institute of Health now require data management plans in their grant applications. The combination of big data applications and data management processes has created new opportunities and pitfalls for researchers. In the past year, prominent scientists including the Director of the NIH have suggested that inappropriate methodology for data acquisition, analysis and storage has led to a gap in the translation of basic research findings to clinical cures. In this session we will track data through all research stages, describe best practices and university resources available to faculty grappling with these important issues.
Presentation about OHSL's new initiative, Mycroft Cognitive Assistant®, which is intended to streamline the operational aspects of research using IBM Watson cognitive computing capabilities.
Alzheimer's disease (AD) is recognized as a public health crisis worldwide (IADRP, 2013). AD is a complex neurodegenerative disease and the leading cause of dementia among the elderly people (Evans et al., 1989). Currently, there are approximately five million AD cases in the United States and about 35 million cases worldwide (Alzheimer's Disease International, 2009.
The focus of this case study is on the Uniform Data Set (UDS), a longitudinal database on Alzheimer’s patients, and the 29 Alzheimer’s Disease Centers (ADC) that submit their data to the UDS and actively collaborate in the ongoing maintenance, development and research utilization of the database. The ADCs are based in major medical institutions across the United States. They have a multi-decade track record of collaborative research and a networked and virtual approach to the scientific study of AD. The central coordinating mechanism for the ADCs and the UDS is the National Alzheimer’s Coordinating Center (NACC), which is located at the University of Washington. The NACC coordinates data collection and supports collaborative research among the ADCs.
Big Data: Big Opportunities or Big Trouble?Shea Swauger
Big data is changing how research is being conducted and allowing new kinds of questions to be asked. Meanwhile, data management has enabled a rapid increase in the dissemination and preservation of research products and many funding agencies like the National Science Foundation and National Institute of Health now require data management plans in their grant applications. The combination of big data applications and data management processes has created new opportunities and pitfalls for researchers. In the past year, prominent scientists including the Director of the NIH have suggested that inappropriate methodology for data acquisition, analysis and storage has led to a gap in the translation of basic research findings to clinical cures. In this session we will track data through all research stages, describe best practices and university resources available to faculty grappling with these important issues.
Presentation about OHSL's new initiative, Mycroft Cognitive Assistant®, which is intended to streamline the operational aspects of research using IBM Watson cognitive computing capabilities.
Alzheimer's disease (AD) is recognized as a public health crisis worldwide (IADRP, 2013). AD is a complex neurodegenerative disease and the leading cause of dementia among the elderly people (Evans et al., 1989). Currently, there are approximately five million AD cases in the United States and about 35 million cases worldwide (Alzheimer's Disease International, 2009.
The focus of this case study is on the Uniform Data Set (UDS), a longitudinal database on Alzheimer’s patients, and the 29 Alzheimer’s Disease Centers (ADC) that submit their data to the UDS and actively collaborate in the ongoing maintenance, development and research utilization of the database. The ADCs are based in major medical institutions across the United States. They have a multi-decade track record of collaborative research and a networked and virtual approach to the scientific study of AD. The central coordinating mechanism for the ADCs and the UDS is the National Alzheimer’s Coordinating Center (NACC), which is located at the University of Washington. The NACC coordinates data collection and supports collaborative research among the ADCs.
What are today's challenges of big medical data and how can we use the immense data to turn it into potentials, e.g. for precision medicine. Get insights in application examples, where big medical data are incorporated and how in-memory database technology can enable it instantaneous analysis.
Health Datapalooza IV: June 3rd-4th, 2013
Health Data Consortium Affiliates Apps Demos
Moderator: Sunnie Southern, Founder and Chief Executive Officer, Viable Synergy, LLC; Ohio Health Data Affiliate
MEDgle’s graph-based big health analytics engine and platform provides real-time diagnostic, predictive, and prescriptive analytics for individuals and populations.
Presenter: Ash Damle
Attivio Customer Success Story - Durkheim Project Search & DiscoveryAttivio
The statistics are alarming: suicide rates among U.S. veterans are almost double those of the general U.S. adult population. Reducing the incidence of suicide
among U.S. veterans has proven to be a complex and challenging battle; no initiative or program to date has worked to reverse this trend. Fortunately, there is
a new ally in veterans’ suicide prevention: predictive analytics technology.
Attivio Customer Success Story - Durkheim Project Attivio
Attivio plays a key role in powering Patterns and Predictions’ real-time predictive analytics solution to identify mental health risk factors among U.S. veterans,including suicide.
Augmenting Healthcare by Supporting General Practitioners and Disclosing Hea...Robin De Croon
Slides used during my public PhD defence at KU Leuven on June 23, 2017.
This PhD explores, designs, develops and evaluates a suite of information visualization tools for understanding, exploring, explaining and disclosing health information. This toolset is aimed at both general practitioners and patients and is driven by three underlying research goals: augmenting traditional practitioners’ workflows, boosting patient empowerment, and investigating novel opportunities in devices for supporting communication and collaboration between practitioners and patients.
Cemal H. Guvercin MedicReS 5th World Congress MedicReS
Ethical Issues in Artifical Intelligence Applied to Medicine Presentation to MedicReS 5th World Congress on October 19,25,2015 in New York by Cemal H. Guvercin, MD, PhD
Augmented Personalized Health: using AI techniques on semantically integrated...Amit Sheth
Keynote @ 2018 AAAI Joint Workshop on Health Intelligence (W3PHIAI 2018), 2 February 2018, New Orleans, LA [Video: https://youtu.be/GujvoWRa0O8]
Related article: https://ieeexplore.ieee.org/document/8355891/
Abstract
Healthcare as we know it is in the process of going through a massive change - from episodic to continuous, from disease-focused to wellness and quality of life focused, from clinic centric to anywhere a patient is, from clinician controlled to patient empowered, and from being driven by limited data to 360-degree, multimodal personal-public-population physical-cyber-social big data-driven. While the ability to create and capture data is already here, the upcoming innovations will be in converting this big data into smart data through contextual and personalized processing such that patients and clinicians can make better decisions and take timely actions for augmented personalized health. In this talk, we will discuss how use of AI techniques on semantically integrated patient-generated health data (PGHD), environmental data, clinical data, and public social data is exploited to achieve a range of augmented health management strategies that include self-monitoring, self-appraisal, self-management, intervention, and Disease Progression Tracking and Prediction. We will review examples and outcomes from a number of applications, some involving patient evaluations, including asthma in children, bariatric surgery/obesity, mental health/depression, that are part of the Kno.e.sis kHealth personalized digital health initiative.
Background: Background: http://bit.ly/k-APH, http://bit.ly/kAsthma, http://j.mp/PARCtalk
Wide adoption of smartphones and availability of low-cost sensors has resulted in seamless and continuous monitoring of physiology, environment, and public health notifications. However, personalized digital health and patient empowerment can become a reality only if the complex multisensory and multimodal data is processed within the patient context. Contextual processing of patient data along with personalized medical knowledge can lead to actionable information for better and timely decisions. We present a system called kHealth capable of aggregating multisensory and multimodal data from sensors (passive sensing) and answers to questionnaire (active sensing) from patients with asthma. We present our preliminary data analysis comprising data collected from real patients highlighting the challenges in deploying such an application. The results show strong promise to derive actionable information using a combination of physiological indicators from active and passive sensors that can help doctors determine more precisely the cause, severity, and control level of asthma. Information synthesized from kHealth can be used to alert patients and caregivers for seeking timely clinical assistance to better manage asthma and improve their quality of life.
Paper: http://www.knoesis.org/library/resource.php?id=2153
Citation:
Pramod Anantharam, Tanvi Banerjee, Amit Sheth, Krishnaprasad Thirunarayan, Surendra Marupudi, Vaikunth Sridharan, Shalini G. Forbis, Knowledge-driven Personalized Contextual mHealth Service for Asthma Management in Children , IEEE 4th International Conference on Mobile Services, June 27 - July 2, 2015, New York, USA.
Multimodal Question Answering in the Medical Domain (CMU/LTI 2020) | Dr. Asma...Asma Ben Abacha
"Multimodal Question Answering in the Medical Domain". Invited talk at the Language Technologies Institute (LTI), Carnegie Mellon University (CMU).
Dr. Asma Ben Abacha.
April 24, 2020.
IBM Watson Health: How cognitive technologies have begun transforming clinica...Maged N. Kamel Boulos
Cite as: Kamel Boulos MN. IBM Watson Health: how cognitive technologies have begun transforming clinical medicine and healthcare (Oral session IV – Patient safety tools, Thursday 19 May 2016, 15:45-16:45, Hotel Puijonsarvi, Kuopio). In: Proceedings of the 4th Nordic Conference on Research in Patient Safety and Quality in Healthcare (NSQH2016), Kuopio, Finland, 18-20 May 2016 (organised by University of Eastern Finland), p.29. URL: http://www.uef.fi/NSQH2016 (In: Nykanen I (ed.). The 4th Nordic Conference on Research in Patient Safety and Quality in Healthcare. Kuopio, Finland, May 18-20, 2016. Program and Abstracts. Publications of the University of Eastern Finland. Report and Studies in Health Sciences 21. 2016, p.29 (of 119 p.). ISBN: 978-952-61-2130-7 (nid.), ISSNL: 1798-5722, ISSN: 1798-5730.)
IBM Watson health: how cognitive technologies have begun transforming clinical medicine and healthcare
Maged N Kamel Boulos
ABSTRACT
Background: IBM Watson Health (http://www.ibm.com/smarterplanet/us/en/ibmwatson/health/) belongs to a new generation of smart cognitive computing technologies (a type of artificial intelligence) that are poised to transform the way healthcare is delivered, and to vastly improve clinical outcomes, quality of care and patient safety.
Objectives: Our goal was to collect and document the huge potential of a range of emerging and exemplary uses of IBM Watson in healthcare in both developed and developing country settings.
Methods: A survey of current peer reviewed and grey literature has been conducted, looking for reports and case studies involving the use of IBM Watson in different health and healthcare applications.
Results, conclusions and clinical implications: With its ability to make sense of unstructured medical information by analysing the meaning and context of natural language, and uncovering important knowledge buried within large volumes of data and information, including medical images, IBM Watson is exceptionally well suited for clinical and healthcare decision support, where there are often elements of ambiguity and uncertainty. It has been (or is currently being) successfully deployed in many developed countries in the West, as well as in developing countries, such as India and South Africa. IBM Watson unlocks a complex case by acquiring information from multiple sources, e.g., accessing the electronic patient record, then parsing all related medical evidence at up to 60 million pages per second. After processing all of this information, Watson offers relevant and prioritised suggestions to the decision-maker, e.g., helping clinicians identify the best diagnosis and treatment options in complex oncology cases, and providing hospital managers with new operational insights. The ultimate goals are to reduce cost, medical errors, mortality rates, and help improve patients' quality of life.
Medical Question Answering: Dealing with the complexity and specificity of co...Asma Ben Abacha
"Medical Question Answering: Dealing with the complexity and specificity of consumer health questions and visual questions". Invited talk at the Allen Institute for AI (AI2), Seattle, Washington.
Dr. Asma Ben Abacha.
November 12, 2019.
Insights from the Organization of International Challenges on Artificial Inte...Asma Ben Abacha
"Insights from the Organization of International Challenges on Artificial Intelligence in Medical Question Answering". Invited talk at the SciNLP (Natural Language Processing and Data Mining for Scientific Text) Workshop.
Dr. Asma Ben Abacha.
June 24, 2020.
Big data and health sciences: Machine learning in chronic illness by Huiyu DengData Con LA
Abstract:- Big data has become the new hot topic in recent years. It promotes the understanding of the exploit of data and directs the decision guidance in many sectors. The health science field is also shaped by the innovative idea of big data application. Our study group from the department of preventive medicine of the Keck school of medicine of the University of Southern California aims to build a big data architecture that combines and analyzes data of people from difference sources and provide health related assessments back to them. Specifically, ecological momentary assessments (EMAs), electronic medical records (EMRs), and real-time air quality monitor data of children with pre-existing asthma diagnosis are collected and fed into the machine learning models. Asthma exacerbation alert is generated and delivered back to the children before it happens. The machine learning model was tested and built in a similar study. The study population consists of children from a cohort of the prospective, population-based Children's Health Study followed from 2003-2012 in 13 Southern California communities. Potential risk factors were grouped into five broad categories: sociodemographic factors, indoor/home exposures, traffic/air pollution exposures, symptoms/medication use, and asthma/allergy status. The outcome of interest, assessed via annual questionnaire, was the presence of bronchitic symptoms over the prior 12 months. A gradient boosting model (GBM) was trained on data consisting of one observation per participant in a random study year, for a randomly selected half of the study participants. The model was validated using hold-out test data obtained in two complementary approaches: (within-participant) a random (later) year in the same participants and (across-participant) a random year in participants not included in the training data. The predictive ability of risk factor groupings was evaluated using the area under receiver operating characteristic curve (AUC) and accuracy. The predictive ability of individual risk factors was evaluated using the relative variable importance. Graphical visualization of the predictor-outcome relationship was displayed using partial dependency plots. Interaction effects were identified using the H-statistic. Gradient boosting model offers a novel approach to better understand predictive factors for chronic upper respiratory illness such as bronchitic symptoms.
Univ of Miami CTSI: Citizen science seminar; Oct 2014Richard Bookman
The University of Miami's Clinical & Translational Science Institute runs a seminar course for MS students.
This talk surveys 8 citizen science projects, reviews NIH's current activities, and identifies issues for attention, particularly with ethical, legal and social implications.
What are today's challenges of big medical data and how can we use the immense data to turn it into potentials, e.g. for precision medicine. Get insights in application examples, where big medical data are incorporated and how in-memory database technology can enable it instantaneous analysis.
Health Datapalooza IV: June 3rd-4th, 2013
Health Data Consortium Affiliates Apps Demos
Moderator: Sunnie Southern, Founder and Chief Executive Officer, Viable Synergy, LLC; Ohio Health Data Affiliate
MEDgle’s graph-based big health analytics engine and platform provides real-time diagnostic, predictive, and prescriptive analytics for individuals and populations.
Presenter: Ash Damle
Attivio Customer Success Story - Durkheim Project Search & DiscoveryAttivio
The statistics are alarming: suicide rates among U.S. veterans are almost double those of the general U.S. adult population. Reducing the incidence of suicide
among U.S. veterans has proven to be a complex and challenging battle; no initiative or program to date has worked to reverse this trend. Fortunately, there is
a new ally in veterans’ suicide prevention: predictive analytics technology.
Attivio Customer Success Story - Durkheim Project Attivio
Attivio plays a key role in powering Patterns and Predictions’ real-time predictive analytics solution to identify mental health risk factors among U.S. veterans,including suicide.
Augmenting Healthcare by Supporting General Practitioners and Disclosing Hea...Robin De Croon
Slides used during my public PhD defence at KU Leuven on June 23, 2017.
This PhD explores, designs, develops and evaluates a suite of information visualization tools for understanding, exploring, explaining and disclosing health information. This toolset is aimed at both general practitioners and patients and is driven by three underlying research goals: augmenting traditional practitioners’ workflows, boosting patient empowerment, and investigating novel opportunities in devices for supporting communication and collaboration between practitioners and patients.
Cemal H. Guvercin MedicReS 5th World Congress MedicReS
Ethical Issues in Artifical Intelligence Applied to Medicine Presentation to MedicReS 5th World Congress on October 19,25,2015 in New York by Cemal H. Guvercin, MD, PhD
Augmented Personalized Health: using AI techniques on semantically integrated...Amit Sheth
Keynote @ 2018 AAAI Joint Workshop on Health Intelligence (W3PHIAI 2018), 2 February 2018, New Orleans, LA [Video: https://youtu.be/GujvoWRa0O8]
Related article: https://ieeexplore.ieee.org/document/8355891/
Abstract
Healthcare as we know it is in the process of going through a massive change - from episodic to continuous, from disease-focused to wellness and quality of life focused, from clinic centric to anywhere a patient is, from clinician controlled to patient empowered, and from being driven by limited data to 360-degree, multimodal personal-public-population physical-cyber-social big data-driven. While the ability to create and capture data is already here, the upcoming innovations will be in converting this big data into smart data through contextual and personalized processing such that patients and clinicians can make better decisions and take timely actions for augmented personalized health. In this talk, we will discuss how use of AI techniques on semantically integrated patient-generated health data (PGHD), environmental data, clinical data, and public social data is exploited to achieve a range of augmented health management strategies that include self-monitoring, self-appraisal, self-management, intervention, and Disease Progression Tracking and Prediction. We will review examples and outcomes from a number of applications, some involving patient evaluations, including asthma in children, bariatric surgery/obesity, mental health/depression, that are part of the Kno.e.sis kHealth personalized digital health initiative.
Background: Background: http://bit.ly/k-APH, http://bit.ly/kAsthma, http://j.mp/PARCtalk
Wide adoption of smartphones and availability of low-cost sensors has resulted in seamless and continuous monitoring of physiology, environment, and public health notifications. However, personalized digital health and patient empowerment can become a reality only if the complex multisensory and multimodal data is processed within the patient context. Contextual processing of patient data along with personalized medical knowledge can lead to actionable information for better and timely decisions. We present a system called kHealth capable of aggregating multisensory and multimodal data from sensors (passive sensing) and answers to questionnaire (active sensing) from patients with asthma. We present our preliminary data analysis comprising data collected from real patients highlighting the challenges in deploying such an application. The results show strong promise to derive actionable information using a combination of physiological indicators from active and passive sensors that can help doctors determine more precisely the cause, severity, and control level of asthma. Information synthesized from kHealth can be used to alert patients and caregivers for seeking timely clinical assistance to better manage asthma and improve their quality of life.
Paper: http://www.knoesis.org/library/resource.php?id=2153
Citation:
Pramod Anantharam, Tanvi Banerjee, Amit Sheth, Krishnaprasad Thirunarayan, Surendra Marupudi, Vaikunth Sridharan, Shalini G. Forbis, Knowledge-driven Personalized Contextual mHealth Service for Asthma Management in Children , IEEE 4th International Conference on Mobile Services, June 27 - July 2, 2015, New York, USA.
Multimodal Question Answering in the Medical Domain (CMU/LTI 2020) | Dr. Asma...Asma Ben Abacha
"Multimodal Question Answering in the Medical Domain". Invited talk at the Language Technologies Institute (LTI), Carnegie Mellon University (CMU).
Dr. Asma Ben Abacha.
April 24, 2020.
IBM Watson Health: How cognitive technologies have begun transforming clinica...Maged N. Kamel Boulos
Cite as: Kamel Boulos MN. IBM Watson Health: how cognitive technologies have begun transforming clinical medicine and healthcare (Oral session IV – Patient safety tools, Thursday 19 May 2016, 15:45-16:45, Hotel Puijonsarvi, Kuopio). In: Proceedings of the 4th Nordic Conference on Research in Patient Safety and Quality in Healthcare (NSQH2016), Kuopio, Finland, 18-20 May 2016 (organised by University of Eastern Finland), p.29. URL: http://www.uef.fi/NSQH2016 (In: Nykanen I (ed.). The 4th Nordic Conference on Research in Patient Safety and Quality in Healthcare. Kuopio, Finland, May 18-20, 2016. Program and Abstracts. Publications of the University of Eastern Finland. Report and Studies in Health Sciences 21. 2016, p.29 (of 119 p.). ISBN: 978-952-61-2130-7 (nid.), ISSNL: 1798-5722, ISSN: 1798-5730.)
IBM Watson health: how cognitive technologies have begun transforming clinical medicine and healthcare
Maged N Kamel Boulos
ABSTRACT
Background: IBM Watson Health (http://www.ibm.com/smarterplanet/us/en/ibmwatson/health/) belongs to a new generation of smart cognitive computing technologies (a type of artificial intelligence) that are poised to transform the way healthcare is delivered, and to vastly improve clinical outcomes, quality of care and patient safety.
Objectives: Our goal was to collect and document the huge potential of a range of emerging and exemplary uses of IBM Watson in healthcare in both developed and developing country settings.
Methods: A survey of current peer reviewed and grey literature has been conducted, looking for reports and case studies involving the use of IBM Watson in different health and healthcare applications.
Results, conclusions and clinical implications: With its ability to make sense of unstructured medical information by analysing the meaning and context of natural language, and uncovering important knowledge buried within large volumes of data and information, including medical images, IBM Watson is exceptionally well suited for clinical and healthcare decision support, where there are often elements of ambiguity and uncertainty. It has been (or is currently being) successfully deployed in many developed countries in the West, as well as in developing countries, such as India and South Africa. IBM Watson unlocks a complex case by acquiring information from multiple sources, e.g., accessing the electronic patient record, then parsing all related medical evidence at up to 60 million pages per second. After processing all of this information, Watson offers relevant and prioritised suggestions to the decision-maker, e.g., helping clinicians identify the best diagnosis and treatment options in complex oncology cases, and providing hospital managers with new operational insights. The ultimate goals are to reduce cost, medical errors, mortality rates, and help improve patients' quality of life.
Medical Question Answering: Dealing with the complexity and specificity of co...Asma Ben Abacha
"Medical Question Answering: Dealing with the complexity and specificity of consumer health questions and visual questions". Invited talk at the Allen Institute for AI (AI2), Seattle, Washington.
Dr. Asma Ben Abacha.
November 12, 2019.
Insights from the Organization of International Challenges on Artificial Inte...Asma Ben Abacha
"Insights from the Organization of International Challenges on Artificial Intelligence in Medical Question Answering". Invited talk at the SciNLP (Natural Language Processing and Data Mining for Scientific Text) Workshop.
Dr. Asma Ben Abacha.
June 24, 2020.
Big data and health sciences: Machine learning in chronic illness by Huiyu DengData Con LA
Abstract:- Big data has become the new hot topic in recent years. It promotes the understanding of the exploit of data and directs the decision guidance in many sectors. The health science field is also shaped by the innovative idea of big data application. Our study group from the department of preventive medicine of the Keck school of medicine of the University of Southern California aims to build a big data architecture that combines and analyzes data of people from difference sources and provide health related assessments back to them. Specifically, ecological momentary assessments (EMAs), electronic medical records (EMRs), and real-time air quality monitor data of children with pre-existing asthma diagnosis are collected and fed into the machine learning models. Asthma exacerbation alert is generated and delivered back to the children before it happens. The machine learning model was tested and built in a similar study. The study population consists of children from a cohort of the prospective, population-based Children's Health Study followed from 2003-2012 in 13 Southern California communities. Potential risk factors were grouped into five broad categories: sociodemographic factors, indoor/home exposures, traffic/air pollution exposures, symptoms/medication use, and asthma/allergy status. The outcome of interest, assessed via annual questionnaire, was the presence of bronchitic symptoms over the prior 12 months. A gradient boosting model (GBM) was trained on data consisting of one observation per participant in a random study year, for a randomly selected half of the study participants. The model was validated using hold-out test data obtained in two complementary approaches: (within-participant) a random (later) year in the same participants and (across-participant) a random year in participants not included in the training data. The predictive ability of risk factor groupings was evaluated using the area under receiver operating characteristic curve (AUC) and accuracy. The predictive ability of individual risk factors was evaluated using the relative variable importance. Graphical visualization of the predictor-outcome relationship was displayed using partial dependency plots. Interaction effects were identified using the H-statistic. Gradient boosting model offers a novel approach to better understand predictive factors for chronic upper respiratory illness such as bronchitic symptoms.
Univ of Miami CTSI: Citizen science seminar; Oct 2014Richard Bookman
The University of Miami's Clinical & Translational Science Institute runs a seminar course for MS students.
This talk surveys 8 citizen science projects, reviews NIH's current activities, and identifies issues for attention, particularly with ethical, legal and social implications.
Research Data Management: Part 1, Principles & ResponsibilitiesAmyLN
This two-part course is a collaboration between CU Libraries/Information Services and the Office of Research Compliance & Training. The purpose of this course is to familiarize you with the various aspects of research data management (RDM)
Part 1: Why RDM is both recommended and required
What research data are
Who is responsible for RDM
Part 2:
When RDM activities occur
How you can carry out RDM activities
The slide presentation that preceded of the annual Health Datapalooza in Washington DC, PCORI was pleased to participate in the latest installment in the Health Data Consortium and PricewaterhouseCoopers (PwC) Innovators in Health Data Series, a webinar featuring PCORI Executive Director Joe Selby, MD, MPH; NIH Director and PCORI Board of Governors member Francis Collins, MD, PhD; and Philip Bourne, PhD, NIH’s Associate Director for Data Science.
From Research to Practice - New Models for Data-sharing and Collaboration to ...Health Data Consortium
Watch the webinar here: http://encore.meetingbridge.com/MB005418/140528/
Webinar transcript: http://hdc.membershipsoftware.org/Files/webinars/HDC-PwC%20NIH%20&%20PCORI%20Webinar%20Transcript%205_28_14.pdf
Patient-Centered Outcomes Research Institute (PCORI) Executive Director Joe Selby, MD, MPH; National Institutes of Health (NIH) Director and PCORI Board of Governors member Francis Collins, MD, PhD; and NIH Associate Director for Data Science Philip Bourne, PhD discussed new and emerging trends in big data for health, including:
- How researchers, patients, clinicians, and others are forging new models for data-sharing.
- Leveraging the quantity, variety, and analytic potential of health-related data for research and practice.
- Addressing patients’ perspectives, needs, and concerns in creating new opportunities for innovation and translational science.
- Exciting initiatives such as PCORnet, the National Patient-Centered Clinical Research Network initiative that PCORI is now helping to develop, and related open data and technology efforts such - as the NIH Health Systems Collaboratory and Big Data to Knowledge (BD2K) initiative.
Discover more health data resources on our website at http://www.healthdataconsortium.org/
Early Childhood Development: Science, Practice, and ResearchCORE Group
Fall Global Health Practitioner Conference 2017
Early Childhood Development: Science, Practice, and Research
Joy Noel Baumgartner, Leslie Chingang, John
Hembling, Maureen Kapiyo, Alfonso
Rosales, Elena McEwan
Funding agencies are instituting requirements for data management and sharing as a condition of receiving research funds. This presentation addresses why researchers should care about research data management, what libraries have to do with it, and a case study of what one research specialist at the University of Colorado Anschutz Medical Campus is doing in this area.
Expert Panel on Data Challenges in Translational ResearchEagle Genomics
A panel of experts including Alexandre Passioukov, VP Translational Medicine at Pierre Fabre, Xose Fernandez, Chief Data Officer at Institut Curie, Abel Ureta-Vidal, CEO at Eagle Genomics share their first-hand experience of enabling translational research in pharmaceutical and biomedical organisations, and discuss the challenges around the establishment of streamlined, seamless data handling and governance to accelerate innovation.
Modern Genomics in conjunction with Patients IP on the value chainJohn Peter Mary Wubbe
Perhaps it is now time to move past the classical dichotomy of privacy or data utility and to seize the possibilities of emerging health technologies, processes, and projects. Far from being harmful, Patient-controlled health records in collaboration with emerging technologies that work within a defined, over sighted and legislated regulatory framework will facilitate innumerable benefits.
There are a number of forthcoming statistical solutions that would permit assembly of data sets at a patient level with limited risk to privacy and also delineate the cumbersome contentions of data ownership and access
Digital Access to the World's Literature: A Blueprint to Integrate Evidence w...Elaine Martin
Lamar Soutter Library Director Elaine Martin and Consultant Karen Dahlen introduce a digital public health library initiative that supports national and state public health departments. Success stories and next steps to build a sustainable digital library model for all public health department is covered.
The Learning Health System: Thinking and Acting Across ScalesPhilip Payne
A Learning Health System (LHS) can be defined as an environment in which knowledge generation processes are embedded into daily clinical practice in order to continually improve the quality, safety, and outcomes of healthcare delivery. While still largely an aspirational goal, the promise of the LHS is a future in which every patient encounter is an opportunity to learn and improve that patient’s care, as well as the care their family and broader community receives. The foundation for building such an LHS can and should be the Electronic Health Record (EHR), which provides the basis for the comprehensive instrumentation and measurement of clinical phenotypes, as well as a means of delivering new evidence at the patient- and population levels. In this presentation, we will explore the ways in which such EHR-derived phenotypes can be combined with complementary data across a spectrum from biomolecules to population level trends, to both generate insights and deliver such knowledge in the right time, place, and format, ultimately improving clinical outcomes and value.
Keynote presentation for the International Semantic Web Conference in Athens Greece, on November 9, 2023. The talk addresses the generative AI explosion and its potential impacts on the Semantic Web and Knowledge Graph communities and, in fact, may spark a research Renaissance.
Abstract:
We are living in an age of rapidly advancing technology. History may view this period as one in which generative artificial intelligence is seen as reshaping the landscape and narrative of many technology-based fields of research and application. Times of disruptions often present both opportunities and challenges. We will discuss some areas that may be ripe for consideration in the field of Semantic Web research and semantically-enabled applications. Semantic Web research has historically focused on representation and reasoning and enabling interoperability of data and vocabularies. At the core are ontologies along with ontology-enabled (or ontology-compatible) knowledge stores such as knowledge graphs. Ontologies are often manually constructed using a process that (1) identifies existing best practice ontologies (and vocabularies) and (2) generates a plan for how to leverage these ontologies by aligning and augmenting them as needed to address requirements. While semi-automated techniques may help, there is typically a significant portion of the work that is often best done by humans with domain and ontology expertise. This is an opportune time to rethink how the field generates, evolves, maintains, and evaluates ontologies. We consider how hybrid approaches, i.e., those that leverage generative AI components along with more traditional knowledge representation and reasoning approaches to create improved processes. The effort to build a robust ontology that meets a use case can be large. Ontologies are not static however and they need to evolve along with knowledge evolution and expanded usage. There is potential for hybrid approaches to help identify gaps in ontologies and/or refine content. Further, ontologies need to be documented with term definitions and their provenance. Opportunities exist to consider semi-automated techniques for some types of documentation, provenance, and decision rationale capture for annotating ontologies. The area of human-AI collaboration for population and verification presents a wide range of areas of research collaboration and impact. Ontologies need to be populated with class and relationship content. Knowledge graphs and other knowledge stores need to be populated with instance data in order to be used for question answering and reasoning. Population of large knowledge graphs can be time consuming. Generative AI holds the promise to create candidate knowledge graphs that are compatible with the ontology schema. The knowledge graph should contain provenance information identifying how the content was populated and its source and correctness and currency should be checked. A human-AI assistant approach is presented.
Keynote presentation for Mobilizing Computable Biomedical Knowledge Conference 2021. Looking in particular at emerging trends of cognitive assistants, personal health knowledge graphs, and meta descriptions for knowledge resources. Examples taken from RPI-IBM project on Health Empowerment by Analysis, Learning, and Semantics and NIEHS project with RPI-MSSM-Columbia on Human Health Exposure Analysis Repository Data Center.
Opening Keynote for Taxonomy Bootcamp. Co-located with Knowledge Management World 2018.
Abstract: Taxonomies and ontologies are seeing a resurgence of interest and usage as Big Data proliferates, machine learning advances, and integration of data becomes more paramount. The previous models of labor-intensive, centralized vocabulary construction and maintenance do not mesh well in today’s interdisciplinary world. Learn about how information professionals can play a starring role in this new world. McGuinness gives a real-world view of building and maintaining large collaborative, interdisciplinary vocabularies along with the data repositories and services they empower, such as the National Institutes of Environmental Health Sciences’ Child Health Exposure Analysis Resource.
http://www.taxonomybootcamp.com/2018/Schedule.aspx
Ontologies are seeing a resurgence of interest and usage as big data proliferates, machine learning advances, and integration of data becomes more paramount. The previous models of sometimes labor-intensive, centralized ontology construction and maintenance do not mesh well in today’s interdisciplinary world that is in the midst of a big data, information extraction, and machine learning explosion. In this talk, we will discuss a model of building and maintaining large collaborative, interdisciplinary ontologies along with the data repositories and data services that they empower. We will also introduce the National Institutes of Environmental Health Science’s Child Health Exposure Analysis Resource and describe how we used our methodology to assemble the broad interdisciplinary ontology that covers exposure science and health and integrates with numerous long standing, well used ontologies. We will also describe how this ontology powers an integrated data resource and provide some examples of how it can be used and re-used for interdisciplinary work. If time permits, we will also describe how the methodology and the integrated ontology has been and is being used in other interdisciplinary health and wellness settings.
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017Deborah McGuinness
Ontologies are seeing a resurgence of interest and usage as big data proliferates, machine learning advances, and integration of data becomes more paramount. The previous models of sometimes labor-intensive, centralized ontology construction and maintenance do not mesh well in today’s interdisciplinary world that is in the midst of a big data, information extraction, and machine learning explosion. In this talk, we will provide some historical perspective on ontologies and their usage, and discuss a model of building and maintaining large collaborative, interdisciplinary ontologies along with the data repositories and data services that they empower. We will give a few examples of heterogeneous semantic data resources made more interconnected and more powerful by ontology-supported infrastructures, discuss a vision for ontology-enabled future research and provide some examples in a large health empowerment joint effort between RPI and IBM Watson Health.
Automating Semantic Metadata Collection in the Field with Mobile ApplicationDeborah McGuinness
Presentation at Mobile Deployment of Semantic Technologies Workshop at the International Semantic Web Conference. Abstract: In the past few decades, the field of ecology has grown from a collection of disparate researchers who collected data on their local phenomenon by hand, to large ecosystems-oriented projects partially fueled by automated sensor networks and a diversity of models and experiments. These modern projects rely on sharing and integrating data to answer questions of increasing scale and complexity. Interpreting and sharing the big data sets generated by these projects relies on information about how the data was collected and what the data is about, typically stored as metadata. Metadata ensures that the data can be interpreted and shared accurately and efficiently. Traditional paper-based metadata collection methods are slow, error-prone, and non-standardized, making data sharing difficult and inefficient. Semantic technologies offer opportunities for better data management in ecology, but also may pose a challenging learning curve to already busy researchers. This paper presents a mobile application for recording semantic metadata about sensor network deployments and experimental settings in real time, in the field, and without expecting prior knowledge of semantics from the users. This application enables more efficient and less error-prone in-situ metadata collection, and generates structured and shareable metadata.
Linked Data and Semantic Technologies can support a next generation of science. This talk shows examples of discovery, access, integration, analysis, and shows directions towards prediction and vision.
This talk introduces Linked Data and Semantic Web by using two examples - population sciences grid and semantAqua - a semantically enabled environmental monitoring. It shows a few tools and the semantic methodology and opens discussion for LOD and team science
The Semantic Travel Concierge - a vision of the potential of semantic technologies for the travel industry. Deborah L. McGuinness Keynote at the Opentravel Alliance Advisory Forum - Miami, Fla, April 11, 2012.
Invited talk for the Square Kilometer Array meeting in Wellington New Zealand in Sept 2011 on Semantic eScience and Semantically enabled Virtual Observatories along with directions
My keynote at the Ontologies Come of Age workshop at the International Semantic Web Conference in Bonn Germany. This workshop was named after a paper I wrote about a decade ago.
Ontologies for the Real World by Deborah L. McGuinness. Invited talk for the 2011 Future Worlds Microsoft Faculty Summit in the Semantic Knowledge for Commodity Computing.
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Essentials of Automations: The Art of Triggers and Actions in FME
Towards an Environmental Health Sciences Ontology:CHEAR to HHEAR and Beyond
1. Towards an Environmental
Health Sciences Ontology:
CHEAR to HHEAR and Beyond
Deborah L. McGuinness
Tetherless World Senior Constellation Chair
Professor of Computer, Cognitive, and Web Science
Director RPI Web Science Research Center
RPI Institute for Data Exploration and Application Health Informatics Lead
dlm@cs.rpi.edu , @dlmcguinness
Computable Exposures Workshop September 9, 2019
2. CHEAR Ontology Effort
2
The Children’s Health Exposure Analysis Resource, or CHEAR, is a program funded
by the National Institute of Environmental Health Sciences to advance understanding
about how the environment impacts children’s health and development over the
course of a lifetime.
https://chearprogram.org/
Children’s Health Exposure
Analysis Resource (CHEAR)
McGuinness 9/9/19 Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
CHEAR is composed of three components:
A National Exposure Assessment Laboratory Network, providing both targeted
and untargeted environmental exposure and biological response analyses in
human samples
A Data Repository, Analysis, and Science Center, providing statistical services,
a data repository, and data standards for integration and sharing
A Coordinating Center, connecting the research community to CHEAR
resources
3. CHEAR Ontology Effort
3
The NIEHS is establishing an infrastructure, the Human Health Exposure Analysis
Resource (HHEAR) as a continuation of the Children's Health Exposure Analysis
Resource (CHEAR). The goal of this consortium is to provide the research
community access to laboratory and statistical analyses to add or expand the
inclusion of environmental exposures in their research and to make that data publicly
available as a means to improve our knowledge of the comprehensive effects of
environmental exposures on human health throughout the life course.
Human Health Exposure Analysis Resource (HHEAR): Data Repository, Analysis and
Science Center
https://grants.nih.gov/grants/guide/rfa-files/rfa-es-18-014.html
Human Health Exposure
Analysis Resource (HHEAR)
McGuinness 9/9/19 Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
4. CHEAR Ontology Effort
4
Goal: Encode terminology currently needed by the CHEAR Data Center
Portal, publish an open source extensible ontology integrating general
exposure science and health leveraging best in class terminologies.
Enabling Findable, Accessible, Interoperable, Reusable Data and
Services to support data analysis and interdisciplinary research
Ontologies encode terms and their interrelationships, providing a foundation
for understanding interoperability and reusability (I and R in terms of FAIR)
Ontology-enabled infrastructures - Knowledge Graphs and Ontology-
enabled search services also provide support for finding and accessing
relevant content (the F and A in FAIR)
Child Health Exposure
Analysis Resource Ontology
McGuinness 9/9/19 Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
Stingone, Mervish, Kovatch, McGuinness, Gennings, Teitelbaum. Big and Disparate Data: Considerations for
Pediatric Consortia. Current Opinions in Pediatrics Journal. 29(2):231-239, April 2017. doi:
10.1097/MOP.0000000000000467. PMID: 28134706
5. Ontology Development Process
Use Cases
Existing Ontologies
& Vocabularies
Expert Interviews
Labkey,
Ontology
Fragments
Ontology
Curation
(ongoing)
Reviewers & Curators
* Ontology Development Team
* Domain collaborators
* Invited experts
"Consumers" (data analysts)
* Semi-automated critiques
Knowledge Graph
Integration
* Linking data and
metadata content to
domain terms
* Linking workflows
based on semantic
descriptions
Repository
Integration
* Source Datasets
* Analytics source
code
* Results
* Publications
Knowledge-
Enhanced
Search
Finding what
is there that
might be of
use
Semantic
Extract
Transform,
Load
(SETLr)
Expert
Guidance
Sources
Data Reporting
Templates
Data Dictionaries /
Codebooks
Foundational
Ontologies/Vocabularies
Interactive
Ontology
Browser with
Annotation
Generated
Ontology
* domain concepts
* authoritative
vocabularies
* vetted definitions
* supporting citations
Erickson, McGuinness, McCusker, Chastain, Pinheiro, Rashid, Liang, Liu, Stingone, …
Exemplified by CHEAR
McGuinness 9/19 https://chearprogram.org/
Extracted
Vocabularies, …
6. Ontology Foundations
6
Use Cases help scope and
prioritize
Key Components
• Summary
• Usage Scenario
• Flow of Events
• Activity Diagram
• Competency Questions
• Resources
• See examples at
Ontology Engineering
https://docs.google.com/document/d/1A2w-
xoN5aRwlSoCTEtDsWjs2caYDRD5bANif6icDS6k/edit?usp=sharing
6
Starting with the Use Case
McGuinness 9/19
7. Ontology Foundations
7
Imported Ontologies:
● Semantic Science Integrated
Ontology (SIO)
● PROV-O
● Units Ontology
● Human-Aware Science Ontology
(HAScO)
● Virtual Solar Terrestrial
Observatory (Instruments)
(VSTO-I)
● Environment Ontology (ENVO)
● …
Minimum Information to Reference an
External Ontology Term (MIREOT)-ed
Ontologies:
● Chemicals of Biological Interest (CheBI)
● Statistics Ontology (STAT-O)
● PubChem
● UBERON (Anatomy)
● Disease Ontology (DO)
● UniProt (Proteins)
● Cogat (Cognitive Measures)
● ExO
● RefMet, …
Annotations:
● Simple Knowledge Organization System
(SKOS)
● Dublin Core (DC) Terms
7
CHEAR Ontology
Foundations and Reuse
McGuinness 9/19 Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
McCusker, Rashid, Liang, Liu, Chastain,
Pinheiro, Stingone, McGuinness. Broad,
Interdisciplinary Science In Tela: An Exposure
and Child Health Ontology. In Proceedings of
Web Science, 2017. Troy, NY. 349-357.
8. Mapping Data to Meaning:
Semantic Data Dictionaries
Rashid, Chastain, Stingone, McGuinness, McCusker. The Semantic Data Dictionary Approach to
Data Annotation and Integration. Enabling Open Semantic Science, in Proc of the International Web
Science Conference Knowledge Graphs Workshop Oct 21, 2017
McGuinness 9/19 Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
9. Content Pipeline
9
Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01 and IBM AI Horizons Network
Increasing levels of automation
Reproducible, traceable
transformations from diverse sources
Transformation to rigorously modeled
knowledge
Versioning supported by provenance
10. 10
• Ontology support for
mapping and integration
(e.g., education level)
• Ontology informs decisions
about variables that may be
combined, serve as proxy,
or used to derive desired
info (e.g., birth outcomes)
• Ontology Integrity
constraints may help flag
errors (e.g., APGAR > 10)
• Ontology helps expose
implicit information and find
links
Fenton Z-Score
Sex
Birth
weight
Gest
Age
Mother’s Highest
Education Level
Val
Did not attend school 0
Elementary school 1
Technical post-primary 2
Middle school 3
Technical post-middle
school 4
Highschool or junior
college 5
Technical post-junior
college 6
College 7
Graduate 8
Doesn’t know 9
Mother
Education
Val
Less than High
School 0
High School
Graduate or More 1
Support Browsing,Searching, Pooling
Deriving Values, Verification, …
McGuinness, McCusker, Pinheiro, Stingone, et. al. Funding: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
McGuinness 9/19
13. Automatic ingest,
access control, data
governance,
precision download,
…
Supports Search study,
data sample, subject, ...
Enables smart
queries e.g., find
Child:BirthWt, Gender,
Gestational Age at Birth
Mother:Age, BMI “early
in pregnancy based on
inclusion criterion for
the particular study”,
Parity, Education
Metals: As, CD, Mn,
Mo, Pb
Ontology-Enabled CHEAR Human
Aware Data Acquisition Framework
McGuinness 12/18
Pinheiro, Santos, Liang, Liu, Rashid, McGuinness, Bax. HADatAc: A Framework for Scientific
Data Integration using Ontologies. Intl Semantic Web Conference, Monterey, CA, 2018.
Sample Question: Gennings
Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
14. Ontology-Enabled Study Search /
Precision Data Search and Download
Blood Biomarkers for Children’s Health (Study 1)
Institution:
Principal Investigator(s):
Number of Subjects:
Number of Samples:
Study Description:
Keywords:
Urine Biomarkers for Children’s Health (Study 2)
Institution:
Principal Investigator(s):
Number of Subjects:
Number of Samples:
Study Description:
Keywords:
Metabolomic Biomarkers for Children’s Health (Study 3)
Institution:
Principal Investigator(s):
Number of Subjects:
Number of Samples:
Study Description:
Keywords:
McGuinness, Pinheiro et al, 9/19 Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
15. Ontology Evolution Strategy
Identify terms
that can be
mapped to
existing ontology
Identify terms
to be added
to ontology
Describe new
terms w/
definitions and
location within
existing ontology
Mappings (e.g.
variable names)
incorporated into
knowledge graph
Data into
knowledge graph
after embargo
period
Incorporate new
terms into
existing ontology
Review and
revise updates
with stakeholders
Data Structures
& Standards
Working Group
Compile new
terms across
multiple studies
(e.g. Quarterly)
Data Center
New
version
Ontology
X
McGuinness with Stingone Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
16. Evolution Plans
McGuinness Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
• HHEAR to start soon (Sept 1)
• HHEAR needs to be backwards compatible with
CHEAR
• Much of CHEAR is not CHEAR or HHEAR
specific and in fact is reused by other health
informatics efforts
• Plans to extract a “reusable core” aimed at
environmental health sciences needs
17. Status
McGuinness et al, 9/9/19 Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
• CHEAR available:
• http://www.hadatac.org/chear-ontology/
• http://www.hadatac.org/ont/chear/
• https://bioportal.bioontology.org/ontologies/CH
EAR
• One more release expected (before HHEAR)
• CHEAR namespace will work however we will
encourage using the HHEAR namespace
• Refactoring coming
18. Some TakeAways
McGuinness Partially supported by: NIH/NIEHS 0255-0236-4609 / 1U2CES026555-01
• Large Community Effort
• Vocabulary Designed initially by Semantics
Experts
• Processes co-designed by domain experts and
semantics experts
• Vocabulary evolution managed by the community
• Leverages MANY community vocabularies
• Has evolving prioritization AND criteria for choice
• Supports a wide range of services that we claim
would not be supported without interlinked,
human supervised vocabulary
19. Community Conversation
McGuinness 9/19
• We would like users , collaborators, requirements
providers, etc.
• We aim to follow the principles in the Ontology
Engineering book
• Questions? : dlm@cs.rpi.edu
Thanks to many: RPI Tetherless World team particularly
McCusker, Erickson, Hendler, Pinheiro, Rashid, Liang, Liu,
Chastain, Chari; RPI: Bennett, Dyson, Seneviratne; Mount Sinai
particularly Teitelbaum, Stingone, Mervish, Gennings, Kovatch;
IBM, particularly Das, Chen, Brown, ….
Funding: NIEHS 0255-0236-4609 / 1U2CES026555-01, DARPA
HR0011-16-2-0030, IBM-RPI HEALS AI Horizons Network, NSF
ACI-1640840, National Spectral Consortium, …
21. Discussion
• Taxonomies/Ontologies enable FAIR (Findable,
Accessible, Interoperable, Reusable) Data
Resources
• Use ontology-enabled architectures
• Do NOT build taxonomies/ ontologies from scratch
• Selectively and thoughtfully reuse existing best
practice ontologies/vocabularies
• Leverage others mappings and selection criteria
where possible
• Engage experts in choosing ontology portions and
in designing the knowledge architecture
• Ecosystems and diverse teams are critical for
success – community driven and maintained
ontology-based systems are the future
• Flexible, Provenance, and Certainty-Aware
Knowledge Graphs are also the future
DLM@CS.RPI
.EDU
McGuinness
22. What is an Ontology?
An ontology specifies a rich description of the
• Terminology, concepts, nomenclature
• Relationships among concepts and individuals
• Sentences distinguishing concepts, refining
definitions & relationships
relevant to a particular domain or area of interest.
* Based on AAAI ‘99 Ontologies Panel ̶ McGuinness, Welty, Uschold, Gruninger, Lehmann
McGuinness 6/7/2017
• "Pull" for Ontologies. Invited
talk. Semantics for the Web.
Dagstuhl, Germany, 2000.
• Ontologies Come of Age.
Fensel, Hendler, Lieberman,
Wahlster, eds. Spinning the
Semantic Web: Bringing the
World Wide Web to Its Full
Potential. MIT Press, 2003.
McGuinness 12/18
23. 23
Text Mining
Linked Data
Biomedical Databases
Omics and Epidemiology Datasets
File uploads of any kind
Fully automated
Reproducible and traceable from diverse
data types:
Tabular CSV & XLS, XML, JSON, HTML,
etc.
Transformed into rigorously modeled
knowledge:
Meaning as structure, global IDs,
foundational ontologies
Tracked and versioned using provenance
standards
RPI knowledge graph technology
curates from diverse sources
McGuinness 12/18 with McCusker et al. partially supported through IBM AI Horizons Network
24. Whyis Knowledge Graph
Framework Current Usage
McCusker, McGuinness 12/18
– HEALS Project: clinical guidelines, cancer restaging, etc.
– Nanomaterials “genome” – NanoMine to MetaMine
– Radio Spectrum Policies
– Biology Knowledge Graph
– Knowledge Graph Catalog
– …. Your use here