Realising the potential of Health Data Science:opportunities and challenges ...Paolo Missier
A guest lecture given to a group of healthcare professionals as part of an Information Management course at Newcastle University, on working with healthcare data to generate disease risk prediction models
A Data-centric perspective on Data-driven healthcare: a short overviewPaolo Missier
a brief intro on the data challenges associated with working with Health Care data, with a few examples, both from literature and our own, of traditional approaches (Latent Class Analysis, Topic Modelling) and a perspective on Language-based modelling for Electronic Health Records (EHR).
probably more references than actual content in here!
Augmented Personalized Health: using AI techniques on semantically integrated...Amit Sheth
Keynote @ 2018 AAAI Joint Workshop on Health Intelligence (W3PHIAI 2018), 2 February 2018, New Orleans, LA [Video: https://youtu.be/GujvoWRa0O8]
Related article: https://ieeexplore.ieee.org/document/8355891/
Abstract
Healthcare as we know it is in the process of going through a massive change - from episodic to continuous, from disease-focused to wellness and quality of life focused, from clinic centric to anywhere a patient is, from clinician controlled to patient empowered, and from being driven by limited data to 360-degree, multimodal personal-public-population physical-cyber-social big data-driven. While the ability to create and capture data is already here, the upcoming innovations will be in converting this big data into smart data through contextual and personalized processing such that patients and clinicians can make better decisions and take timely actions for augmented personalized health. In this talk, we will discuss how use of AI techniques on semantically integrated patient-generated health data (PGHD), environmental data, clinical data, and public social data is exploited to achieve a range of augmented health management strategies that include self-monitoring, self-appraisal, self-management, intervention, and Disease Progression Tracking and Prediction. We will review examples and outcomes from a number of applications, some involving patient evaluations, including asthma in children, bariatric surgery/obesity, mental health/depression, that are part of the Kno.e.sis kHealth personalized digital health initiative.
Background: Background: http://bit.ly/k-APH, http://bit.ly/kAsthma, http://j.mp/PARCtalk
The Learning Health System: Thinking and Acting Across ScalesPhilip Payne
A Learning Health System (LHS) can be defined as an environment in which knowledge generation processes are embedded into daily clinical practice in order to continually improve the quality, safety, and outcomes of healthcare delivery. While still largely an aspirational goal, the promise of the LHS is a future in which every patient encounter is an opportunity to learn and improve that patient’s care, as well as the care their family and broader community receives. The foundation for building such an LHS can and should be the Electronic Health Record (EHR), which provides the basis for the comprehensive instrumentation and measurement of clinical phenotypes, as well as a means of delivering new evidence at the patient- and population levels. In this presentation, we will explore the ways in which such EHR-derived phenotypes can be combined with complementary data across a spectrum from biomolecules to population level trends, to both generate insights and deliver such knowledge in the right time, place, and format, ultimately improving clinical outcomes and value.
Improving health care outcomes with responsible data scienceWessel Kraaij
Keynote presentation by Wessel Kraaij at the Dutch pattern recognition and impage processing society (NVPBV) 29/5/2018, Eindhoven.
This talk discusses
1. trends in health care and respondible data science and their intersection
2. Secure federated analytics on distributed data repositories
3. Generating clinically relevant hypotheses from patient forum discussions.
Precision and Participatory Medicine - Medinfo 2015 Panel on big data. Includes the proposal to use the term Expotype to characterise the Exposome of an individual. Electronic expo typing would refer to the automatic construction of individual expo types from electronic clinical records and other sources of environmental risk factor and exposure data.
Realising the potential of Health Data Science:opportunities and challenges ...Paolo Missier
A guest lecture given to a group of healthcare professionals as part of an Information Management course at Newcastle University, on working with healthcare data to generate disease risk prediction models
A Data-centric perspective on Data-driven healthcare: a short overviewPaolo Missier
a brief intro on the data challenges associated with working with Health Care data, with a few examples, both from literature and our own, of traditional approaches (Latent Class Analysis, Topic Modelling) and a perspective on Language-based modelling for Electronic Health Records (EHR).
probably more references than actual content in here!
Augmented Personalized Health: using AI techniques on semantically integrated...Amit Sheth
Keynote @ 2018 AAAI Joint Workshop on Health Intelligence (W3PHIAI 2018), 2 February 2018, New Orleans, LA [Video: https://youtu.be/GujvoWRa0O8]
Related article: https://ieeexplore.ieee.org/document/8355891/
Abstract
Healthcare as we know it is in the process of going through a massive change - from episodic to continuous, from disease-focused to wellness and quality of life focused, from clinic centric to anywhere a patient is, from clinician controlled to patient empowered, and from being driven by limited data to 360-degree, multimodal personal-public-population physical-cyber-social big data-driven. While the ability to create and capture data is already here, the upcoming innovations will be in converting this big data into smart data through contextual and personalized processing such that patients and clinicians can make better decisions and take timely actions for augmented personalized health. In this talk, we will discuss how use of AI techniques on semantically integrated patient-generated health data (PGHD), environmental data, clinical data, and public social data is exploited to achieve a range of augmented health management strategies that include self-monitoring, self-appraisal, self-management, intervention, and Disease Progression Tracking and Prediction. We will review examples and outcomes from a number of applications, some involving patient evaluations, including asthma in children, bariatric surgery/obesity, mental health/depression, that are part of the Kno.e.sis kHealth personalized digital health initiative.
Background: Background: http://bit.ly/k-APH, http://bit.ly/kAsthma, http://j.mp/PARCtalk
The Learning Health System: Thinking and Acting Across ScalesPhilip Payne
A Learning Health System (LHS) can be defined as an environment in which knowledge generation processes are embedded into daily clinical practice in order to continually improve the quality, safety, and outcomes of healthcare delivery. While still largely an aspirational goal, the promise of the LHS is a future in which every patient encounter is an opportunity to learn and improve that patient’s care, as well as the care their family and broader community receives. The foundation for building such an LHS can and should be the Electronic Health Record (EHR), which provides the basis for the comprehensive instrumentation and measurement of clinical phenotypes, as well as a means of delivering new evidence at the patient- and population levels. In this presentation, we will explore the ways in which such EHR-derived phenotypes can be combined with complementary data across a spectrum from biomolecules to population level trends, to both generate insights and deliver such knowledge in the right time, place, and format, ultimately improving clinical outcomes and value.
Improving health care outcomes with responsible data scienceWessel Kraaij
Keynote presentation by Wessel Kraaij at the Dutch pattern recognition and impage processing society (NVPBV) 29/5/2018, Eindhoven.
This talk discusses
1. trends in health care and respondible data science and their intersection
2. Secure federated analytics on distributed data repositories
3. Generating clinically relevant hypotheses from patient forum discussions.
Precision and Participatory Medicine - Medinfo 2015 Panel on big data. Includes the proposal to use the term Expotype to characterise the Exposome of an individual. Electronic expo typing would refer to the automatic construction of individual expo types from electronic clinical records and other sources of environmental risk factor and exposure data.
Talk entitled "from the Virtual Human to a Digital Me" presented at the Virtual Physiological Human 2012 Conference held at IET Savoy, Savoy Place, London, 18-20 September 2012.
The slide presentation that preceded of the annual Health Datapalooza in Washington DC, PCORI was pleased to participate in the latest installment in the Health Data Consortium and PricewaterhouseCoopers (PwC) Innovators in Health Data Series, a webinar featuring PCORI Executive Director Joe Selby, MD, MPH; NIH Director and PCORI Board of Governors member Francis Collins, MD, PhD; and Philip Bourne, PhD, NIH’s Associate Director for Data Science.
From Research to Practice - New Models for Data-sharing and Collaboration to ...Health Data Consortium
Watch the webinar here: http://encore.meetingbridge.com/MB005418/140528/
Webinar transcript: http://hdc.membershipsoftware.org/Files/webinars/HDC-PwC%20NIH%20&%20PCORI%20Webinar%20Transcript%205_28_14.pdf
Patient-Centered Outcomes Research Institute (PCORI) Executive Director Joe Selby, MD, MPH; National Institutes of Health (NIH) Director and PCORI Board of Governors member Francis Collins, MD, PhD; and NIH Associate Director for Data Science Philip Bourne, PhD discussed new and emerging trends in big data for health, including:
- How researchers, patients, clinicians, and others are forging new models for data-sharing.
- Leveraging the quantity, variety, and analytic potential of health-related data for research and practice.
- Addressing patients’ perspectives, needs, and concerns in creating new opportunities for innovation and translational science.
- Exciting initiatives such as PCORnet, the National Patient-Centered Clinical Research Network initiative that PCORI is now helping to develop, and related open data and technology efforts such - as the NIH Health Systems Collaboratory and Big Data to Knowledge (BD2K) initiative.
Discover more health data resources on our website at http://www.healthdataconsortium.org/
Expert Panel on Data Challenges in Translational ResearchEagle Genomics
A panel of experts including Alexandre Passioukov, VP Translational Medicine at Pierre Fabre, Xose Fernandez, Chief Data Officer at Institut Curie, Abel Ureta-Vidal, CEO at Eagle Genomics share their first-hand experience of enabling translational research in pharmaceutical and biomedical organisations, and discuss the challenges around the establishment of streamlined, seamless data handling and governance to accelerate innovation.
Josephine Briggs, MD
Director
National Center for Complementary and Alternative Medicine
National Institutes of Health
Opening Keynote "Research in an IT Connected World: Building Better Partnerships – NIH and Health Care Systems"
The era of ‘Big Data’ has arrived for biomedical research, bringing with it immense challenges as well as spectacular opportunities. NIH is establishing major programs with the potential to transform the future of US biomedical research by building the capacities necessary for these challenges. These programs will strengthen research partnerships with health care systems and the IT networks that support them.
The Big Data to Knowledge (BD2K) initiative, to be launched in 2014, will implement a set of recommendations from the Data and Informatics Working Group to the Advisory Committee to the Director. Investments are planned to meet scientific needs to manage and utilize large complex datasets, including strengthening training, and investing in improved analysis methods and software development and dissemination. NIH is also evaluating strengthening data and software sharing policies, and the potential creation of catalogs of research data, and data/metadata standards.
The Common Fund’s Health Care Systems (HCS) Research Collaboratory program has the goal to strengthen the national capacity to implement cost-effective large-scale research studies by engaging major health care delivery organizations as research partners. The aim of the program is to provide a framework of implementation methods and best practices that will enable the participation of many health care systems in clinical research. Research conducted in partnership with health care systems is essential to strengthen the relevance of research results to health practice. Seven demonstration projects, currently in a feasibility phase, are developing detailed methods to implement rigorous randomized studies of questions of major public health impact. These studies, and the IT infrastructure that will make them possible, will be described in detail.
Presentation “Harnessing EHRs and Health IT to Achieve Population Health”
Jonathan Weiner, DrPH
Professor Department of Health Policy and Management
Director of Center for Population Health IT
Johns Hopkins Bloomberg School of Public Health, Baltimore Maryland
Professor Weiner’s presentation will focus on how electronic health records and other e-health tools can be harnessed to move beyond providing medical care for a single patient episode towards the achievement of “population health.” This provocative presentation will offer new conceptual paradigms and will review “big data” opportunities and challenges. The emphasis of the talk will be on how population focused care transformation can be brought about through the integration and application of e-health/EHR systems and claims/MIS systems. The talk will offer examples of analytic tools and methods designed to increase the effectiveness, efficiency and equity of care provided at a geographic community level and to “populations” of consumers enrolled in health plans, ACOs and other integrated delivery systems.
Key goals of presentation:
∙ To offer frameworks and paradigms to better understand how EHRs and other HIT can improve population health
∙ To outline opportunities and challenges for communities, ACOs and other integrated delivery systems
∙ To offer some case studies on the application of health IT to population health
To learn more visit:
https://insidescientific.com/webinar/cutting-edge-conversations-fighting-neurodegenerative-diseases/
Evelyn Pyper, MPH discusses how a patient-centered approach to real-world data collection and evidence generation can transform research in neurodegeneration. Neurodegenerative diseases often affect both motor and cognitive function, produce emotional and social changes, and require significant caregiver support, all while stretching across a fragmented healthcare ecosystem. Participatory research that directly obtains patient consent, empowers patients, and simplifies the task of linking multiple data sources, can lead to a more comprehensive capture of medical histories. This presentation briefly explores ways in which patient-centered research can improve understanding of disease diagnoses, symptomatology, and progression.
Are we ready for disruption in Translational Research through Digital Medicine?Ashish Atreja, MD, MPH
This is the slide deck that was presented at Translational Science 2016. Touches upon evidence generation as one of the most desired but expensive process in medical science. Provides examples of how Social Media, medical apps, quantified self movement are leading to patient generated data that can disrupt evidence generation process.
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
A talk given at the DATAPLAT workshop, co-located with the IEEE ICDE conference (May 2024, Utrecht, NL).
Data Provenance for Data Science is our attempt to provide a foundation to add explainability to data-centric AI.
It is a prototype, with lots of work still to do.
More Related Content
Similar to Delivering on the promise of data-driven healthcare: trade-offs, challenges, and research perspective
Talk entitled "from the Virtual Human to a Digital Me" presented at the Virtual Physiological Human 2012 Conference held at IET Savoy, Savoy Place, London, 18-20 September 2012.
The slide presentation that preceded of the annual Health Datapalooza in Washington DC, PCORI was pleased to participate in the latest installment in the Health Data Consortium and PricewaterhouseCoopers (PwC) Innovators in Health Data Series, a webinar featuring PCORI Executive Director Joe Selby, MD, MPH; NIH Director and PCORI Board of Governors member Francis Collins, MD, PhD; and Philip Bourne, PhD, NIH’s Associate Director for Data Science.
From Research to Practice - New Models for Data-sharing and Collaboration to ...Health Data Consortium
Watch the webinar here: http://encore.meetingbridge.com/MB005418/140528/
Webinar transcript: http://hdc.membershipsoftware.org/Files/webinars/HDC-PwC%20NIH%20&%20PCORI%20Webinar%20Transcript%205_28_14.pdf
Patient-Centered Outcomes Research Institute (PCORI) Executive Director Joe Selby, MD, MPH; National Institutes of Health (NIH) Director and PCORI Board of Governors member Francis Collins, MD, PhD; and NIH Associate Director for Data Science Philip Bourne, PhD discussed new and emerging trends in big data for health, including:
- How researchers, patients, clinicians, and others are forging new models for data-sharing.
- Leveraging the quantity, variety, and analytic potential of health-related data for research and practice.
- Addressing patients’ perspectives, needs, and concerns in creating new opportunities for innovation and translational science.
- Exciting initiatives such as PCORnet, the National Patient-Centered Clinical Research Network initiative that PCORI is now helping to develop, and related open data and technology efforts such - as the NIH Health Systems Collaboratory and Big Data to Knowledge (BD2K) initiative.
Discover more health data resources on our website at http://www.healthdataconsortium.org/
Expert Panel on Data Challenges in Translational ResearchEagle Genomics
A panel of experts including Alexandre Passioukov, VP Translational Medicine at Pierre Fabre, Xose Fernandez, Chief Data Officer at Institut Curie, Abel Ureta-Vidal, CEO at Eagle Genomics share their first-hand experience of enabling translational research in pharmaceutical and biomedical organisations, and discuss the challenges around the establishment of streamlined, seamless data handling and governance to accelerate innovation.
Josephine Briggs, MD
Director
National Center for Complementary and Alternative Medicine
National Institutes of Health
Opening Keynote "Research in an IT Connected World: Building Better Partnerships – NIH and Health Care Systems"
The era of ‘Big Data’ has arrived for biomedical research, bringing with it immense challenges as well as spectacular opportunities. NIH is establishing major programs with the potential to transform the future of US biomedical research by building the capacities necessary for these challenges. These programs will strengthen research partnerships with health care systems and the IT networks that support them.
The Big Data to Knowledge (BD2K) initiative, to be launched in 2014, will implement a set of recommendations from the Data and Informatics Working Group to the Advisory Committee to the Director. Investments are planned to meet scientific needs to manage and utilize large complex datasets, including strengthening training, and investing in improved analysis methods and software development and dissemination. NIH is also evaluating strengthening data and software sharing policies, and the potential creation of catalogs of research data, and data/metadata standards.
The Common Fund’s Health Care Systems (HCS) Research Collaboratory program has the goal to strengthen the national capacity to implement cost-effective large-scale research studies by engaging major health care delivery organizations as research partners. The aim of the program is to provide a framework of implementation methods and best practices that will enable the participation of many health care systems in clinical research. Research conducted in partnership with health care systems is essential to strengthen the relevance of research results to health practice. Seven demonstration projects, currently in a feasibility phase, are developing detailed methods to implement rigorous randomized studies of questions of major public health impact. These studies, and the IT infrastructure that will make them possible, will be described in detail.
Presentation “Harnessing EHRs and Health IT to Achieve Population Health”
Jonathan Weiner, DrPH
Professor Department of Health Policy and Management
Director of Center for Population Health IT
Johns Hopkins Bloomberg School of Public Health, Baltimore Maryland
Professor Weiner’s presentation will focus on how electronic health records and other e-health tools can be harnessed to move beyond providing medical care for a single patient episode towards the achievement of “population health.” This provocative presentation will offer new conceptual paradigms and will review “big data” opportunities and challenges. The emphasis of the talk will be on how population focused care transformation can be brought about through the integration and application of e-health/EHR systems and claims/MIS systems. The talk will offer examples of analytic tools and methods designed to increase the effectiveness, efficiency and equity of care provided at a geographic community level and to “populations” of consumers enrolled in health plans, ACOs and other integrated delivery systems.
Key goals of presentation:
∙ To offer frameworks and paradigms to better understand how EHRs and other HIT can improve population health
∙ To outline opportunities and challenges for communities, ACOs and other integrated delivery systems
∙ To offer some case studies on the application of health IT to population health
To learn more visit:
https://insidescientific.com/webinar/cutting-edge-conversations-fighting-neurodegenerative-diseases/
Evelyn Pyper, MPH discusses how a patient-centered approach to real-world data collection and evidence generation can transform research in neurodegeneration. Neurodegenerative diseases often affect both motor and cognitive function, produce emotional and social changes, and require significant caregiver support, all while stretching across a fragmented healthcare ecosystem. Participatory research that directly obtains patient consent, empowers patients, and simplifies the task of linking multiple data sources, can lead to a more comprehensive capture of medical histories. This presentation briefly explores ways in which patient-centered research can improve understanding of disease diagnoses, symptomatology, and progression.
Are we ready for disruption in Translational Research through Digital Medicine?Ashish Atreja, MD, MPH
This is the slide deck that was presented at Translational Science 2016. Touches upon evidence generation as one of the most desired but expensive process in medical science. Provides examples of how Social Media, medical apps, quantified self movement are leading to patient generated data that can disrupt evidence generation process.
Similar to Delivering on the promise of data-driven healthcare: trade-offs, challenges, and research perspective (20)
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
A talk given at the DATAPLAT workshop, co-located with the IEEE ICDE conference (May 2024, Utrecht, NL).
Data Provenance for Data Science is our attempt to provide a foundation to add explainability to data-centric AI.
It is a prototype, with lots of work still to do.
Towards explanations for Data-Centric AI using provenance recordsPaolo Missier
In this presentation, given to graduate students at Universita' RomaTre, Italy, we suggest that concepts well-known in Data Provenance can be exploited to provide explanations in the context of data-centric AI processes. Through use cases (incremental data cleaning, training set pruning), we build up increasingly complex provenance patterns, culminating in an open question:
how to describe "why" a specific data item has been manipulated as part of data processing, when such processing may consist of a complex data transformation algorithm.
Interpretable and robust hospital readmission predictions from Electronic Hea...Paolo Missier
A talk given at the BDA4HM workshop, IEEE BigData conference, Dec. 2023
please see paper here:
https://drive.google.com/file/d/1vN08G0FWxOSH1Yeak5AX6a0sr5-EBbAt/view
Data-centric AI and the convergence of data and model engineering:opportunit...Paolo Missier
A keynote talk given to the IDEAL 2023 conference (Evora, Portugal Nov 23, 2023).
Abstract.
The past few years have seen the emergence of what the AI community calls "Data-centric AI", namely the recognition that some of the limiting factors in AI performance are in fact in the data used for training the models, as much as in the expressiveness and complexity of the models themselves. One analogy is that of a powerful engine that will only run as fast as the quality of the fuel allows. A plethora of recent literature has started the connection between data and models in depth, along with startups that offer "data engineering for AI" services. Some concepts are well-known to the data engineering community, including incremental data cleaning, multi-source integration, or data bias control; others are more specific to AI applications, for instance the realisation that some samples in the training space are "easier to learn from" than others. In this "position talk" I will suggest that, from an infrastructure perspective, there is an opportunity to efficiently support patterns of complex pipelines where data and model improvements are entangled in a series of iterations. I will focus in particular on end-to-end tracking of data and model versions, as a way to support MLDev and MLOps engineers as they navigate through a complex decision space.
Tracking trajectories of multiple long-term conditions using dynamic patient...Paolo Missier
Momentum has been growing into research to better understand the dynamics of multiple long-term conditions-multimorbidity (MLTC-M), defined as the co-occurrence of two or more long-term or chronic conditions within an individual. Several research efforts make use of Electronic Health Records (EHR), which represent patients' medical histories. These range from discovering patterns of multimorbidity, namely by clustering diseases based on their co-occurrence in EHRs, to using EHRs to predict the next disease or other specific outcomes. One problem with the former approach is that it discards important temporal information on the co-occurrence, while the latter requires "big" data volumes that are not always available from routinely collected EHRs, limiting the robustness of the resulting models. In this paper we take an intermediate approach, where initially we use about 143,000 EHRs from UK Biobank to perform time-independent clustering using topic modelling, and Latent Dirichlet Allocation specifically. We then propose a metric to measure how strongly a patient is "attracted" into any given cluster at any point through their medical history. By tracking how such gravitational pull changes over time, we may then be able to narrow the scope for potential interventions and preventative measures to specific clusters, without having to resort to full-fledged predictive modelling. In this preliminary work we show exemplars of these dynamic associations, which suggest that further exploration may lead to On behalf of the AI-MULTIPLY consortium. Funded by NIHR AIM Development grant to AI-MULTIPLY actionable insights into patients' medical trajectories.
Digital biomarkers for preventive personalised healthcarePaolo Missier
A talk given to the Alan Turing Institute, UK, Oct 2021, reporting on the preliminary results and ongoing research in our lab, on self-monitoring using accelerometers for healthcare applications
Digital biomarkers for preventive personalised healthcarePaolo Missier
A talk given to the Alan Turing Institute, UK, Oct 2021, reporting on the preliminary results and ongoing research in our lab, on self-monitoring using accelerometers for healthcare applications
Capturing and querying fine-grained provenance of preprocessing pipelines in ...Paolo Missier
a talk given at the VLDB 2021 conference, August, 2021, presenting our paper:
Capturing and Querying Fine-grained Provenance of Preprocessing Pipelines in Data Science. Chapman, A., Missier, P., Simonelli, G., & Torlone, R. PVLDB, 14(4):507–520, January, 2021.
http://doi.org/10.14778/3436905.3436911
Decentralized, Trust-less Marketplacefor Brokered IoT Data Tradingusing Blo...Paolo Missier
a talk given at the 2nd IEEE Blockchain conference, Atlanta, US ?july 2019.
here is the paper: http://homepages.cs.ncl.ac.uk/paolo.missier/doc/Decentralised_Marketplace_USA_Conference___Accepted_Version_.pdf
Navigating Challenges: Mental Health, Legislation, and the Prison System in B...Guillermo Rivera
This conference will delve into the intricate intersections between mental health, legal frameworks, and the prison system in Bolivia. It aims to provide a comprehensive overview of the current challenges faced by mental health professionals working within the legislative and correctional landscapes. Topics of discussion will include the prevalence and impact of mental health issues among the incarcerated population, the effectiveness of existing mental health policies and legislation, and potential reforms to enhance the mental health support system within prisons.
CHAPTER 1 SEMESTER V PREVENTIVE-PEDIATRICS.pdfSachin Sharma
This content provides an overview of preventive pediatrics. It defines preventive pediatrics as preventing disease and promoting children's physical, mental, and social well-being to achieve positive health. It discusses antenatal, postnatal, and social preventive pediatrics. It also covers various child health programs like immunization, breastfeeding, ICDS, and the roles of organizations like WHO, UNICEF, and nurses in preventive pediatrics.
One of the most developed cities of India, the city of Chennai is the capital of Tamilnadu and many people from different parts of India come here to earn their bread and butter. Being a metropolitan, the city is filled with towering building and beaches but the sad part as with almost every Indian city
ICH Guidelines for Pharmacovigilance.pdfNEHA GUPTA
The "ICH Guidelines for Pharmacovigilance" PDF provides a comprehensive overview of the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) guidelines related to pharmacovigilance. These guidelines aim to ensure that drugs are safe and effective for patients by monitoring and assessing adverse effects, ensuring proper reporting systems, and improving risk management practices. The document is essential for professionals in the pharmaceutical industry, regulatory authorities, and healthcare providers, offering detailed procedures and standards for pharmacovigilance activities to enhance drug safety and protect public health.
R3 Stem Cells and Kidney Repair A New Horizon in Nephrology.pptxR3 Stem Cell
R3 Stem Cells and Kidney Repair: A New Horizon in Nephrology" explores groundbreaking advancements in the use of R3 stem cells for kidney disease treatment. This insightful piece delves into the potential of these cells to regenerate damaged kidney tissue, offering new hope for patients and reshaping the future of nephrology.
Antibiotic Stewardship by Anushri Srivastava.pptxAnushriSrivastav
Stewardship is the act of taking good care of something.
Antimicrobial stewardship is a coordinated program that promotes the appropriate use of antimicrobials (including antibiotics), improves patient outcomes, reduces microbial resistance, and decreases the spread of infections caused by multidrug-resistant organisms.
WHO launched the Global Antimicrobial Resistance and Use Surveillance System (GLASS) in 2015 to fill knowledge gaps and inform strategies at all levels.
ACCORDING TO apic.org,
Antimicrobial stewardship is a coordinated program that promotes the appropriate use of antimicrobials (including antibiotics), improves patient outcomes, reduces microbial resistance, and decreases the spread of infections caused by multidrug-resistant organisms.
ACCORDING TO pewtrusts.org,
Antibiotic stewardship refers to efforts in doctors’ offices, hospitals, long term care facilities, and other health care settings to ensure that antibiotics are used only when necessary and appropriate
According to WHO,
Antimicrobial stewardship is a systematic approach to educate and support health care professionals to follow evidence-based guidelines for prescribing and administering antimicrobials
In 1996, John McGowan and Dale Gerding first applied the term antimicrobial stewardship, where they suggested a causal association between antimicrobial agent use and resistance. They also focused on the urgency of large-scale controlled trials of antimicrobial-use regulation employing sophisticated epidemiologic methods, molecular typing, and precise resistance mechanism analysis.
Antimicrobial Stewardship(AMS) refers to the optimal selection, dosing, and duration of antimicrobial treatment resulting in the best clinical outcome with minimal side effects to the patients and minimal impact on subsequent resistance.
According to the 2019 report, in the US, more than 2.8 million antibiotic-resistant infections occur each year, and more than 35000 people die. In addition to this, it also mentioned that 223,900 cases of Clostridoides difficile occurred in 2017, of which 12800 people died. The report did not include viruses or parasites
VISION
Being proactive
Supporting optimal animal and human health
Exploring ways to reduce overall use of antimicrobials
Using the drugs that prevent and treat disease by killing microscopic organisms in a responsible way
GOAL
to prevent the generation and spread of antimicrobial resistance (AMR). Doing so will preserve the effectiveness of these drugs in animals and humans for years to come.
being to preserve human and animal health and the effectiveness of antimicrobial medications.
to implement a multidisciplinary approach in assembling a stewardship team to include an infectious disease physician, a clinical pharmacist with infectious diseases training, infection preventionist, and a close collaboration with the staff in the clinical microbiology laboratory
to prevent antimicrobial overuse, misuse and abuse.
to minimize the developme
Explore our infographic on 'Essential Metrics for Palliative Care Management' which highlights key performance indicators crucial for enhancing the quality and efficiency of palliative care services.
This visual guide breaks down important metrics across four categories: Patient-Centered Metrics, Care Efficiency Metrics, Quality of Life Metrics, and Staff Metrics. Each section is designed to help healthcare professionals monitor and improve care delivery for patients facing serious illnesses. Understand how to implement these metrics in your palliative care practices for better outcomes and higher satisfaction levels.
Delivering on the promise of data-driven healthcare: trade-offs, challenges, and research perspective
1. Paolo Missier
School of Computing
Newcastle University, UK
Comsys 2022
IIT Ropar, India
(online presentation)
Delivering on the promise of data-driven healthcare:
trade-offs, challenges, and research perspectives
2. 2
<event
name>
Outline
• AI for HealthCare: a convergence of needs and opportunities
• A complex multifaceted landscape
• Data engineering for healthcare data: intrinsic and translational requirements
• Extracting actionable knowledge from EHRs
• Recent work
• Some Challenges
3. 3
<event
name>
The promise of data-driven medicine and healthcare
Predictive, Preventative, Personalised, Participatory: a systems biology perspective on the future of
medicine and health care
Hood L, Heath JR, Phelps ME, Lin B. Systems biology and new technologies enable predictive and preventative medicine. Science. 2004;306(5696):640–643.
Hood L, Balling R, Auffray C. Revolutionizing medicine in the 21st century through systems approaches. Biotechnol J. 2012;7(8):992–1001. Provides an overview of the science and
technological foundations of predictive, preventive, personalized and participatory healthcare
Flores M, Glusman G, Brogaard K, Price ND, Hood L. P4 medicine: how systems medicine will transform the healthcare sector and society. Per Med. 2013;10(6):565-576. doi:
10.2217/pme.13.57. PMID: 25342952; PMCID: PMC4204402.
Schmidt, Charlie. ‘Leroy Hood Looks Forward to P4 Medicine: Predictive, Personalized, Preventive, and Participatory’. JNCI Journal of the National Cancer Institute 106, no. 12
(December 2014): dju416–dju416. https://doi.org/10.1093/jnci/dju416.
[1] Sagner, M, A McNeil, P Puska, and R Arena. ‘The P4 Health Spectrum – A Predictive, Preventive, Personalized and Participatory Continuum for Promoting Healthspan’.
Progress in Cardiovascular Diseases 59, no. 5 (2017): 506–21. https://doi.org/10.1016/j.pcad.2016.08.002.
A new approach in medicine that is predictive, preventive, personalized and participatory, which we
label here as “P4” holds great promise to reduce the burden of chronic diseases by harnessing
technology and an increasingly better understanding of environment-biology interactions, evidence-
based interventions and the underlying mechanisms of chronic diseases. [1]
4. 5
<event
name>
Five pillars of P4 medicine
Pillar 1
■ Cutting-edge technologies for generating data regarding multiple dimensions of each person's experience of
health and disease.
Pillar 2
■ A digital infrastructure linking participating discovery science and clinical institutions, as well as
patients/consumers.
Pillar 3
■ Personalized data clouds providing information about multiple dimensions of each individual's unique dynamic
experience of health and disease ranging from the molecular to the social. These data will include genetic and
phenotypic characteristics, medical history, demographics and other sociometrics.
Pillar 4
■ New analytic techniques and technologies from deriving actionable knowledge from the data.
Pillar 5
■ Systems biology models for understanding the unique health status of each individual in terms of dynamic
network states that can be manipulated by cost-effective strategies
Source: [1]
5. 6
<event
name>
Outline
• AI for HealthCare: a convergence of needs and opportunities
• A complex multifaceted landscape
• Data engineering for healthcare data: intrinsic and translational requirements
• Extracting actionable knowledge from EHRs
• Recent work
• Some Challenges
6. 7
<event
name>
A convergence of needs and opportunities
P4
Data-driven
Healthcare
Personal self-
monitoring
devices
Health Data
Science and
Engineering
Governance, consent
Secure data access
(Big) Health
Data
- Operations Research
- ML, AI Methods
- Scalable computing
Medical grade Consumer grade
- Privacy (eg GDPR)
- Opt-in vs opt-out
- Trusted Research Environments
Bigger == more useful?
8. 9
<event
name>
Outline
• AI for HealthCare: a convergence of needs and opportunities
• A complex multifaceted landscape
• Data engineering for healthcare data: intrinsic and translational requirements
• Extracting actionable knowledge from EHRs
• Recent work
• Some Challenges
9. 10
<event
name>
Understanding the facets of Health data
• Clinical
• Lifestyle, social
•Which data types?
• Prospective vs
retrospective
•Where do datasets
come from?
• Acquisition
• Curation, annotation
•How much do they
cost?
• Small vs Big Health
Data
•How large?
• Governance
• Protection
•Who can use it and
how?
Data
Science and
Engineering
Benefits to
patients
10. 11
<event
name>
I. Which data? Capturing individuals’ complexity
Primary care records:
- Clinical tests / GP notes, diagnoses / Prescriptions
Secondary care records:
- hospital admission / diagnoses / operations / prescriptions
Multi-omics data:
- genotypes, exomes, genomes.
- Transcriptomics, proteomics
Digital Health:
- Data streams from wearable and environment sensors,
self-monitoring
Socio-demographics:
- Area of residence, family, social deprivation
12. 13
✗
<event
name>
II. Prospective vs retrospective datasets
Prospective: defined for research purposes
✓ Stable and
predictable
✓ Follow protocol
✓ Research ready
✓ Potentially well-
curated
✓ Bias known a priori
✗ Expensive
✗ Not very reusable
✗ Scarce
Potentially more reusable
Natural Bias (reflects natural cohort locality)
✗ Generally not research ready
✗ Require data engineering
Retrospective: typically operational data
Example:
Clinical Practice Research Datalink
- Data collected from UK GP practices
- 60+ million patients
- (also prospective)
Example: UK Biobank
- 500,000 volunteer participants
- General health information
- Genotypes and whole genomes
- Selected internal organ imaging study (100K)
- Bias: 40+ years, geographic / social bias
Prospective datasets:
13. 14
<event
name>
Example: LITMUS
Retrospective + prospective data collection project
• EU IMI2 project
• Collecting data across Centres (EU + USA) on Non-Alcoholic Fatty
Liver Disease (NAFLD) and NASH (liver steathosis, fibrosis, cirrhosis)
https://litmus-project.eu/litmus-partners/
Phase 1a:
- Retrospective data collected from hospitals datasets
- Around 10,000 patients
- Varying degrees of quality / completeness
- Central curation required
Phase 1b:
- Prospective data from active recruitment
- Around 2,000 patients
- Omics data more abundant
14. 15
<event
name>
Discovery + validation experimental design
Actual design will depend on dataset characteristics
Ex. “Explore relationship between social deprivation and mortality rate in MLTC population”
An ideal study will include both Prospective and Retrospective datasets
UKBiobank machine-learning friendly modelling discovery of candidate associations
Regional dataset validation dataset
Regional UK dataset:
- 50K actual patients
- Data availability depending on operational systems
- Likely data quality problems (incomplete, incorrect)
- Bias: geographic location natural distribution of
social deprivation
UK Biobank:
- Nationwide data
- 140K MLTC participants
- Complete set of multiple
deprivation indicators available(*)
(*) Townsend deprivation index at recruitment, Index of Multiple Deprivation (England, Scotland, Wales)
Plus education score and other socio-demographics indeces. Distribution across population documented on UKBB site
15. 16
<event
name>
III - Cost of health data
Retrospective: integration/harmonisation, curation, cleaning
Prospective: cost of cohort recruitment, data collection, data processing
Acquisition + processing cost by data type:
Routinely collected
clinical variables
(GP test)
- Tests requiring specialist labs
- Proteomics
- Genotyping
(a few genes)
Whole exome
sequencing
Whole genome
sequencing
Low High
19. 20
<event
name>
Challenge: making the best of expensive features
RS1
RS2
FS1
Training set 1: (RS1+RS2 , FS1)
FS1: core features, FS2 extended features
FS1 available on entire cohort
FS2 only available on a subset
How do leverage a model learnt using Training set 1
to improve a model learnt from Training set 2?
FS2
Training set 2: (RS1 , FS1+FS2)
20. 21
<event
name>
Cost of data: the imbalance problem
Example: Physical Monitoring:
Everyday
fitness
Pathological conditions
Eg cognitive impairment
Cost
Low High
High Low
Abundance
Consequence: class imbalance in classification tasks
21. 22
IV - Size: Big Data for Health Care
Genomics for
personalized medicine
Article Source: Big Data: Astronomical or Genomical?
Stephens ZD, Lee SY, Faghri F, Campbell RH, Zhai C, et al. (2015) Big Data: Astronomical or Genomical?. PLOS Biology 13(7):
e1002195. https://doi.org/10.1371/journal.pbio.1002195
22. 25
<event
name>
Size: Do I always need the full granularity?
Dataset size = <Data point size> x <Number of data points>
how much information do I lose by downsampling?
Genomics:
1 exome: >1 Billion data points (base pairs) x N exomes. No downsampling
Medical imaging:
1 image = N pixels
Some downsampling may be acceptable
Sensor data (eg accelerometers)
- Do I need 10Hz or 100 Hz?
- Typically very noisy
Feature engineering
vs
Representation learning
23. 27
Case study: Using activity trackers to predict Type-2 Diabetes
Objective: To determine the extent to which accelerometer traces can be used to distinguish individuals with
Type-2 Diabetes (T2D) from normoglycaemic controls, and to quantify their limitations.
Lam, B; Catt, M; Cassidy, S; Bacardit, J; Darke, P; Butterfield, S; Alshabrawy, O; Trenell, M; and Missier, P, Using wearable activity trackers
to predict Type-2 Diabetes: A machine learning-based cross-sectional study of the UK Biobank accelerometer cohort. JMIR Diabetes.
January 2021. http://doi.org/10.2196/23364
Feature
extraction
Clustering
Classification
24. 28
Filter:
Accelerometry study?
103,712
Split criteria:
Type 2 Diabetes?
At baseline: 2,755
Through EHR analysis: 1,321
Total: 4,076
Non-Diabetes
99,636
Filter:
EHR data available?
19,852
502, 664
All UK Biobank participants:
Filter:
QC on activity traces
3,103
Positives:
T2D vs Norm-0
Physical Impairment analysis
Severe impairment
1,666
No impairment
8,463
A great UG project!
your (biomedical) dataset may not be as big as it looks
T2D vs Norm-1
25. 29
<event
name>
V - Data governance issues: the emerging UK landscape
https://www.goldacrereview.org/
Build a small number of Trusted Research Environments, avoiding duplication
Promote culture of reuse of code (curation pipelines, analytics)
- Reproducible Analytical Pipelines”, a set of best practices
- Promote high quality, shared, reviewable, re-usable, well-documented code for
standardized data curation and analysis
- Promote transparency, avoid black box analysis
Adopt single governance rules for integrated data access
- Rationalise approvals: create one map of all approval processes
Build appropriate capabilities:
- Train academic researchers and NHS analysts in computational data science
techniques
26. 30
<event
name>
Outline
• AI for HealthCare: a convergence of needs and opportunities
• A complex multifaceted landscape
• Data engineering for healthcare data: intrinsic and translational requirements
• Extracting actionable knowledge from EHRs
• Recent work
• Some Challenges
27. 31
<event
name>
Data engineering for healthcare data
My Smart Age with HIV (MySAwH)
- International multi-center prospective study
- Aimed at studying and monitoring healthy aging in People Living with HIV (PLWH)
- Data from routine clinical assessments and innovative PROs, collected through mobile and wearable devices;
- retrospective studies
- focus on the hospital resource management and clinical decision making problems emerged during the Covid–19
pandemic
Mandreoli, Federica, Davide Ferrari, Veronica Guidetti, Federico Motta, and Paolo Missier. ‘Real-World Data Mining Meets Clinical Practice: Research
Challenges and Perspective’. Frontiers in Big Data 5 (2022). https://doi.org/10.3389/fdata.2022.1021621.
Ferrari, D., Guaraldi, G., Mandreoli, F., Martoglia, R., Milić, J., and Missier, P. (2020a). “Data-driven vs. knowledge-driven inference of health outcomes in the
ageing population: a case study,” in Proceedings of the Workshops of the EDBT-ICDT Joint Conference, Vol. 2578.
Ferrari, D., Mandreoli, F., Guaraldi, G., Milić, J., and Missier, P. (2020b). “Predicting respiratory failure in patients with COVID-19 pneumonia: a case study from
Northern Italy,” in Proceedings of the 1st International Advances in Artificial Intelligence for Healthcare Workshop, Vol. 2820, (Santiago de Compostela), 32–38.
Ferrari, D., Milić, J., Mussini, C., Mandreoli, F., Missier, P., Guaraldi, G., et al. (2020c). Machine learning in predicting respiratory failure in patients with COVID-19
pneumonia–Challenges, strengths, and opportunities in a global health emergency. PLoS ONE 15:e239172. doi: 10.1371/journal.pone.0239172
Mandreoli, F., Motta, F., and Missier, P. (2021). “An HMM-ensemble approach to predict severity progression of ICU treatment for hospitalized COVID-19
patients,” in 20th IEEE International Conference on Machine Learning and Applications (Pasadena, CA), 1299–1306. doi: 10.1109/ICMLA52953.2021.00211
28. 32
<event
name>
Issues requiring Data Engineering
Recurringdata
issues
Data–driven, AI–based clinical practice: experiences, challenges, and research directions
DATA SPARSITY
AND SCARSITY
• EHR: Irregular
collections of
time series
• Imputation is
not always
possible
DATA
IMBALANCE
• Predicting
rare events
can be a
priority
• No
downsampling
option
DATA
INCONSISTENCY
and INSTABILITY
• Retrospective
data are often
source of
inconsistency
and their
schema are
instable
NOT ALL
ERRORSARE
EQUALLY
WRONG
• In high-stake
domains
sometimes a
bias towards
one type of
error is
preferible
HUMAN-IN-
THE-LOOP
• Explanations
engender trust
in the models
• Trust should
include not
only the
clinician but
also the
patient.
30. 34
<event
name>
Sparsity/ scarcity, imbalance
Classifiers are not resilient to class imbalance:
- Models will be biased towards predicting
majority class regardless of the input features
- Will struggle to generalise correctly on the
minority class
- In clinical datasets, data scarcity/sparsity often
conspires with data imbalance
- Imbalance is very common in medical datasets
Typical mitigation:
- Downsample the majority class lose training examples
- Upsample the minority class. SMOTE (Synthetic Minority Oversampling Technique)
When modelling processes, these mitigations do not work
We used Hidden Markov Models (HMMs) to predict oxygen-therapy state-transitions
However, intubation is a infrequent state (and so is “death”)
This makes it was difficult to accurately learn probability distributions.
[1] proposes a novel, generic ensemble technique to mitigate the imbalance problem in HMM
31. 35
<event
name>
Instability
Retrospective studies are often unstable:
Data acquisition and management practices may change over time, following changes in
- Clinical practices
- Public policy
- Hospital resources
- Data collection technologies
- In our COVID dataset clinical tests vary daily depending on the patient’s
condition
- Scientific evidence for the need of certain tests changed rapidly
- Example: new biomarkers like interleukin-6 were introduced in “mid flight”
- Thus earlier study datasets completely miss this variable
32. 36
<event
name>
Translational challenge: Not all errors are equally wrong
- In high-stakes domains, prediction errors are not symmetric:
- Typically, underestimating risk is less desirable than overestimating it
- Standard model performance metrics (eg AUC, F1 etc) fail to capture this distinction
Cost-sensitive learning (cf eg [1,2,3])
- Introduce an explicit penalty of mis-classifying samples
- Note that cost- sensitive methods can sometimes deal with imbalanced datasets without
altering the original data distribution [4]
[1] Lomax, S., and Vadera, S. (2013). A survey of cost-sensitive decision tree induction algorithms. ACM Comput. Surveys 45, 1–35. doi: 10.1145/2431211.2431215
[2] Wang, H., Cui, Z., Chen, Y., Avidan, M., Abdallah, A. B., and Kronzer, A. (2018). Predicting hospital readmission via cost-sensitive deep learning. ACM Trans.
Comput. Biol. Bioinformatics 15, 1968–1978. doi: 10.1109/TCBB.2018.2827029
[3] Freitas, A., Costa-Pereira, A., and Brazdil, P. (2007). “Cost-sensitive decision trees applied to medical data,” in Data Warehousing and Knowledge Discovery
(Regensburg), 303–312. doi: 10.1007/978-3-540-74553-2_28
[4] Mienye, I. D., and Sun, Y. (2021). Performance analysis of cost-sensitive learning methods with application to imbalanced medical data. Inform. Med. Unlock.
25:100690. doi: 10.1016/j.imu.2021.100690
33. 37
<event
name>
Translational challenge: human-in-the-loop AI
• Essential in medical AI
• Evidence of performance is not enough
• Black-box AI not acceptable in clinical practice
From technical explanations:
• non-linear [1] and Deep Learning [2] models
• Shapley values [3]
• Interpretable ML [4,5]
Also importantly:
Patient and Public Involvement (PPI) is essential in publicly funded clinical research
“Explanation gap”:
To expert involvement in the learning process:
- by accepting/rejecting predictions
- By expressing preference for a given error type
Causal Machine Learning (CML) [6,7]:
- Visualisation and reasoning over complex clinical scenarios
- Counterfactuals, what-if scenarios
34. 38
<event
name>
References on explainability and human-in-the-loop
[1] Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., et al. (2020). From local explanations to global
understanding with explainable AI for trees. Nat. Mach. Intell. 2, 2522–5839. doi: 10.1038/s42256-019-0138-9
[2] Singh, A., Sengupta, S., and Lakshminarayanan, V. (2020). Explainable deep learning models in medical image analysis. J. Imaging
6:52. doi: 10.3390/jimaging6060052
[3] Lundberg, S. M., and Lee, S. (2017). “A unified approach to interpreting model predictions,” in Advances in Neural Information
Processing Systems, Vol. 30. (Long Beach, CA).
[4] Ahmad, M. A., Eckert, C., and Teredesai, A. (2018). “Interpretable machine learning in healthcare,” in Proceedings of the ACM
International Conference on Bioinformatics, Computational Biology, and Health Informatics (Washington, DC), 559–560. doi:
10.1145/3233547.3233667
[5] Abdullah, T. A. A., Zahid, M. S. M., and Ali, W. (2021). A review of interpretable ML in healthcare: taxonomy, applications,
challenges, and future directions. Symmetry 13:2439. doi: 10.3390/sym13122439
[6] Oneto, L., and Chiappa, S. (2020). “Fairness in machine learning,” in Recent Trends in Learning From Data: Tutorials from the INNS
Big Data and Deep Learning Conference (Sestri Levante, Genova), 155–196. doi: 10.1007/978-3-030-43883-8_7
[7] Sanchez, P., Voisey, J. P., Xia, T., Watson, H. I., O’Neil, A. Q., and Tsaftaris, S. A. (2022). Causal machine learning for healthcare and
precision medicine. R. Soc. Open Sci. 9:220638. doi: 10.1098/rsos.220638
35. 39
<event
name>
Outline
• AI for HealthCare: a convergence of needs and opportunities
• A complex multifaceted landscape
• Data engineering for healthcare data: intrinsic and translational requirements
• Extracting actionable knowledge from EHRs
• Recent work
• Some Challenges
36. 40
<event
name>
EHR data: Traditional statistics and machine learning methods
Discovering patterns of multimorbid conditions and relationships between multimorbidities,
Socio-Demographic Factors, Health-Related Quality of Life, mortality
• See Systematic review [1]
• Clustering methods but differing in proximity measures. Also, patient clusters vs disease
clusters?
• Other specific methods:
• Latent Class Analysis, Cox Regression models, SVM, Cox proportional hazard model,
Random Forests [2,3,4,5,6]
• Multilevel logistic regression for longitudinal analysis [7]
[1] Ng, Shu Kay, Richard Tawiah, Michael Sawyer, and Paul Scuffham. ‘Patterns of Multimorbid Health Conditions: A Systematic Review of Analytical Methods and
Comparison Analysis.’ International Journal of Epidemiology 47, no. 5 (1 October 2018): 1687–1704. https://doi.org/10.1093/ije/dyy134.
37. 41
<event
name>
EHR data: Traditional methods – references
[2] Larsen, Finn Breinholt, Marie Hauge Pedersen, Karina Friis, Charlotte Glümer, and Mathias Lasgaard. ‘A Latent Class Analysis of
Multimorbidity and the Relationship to Socio-Demographic Factors and Health-Related Quality of Life. A National Population-Based Study
of 162,283 Danish Adults.’ PloS One 12, no. 1 (2017): e0169426. https://doi.org/10.1371/journal.pone.0169426.
[3] Jani, Bhautesh Dinesh, Peter Hanlon, Barbara I. Nicholl, Ross McQueenie, Katie I. Gallacher, Duncan Lee, and Frances S. Mair. ‘Relationship
between Multimorbidity, Demographic Factors and Mortality: Findings from the UK Biobank Cohort’. BMC Medicine 17, no. 1 (10 April 2019):
74. https://doi.org/10.1186/s12916-019-1305-x.
[4] Whitson, Heather E., Kimberly S. Johnson, Richard Sloane, Christine T. Cigolle, Carl F. Pieper, Lawrence Landerman, and Susan N. Hastings.
‘Identifying Patterns of Multimorbidity in Older Americans: Application of Latent Class Analysis.’ Journal of the American Geriatrics Society 64,
no. 8 (August 2016): 1668–73. https://doi.org/10.1111/jgs.14201.
[5] Zemedikun, Dawit T., Laura J. Gray, Kamlesh Khunti, Melanie J. Davies, and Nafeesa N. Dhalwani. ‘Patterns of Multimorbidity in Middle-
Aged and Older Adults: An Analysis of the UK Biobank Data.’ Mayo Clinic Proceedings 93, no. 7 (July 2018): 857–66.
https://doi.org/10.1016/j.mayocp.2018.02.012.
[6] Zhu, Yajing, Duncan Edwards, Jonathan Mant, Rupert A. Payne, and Steven Kiddle. ‘Characteristics, Service Use and Mortality of Clusters of
Multimorbid Patients in England: A Population-Based Study.’ BMC Medicine 18, no. 1 (10 April 2020): 78. https://doi.org/10.1186/s12916-020-
01543-8.
[7] Ashworth, Mark, Stevo Durbaba, David Whitney, James Crompton, Michael Wright, and Hiten Dodhia. ‘Journey to Multimorbidity:
Longitudinal Analysis Exploring Cardiovascular Risk Factors and Sociodemographic Determinants in an Urban Setting.’ BMJ Open 9, no. 12 (23
December 2019): e031649. https://doi.org/10.1136/bmjopen-2019-031649.
38. 42
<event
name>
EHR data: Deep Learning
Recent survey on DNN methods for EHR-based modelling [1]:
- DNNs fully exploit the longitudinal nature of EHRs
- Useful predict outcomes where patient history is a relevant predictor
State of the art methods (< 2020):
- eNRBM [2]
- Deep Patient [3]
- Deepr [4]
- RETAIN [5]
Summary of Results:
- The methods are competitive
- Achieving AUC >.8 on each of the outcomes above
Target outcomes:
Future disease given medical history
Unplanned readmission
Disease progression
Specific complication, eg heart failure, cataract
Patient mortality
[1] Ayala Solares, Jose Roberto, Francesca Elisa Diletta Raimondi, Yajie Zhu, Fatemeh Rahimian, Dexter Canoy, Jenny Tran, Ana Catarina Pinho Gomes, et al. ‘Deep
Learning for Electronic Health Records: A Comparative Review of Multiple Deep Neural Architectures’. Journal of Biomedical Informatics 101 (1 January 2020):
103337. https://doi.org/10.1016/j.jbi.2019.103337.
39. 43
<event
name>
EHR data: Deep Learning -- References
[2] T. Tran, T. D. Nguyen, D. Phung, S. Venkatesh, Learning vector representation of medical objects via EMR-driven nonnegative restricted
Boltzmann machines (eNRBM), Journal of Biomedical Informatics 54 (2015) 96 – 105. doi:https://doi.org/10.1016/j.jbi.2015.01.012.
[3] Miotto R, Li L, Kidd BA, Dudley JT. Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health
Records. Sci Rep. 2016 May 17;6:26094. doi: 10.1038/srep26094. PMID: 27185194; PMCID: PMC4869115.
[4] P. Nguyen, T. Tran, N. Wickramasinghe, S. Venkatesh, Deepr: A convolutional net for medical records, IEEE Journal of Biomedical and Health
Informatics 21 (1) (2017) 22–30. doi:10.1109/JBHI.2016. 767 2633963.
[5] E. Choi, M. T. Bahadori, J. Sun, J. Kulas, A. Schuetz, W. Stewart, 850 RETAIN: An Interpretable Predictive Model for Healthcare using Reverse
Time Attention Mechanism, in: Advances in Neural Information Processing Systems, 2016, pp. 3504–3512.
41. 47
IEEE
BigData
2022
Can we model the likelihood of next disease?
Experimental models exists for
- Modelling disease progression [1]
- Discovering clinical pathway patterns [2]
- Predicting next disease(s) [3]
However, not very robust or actually deployed in practice
[1] Wang, Xiang, David Sontag, and Fei Wang. ‘Unsupervised Learning of Disease Progression Models’. In Proceedings of the 20th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, 85–94. KDD ’14. New York, NY, USA, 2014 https://doi.org/10.1145/2623330.2623754.
[2] Huang, Zhengxing, Wei Dong, Lei Ji, Chenxi Gan, Xudong Lu, and Huilong Duan. ‘Discovery of Clinical Pathway Patterns from Event Logs Using Probabilistic Topic
Models’. Journal of Biomedical Informatics 47 (1 February 2014): 39–57. https://doi.org/10.1016/j.jbi.2013.09.003.
[3] Men, Lu, Noyan Ilk, Xinlin Tang, and Yuan Liu. ‘Multi-Disease Prediction Using LSTM Recurrent Neural Networks’. Expert Systems with Applications 177 (1
September 2021): 114905. https://doi.org/10.1016/j.eswa.2021.114905.
use Electronic Health Records (diagnoses event logs) to predict patients’
long-term associations to a specific disease cluster
Our goal:
42. 48
IEEE
BigData
2022
Research hypothesis
It is possible to identify clusters of diseases that:
1. Are described using disease terms that are familiar to health domain experts
2. Are clinically significant based on expert validation
3. Admit a quantitative association of individual patients with each of the clusters
(*) limited to LTCs
What we hope to find:
1. A significant majority of patients are stable relative to the clustering
2. Stability emerges early in their medical history(*)
43. 49
IEEE
BigData
2022
Contributions
• We use Topic Modelling as a form of semantic clustering
• Topics are defined by ranked lists of disease terms
• We define a cluster’s gravitational pull: patients are differently attracted by each
cluster at different points in time
• We propose a quantitative measure of stability with respect to clusters over time
• We study how stability increases as timelines progress
45. 51
<event
name>
Challenge: making the best of expensive features
RS1
RS2
FS1
Training set 1: (RS1+RS2 , FS1)
FS1: core features, FS2 extended features
FS1 available on entire cohort
FS2 only available on a subset
How do leverage a model learnt using Training set 1
to improve a model learnt from Training set 2?
FS2
Training set 2: (RS1 , FS1+FS2)
46. 52
<event
name>
Challenge: synthetic data generation for specialized data types
Self-monitoring contains potentially useful signal to anticipate specific conditions
- But data heavily imbalanced towards healthy controls
- Case data points harder to collect
Can we use the available “seed” true data points to generate new synthetic and plausible ones?
Specifically: physical activity data general problem of time-series data generation
Challenge:
Existing GAN / TimeGAN approaches insufficient
- Hard to scale
- Require very strong signal
47. 53
<event
name>
Key messages
• The weaknesses are in the data not in the models!
• Need for data integration + curation + engineering dominate the need for size
• Investments driven by “health crisis”
• Mental (dementia, Parkinson’s)
• Physical: multimorbidity in older population
• Focus on EHR:
• Good advances in using AI to draw insights from EHR, but data quality is a big barrier
AI for HealthCare: great opportunities for impactful research,
but many challenges remain
Editor's Notes
Mention "reusable analysis pipelines" (RAP)
NHS data in the UK are a prime example of retrospective data. In principle accessible for research, but
There are governance issues
It requires coding and integration
Figure 2a shows that the total volume of data in genomics is considerably smaller than the data generated by earth science [26], but orders of magnitude larger than the social sciences. The data growth trend in genomics, however, is greater than in other disciplines. In fact, some researchers have suggested that if the genomics data generation growth trend remains constant, genomics will soon generate more data than applications such as social media, earth sciences, and astronomy [27].
In Fig. 2b, we compare genomics to other data-driven disciplines in the biological sciences. This analysis clearly shows that the large amount of early biological data was not in genomics, but rather in macromolecular structure. Only in 2001, for example, did the number of datasets in genomics finally surpass protein-structure data. More recently, new trends have emerged with the rapidly increasing amount of electron microscopy data, due to the advent of cryo-electron microscopy, and of mass-spectrometry-based proteomics data. Perhaps these trends will shift the balance of biomedical data science in the future.
Vision is generating actionable knowledge from big health data.
What sort of clinical questions are we considering?