This document summarizes a presentation on opportunities and challenges for applying health data science and AI in healthcare. It discusses the potential of predictive, preventative, personalized and participatory (P4) approaches using large health datasets. However, it notes major challenges including data sparsity, imbalance, inconsistency and high costs. Case studies on liver disease and COVID datasets demonstrate issues requiring data engineering. Ensuring explanations and human oversight are also key to adopting AI in clinical practice. Overall, the document outlines a complex landscape and the need for better data science methods to realize the promise of data-driven healthcare.
A Data-centric perspective on Data-driven healthcare: a short overviewPaolo Missier
a brief intro on the data challenges associated with working with Health Care data, with a few examples, both from literature and our own, of traditional approaches (Latent Class Analysis, Topic Modelling) and a perspective on Language-based modelling for Electronic Health Records (EHR).
probably more references than actual content in here!
HEALTH PREDICTION ANALYSIS USING DATA MININGAshish Salve
Data mining techniques are used for a variety of applications. In healthcare industry, datamining plays an important
role in predicting diseases. For detecting a disease number of tests should be required from the patient. But using data
mining technique the number of tests can be reduced. This reduced test plays an important role in time and performance.
This report analyses data mining techniques which can be used for predicting different types of diseases. This report reviewed
the research papers which mainly concentrate on predicting various disease
A Data-centric perspective on Data-driven healthcare: a short overviewPaolo Missier
a brief intro on the data challenges associated with working with Health Care data, with a few examples, both from literature and our own, of traditional approaches (Latent Class Analysis, Topic Modelling) and a perspective on Language-based modelling for Electronic Health Records (EHR).
probably more references than actual content in here!
HEALTH PREDICTION ANALYSIS USING DATA MININGAshish Salve
Data mining techniques are used for a variety of applications. In healthcare industry, datamining plays an important
role in predicting diseases. For detecting a disease number of tests should be required from the patient. But using data
mining technique the number of tests can be reduced. This reduced test plays an important role in time and performance.
This report analyses data mining techniques which can be used for predicting different types of diseases. This report reviewed
the research papers which mainly concentrate on predicting various disease
Improving health care outcomes with responsible data scienceWessel Kraaij
Keynote presentation by Wessel Kraaij at the Dutch pattern recognition and impage processing society (NVPBV) 29/5/2018, Eindhoven.
This talk discusses
1. trends in health care and respondible data science and their intersection
2. Secure federated analytics on distributed data repositories
3. Generating clinically relevant hypotheses from patient forum discussions.
The Learning Health System: Thinking and Acting Across ScalesPhilip Payne
A Learning Health System (LHS) can be defined as an environment in which knowledge generation processes are embedded into daily clinical practice in order to continually improve the quality, safety, and outcomes of healthcare delivery. While still largely an aspirational goal, the promise of the LHS is a future in which every patient encounter is an opportunity to learn and improve that patient’s care, as well as the care their family and broader community receives. The foundation for building such an LHS can and should be the Electronic Health Record (EHR), which provides the basis for the comprehensive instrumentation and measurement of clinical phenotypes, as well as a means of delivering new evidence at the patient- and population levels. In this presentation, we will explore the ways in which such EHR-derived phenotypes can be combined with complementary data across a spectrum from biomolecules to population level trends, to both generate insights and deliver such knowledge in the right time, place, and format, ultimately improving clinical outcomes and value.
CORD Rare Drug Conference, June 8 - 9, 2022
Opportunities and Challenges for Data Management Real-World Data and Real-World Evidence
• Patient support programs: Sandra Anderson, Innomar Strategies
• AI for Data Management and Enhancement: Aaron Leibtag, Pentavere
• Patient Support and RWE: Laurie Lambert, CADTH
Precision and Participatory Medicine - Medinfo 2015 Panel on big data. Includes the proposal to use the term Expotype to characterise the Exposome of an individual. Electronic expo typing would refer to the automatic construction of individual expo types from electronic clinical records and other sources of environmental risk factor and exposure data.
Leveraging Data Analysis for Advancements in Healthcare and Medical Research.pdfSoumodeep Nanee Kundu
Data analysis in healthcare encompasses a wide range of applications, all geared toward improving patient care and well-being. It begins with the collection of diverse healthcare data, which includes electronic health records, medical imaging, genomic data, wearable device data, and more. These data sources provide a rich tapestry of information that can be analysed to unlock valuable insights and drive healthcare advancements.
One of the primary areas where data analysis is a game-changer is in clinical decision-making. Through the utilization of data-driven algorithms, healthcare professionals are empowered to make informed decisions regarding patient diagnosis, treatment plans, and prognosis. Clinical Decision Support Systems (CDSS), powered by data analysis, provide real-time guidance based on evidence-based medical knowledge, assisting physicians in choosing the most appropriate treatments and interventions. This not only enhances patient care but also reduces medical errors and ensures that treatment decisions are aligned with the most current medical research.
Data analysis is also instrumental in early disease identification and monitoring. Machine learning models, for example, can predict the onset of diseases like diabetes, Alzheimer's, and cardiovascular conditions by analysing patient data. This early detection capability enables healthcare providers to intervene proactively, potentially preventing or mitigating the severity of these conditions. This aspect of data analysis significantly contributes to the shift from reactive to proactive healthcare, improving patient outcomes and reducing healthcare costs.
Epidemiology and public health are areas where data analysis plays a vital role. The analysis of healthcare data is essential for tracking and predicting disease outbreaks, which is especially critical in the context of infectious diseases and bioterrorism preparedness. Real-time analysis of health data can offer early warning signs of emerging epidemics, allowing authorities to take timely preventive measures and allocate resources efficiently.
Augmented Personalized Health: using AI techniques on semantically integrated...Amit Sheth
Keynote @ 2018 AAAI Joint Workshop on Health Intelligence (W3PHIAI 2018), 2 February 2018, New Orleans, LA [Video: https://youtu.be/GujvoWRa0O8]
Related article: https://ieeexplore.ieee.org/document/8355891/
Abstract
Healthcare as we know it is in the process of going through a massive change - from episodic to continuous, from disease-focused to wellness and quality of life focused, from clinic centric to anywhere a patient is, from clinician controlled to patient empowered, and from being driven by limited data to 360-degree, multimodal personal-public-population physical-cyber-social big data-driven. While the ability to create and capture data is already here, the upcoming innovations will be in converting this big data into smart data through contextual and personalized processing such that patients and clinicians can make better decisions and take timely actions for augmented personalized health. In this talk, we will discuss how use of AI techniques on semantically integrated patient-generated health data (PGHD), environmental data, clinical data, and public social data is exploited to achieve a range of augmented health management strategies that include self-monitoring, self-appraisal, self-management, intervention, and Disease Progression Tracking and Prediction. We will review examples and outcomes from a number of applications, some involving patient evaluations, including asthma in children, bariatric surgery/obesity, mental health/depression, that are part of the Kno.e.sis kHealth personalized digital health initiative.
Background: Background: http://bit.ly/k-APH, http://bit.ly/kAsthma, http://j.mp/PARCtalk
Data Science in Healthcare" by authors Sergio Consoli, Diego Reforgiato Recupero, and Milan Petkovic is an insightful guide that delves into the intersection of data science and healthcare. As a first-year student in Pharmaceutical Management, I found this book to be a valuable resource for understanding how data-driven approaches are transforming the healthcare industry, offering fresh perspectives and practical insights for future professionals like myself.
Standardization and wider use of Electronic Health records (EHR) creates opportunities for
better understanding patterns of illness and care within and across medical systems. In the healthcare
systems, hidden event signatures allow taking decision for patient’s diagnosis, prognosis, and
management. Temporal history of event codes embedded in patients' records, investigates frequently
occurring sequences of event codes across patients. There is a framework that enables the
representation, retrieval, and mining of high order latent event structure and relationships within
single and multiple event sequences. There is a wealth of hidden information present in the large
databases. Different data mining techniques can be used for retrieving data. A classifier approach for
detection of diabetes is presented in this paper and shows how Naive Bayes can be used for
classification purpose. In this system, medical data is categories into five categories namely low,
average, high and very high and critical, treatment is given as per the predicted category. The system
will predict the class label of unknown sample. Hence two basic functions namely classification
(training) and prediction (testing) will be performed. An algorithm and database used affects the
accuracy of the system. It can answer complex queries for diagnosing diabetes disease and thus assist
healthcare practitioners to make intelligent clinical decisions which traditional decision support
systems cannot.Over the last decade, so many information visualization techniques have been
developed to support the exploration of large data sets. There are various interactive visual data
mining tools available for visual data analysis. It is possible to perform clinical assessment for visual
interactive knowledge discovery in large electronic health record databases. In this paper, we
proposed that it is possible to develop a tool for data visualization for interactive knowledge
discovery.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
2016 Data Commons and Data Science Workshop June 7th and June 8th 2016. Genomic Data Commons, FAIR, NCI and making data more findable, publicly accessible, interoperable (machine readable), reusable and support recognition and attribution
Talk entitled "from the Virtual Human to a Digital Me" presented at the Virtual Physiological Human 2012 Conference held at IET Savoy, Savoy Place, London, 18-20 September 2012.
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
A talk given at the DATAPLAT workshop, co-located with the IEEE ICDE conference (May 2024, Utrecht, NL).
Data Provenance for Data Science is our attempt to provide a foundation to add explainability to data-centric AI.
It is a prototype, with lots of work still to do.
More Related Content
Similar to Realising the potential of Health Data Science:opportunities and challenges to practical adoption
Improving health care outcomes with responsible data scienceWessel Kraaij
Keynote presentation by Wessel Kraaij at the Dutch pattern recognition and impage processing society (NVPBV) 29/5/2018, Eindhoven.
This talk discusses
1. trends in health care and respondible data science and their intersection
2. Secure federated analytics on distributed data repositories
3. Generating clinically relevant hypotheses from patient forum discussions.
The Learning Health System: Thinking and Acting Across ScalesPhilip Payne
A Learning Health System (LHS) can be defined as an environment in which knowledge generation processes are embedded into daily clinical practice in order to continually improve the quality, safety, and outcomes of healthcare delivery. While still largely an aspirational goal, the promise of the LHS is a future in which every patient encounter is an opportunity to learn and improve that patient’s care, as well as the care their family and broader community receives. The foundation for building such an LHS can and should be the Electronic Health Record (EHR), which provides the basis for the comprehensive instrumentation and measurement of clinical phenotypes, as well as a means of delivering new evidence at the patient- and population levels. In this presentation, we will explore the ways in which such EHR-derived phenotypes can be combined with complementary data across a spectrum from biomolecules to population level trends, to both generate insights and deliver such knowledge in the right time, place, and format, ultimately improving clinical outcomes and value.
CORD Rare Drug Conference, June 8 - 9, 2022
Opportunities and Challenges for Data Management Real-World Data and Real-World Evidence
• Patient support programs: Sandra Anderson, Innomar Strategies
• AI for Data Management and Enhancement: Aaron Leibtag, Pentavere
• Patient Support and RWE: Laurie Lambert, CADTH
Precision and Participatory Medicine - Medinfo 2015 Panel on big data. Includes the proposal to use the term Expotype to characterise the Exposome of an individual. Electronic expo typing would refer to the automatic construction of individual expo types from electronic clinical records and other sources of environmental risk factor and exposure data.
Leveraging Data Analysis for Advancements in Healthcare and Medical Research.pdfSoumodeep Nanee Kundu
Data analysis in healthcare encompasses a wide range of applications, all geared toward improving patient care and well-being. It begins with the collection of diverse healthcare data, which includes electronic health records, medical imaging, genomic data, wearable device data, and more. These data sources provide a rich tapestry of information that can be analysed to unlock valuable insights and drive healthcare advancements.
One of the primary areas where data analysis is a game-changer is in clinical decision-making. Through the utilization of data-driven algorithms, healthcare professionals are empowered to make informed decisions regarding patient diagnosis, treatment plans, and prognosis. Clinical Decision Support Systems (CDSS), powered by data analysis, provide real-time guidance based on evidence-based medical knowledge, assisting physicians in choosing the most appropriate treatments and interventions. This not only enhances patient care but also reduces medical errors and ensures that treatment decisions are aligned with the most current medical research.
Data analysis is also instrumental in early disease identification and monitoring. Machine learning models, for example, can predict the onset of diseases like diabetes, Alzheimer's, and cardiovascular conditions by analysing patient data. This early detection capability enables healthcare providers to intervene proactively, potentially preventing or mitigating the severity of these conditions. This aspect of data analysis significantly contributes to the shift from reactive to proactive healthcare, improving patient outcomes and reducing healthcare costs.
Epidemiology and public health are areas where data analysis plays a vital role. The analysis of healthcare data is essential for tracking and predicting disease outbreaks, which is especially critical in the context of infectious diseases and bioterrorism preparedness. Real-time analysis of health data can offer early warning signs of emerging epidemics, allowing authorities to take timely preventive measures and allocate resources efficiently.
Augmented Personalized Health: using AI techniques on semantically integrated...Amit Sheth
Keynote @ 2018 AAAI Joint Workshop on Health Intelligence (W3PHIAI 2018), 2 February 2018, New Orleans, LA [Video: https://youtu.be/GujvoWRa0O8]
Related article: https://ieeexplore.ieee.org/document/8355891/
Abstract
Healthcare as we know it is in the process of going through a massive change - from episodic to continuous, from disease-focused to wellness and quality of life focused, from clinic centric to anywhere a patient is, from clinician controlled to patient empowered, and from being driven by limited data to 360-degree, multimodal personal-public-population physical-cyber-social big data-driven. While the ability to create and capture data is already here, the upcoming innovations will be in converting this big data into smart data through contextual and personalized processing such that patients and clinicians can make better decisions and take timely actions for augmented personalized health. In this talk, we will discuss how use of AI techniques on semantically integrated patient-generated health data (PGHD), environmental data, clinical data, and public social data is exploited to achieve a range of augmented health management strategies that include self-monitoring, self-appraisal, self-management, intervention, and Disease Progression Tracking and Prediction. We will review examples and outcomes from a number of applications, some involving patient evaluations, including asthma in children, bariatric surgery/obesity, mental health/depression, that are part of the Kno.e.sis kHealth personalized digital health initiative.
Background: Background: http://bit.ly/k-APH, http://bit.ly/kAsthma, http://j.mp/PARCtalk
Data Science in Healthcare" by authors Sergio Consoli, Diego Reforgiato Recupero, and Milan Petkovic is an insightful guide that delves into the intersection of data science and healthcare. As a first-year student in Pharmaceutical Management, I found this book to be a valuable resource for understanding how data-driven approaches are transforming the healthcare industry, offering fresh perspectives and practical insights for future professionals like myself.
Standardization and wider use of Electronic Health records (EHR) creates opportunities for
better understanding patterns of illness and care within and across medical systems. In the healthcare
systems, hidden event signatures allow taking decision for patient’s diagnosis, prognosis, and
management. Temporal history of event codes embedded in patients' records, investigates frequently
occurring sequences of event codes across patients. There is a framework that enables the
representation, retrieval, and mining of high order latent event structure and relationships within
single and multiple event sequences. There is a wealth of hidden information present in the large
databases. Different data mining techniques can be used for retrieving data. A classifier approach for
detection of diabetes is presented in this paper and shows how Naive Bayes can be used for
classification purpose. In this system, medical data is categories into five categories namely low,
average, high and very high and critical, treatment is given as per the predicted category. The system
will predict the class label of unknown sample. Hence two basic functions namely classification
(training) and prediction (testing) will be performed. An algorithm and database used affects the
accuracy of the system. It can answer complex queries for diagnosing diabetes disease and thus assist
healthcare practitioners to make intelligent clinical decisions which traditional decision support
systems cannot.Over the last decade, so many information visualization techniques have been
developed to support the exploration of large data sets. There are various interactive visual data
mining tools available for visual data analysis. It is possible to perform clinical assessment for visual
interactive knowledge discovery in large electronic health record databases. In this paper, we
proposed that it is possible to develop a tool for data visualization for interactive knowledge
discovery.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
2016 Data Commons and Data Science Workshop June 7th and June 8th 2016. Genomic Data Commons, FAIR, NCI and making data more findable, publicly accessible, interoperable (machine readable), reusable and support recognition and attribution
Talk entitled "from the Virtual Human to a Digital Me" presented at the Virtual Physiological Human 2012 Conference held at IET Savoy, Savoy Place, London, 18-20 September 2012.
Similar to Realising the potential of Health Data Science:opportunities and challenges to practical adoption (20)
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
A talk given at the DATAPLAT workshop, co-located with the IEEE ICDE conference (May 2024, Utrecht, NL).
Data Provenance for Data Science is our attempt to provide a foundation to add explainability to data-centric AI.
It is a prototype, with lots of work still to do.
Towards explanations for Data-Centric AI using provenance recordsPaolo Missier
In this presentation, given to graduate students at Universita' RomaTre, Italy, we suggest that concepts well-known in Data Provenance can be exploited to provide explanations in the context of data-centric AI processes. Through use cases (incremental data cleaning, training set pruning), we build up increasingly complex provenance patterns, culminating in an open question:
how to describe "why" a specific data item has been manipulated as part of data processing, when such processing may consist of a complex data transformation algorithm.
Interpretable and robust hospital readmission predictions from Electronic Hea...Paolo Missier
A talk given at the BDA4HM workshop, IEEE BigData conference, Dec. 2023
please see paper here:
https://drive.google.com/file/d/1vN08G0FWxOSH1Yeak5AX6a0sr5-EBbAt/view
Data-centric AI and the convergence of data and model engineering:opportunit...Paolo Missier
A keynote talk given to the IDEAL 2023 conference (Evora, Portugal Nov 23, 2023).
Abstract.
The past few years have seen the emergence of what the AI community calls "Data-centric AI", namely the recognition that some of the limiting factors in AI performance are in fact in the data used for training the models, as much as in the expressiveness and complexity of the models themselves. One analogy is that of a powerful engine that will only run as fast as the quality of the fuel allows. A plethora of recent literature has started the connection between data and models in depth, along with startups that offer "data engineering for AI" services. Some concepts are well-known to the data engineering community, including incremental data cleaning, multi-source integration, or data bias control; others are more specific to AI applications, for instance the realisation that some samples in the training space are "easier to learn from" than others. In this "position talk" I will suggest that, from an infrastructure perspective, there is an opportunity to efficiently support patterns of complex pipelines where data and model improvements are entangled in a series of iterations. I will focus in particular on end-to-end tracking of data and model versions, as a way to support MLDev and MLOps engineers as they navigate through a complex decision space.
Tracking trajectories of multiple long-term conditions using dynamic patient...Paolo Missier
Momentum has been growing into research to better understand the dynamics of multiple long-term conditions-multimorbidity (MLTC-M), defined as the co-occurrence of two or more long-term or chronic conditions within an individual. Several research efforts make use of Electronic Health Records (EHR), which represent patients' medical histories. These range from discovering patterns of multimorbidity, namely by clustering diseases based on their co-occurrence in EHRs, to using EHRs to predict the next disease or other specific outcomes. One problem with the former approach is that it discards important temporal information on the co-occurrence, while the latter requires "big" data volumes that are not always available from routinely collected EHRs, limiting the robustness of the resulting models. In this paper we take an intermediate approach, where initially we use about 143,000 EHRs from UK Biobank to perform time-independent clustering using topic modelling, and Latent Dirichlet Allocation specifically. We then propose a metric to measure how strongly a patient is "attracted" into any given cluster at any point through their medical history. By tracking how such gravitational pull changes over time, we may then be able to narrow the scope for potential interventions and preventative measures to specific clusters, without having to resort to full-fledged predictive modelling. In this preliminary work we show exemplars of these dynamic associations, which suggest that further exploration may lead to On behalf of the AI-MULTIPLY consortium. Funded by NIHR AIM Development grant to AI-MULTIPLY actionable insights into patients' medical trajectories.
Digital biomarkers for preventive personalised healthcarePaolo Missier
A talk given to the Alan Turing Institute, UK, Oct 2021, reporting on the preliminary results and ongoing research in our lab, on self-monitoring using accelerometers for healthcare applications
Digital biomarkers for preventive personalised healthcarePaolo Missier
A talk given to the Alan Turing Institute, UK, Oct 2021, reporting on the preliminary results and ongoing research in our lab, on self-monitoring using accelerometers for healthcare applications
Capturing and querying fine-grained provenance of preprocessing pipelines in ...Paolo Missier
a talk given at the VLDB 2021 conference, August, 2021, presenting our paper:
Capturing and Querying Fine-grained Provenance of Preprocessing Pipelines in Data Science. Chapman, A., Missier, P., Simonelli, G., & Torlone, R. PVLDB, 14(4):507–520, January, 2021.
http://doi.org/10.14778/3436905.3436911
Decentralized, Trust-less Marketplacefor Brokered IoT Data Tradingusing Blo...Paolo Missier
a talk given at the 2nd IEEE Blockchain conference, Atlanta, US ?july 2019.
here is the paper: http://homepages.cs.ncl.ac.uk/paolo.missier/doc/Decentralised_Marketplace_USA_Conference___Accepted_Version_.pdf
CHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdfSachin Sharma
Pediatric nurses play a vital role in the health and well-being of children. Their responsibilities are wide-ranging, and their objectives can be categorized into several key areas:
1. Direct Patient Care:
Objective: Provide comprehensive and compassionate care to infants, children, and adolescents in various healthcare settings (hospitals, clinics, etc.).
This includes tasks like:
Monitoring vital signs and physical condition.
Administering medications and treatments.
Performing procedures as directed by doctors.
Assisting with daily living activities (bathing, feeding).
Providing emotional support and pain management.
2. Health Promotion and Education:
Objective: Promote healthy behaviors and educate children, families, and communities about preventive healthcare.
This includes tasks like:
Administering vaccinations.
Providing education on nutrition, hygiene, and development.
Offering breastfeeding and childbirth support.
Counseling families on safety and injury prevention.
3. Collaboration and Advocacy:
Objective: Collaborate effectively with doctors, social workers, therapists, and other healthcare professionals to ensure coordinated care for children.
Objective: Advocate for the rights and best interests of their patients, especially when children cannot speak for themselves.
This includes tasks like:
Communicating effectively with healthcare teams.
Identifying and addressing potential risks to child welfare.
Educating families about their child's condition and treatment options.
4. Professional Development and Research:
Objective: Stay up-to-date on the latest advancements in pediatric healthcare through continuing education and research.
Objective: Contribute to improving the quality of care for children by participating in research initiatives.
This includes tasks like:
Attending workshops and conferences on pediatric nursing.
Participating in clinical trials related to child health.
Implementing evidence-based practices into their daily routines.
By fulfilling these objectives, pediatric nurses play a crucial role in ensuring the optimal health and well-being of children throughout all stages of their development.
Antibiotic Stewardship by Anushri Srivastava.pptxAnushriSrivastav
Stewardship is the act of taking good care of something.
Antimicrobial stewardship is a coordinated program that promotes the appropriate use of antimicrobials (including antibiotics), improves patient outcomes, reduces microbial resistance, and decreases the spread of infections caused by multidrug-resistant organisms.
WHO launched the Global Antimicrobial Resistance and Use Surveillance System (GLASS) in 2015 to fill knowledge gaps and inform strategies at all levels.
ACCORDING TO apic.org,
Antimicrobial stewardship is a coordinated program that promotes the appropriate use of antimicrobials (including antibiotics), improves patient outcomes, reduces microbial resistance, and decreases the spread of infections caused by multidrug-resistant organisms.
ACCORDING TO pewtrusts.org,
Antibiotic stewardship refers to efforts in doctors’ offices, hospitals, long term care facilities, and other health care settings to ensure that antibiotics are used only when necessary and appropriate
According to WHO,
Antimicrobial stewardship is a systematic approach to educate and support health care professionals to follow evidence-based guidelines for prescribing and administering antimicrobials
In 1996, John McGowan and Dale Gerding first applied the term antimicrobial stewardship, where they suggested a causal association between antimicrobial agent use and resistance. They also focused on the urgency of large-scale controlled trials of antimicrobial-use regulation employing sophisticated epidemiologic methods, molecular typing, and precise resistance mechanism analysis.
Antimicrobial Stewardship(AMS) refers to the optimal selection, dosing, and duration of antimicrobial treatment resulting in the best clinical outcome with minimal side effects to the patients and minimal impact on subsequent resistance.
According to the 2019 report, in the US, more than 2.8 million antibiotic-resistant infections occur each year, and more than 35000 people die. In addition to this, it also mentioned that 223,900 cases of Clostridoides difficile occurred in 2017, of which 12800 people died. The report did not include viruses or parasites
VISION
Being proactive
Supporting optimal animal and human health
Exploring ways to reduce overall use of antimicrobials
Using the drugs that prevent and treat disease by killing microscopic organisms in a responsible way
GOAL
to prevent the generation and spread of antimicrobial resistance (AMR). Doing so will preserve the effectiveness of these drugs in animals and humans for years to come.
being to preserve human and animal health and the effectiveness of antimicrobial medications.
to implement a multidisciplinary approach in assembling a stewardship team to include an infectious disease physician, a clinical pharmacist with infectious diseases training, infection preventionist, and a close collaboration with the staff in the clinical microbiology laboratory
to prevent antimicrobial overuse, misuse and abuse.
to minimize the developme
How many patients does case series should have In comparison to case reports.pdfpubrica101
Pubrica’s team of researchers and writers create scientific and medical research articles, which may be important resources for authors and practitioners. Pubrica medical writers assist you in creating and revising the introduction by alerting the reader to gaps in the chosen study subject. Our professionals understand the order in which the hypothesis topic is followed by the broad subject, the issue, and the backdrop.
https://pubrica.com/academy/case-study-or-series/how-many-patients-does-case-series-should-have-in-comparison-to-case-reports/
CHAPTER 1 SEMESTER V PREVENTIVE-PEDIATRICS.pdfSachin Sharma
This content provides an overview of preventive pediatrics. It defines preventive pediatrics as preventing disease and promoting children's physical, mental, and social well-being to achieve positive health. It discusses antenatal, postnatal, and social preventive pediatrics. It also covers various child health programs like immunization, breastfeeding, ICDS, and the roles of organizations like WHO, UNICEF, and nurses in preventive pediatrics.
The Importance of Community Nursing Care.pdfAD Healthcare
NDIS and Community 24/7 Nursing Care is a specific type of support that may be provided under the NDIS for individuals with complex medical needs who require ongoing nursing care in a community setting, such as their home or a supported accommodation facility.
Deep Leg Vein Thrombosis (DVT): Meaning, Causes, Symptoms, Treatment, and Mor...The Lifesciences Magazine
Deep Leg Vein Thrombosis occurs when a blood clot forms in one or more of the deep veins in the legs. These clots can impede blood flow, leading to severe complications.
Medical Technology Tackles New Health Care Demand - Research Report - March 2...pchutichetpong
M Capital Group (“MCG”) predicts that with, against, despite, and even without the global pandemic, the medical technology (MedTech) industry shows signs of continuous healthy growth, driven by smaller, faster, and cheaper devices, growing demand for home-based applications, technological innovation, strategic acquisitions, investments, and SPAC listings. MCG predicts that this should reflects itself in annual growth of over 6%, well beyond 2028.
According to Chris Mouchabhani, Managing Partner at M Capital Group, “Despite all economic scenarios that one may consider, beyond overall economic shocks, medical technology should remain one of the most promising and robust sectors over the short to medium term and well beyond 2028.”
There is a movement towards home-based care for the elderly, next generation scanning and MRI devices, wearable technology, artificial intelligence incorporation, and online connectivity. Experts also see a focus on predictive, preventive, personalized, participatory, and precision medicine, with rising levels of integration of home care and technological innovation.
The average cost of treatment has been rising across the board, creating additional financial burdens to governments, healthcare providers and insurance companies. According to MCG, cost-per-inpatient-stay in the United States alone rose on average annually by over 13% between 2014 to 2021, leading MedTech to focus research efforts on optimized medical equipment at lower price points, whilst emphasizing portability and ease of use. Namely, 46% of the 1,008 medical technology companies in the 2021 MedTech Innovator (“MTI”) database are focusing on prevention, wellness, detection, or diagnosis, signaling a clear push for preventive care to also tackle costs.
In addition, there has also been a lasting impact on consumer and medical demand for home care, supported by the pandemic. Lockdowns, closure of care facilities, and healthcare systems subjected to capacity pressure, accelerated demand away from traditional inpatient care. Now, outpatient care solutions are driving industry production, with nearly 70% of recent diagnostics start-up companies producing products in areas such as ambulatory clinics, at-home care, and self-administered diagnostics.
Health Education on prevention of hypertensionRadhika kulvi
Hypertension is a chronic condition of concern due to its role in the causation of coronary heart diseases. Hypertension is a worldwide epidemic and important risk factor for coronary artery disease, stroke and renal diseases. Blood pressure is the force exerted by the blood against the walls of the blood vessels and is sufficient to maintain tissue perfusion during activity and rest. Hypertension is sustained elevation of BP. In adults, HTN exists when systolic blood pressure is equal to or greater than 140mmHg or diastolic BP is equal to or greater than 90mmHg. The
Defecation
Normal defecation begins with movement in the left colon, moving stool toward the anus. When stool reaches the rectum, the distention causes relaxation of the internal sphincter and an awareness of the need to defecate. At the time of defecation, the external sphincter relaxes, and abdominal muscles contract, increasing intrarectal pressure and forcing the stool out
The Valsalva maneuver exerts pressure to expel faeces through a voluntary contraction of the abdominal muscles while maintaining forced expiration against a closed airway. Patients with cardiovascular disease, glaucoma, increased intracranial pressure, or a new surgical wound are at greater risk for cardiac dysrhythmias and elevated blood pressure with the Valsalva maneuver and need to avoid straining to pass the stool.
Normal defecation is painless, resulting in passage of soft, formed stool
CONSTIPATION
Constipation is a symptom, not a disease. Improper diet, reduced fluid intake, lack of exercise, and certain medications can cause constipation. For example, patients receiving opiates for pain after surgery often require a stool softener or laxative to prevent constipation. The signs of constipation include infrequent bowel movements (less than every 3 days), difficulty passing stools, excessive straining, inability to defecate at will, and hard feaces
IMPACTION
Fecal impaction results from unrelieved constipation. It is a collection of hardened feces wedged in the rectum that a person cannot expel. In cases of severe impaction the mass extends up into the sigmoid colon.
DIARRHEA
Diarrhea is an increase in the number of stools and the passage of liquid, unformed feces. It is associated with disorders affecting digestion, absorption, and secretion in the GI tract. Intestinal contents pass through the small and large intestine too quickly to allow for the usual absorption of fluid and nutrients. Irritation within the colon results in increased mucus secretion. As a result, feces become watery, and the patient is unable to control the urge to defecate. Normally an anal bag is safe and effective in long-term treatment of patients with fecal incontinence at home, in hospice, or in the hospital. Fecal incontinence is expensive and a potentially dangerous condition in terms of contamination and risk of skin ulceration
HEMORRHOIDS
Hemorrhoids are dilated, engorged veins in the lining of the rectum. They are either external or internal.
FLATULENCE
As gas accumulates in the lumen of the intestines, the bowel wall stretches and distends (flatulence). It is a common cause of abdominal fullness, pain, and cramping. Normally intestinal gas escapes through the mouth (belching) or the anus (passing of flatus)
FECAL INCONTINENCE
Fecal incontinence is the inability to control passage of feces and gas from the anus. Incontinence harms a patient’s body image
PREPARATION AND GIVING OF LAXATIVESACCORDING TO POTTER AND PERRY,
An enema is the instillation of a solution into the rectum and sig
Realising the potential of Health Data Science:opportunities and challenges to practical adoption
1. Professor Paolo Missier
School of Computing
Newcastle University
October 2023
Realising the potential of Health Data Science:
opportunities and challenges to practical adoption
2. 2
<event
name>
The promise of data-driven medicine and healthcare
Predictive, Preventative, Personalised, Participatory: a systems biology perspective on the future of
medicine and health care
Hood L, Heath JR, Phelps ME, Lin B. Systems biology and new technologies enable predictive and preventative medicine. Science. 2004;306(5696):640–643.
Hood L, Balling R, Auffray C. Revolutionizing medicine in the 21st century through systems approaches. Biotechnol J. 2012;7(8):992–1001. Provides an overview of the science and
technological foundations of predictive, preventive, personalized and participatory healthcare
Flores M, Glusman G, Brogaard K, Price ND, Hood L. P4 medicine: how systems medicine will transform the healthcare sector and society. Per Med. 2013;10(6):565-576. doi:
10.2217/pme.13.57. PMID: 25342952; PMCID: PMC4204402.
Schmidt, Charlie. ‘Leroy Hood Looks Forward to P4 Medicine: Predictive, Personalized, Preventive, and Participatory’. JNCI Journal of the National Cancer Institute 106, no. 12
(December 2014): dju416–dju416. https://doi.org/10.1093/jnci/dju416.
[1] Sagner, M, A McNeil, P Puska, and R Arena. ‘The P4 Health Spectrum – A Predictive, Preventive, Personalized and Participatory Continuum for Promoting Healthspan’.
Progress in Cardiovascular Diseases 59, no. 5 (2017): 506–21. https://doi.org/10.1016/j.pcad.2016.08.002.
A new approach in medicine that is predictive, preventive, personalized and participatory, which we
label here as “P4” holds great promise to reduce the burden of chronic diseases by harnessing
technology and an increasingly better understanding of environment-biology interactions, evidence-
based interventions and the underlying mechanisms of chronic diseases. [1]
3. 3
<event
name>
Data about us
Sagner, M, A McNeil, P Puska, and R Arena. ‘The P4 Health Spectrum – A Predictive, Preventive, Personalized and Participatory Continuum for Promoting Healthspan’.
Progress in Cardiovascular Diseases 59, no. 5 (2017): 506–21. https://doi.org/10.1016/j.pcad.2016.08.002.
4. 4
<event
name>
Outline
• AI for HealthCare: a convergence of needs and opportunities
• A complex multifaceted landscape
• Challenges, opportunities, state of the art through two first-hand case studies
• Costs and Challenges throughout the data value chain
5. 5
<event
name>
Understanding the facets of Health data
• Clinical
• Lifestyle, social
•Which data types?
• Prospective vs
retrospective
•Where do datasets
come from?
• Acquisition
• Curation, annotation
•How much do they
cost?
• Small vs Big Health
Data
•How large?
• Governance
• Protection
•Who can use it and
how?
Data
Science and
Engineering
Benefits to
patients
6. 6
<event
name>
Which data? Capturing individuals’ complexity
Primary care records:
- Clinical tests / GP notes, diagnoses / Prescriptions
Secondary care records:
- hospital admission / diagnoses / operations / prescriptions
Multi-omics data:
- genotypes, exomes, genomes.
- Transcriptomics, proteomics
Digital Health:
- Data streams from wearable and environment sensors,
self-monitoring
Socio-demographics:
- Area of residence, family, social deprivation
8. 8
<event
name>
CPRD
Data access fee for research ~£60K
(non-commercial license)
Population makeup:
over 2,000 primary care practices
60 million patients (18m registered active patient)
at least 20 years of follow-up for 25% of the patients
Core dataset:
Demographics
Diagnoses and symptoms
Drug exposures
Vaccination history
Laboratory tests
Referrals to hospital and specialist care
Data linkages:
Hospital care (A&E; Inpatient; Outpatient; Imaging)
Death registry
Cancer registry and treatment
Mental health services
Socio-economic measures
9. 9
<event
name>
A convergence of needs and opportunities
P4
Data-driven
Healthcare
Personal self-
monitoring
devices
Health Data
Science and
Engineering
Governance, consent
Secure data access
(Big) Health
Data
- Operations Research
- ML, AI Methods
- Scalable computing
Medical grade Consumer grade
- Privacy (eg GDPR)
- Opt-in vs opt-out
- Trusted Research Environments
Bigger == more useful?
11. 11
A complex health data science landscape for translational research
Challenges
Data
integration
Protocol design
Retrospective
Dataset search
and selection
Prospective
Data cleaning
Data standardisation
Data augmentation
- Annotation amplification
- Synthetic data
. Population characterization
. Subgroups identification
- Patient subtyping
- Disease subtyping
- ”group by”
- Clustering
- Latent Class Analysis
- Risk prediction
- Next disease prediction
- {bio, digital} markers discovery
- Other outcomes
Process modelling, HMM
Established ML
- Deep NN
- Generative AI (eg BEHRT)
Tasks
and
methods
Cross-source integration
across types:
clinical/EHR/Omics/sensors
Understanding
data semantics
Data and annotation scarcity
Managing the
quality/quantity/cost envelope
Bias control
Data noise
Advancing the methods:
“Better data science for better science”
Data governance, computational scalability Safe Data Environments
End-to-end explainability provenance engineering, demonstrating the benefits
Reproducible Analytics Pipelines (RAP)
Architectures
Data and
methods
Data ingestion
Data preparation /
engineering
Descriptive analytics
Pattern discovery
Predictions
12. 12
✗
<event
name>
II. Prospective vs retrospective datasets
Prospective: defined for research purposes
✓ Stable and
predictable
✓ Follow protocol
✓ Research ready
✓ Potentially well-
curated
✓ Bias known a priori
✗ Expensive
✗ Not very reusable
✗ Scarce
Potentially more reusable
Natural Bias (reflects natural cohort locality)
✗ Generally not research ready
✗ Require data engineering
Retrospective: typically operational data
Example:
Clinical Practice Research Datalink
- Data collected from UK GP practices
- 60+ million patients
- (also prospective)
Example: UK Biobank
- 500,000 volunteer participants
- General health information
- Genotypes and whole genomes
- Selected internal organ imaging study (100K)
- Bias: 40+ years, geographic / social bias
Prospective datasets:
13. 13
<event
name>
Cost of health data
Retrospective: integration/harmonisation, curation, cleaning
Prospective: cost of cohort recruitment, data collection, data processing
Acquisition + processing cost by data type:
Routinely collected
clinical variables
(GP test)
- Tests requiring specialist labs
- Proteomics
- Genotyping
(a few genes)
Whole exome
sequencing
Whole genome
sequencing
Low High
14. 14
Case study: LITMUS
Retrospective data collected from hospitals datasets (N ≅ 10K)
Prospective data from active recruitment (N ≅ 2K)
- Routine clinical tests
- Omics (genotypes, transcriptomes, proteomes)
- Biopsies provide label annotations
• EU IMI2 project
• Non-Alcoholic Fatty Liver Disease (NAFLD / steathosis) and NASH
(fibrosis, cirrhosis)
https://litmus-project.eu/litmus-partners/
Main contributor: Matt McTeer, PhD student
From multivariate linear regression to non-linear combinations
of markers
18. 18
LITMUS
Challenges
Data
integration
Protocol design
Retrospective
Dataset search
and selection
Prospective Data cleaning
Data standardisation
Data augmentation
- Annotation amplification
- Synthetic data
. Population characterization
. Subgroups identification
- Patient subtyping
- Disease subtyping
- ”group by”
- Clustering
- Latent Class Analysis
- Statistical modelling
- Multivariate regression
- Risk prediction
- Next disease prediction
- {bio, digital} markers discovery
- Other outcomes
Process modelling, HMM
Established ML
- Deep NN
- Generative AI (eg BEHRT)
Tasks
and
methods
Cross-source integration
across types:
clinical/EHR/Omics/sensors
Understanding
data semantics
Data and annotation scarcity
Managing the
quality/quantity/cost envelope
Bias control
Data noise
Advancing the methods:
“Better data science for better science”
Data governance, computational scalability Safe Data Environment
End-to-end explainability provenance engineering, demonstrating the benefits
Reproducible Analytics Pipelines (RAP)
Architectural
Data
“Long and thin” vs “short and broad” training sets
feature completeness vs
importance, imputation
Binary classifiers across
multiple feature sets
19. 19
<event
name>
Issues requiring Data Engineering
Recurringdata
issues
Data–driven, AI–based clinical practice: experiences, challenges, and research directions
DATA SPARSITY
AND SCARSITY
• EHR: Irregular
collections of
time series
• Imputation is
not always
possible
DATA
IMBALANCE
• Predicting
rare events
can be a
priority
• No
downsampling
option
DATA
INCONSISTENCY
and INSTABILITY
• Retrospective
data are often
source of
inconsistency
and their
schema are
instable
NOT ALL
ERRORSARE
EQUALLY
WRONG
• In high-stake
domains
sometimes a
bias towards
one type of
error is
preferible
HUMAN-IN-
THE-LOOP
• Explanations
engender trust
in the models
• Trust should
include not
only the
clinician but
also the
patient.
20. 20
<event
name>
Sparsity/ scarcity, imbalance
Classifiers are not resilient to class imbalance:
- Models will be biased towards predicting
majority class regardless of the input features
- Will struggle to generalise correctly on the
minority class
- In clinical datasets, data scarcity/sparsity often
conspires with data imbalance
- Imbalance is very common in medical datasets
Typical mitigation:
- Downsample the majority class lose training examples
- Upsample the minority class. SMOTE (Synthetic Minority Oversampling Technique)
When modelling processes, these mitigations do not work
We used Hidden Markov Models (HMMs) to predict oxygen-therapy state-transitions
However, intubation is a infrequent state (and so is “death”)
This makes it was difficult to accurately learn probability distributions.
[1] proposes a novel, generic ensemble technique to mitigate the imbalance problem in HMM
21. 21
<event
name>
Instability
Retrospective studies are often unstable:
Data acquisition and management practices may change over time, following changes in
- Clinical practices
- Public policy
- Hospital resources
- Data collection technologies
- In our COVID dataset clinical tests vary daily depending on the patient’s
condition
- Scientific evidence for the need of certain tests changed rapidly
- Example: new biomarkers like interleukin-6 were introduced in “mid flight”
- Thus earlier study datasets completely miss this variable
22. 22
<event
name>
Translational challenge: Not all errors are equally wrong
- In high-stakes domains, prediction errors are not symmetric:
- Typically, underestimating risk is less desirable than overestimating it
- Standard model performance metrics (eg AUC, F1 etc) fail to capture this distinction
Cost-sensitive learning (cf eg [1,2,3])
- Introduce an explicit penalty of mis-classifying samples
- Note that cost- sensitive methods can sometimes deal with imbalanced datasets without
altering the original data distribution [4]
[1] Lomax, S., and Vadera, S. (2013). A survey of cost-sensitive decision tree induction algorithms. ACM Comput. Surveys 45, 1–35. doi: 10.1145/2431211.2431215
[2] Wang, H., Cui, Z., Chen, Y., Avidan, M., Abdallah, A. B., and Kronzer, A. (2018). Predicting hospital readmission via cost-sensitive deep learning. ACM Trans.
Comput. Biol. Bioinformatics 15, 1968–1978. doi: 10.1109/TCBB.2018.2827029
[3] Freitas, A., Costa-Pereira, A., and Brazdil, P. (2007). “Cost-sensitive decision trees applied to medical data,” in Data Warehousing and Knowledge Discovery
(Regensburg), 303–312. doi: 10.1007/978-3-540-74553-2_28
[4] Mienye, I. D., and Sun, Y. (2021). Performance analysis of cost-sensitive learning methods with application to imbalanced medical data. Inform. Med. Unlock.
25:100690. doi: 10.1016/j.imu.2021.100690
23. 23
<event
name>
Translational challenge: human-in-the-loop AI
• Essential in medical AI
• Evidence of performance is not enough
• Black-box AI not acceptable in clinical practice
From technical explanations:
• non-linear [1] and Deep Learning [2] models
• Shapley values [3]
• Interpretable ML [4,5]
Also importantly:
Patient and Public Involvement (PPI) is essential in publicly funded clinical research
“Explanation gap”:
To expert involvement in the learning process:
- by accepting/rejecting predictions
- By expressing preference for a given error type
Causal Machine Learning (CML) [6,7]:
- Visualisation and reasoning over complex clinical scenarios
- Counterfactuals, what-if scenarios
24. 24
IEEE
BigData
2022
Multimorbidities and disease prediction
Multiple Long-Term Conditions, defined as [1,2]:
• Two/Four or more long-term (chronic) conditions
A Long Term Condition (LTC) is a condition that cannot, at present, be cured
but is controlled by medication and/or other treatment/therapies (*)
(*) NHS and UK Dept. of Health, Long Term Conditions Compendium of Information Third Edition,
https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/216528/dh_134486.pdf
[1] M. C. Johnston, M. Crilly, C. Black, G. J. Prescott, and S. W. Mercer, “Defining and measuring multimorbidity: a systematic review of systematic reviews,”
European journal of public health, vol. 29, no. 1, pp. 182–189, 2019.
[2] B. P. Nunes, T. R. Flores, G. I. Mielke, E. Thum ́e, and L. A. Facchini, “Multimorbidity and mortality in older adults: A systematic review and meta-analysis.”
Archives of gerontology and geriatrics, vol. 67, pp. 130–138, Dec. 2016, place: Netherlands.
Significant research investment by NIHR, the core
translational medicine funder in the UK
The number of people with multiple LTCs in the UK is set to
rise to 2.9 million in 2018 from 1.9 million in 2008.
25. 25
Multiple Long Term Conditions: research at Newcastle
Characterising the inter-relationships between multiple long-term conditions (MLTC)
and polypharmacy
Funding: NIHR, 2022-2024 (CO-I)
Disease clustering based on co-occurrence in patients’ medical timelines
Patient clustering based on timeline similarity
Predicting outcomes using diagnoses + prescriptions / deep learning
• Disease embeddings supervised learning for outcome prediction
Characterise patient pathways through hospital process modelling
26. AI methods Outputs & Impacts
Datasets
Replicate
Test
Discover
UK-Biobank
CPRD
GNCR
ELPR
Event spatial &
temporal order
Event Characterisation
Event Prediction
Shared standards
Portable pipelines
Identification of high-risk
situations & tipping points
Trial emulation
Local and national policies
(high risk groups)
Training & capacity building
Clinical dashboard
Clinical support tools
NIHR AIM CISC
Connected Bradford
NIHR AIM
OPTIMAL
Improve
patient
care
Reduce
health
inequalities
Communication of results
Explainable research &
Explainable AI
Local
Health
Intelligence
Datasets
Replication
Datasets
National
Discovery
Datasets
> >
Within 5 years
27. LTC
Embedding: 200x100
Diagnosis
Embedding: 251x50
Historical Prescriptions
Embedding: 512x50
Preadmission Prescriptions
Embedding: 512x50
Postadmission Prescriptions
Embedding: 512x50
Demographics Vector
sex, ethnicity, townsend, etc.
Feature Vector
size: 6048
Gradient Boost
size: 100 estimators
Output:
{0, 1}
• Neural network + xgboost
combination
• Our readmission cohort is more
general vs domain specific cohorts
in literature
• Our model performs better than
current literature in spite of more
complex problem
Predicting MLTC-PP outcomes: hospital readmission
Adding explanations: which
predictors are more relevant to
explain unplanned readmission?
- LTCs and how they accumulate
- Prescriptions given between
discharge and readmission?
28. 28
AI-MULTIPLY
Challenges
Data
integration
Protocol design
Prospective
Dataset search
and selection
Retrospective Data cleaning
Data standardisation
Data augmentation
- Annotation amplification
- Synthetic data
. Population characterization
. Subgroups identification
- Patient subtyping
- Disease subtyping
- ”group by”
- Clustering
- Latent Class Analysis
- Risk prediction
- Next disease prediction
- {bio, digital} markers discovery
- Other outcomes
Process modelling, HMM
Established ML
- Deep NN
- Generative AI (eg BEHRT)
Tasks
and
methods
Cross-source integration
across types:
clinical/EHR/Omics/sensors
Understanding
data semantics
Data and annotation scarcity
Managing the
quality/quantity/cost envelope
Bias control
Data noise
Advancing the methods:
“Better data science for better science”
Data governance, computational scalability Safe Data Environment
End-to-end explainability provenance engineering, demonstrating the benefits
Reproducible Analytics Pipelines (RAP)
Architectural
Data
Challenges in Drug coding in UKBB Defining and predicting hospital readmission
Defining and coding MLTC
Reproduce DNN results across sites
Disease clustering and cluster prediction
29. 29
<event
name>
Challenges and opportunities
Data:
• Multi-site research presents opportunities for cross-validation of results, but also challenges
• Newcastle UK Biobank
• QMUL CPRD
• Projects like these tend to “piggyback” on existing data licenses, which may restrictive
Modelling:
• LLMs and genAI have shown potential to “sidestep” some of the more traditional prediction techniques
Next disease prediction becomes a case of “sentence completion”
Engineering / reproducibility:
• at this stage, prototyping and experimenting are distributed across sites and each piece is owned by
one researcher
• reproducibility and reusability both seem like distant goals…
Patient and Public Involvement and Engagement:
• Establishing a productive and sustained relationship between PPIE members and researchers is a
priority
30. 30
<event
name>
Role of PPIE in Health Data Science / AI projects
PPIE involvement “built into” every NIHR-funded project: it’s an asset and opportunity
BUT: need to make it work!
What kind of involvement? Consultation vs research co-design
• Periodic, scheduled “themed” sessions at designated project checkpoints
• Key research questions defined upfront, but opportunities to revise / refine mid-flight
• The academic perspective and the lived experiences are very different
• Need to find a common language
• But also to find a way to ensure mutual benefit and a two-way learning experience
31. 31
<event
name>
PPIE: some elements for reflection
Engendering trust in AI and in secure data management practices
• Where is your data held? How do TREs work?
• What are the boundaries of legitimate use of your data for research? How is the law changing?
• Transparency and explainability: How we can achieve effective communication on what an algorithm is doing?
What outcomes are most relevant? Are those aligned with the data we work with?
• Ex.: ensuring good Quality of Life for LTC patients: very important, but data hardly available
Medication / prescriptions:
• Meeting expectations like “predict the best combination of medicines” present hard challenges
Data limitations: “you don’t know half my story”
32. 32
<event
name>
Data governance issues: the emerging UK landscape
https://www.goldacrereview.org/
Build a small number of Trusted Research Environments, avoiding duplication
Promote culture of reuse of code (curation pipelines, analytics)
- Reproducible Analytical Pipelines”, a set of best practices
- Promote high quality, shared, reviewable, re-usable, well-documented code for
standardized data curation and analysis
- Promote transparency, avoid black box analysis
Adopt single governance rules for integrated data access
- Rationalise approvals: create one map of all approval processes
Build appropriate capabilities:
- Train academic researchers and NHS analysts in computational data science
techniques
38. 38
<event
name>
Summary
Enablers:
Data availability
Scalable data processing technology
Inexpensive, accurate self-monitoring
Mature data science and engineering methods
Rapidly advancing AI
A unique convergence of opportunities and challenges to achieve a “P4” vision of data-driven medicine
and healthcare management
Blockers:
Data access and governance, data integration
Data Quality control, device tolerance, intrusiveness
Data engineering expensive and ad hoc
Still very experimental. Trustworthy, Ethical, Responsible AI
Hard “management” questions:
- how do you calculate the “total cost of operation” for data-driven medicine?
- at which point does it become cost-effective for the health service?
- what are the real benefits to patients?
- …
Editor's Notes
Mention "reusable analysis pipelines" (RAP)
NHS data in the UK are a prime example of retrospective data. In principle accessible for research, but
There are governance issues
It requires coding and integration
This slide illustrates the range of national and local datasets together with our planned outputs and impacts.r ..We will use UK biobank and CPRD for discovery, our local “health-intelligence” data sets in the North East and East London for testing and then replicate by sharing code with colleagues in Edinburgh, Bradford and Birmingham.
Applying a range of AI methods, our outputs and impacts - wide ranging. Specifically, identify high risk situations and tipping points to inform local and national policies. Together these outputs will improve patient care and reduce health inequalities.