SlideShare a Scribd company logo
1 of 17
A MACHINE LEARNING APPROACH
TO PREDICTING COVID-19 CASES
AMONGST SUSPECTED CASES
AND THEIR CATEGORY OF
ADMISSION
Dr. D Narayana, Amanpreet Singh,
Aparna Ranjith, Divya Dixit, Abhay,
Suparna Khan, Anwesh Reddy
Presented by
Amanpreet Singh
COVID-19 timeline
DEC’19
First documented
Covid-19 hospital
admission in
Wuhan, Hubei
JAN ‘20
Multiple cases
registered from
countries around
the globe.
FEB ‘20
WHO named the
disease Coronavirus
Disease - 2019
(COVID-19)
MAR ‘20
UN global response
launched by WHO.
Complete
lockdown imposed
in several countries
APR ‘20
Evidence of
transmission from
asymptomatic to
symptomatic and
pre-asymptomatic
people infected
with Covid-19 were
reported
Agenda
Objective
Conclusion
About the Data
Methodology
Results
Future Enhancements
Objective
● Predict confirmed COVID-19 cases amongst suspected cases
based on the laboratory tests of their clinical samples.
● Predict admission to general, semi-ICU, and ICU wards among
those who predicted positive for COVID-19 in the first task.
The World Health Organization has emphasized the need for
comprehensive testing in order to fight the virus . With the lack of
testing kits available worldwide, there is a call for novel testing methods
that can help arrest the spread effectively.
About the data
Source Hospital Israelita Albert Einstein, at São Paulo, Brazil
Date Range The data has been collected between March 28 - April 3, 2020
Total records 5644
Attributes 111
Data points used
for prediction
Subset I Subset II
Attributes: 20 Attributes:48
Records: 598 Records: 254
Standard
Deviation
1
Mean 0
Challenges Missing values and imbalanced data
Clinical Parameters
● Complete Blood Count Tests
● Liver Function Tests
● Renal Analysis (Kidney)
● Influenza Tests
● Blood gas analysis
● Salt tests
Logical Grouping of Attributes
Blood Liver Function Influenza Renal(Urine)
Haemoglobin Alanine
Transaminase
RSV Urine-ph
Red Blood Cells Aspartate
Transaminase
Influenza A Urine-Aspect
Lymphocytes Total Bilirubin Influenza B Urine-Color
C-Reactive
Protein
Indirect Bilirubin Rhinovirus/Enteroviru
s
Urine-RBC
Creatinine Alkaline
Phosphatase
Adenovirus Urine-Density
Potassium GGT Parainfluenza 3 Urine-Haemoglobin
Statistical analysis
Ward Distribution of Patients
Of all the positive patients
● 1.4% were admitted to ICU
● 1.4% were admitted to Semi-ICU
● 6.5% were admitted to Regular
Ward
Methodology
● Data Cleaning
● Imbalanced Data – Sampling(UpSampling and Downsampling)
● Imputation
● Classification
● Results
Feature Selection Criteria
The first subset has been made taking the features related to CBC along with age
quantile and three categorical variables representing the hospital admission and
the target variable (‘sars_cov2’). The information in these features is related to
RBC, WBC and platelets. A total of 20 features have been chosen with 598
records that have no missing values.
The second subset includes features related to blood, flu and liver parameters
along with age quantile and three categorical variables representing the hospital
admission and the target variable (‘sars_cov2’). It contains 48 attributes with 254
records. The larger dataset after careful filtering had missing values which were
eventually treated using a placeholder value.
For the ward prediction task, the ICU, semi-icu, regular ward were
merged into one categorical column with ordinal values 0,1,2,3
01
02
03
RESULTS - AUC/ROC Metrics
Prediction Accuracy
Model Performances
Model Performances
Model Training Accuracy(%) Testing Accuracy(%)
Logistic Regression 91.52 89.61
Decision Tree 90.39 83.11
Random Forest 97.17 94.80
Bagging 100 96.10
Gradient Boost 100 94.80
ADA Boosting 98.30 93.50
Conclusion
● Our results demonstrate that adding more bio-markers can help in
increasing prediction of covid-19 using machine learning approach.
● Further it can help in predicting severity of the disease (ward).
● Adding these methods to testing protocols can minimize contact for
healthcare workers.
● The model’s output can be used as a tool for prioritization and to support
further medical decision-making processes.
Future Enhancements
● Larger Dataset
● More ethnically representative sample data
● Capturing critical features like D-dimer, CRP
● Including X-ray
● Considering co-morbidities
Thank You

More Related Content

What's hot

Clinical Trials Registry
Clinical Trials RegistryClinical Trials Registry
Clinical Trials Registry
biinoida
 
GCP Course for Clinical Trials Involving Investigational Drugs ICH Completion...
GCP Course for Clinical Trials Involving Investigational Drugs ICH Completion...GCP Course for Clinical Trials Involving Investigational Drugs ICH Completion...
GCP Course for Clinical Trials Involving Investigational Drugs ICH Completion...
Tigran Uzunyan
 
David Neasham Practical Use Pharmacoepi Drug Dev
David Neasham Practical Use Pharmacoepi Drug DevDavid Neasham Practical Use Pharmacoepi Drug Dev
David Neasham Practical Use Pharmacoepi Drug Dev
guest41e570
 

What's hot (20)

Clinical Trials Registry
Clinical Trials RegistryClinical Trials Registry
Clinical Trials Registry
 
Presentation: Global pharmacovigilance networks - A regulator's
Presentation: Global pharmacovigilance networks - A regulator'sPresentation: Global pharmacovigilance networks - A regulator's
Presentation: Global pharmacovigilance networks - A regulator's
 
Role of ADR monitoring centre
Role of ADR monitoring centreRole of ADR monitoring centre
Role of ADR monitoring centre
 
Hospital Pharmaco-epidemiology
Hospital Pharmaco-epidemiology Hospital Pharmaco-epidemiology
Hospital Pharmaco-epidemiology
 
Pharmacovigilance methods
Pharmacovigilance methodsPharmacovigilance methods
Pharmacovigilance methods
 
GCP Course for Clinical Trials Involving Investigational Drugs ICH Completion...
GCP Course for Clinical Trials Involving Investigational Drugs ICH Completion...GCP Course for Clinical Trials Involving Investigational Drugs ICH Completion...
GCP Course for Clinical Trials Involving Investigational Drugs ICH Completion...
 
Pharmacoepidemiology
PharmacoepidemiologyPharmacoepidemiology
Pharmacoepidemiology
 
Prescription event monitorig
Prescription event monitorigPrescription event monitorig
Prescription event monitorig
 
PPT on Vigiflow, Argus-G and Aris For ADR Reporting
PPT on Vigiflow, Argus-G and Aris For ADR ReportingPPT on Vigiflow, Argus-G and Aris For ADR Reporting
PPT on Vigiflow, Argus-G and Aris For ADR Reporting
 
Computational models in research
Computational models in researchComputational models in research
Computational models in research
 
Accessing the Burden of Nondeferrable Major Uro-oncologic Surgery to Guide Pr...
Accessing the Burden of Nondeferrable Major Uro-oncologic Surgery to Guide Pr...Accessing the Burden of Nondeferrable Major Uro-oncologic Surgery to Guide Pr...
Accessing the Burden of Nondeferrable Major Uro-oncologic Surgery to Guide Pr...
 
David Neasham Practical Use Pharmacoepi Drug Dev
David Neasham Practical Use Pharmacoepi Drug DevDavid Neasham Practical Use Pharmacoepi Drug Dev
David Neasham Practical Use Pharmacoepi Drug Dev
 
A Global Survey on the Impact of COVID-19 on Urological Services
A Global Survey on the Impact of COVID-19 on Urological ServicesA Global Survey on the Impact of COVID-19 on Urological Services
A Global Survey on the Impact of COVID-19 on Urological Services
 
Prescription event monitoring
Prescription event monitoringPrescription event monitoring
Prescription event monitoring
 
Open letter the statistical analysis and data integrity of mehra et al final
Open letter the statistical analysis and data integrity of mehra et al finalOpen letter the statistical analysis and data integrity of mehra et al final
Open letter the statistical analysis and data integrity of mehra et al final
 
Urology in the time of Coronavirus: Reduced Acmes to Urgent and Emergent Care...
Urology in the time of Coronavirus: Reduced Acmes to Urgent and Emergent Care...Urology in the time of Coronavirus: Reduced Acmes to Urgent and Emergent Care...
Urology in the time of Coronavirus: Reduced Acmes to Urgent and Emergent Care...
 
Pharmacoepidemiology
PharmacoepidemiologyPharmacoepidemiology
Pharmacoepidemiology
 
The role of artificial intelligence in drug repurposing
The role of artificial intelligence in drug repurposingThe role of artificial intelligence in drug repurposing
The role of artificial intelligence in drug repurposing
 
Active and passive survillance
Active and passive survillanceActive and passive survillance
Active and passive survillance
 
Methods of causality assessment
Methods of causality assessmentMethods of causality assessment
Methods of causality assessment
 

Similar to A Machine Learning Approach to Predicting Covid-19 Cases Amongst Suspected Cases and Their Category of Admission

1 s2.0-s2213398421000786-main
1 s2.0-s2213398421000786-main1 s2.0-s2213398421000786-main
1 s2.0-s2213398421000786-main
▄ █
 
Factores de riesgo para reingreso temprano despues de hospitalizacion por cov...
Factores de riesgo para reingreso temprano despues de hospitalizacion por cov...Factores de riesgo para reingreso temprano despues de hospitalizacion por cov...
Factores de riesgo para reingreso temprano despues de hospitalizacion por cov...
MelendiNavarro
 
Application of ordinal logistic=China.pdf
Application of ordinal logistic=China.pdfApplication of ordinal logistic=China.pdf
Application of ordinal logistic=China.pdf
HenokBuno
 
Health Risk Prediction Using Support Vector Machine with Gray Wolf Optimizati...
Health Risk Prediction Using Support Vector Machine with Gray Wolf Optimizati...Health Risk Prediction Using Support Vector Machine with Gray Wolf Optimizati...
Health Risk Prediction Using Support Vector Machine with Gray Wolf Optimizati...
ijtsrd
 

Similar to A Machine Learning Approach to Predicting Covid-19 Cases Amongst Suspected Cases and Their Category of Admission (20)

Four Unique Laboratory Characteristics Applied to Assess the Severity of COVI...
Four Unique Laboratory Characteristics Applied to Assess the Severity of COVI...Four Unique Laboratory Characteristics Applied to Assess the Severity of COVI...
Four Unique Laboratory Characteristics Applied to Assess the Severity of COVI...
 
Four Unique Laboratory Characteristics Applied to Assess the Severity of COVI...
Four Unique Laboratory Characteristics Applied to Assess the Severity of COVI...Four Unique Laboratory Characteristics Applied to Assess the Severity of COVI...
Four Unique Laboratory Characteristics Applied to Assess the Severity of COVI...
 
1 s2.0-s2213398421000786-main
1 s2.0-s2213398421000786-main1 s2.0-s2213398421000786-main
1 s2.0-s2213398421000786-main
 
Investigation of Long term Hazards and Multi organ Impact of SARS COV-2 in Po...
Investigation of Long term Hazards and Multi organ Impact of SARS COV-2 in Po...Investigation of Long term Hazards and Multi organ Impact of SARS COV-2 in Po...
Investigation of Long term Hazards and Multi organ Impact of SARS COV-2 in Po...
 
1 best.pdf
1 best.pdf1 best.pdf
1 best.pdf
 
Factores de riesgo para reingreso temprano despues de hospitalizacion por cov...
Factores de riesgo para reingreso temprano despues de hospitalizacion por cov...Factores de riesgo para reingreso temprano despues de hospitalizacion por cov...
Factores de riesgo para reingreso temprano despues de hospitalizacion por cov...
 
Molecular diagnostics in the future July 14 - Prof. Bert Niesters
Molecular diagnostics in the future July 14 - Prof. Bert NiestersMolecular diagnostics in the future July 14 - Prof. Bert Niesters
Molecular diagnostics in the future July 14 - Prof. Bert Niesters
 
A Review Paper on Covid-19 Detection using Deep Learning
A Review Paper on Covid-19 Detection using Deep LearningA Review Paper on Covid-19 Detection using Deep Learning
A Review Paper on Covid-19 Detection using Deep Learning
 
4-Journal of Medical Virology - 2021 - Zhang - Using different machine learni...
4-Journal of Medical Virology - 2021 - Zhang - Using different machine learni...4-Journal of Medical Virology - 2021 - Zhang - Using different machine learni...
4-Journal of Medical Virology - 2021 - Zhang - Using different machine learni...
 
Considerations for Diagnostic COVID-19 Tests in the 4 Medical Testing Centre ...
Considerations for Diagnostic COVID-19 Tests in the 4 Medical Testing Centre ...Considerations for Diagnostic COVID-19 Tests in the 4 Medical Testing Centre ...
Considerations for Diagnostic COVID-19 Tests in the 4 Medical Testing Centre ...
 
A systematic review on COVID-1: urological manifestations...
A systematic review on COVID-1: urological manifestations...A systematic review on COVID-1: urological manifestations...
A systematic review on COVID-1: urological manifestations...
 
Application of ordinal logistic=China.pdf
Application of ordinal logistic=China.pdfApplication of ordinal logistic=China.pdf
Application of ordinal logistic=China.pdf
 
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELSCOVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
 
1645 ainsworth
1645 ainsworth1645 ainsworth
1645 ainsworth
 
1645 ainsworth
1645 ainsworth1645 ainsworth
1645 ainsworth
 
Coronavirus disease 2019 detection using deep features learning
Coronavirus disease 2019 detection using deep features learningCoronavirus disease 2019 detection using deep features learning
Coronavirus disease 2019 detection using deep features learning
 
Health Risk Prediction Using Support Vector Machine with Gray Wolf Optimizati...
Health Risk Prediction Using Support Vector Machine with Gray Wolf Optimizati...Health Risk Prediction Using Support Vector Machine with Gray Wolf Optimizati...
Health Risk Prediction Using Support Vector Machine with Gray Wolf Optimizati...
 
The Envisia Genomic Classifier
The Envisia Genomic ClassifierThe Envisia Genomic Classifier
The Envisia Genomic Classifier
 
Brazil Clinical Data Platform Slides 03_08_22.pdf
Brazil Clinical Data Platform Slides 03_08_22.pdfBrazil Clinical Data Platform Slides 03_08_22.pdf
Brazil Clinical Data Platform Slides 03_08_22.pdf
 
PCR Assay Turned Positive in 25 Discharged COVID-19 Patients
PCR Assay Turned Positive in 25 Discharged COVID-19 PatientsPCR Assay Turned Positive in 25 Discharged COVID-19 Patients
PCR Assay Turned Positive in 25 Discharged COVID-19 Patients
 

Recently uploaded

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 

Recently uploaded (20)

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 

A Machine Learning Approach to Predicting Covid-19 Cases Amongst Suspected Cases and Their Category of Admission

  • 1. A MACHINE LEARNING APPROACH TO PREDICTING COVID-19 CASES AMONGST SUSPECTED CASES AND THEIR CATEGORY OF ADMISSION Dr. D Narayana, Amanpreet Singh, Aparna Ranjith, Divya Dixit, Abhay, Suparna Khan, Anwesh Reddy Presented by Amanpreet Singh
  • 2. COVID-19 timeline DEC’19 First documented Covid-19 hospital admission in Wuhan, Hubei JAN ‘20 Multiple cases registered from countries around the globe. FEB ‘20 WHO named the disease Coronavirus Disease - 2019 (COVID-19) MAR ‘20 UN global response launched by WHO. Complete lockdown imposed in several countries APR ‘20 Evidence of transmission from asymptomatic to symptomatic and pre-asymptomatic people infected with Covid-19 were reported
  • 4. Objective ● Predict confirmed COVID-19 cases amongst suspected cases based on the laboratory tests of their clinical samples. ● Predict admission to general, semi-ICU, and ICU wards among those who predicted positive for COVID-19 in the first task. The World Health Organization has emphasized the need for comprehensive testing in order to fight the virus . With the lack of testing kits available worldwide, there is a call for novel testing methods that can help arrest the spread effectively.
  • 5. About the data Source Hospital Israelita Albert Einstein, at São Paulo, Brazil Date Range The data has been collected between March 28 - April 3, 2020 Total records 5644 Attributes 111 Data points used for prediction Subset I Subset II Attributes: 20 Attributes:48 Records: 598 Records: 254 Standard Deviation 1 Mean 0 Challenges Missing values and imbalanced data
  • 6. Clinical Parameters ● Complete Blood Count Tests ● Liver Function Tests ● Renal Analysis (Kidney) ● Influenza Tests ● Blood gas analysis ● Salt tests
  • 7. Logical Grouping of Attributes Blood Liver Function Influenza Renal(Urine) Haemoglobin Alanine Transaminase RSV Urine-ph Red Blood Cells Aspartate Transaminase Influenza A Urine-Aspect Lymphocytes Total Bilirubin Influenza B Urine-Color C-Reactive Protein Indirect Bilirubin Rhinovirus/Enteroviru s Urine-RBC Creatinine Alkaline Phosphatase Adenovirus Urine-Density Potassium GGT Parainfluenza 3 Urine-Haemoglobin
  • 9. Ward Distribution of Patients Of all the positive patients ● 1.4% were admitted to ICU ● 1.4% were admitted to Semi-ICU ● 6.5% were admitted to Regular Ward
  • 10. Methodology ● Data Cleaning ● Imbalanced Data – Sampling(UpSampling and Downsampling) ● Imputation ● Classification ● Results
  • 11. Feature Selection Criteria The first subset has been made taking the features related to CBC along with age quantile and three categorical variables representing the hospital admission and the target variable (‘sars_cov2’). The information in these features is related to RBC, WBC and platelets. A total of 20 features have been chosen with 598 records that have no missing values. The second subset includes features related to blood, flu and liver parameters along with age quantile and three categorical variables representing the hospital admission and the target variable (‘sars_cov2’). It contains 48 attributes with 254 records. The larger dataset after careful filtering had missing values which were eventually treated using a placeholder value. For the ward prediction task, the ICU, semi-icu, regular ward were merged into one categorical column with ordinal values 0,1,2,3 01 02 03
  • 12. RESULTS - AUC/ROC Metrics
  • 14. Model Performances Model Performances Model Training Accuracy(%) Testing Accuracy(%) Logistic Regression 91.52 89.61 Decision Tree 90.39 83.11 Random Forest 97.17 94.80 Bagging 100 96.10 Gradient Boost 100 94.80 ADA Boosting 98.30 93.50
  • 15. Conclusion ● Our results demonstrate that adding more bio-markers can help in increasing prediction of covid-19 using machine learning approach. ● Further it can help in predicting severity of the disease (ward). ● Adding these methods to testing protocols can minimize contact for healthcare workers. ● The model’s output can be used as a tool for prioritization and to support further medical decision-making processes.
  • 16. Future Enhancements ● Larger Dataset ● More ethnically representative sample data ● Capturing critical features like D-dimer, CRP ● Including X-ray ● Considering co-morbidities

Editor's Notes

  1. Hello everyone, my name is Amanpreet Singh In this study we propose a machine learning approach towards predicting Covid-19 cases among a sample population who have undergone other clinical tests and blood spectrum tests.
  2. Before moving forward lets talk a little about the covid-19 –December ‘19 - First documented COVID-19 patient was admitted in the hospital in Wuhan, Hubei Province, China –January ‘20 - Multiple cases registered from countries around the globe and WHO declared the COVID-19 outbreak as the sixth public health emergency of international concern –February ‘20 - WHO named the disease caused by 2019-nCoV: Coronavirus Disease-2019 (COVID-19) –March ‘20 - The UN Global Humanitarian Response Plan was launched by the WHO. Complete Lockdown was introduced in many countries. –April ‘20 – Evidence of transmission from symptomatic, pre-symptomatic and asymptomatic people infected with COVID-19 were reported. WHO published a draft landscape of COVID-19 candidate vaccines
  3. Developing countries with fewer testing resources have adopted conservative testing methods in order to conserve testing kits on symptomatic and mild to severe cases. As more evidence surfaces around symptoms and associated risks of Covid-19, it has been observed that blood examination can play a key role in the route to extensive testing.
  4. The patient data used in this effort has been donated by Hospital Israelita Albert Einstein, at São Paulo, Brazil for the purpose of research. The blood and clinical samples were collected of patients who visited the hospital for a suspected infection of Covid. The data was made available in a standardized and normalized form. It has a unit standard deviation and a mean of zero There was an imbalance in class distribution across both target columns ‘sars_cov2’ and ‘Ward’
  5. The features which comprise various clinical parameters can be broadly categorized under CBC, liver function, renal analysis, salt tests, blood gas analysis (arterial and venous), And influenza tests. The CBC is indicative of an individual’s overall health and detects a range of potential disorders but is not limited to anaemia, leukaemia, or inflammation
  6. The attributes which have more than 98% of missing values were eliminated. After Data Cleaning, it was observed that there was an imbalance in class distribution across both target columns ‘sars_cov2’ and ‘Ward’ so we have performed sampling there, both up sampling and down sampling. The missing data was not imputed as imputations may render the results compromised. Binary classification is used in the first problem statement. And in the second problem we have used multi-class classification. In the results we have used auc/roc metrics with prediction accuracy of our model performances.
  7. Scenario-1: Area under the ROC-curve for Covid-19 Predictions was observed to be 87.58%. Scenario-2: Area under the ROC-curve for Covid-19 Predictions was observed to be 97.82%.
  8. Prediction accuracy of the model is also an important factor in determining the efficiency of the model. we are able to predict with 87.0 - 97.4 percent accuracy at a 95 percent confidence level that a patient is suffering from Covid-19 When biomarkers are taken into consideration. Among those that tested positive, we were able to demonstrate that our model could predict with 87.0 - 100 percent accuracy at a 95 percent confidence that whether the patient would be admitted to a particular ward.
  9. We performed classification using different machine learning algorithms and compared the testing and training accuracy of the different classifiers used. The best accuracy was obtained using Random Forest Classifier with a training accuracy of 97.17% and testing accuracy of 94.80 % .