SlideShare a Scribd company logo
Data science for pathogen genomic surveillance:
predicting quantitative phenotype from genotype
Eric J. Ma, Islam T. M. Hussein, Jonathan A. Runstadler
Department of Biological Engineering, MIT
Analysis: HIV Drug Resistance
Cross-Validated Prediction Performance
•	 Good prediction performance: high correlation, low error.
Global Drug Resistance Prediction
•	 Model predictions are largely concordant with one another.
Important Amino Acids
•	 Match between expert-identified important positions and model predictions.
Position
10
33
88
84
47
46
54
Rel. Impt
47%
12%
5%
3%
2%
2%
2%
Database
Y
Y
Y
Y
Y
Y
Y
Temporal Emergence of Drug Resistance
•	 FDA approval dates - IDV (indinavir): 1996, FPV (fosamprenavir): 2003, DRV
(duranivir): 2006 (arrows)
•	 Drug resistance emerged a few years after approval
•	 FPV and DRV have similar chemical structures
•	 Goal: Establish example pipeline for genomic surveillance
•	 Input: HIV protease sequence & drug resistance profile
Conclusions & Future Work
•	 Machine learning models can predict drug resistance from protein sequence.
•	 Genomic surveillance able to capture temporal rise of drug resistance
•	 Applicable to other pathogens, with high quality genotype-phenotype data
Genomic Surveillance
•	 Zoonotic pathogens circulating in wild may affect livestock and human health.
•	 Given sequence information, can we compute a pathogen “risk factor”?
•	 Given a computed risk, can we do preventative surveillance of zoonotic pathogens?
Influenza
Influenza Genome Structure
1 PB2 2.4 kb
2 PB1 2.4 kb
3 PA 2.2 kb
4 HA 1.8 kb
5 NP 1.6 kb
6 NA 1.5 kb
7 M 1.0 kb
8 NS 0.9 kb
Reassortment
•	 Influenza is a zoonotic pathogen that has a broad tropic range.
•	 Segmented genome allows reassortment, accelerating viral evolution.
•	 A high polymerase mutation rate rapidly generates novel sequence diversity.
Introduction
Difficulties
•	 Necessity: The presence of a point mutation may
“enhance” phenotype, but not necessarily cause
“dangerouse” phenotype levels (right).
•	 Epistasis: Mapping from genotype to phenotype.
•	 Experiments: Require assays to measure
biochemical phenotype relevant to pathogenesis
without infecting humans.
•	 Data: Lack genotype-phenotype data.
•	 Biology: Novel sequence diversity generated
through error-prone polymerase.
Gene(s)Genotypes PhenotypeHT Assay
HA
NA
receptor affinity
inhibitor resistance
a(2,6) binding
a(2,6) cleavage
antigenic distanceHA, NA
hemagglutinin/
neuraminidase
inhibition
Pol replication activity
polymerase
assay
Significance
infection
potential
treatability
disease
burden
immunity
training data for ML ML predicts phenotype
Risk Prediction
risk
phenotype
MERI...
MKAK...
risk
phenotype
risk
phenotype
MNPN...
risk
phenotype
MKAK...
MNPN...
Application
Risk Profile
Informed
Intervention
NS1
IFN-β
production
dampening
innate immunity
risk
phenotype
MDSN... immunity
Vision
•	 Assay: Biochemical, quantitative measure relevant to pathogenesis
•	 Characterize: population diversity
•	 Machine learning: learn non-linear mapping from genotype to phenotype
•	 Model: quantitative risk profile
Experimentation Plan
Rational Library
ATGGTAACCA
PacBio
Sequencing
Polymerase
Assay
Genotype-
Phenotype
Sequence PEU
Machine
Learning
Web Query
MERIKEL
MERIREL
MDRIKEL
MERIKNL
26
10
9
15
•	 Rational sampling to cover polymorphic diversity.
•	 High throughput library construction & verification.
•	 Safe, scalable, standardized assay of RNA replication rate.
•	 Matched phenotype to genotype
•	 Machine learning models to predict RNA replication rate.
•	 Open data release via web interface & API

More Related Content

What's hot

Gene Therapy - A Novel Approch in Medical Treatment
Gene Therapy - A Novel Approch in Medical TreatmentGene Therapy - A Novel Approch in Medical Treatment
Gene Therapy - A Novel Approch in Medical Treatment
Subhajit Hazra ; M.Pharm
 
10.1128@jcm.00298 17
10.1128@jcm.00298 1710.1128@jcm.00298 17
10.1128@jcm.00298 17
Julio A. Diaz M.
 
Rossen eccmid2015v1.5
Rossen eccmid2015v1.5Rossen eccmid2015v1.5
Integrating Disruptive Technologies Into Translational Research Hinxton Hal...
Integrating Disruptive Technologies Into Translational Research   Hinxton Hal...Integrating Disruptive Technologies Into Translational Research   Hinxton Hal...
Integrating Disruptive Technologies Into Translational Research Hinxton Hal...
Mike Romanos
 
Mci5004 biomarkers infectious diseases
Mci5004 biomarkers infectious diseasesMci5004 biomarkers infectious diseases
Mci5004 biomarkers infectious diseases
R Lin
 
Cancer Gene Therapy
Cancer Gene TherapyCancer Gene Therapy
Cancer Gene Therapy
sathish sak
 
Gene therapy
Gene therapyGene therapy
Gene therapy
Farshid Mokhberi
 
Gene Therapy
Gene Therapy Gene Therapy
Gene Therapy
Marta Talise
 
Forum on Personalized Medicine: Challenges for the next decade
Forum on Personalized Medicine: Challenges for the next decadeForum on Personalized Medicine: Challenges for the next decade
Forum on Personalized Medicine: Challenges for the next decade
Joaquin Dopazo
 
Cancer Gene therapy
Cancer Gene therapyCancer Gene therapy
Cancer Gene therapy
Ahmad Hady
 
Gene Therapy for Cancer Treatment
Gene Therapy for Cancer TreatmentGene Therapy for Cancer Treatment
Gene Therapy for Cancer Treatment
ijtsrd
 
Research Ethics Forum: Ethical Challenges in Trials of Human Genome Editing a...
Research Ethics Forum: Ethical Challenges in Trials of Human Genome Editing a...Research Ethics Forum: Ethical Challenges in Trials of Human Genome Editing a...
Research Ethics Forum: Ethical Challenges in Trials of Human Genome Editing a...
SC CTSI at USC and CHLA
 
GENE THERAPY
GENE THERAPYGENE THERAPY
GENE THERAPY
Abhishek Ghara
 
Genetic as medicine
Genetic as medicineGenetic as medicine
Genetic as medicine
vlmawia
 
Gene Therapy
Gene TherapyGene Therapy
Gene Therapy
fondas vakalis
 
Forward-looking Research: The Potential of New Genetic Technologies
Forward-looking Research:  The Potential of New Genetic TechnologiesForward-looking Research:  The Potential of New Genetic Technologies
Forward-looking Research: The Potential of New Genetic Technologies
Norrie Disease Association
 
Gene theary seminar
Gene theary seminarGene theary seminar
Gene theary seminar
USmile Ï Ṩṃïlệ
 
Viral gene therapy
Viral gene therapyViral gene therapy
Viral gene therapy
15bunkal
 
2016 ngs health_lecture
2016 ngs health_lecture2016 ngs health_lecture
2016 ngs health_lecture
Dan Gaston
 
Death prompts a review of gene therapy vector
Death prompts a review of gene therapy vectorDeath prompts a review of gene therapy vector
Death prompts a review of gene therapy vector
Lindsay Meyer
 

What's hot (20)

Gene Therapy - A Novel Approch in Medical Treatment
Gene Therapy - A Novel Approch in Medical TreatmentGene Therapy - A Novel Approch in Medical Treatment
Gene Therapy - A Novel Approch in Medical Treatment
 
10.1128@jcm.00298 17
10.1128@jcm.00298 1710.1128@jcm.00298 17
10.1128@jcm.00298 17
 
Rossen eccmid2015v1.5
Rossen eccmid2015v1.5Rossen eccmid2015v1.5
Rossen eccmid2015v1.5
 
Integrating Disruptive Technologies Into Translational Research Hinxton Hal...
Integrating Disruptive Technologies Into Translational Research   Hinxton Hal...Integrating Disruptive Technologies Into Translational Research   Hinxton Hal...
Integrating Disruptive Technologies Into Translational Research Hinxton Hal...
 
Mci5004 biomarkers infectious diseases
Mci5004 biomarkers infectious diseasesMci5004 biomarkers infectious diseases
Mci5004 biomarkers infectious diseases
 
Cancer Gene Therapy
Cancer Gene TherapyCancer Gene Therapy
Cancer Gene Therapy
 
Gene therapy
Gene therapyGene therapy
Gene therapy
 
Gene Therapy
Gene Therapy Gene Therapy
Gene Therapy
 
Forum on Personalized Medicine: Challenges for the next decade
Forum on Personalized Medicine: Challenges for the next decadeForum on Personalized Medicine: Challenges for the next decade
Forum on Personalized Medicine: Challenges for the next decade
 
Cancer Gene therapy
Cancer Gene therapyCancer Gene therapy
Cancer Gene therapy
 
Gene Therapy for Cancer Treatment
Gene Therapy for Cancer TreatmentGene Therapy for Cancer Treatment
Gene Therapy for Cancer Treatment
 
Research Ethics Forum: Ethical Challenges in Trials of Human Genome Editing a...
Research Ethics Forum: Ethical Challenges in Trials of Human Genome Editing a...Research Ethics Forum: Ethical Challenges in Trials of Human Genome Editing a...
Research Ethics Forum: Ethical Challenges in Trials of Human Genome Editing a...
 
GENE THERAPY
GENE THERAPYGENE THERAPY
GENE THERAPY
 
Genetic as medicine
Genetic as medicineGenetic as medicine
Genetic as medicine
 
Gene Therapy
Gene TherapyGene Therapy
Gene Therapy
 
Forward-looking Research: The Potential of New Genetic Technologies
Forward-looking Research:  The Potential of New Genetic TechnologiesForward-looking Research:  The Potential of New Genetic Technologies
Forward-looking Research: The Potential of New Genetic Technologies
 
Gene theary seminar
Gene theary seminarGene theary seminar
Gene theary seminar
 
Viral gene therapy
Viral gene therapyViral gene therapy
Viral gene therapy
 
2016 ngs health_lecture
2016 ngs health_lecture2016 ngs health_lecture
2016 ngs health_lecture
 
Death prompts a review of gene therapy vector
Death prompts a review of gene therapy vectorDeath prompts a review of gene therapy vector
Death prompts a review of gene therapy vector
 

Viewers also liked

CEHS 2016 Poster
CEHS 2016 PosterCEHS 2016 Poster
CEHS 2016 Poster
Eric Ma
 
The truth about silencers
The truth about silencersThe truth about silencers
The truth about silencers
Christopher Dill
 
Resume
ResumeResume
resumedinesh
resumedineshresumedinesh
resumedinesh
Dinesh kumar
 
Life' 's tomorow..ramy 12
Life' 's tomorow..ramy 12Life' 's tomorow..ramy 12
Life' 's tomorow..ramy 12
ramy.georges@live.co morcos
 
Riga 2016TomoCESMcombo
Riga 2016TomoCESMcombo Riga 2016TomoCESMcombo
Riga 2016TomoCESMcombo
Sana Pascaline
 
Ambiente excel irvin armando santamaria may
Ambiente excel irvin armando santamaria mayAmbiente excel irvin armando santamaria may
Ambiente excel irvin armando santamaria may
irvin5789
 
DIREITO NA CALÇADA: Reflexões sobre Relações Jurídicas Correlatas à Calçada. ...
DIREITO NA CALÇADA: Reflexões sobre Relações Jurídicas Correlatas à Calçada. ...DIREITO NA CALÇADA: Reflexões sobre Relações Jurídicas Correlatas à Calçada. ...
DIREITO NA CALÇADA: Reflexões sobre Relações Jurídicas Correlatas à Calçada. ...
dingojr
 
Presen3
Presen3Presen3
BS 1- Installation of hot water supply for hotel
BS 1- Installation of hot water supply for hotelBS 1- Installation of hot water supply for hotel
BS 1- Installation of hot water supply for hotel
Monnie Bao Jia
 
Lisa Ruddle CV 2016
Lisa Ruddle CV 2016Lisa Ruddle CV 2016
Lisa Ruddle CV 2016
Lisa Hodkinson
 

Viewers also liked (13)

CEHS 2016 Poster
CEHS 2016 PosterCEHS 2016 Poster
CEHS 2016 Poster
 
master grades 1
master grades 1master grades 1
master grades 1
 
The truth about silencers
The truth about silencersThe truth about silencers
The truth about silencers
 
Resume
ResumeResume
Resume
 
boekje Oger 041116
boekje Oger 041116boekje Oger 041116
boekje Oger 041116
 
resumedinesh
resumedineshresumedinesh
resumedinesh
 
Life' 's tomorow..ramy 12
Life' 's tomorow..ramy 12Life' 's tomorow..ramy 12
Life' 's tomorow..ramy 12
 
Riga 2016TomoCESMcombo
Riga 2016TomoCESMcombo Riga 2016TomoCESMcombo
Riga 2016TomoCESMcombo
 
Ambiente excel irvin armando santamaria may
Ambiente excel irvin armando santamaria mayAmbiente excel irvin armando santamaria may
Ambiente excel irvin armando santamaria may
 
DIREITO NA CALÇADA: Reflexões sobre Relações Jurídicas Correlatas à Calçada. ...
DIREITO NA CALÇADA: Reflexões sobre Relações Jurídicas Correlatas à Calçada. ...DIREITO NA CALÇADA: Reflexões sobre Relações Jurídicas Correlatas à Calçada. ...
DIREITO NA CALÇADA: Reflexões sobre Relações Jurídicas Correlatas à Calçada. ...
 
Presen3
Presen3Presen3
Presen3
 
BS 1- Installation of hot water supply for hotel
BS 1- Installation of hot water supply for hotelBS 1- Installation of hot water supply for hotel
BS 1- Installation of hot water supply for hotel
 
Lisa Ruddle CV 2016
Lisa Ruddle CV 2016Lisa Ruddle CV 2016
Lisa Ruddle CV 2016
 

Similar to BE Retreat 2015 Poster

HIV Resistance (Journal Club)
HIV Resistance (Journal Club)HIV Resistance (Journal Club)
HIV Resistance (Journal Club)
Abdullatif Al-Rashed
 
Next Generation Sequencing application in virology
Next Generation Sequencing application in virologyNext Generation Sequencing application in virology
Next Generation Sequencing application in virology
Eben Titus
 
The emerging picture of host genetic control of susceptibility and outcome in...
The emerging picture of host genetic control of susceptibility and outcome in...The emerging picture of host genetic control of susceptibility and outcome in...
The emerging picture of host genetic control of susceptibility and outcome in...
Meningitis Research Foundation
 
Dr. Esteban Domingo: Respuesta del virus de la hepatitis C a inhibidores. Inf...
Dr. Esteban Domingo: Respuesta del virus de la hepatitis C a inhibidores. Inf...Dr. Esteban Domingo: Respuesta del virus de la hepatitis C a inhibidores. Inf...
Dr. Esteban Domingo: Respuesta del virus de la hepatitis C a inhibidores. Inf...
Vall d'Hebron Institute of Research (VHIR)
 
The 'omics' revolution: How will it improve our understanding of infections a...
The 'omics' revolution: How will it improve our understanding of infections a...The 'omics' revolution: How will it improve our understanding of infections a...
The 'omics' revolution: How will it improve our understanding of infections a...
WAidid
 
D4 HIV Resistance Testing An Update Barnett
D4 HIV Resistance Testing An Update BarnettD4 HIV Resistance Testing An Update Barnett
D4 HIV Resistance Testing An Update Barnett
DSHS
 
LLA 2011 - B. Cheson - Problems of the design and interpretation of very earl...
LLA 2011 - B. Cheson - Problems of the design and interpretation of very earl...LLA 2011 - B. Cheson - Problems of the design and interpretation of very earl...
LLA 2011 - B. Cheson - Problems of the design and interpretation of very earl...
European School of Oncology
 
Hao Liu Resume 2017-02
Hao Liu Resume 2017-02Hao Liu Resume 2017-02
Hao Liu Resume 2017-02
Hao Liu
 
Dr. Andrea Wilson - New PRRS disease phenotypes as vaccine and genetic improv...
Dr. Andrea Wilson - New PRRS disease phenotypes as vaccine and genetic improv...Dr. Andrea Wilson - New PRRS disease phenotypes as vaccine and genetic improv...
Dr. Andrea Wilson - New PRRS disease phenotypes as vaccine and genetic improv...
John Blue
 
EID_lec3_Bishai.pdf
EID_lec3_Bishai.pdfEID_lec3_Bishai.pdf
EID_lec3_Bishai.pdf
Nusakaryaengineering
 
Diabetes Systems Biology And Genetics V6
Diabetes Systems Biology And Genetics V6Diabetes Systems Biology And Genetics V6
Diabetes Systems Biology And Genetics V6
cphensley
 
Integration of NGS in Current & FutureTreatment Algorithm of Colorectal Cancer
Integration of NGS in Current & FutureTreatment Algorithm of Colorectal CancerIntegration of NGS in Current & FutureTreatment Algorithm of Colorectal Cancer
Integration of NGS in Current & FutureTreatment Algorithm of Colorectal Cancer
Mohamed Abdulla
 
Современное лечение ВИЧ: лечение многократно леченных пациентов с резистентно...
Современное лечение ВИЧ: лечение многократно леченных пациентов с резистентно...Современное лечение ВИЧ: лечение многократно леченных пациентов с резистентно...
Современное лечение ВИЧ: лечение многократно леченных пациентов с резистентно...
hivlifeinfo
 
Dr. Ying Fang - Emerging swine disease diagnostics and characterization: conn...
Dr. Ying Fang - Emerging swine disease diagnostics and characterization: conn...Dr. Ying Fang - Emerging swine disease diagnostics and characterization: conn...
Dr. Ying Fang - Emerging swine disease diagnostics and characterization: conn...
John Blue
 
Joseph Levy MedicReS World Congress 2013 - 2
Joseph Levy MedicReS World Congress 2013 - 2Joseph Levy MedicReS World Congress 2013 - 2
Joseph Levy MedicReS World Congress 2013 - 2
MedicReS
 
SEYED MOHAMMADREZA Hashemian IFI
SEYED MOHAMMADREZA Hashemian IFISEYED MOHAMMADREZA Hashemian IFI
SEYED MOHAMMADREZA Hashemian IFI
Seyed Mohammad Reza Hashemian
 
Gene sequencing and tb – a new age approach.pptx
Gene sequencing and tb – a new age approach.pptxGene sequencing and tb – a new age approach.pptx
Gene sequencing and tb – a new age approach.pptx
ShajahanPS
 
Dr. Stephanie Rossow - Applications of Next Generation Sequencing
Dr. Stephanie Rossow - Applications of Next Generation SequencingDr. Stephanie Rossow - Applications of Next Generation Sequencing
Dr. Stephanie Rossow - Applications of Next Generation Sequencing
John Blue
 
Assessing the clinical utility of cancer genomic and proteomic data across tu...
Assessing the clinical utility of cancer genomic and proteomic data across tu...Assessing the clinical utility of cancer genomic and proteomic data across tu...
Assessing the clinical utility of cancer genomic and proteomic data across tu...
Gul Muneer
 
patho.ppt
patho.pptpatho.ppt
patho.ppt
Ranjana Nagendra
 

Similar to BE Retreat 2015 Poster (20)

HIV Resistance (Journal Club)
HIV Resistance (Journal Club)HIV Resistance (Journal Club)
HIV Resistance (Journal Club)
 
Next Generation Sequencing application in virology
Next Generation Sequencing application in virologyNext Generation Sequencing application in virology
Next Generation Sequencing application in virology
 
The emerging picture of host genetic control of susceptibility and outcome in...
The emerging picture of host genetic control of susceptibility and outcome in...The emerging picture of host genetic control of susceptibility and outcome in...
The emerging picture of host genetic control of susceptibility and outcome in...
 
Dr. Esteban Domingo: Respuesta del virus de la hepatitis C a inhibidores. Inf...
Dr. Esteban Domingo: Respuesta del virus de la hepatitis C a inhibidores. Inf...Dr. Esteban Domingo: Respuesta del virus de la hepatitis C a inhibidores. Inf...
Dr. Esteban Domingo: Respuesta del virus de la hepatitis C a inhibidores. Inf...
 
The 'omics' revolution: How will it improve our understanding of infections a...
The 'omics' revolution: How will it improve our understanding of infections a...The 'omics' revolution: How will it improve our understanding of infections a...
The 'omics' revolution: How will it improve our understanding of infections a...
 
D4 HIV Resistance Testing An Update Barnett
D4 HIV Resistance Testing An Update BarnettD4 HIV Resistance Testing An Update Barnett
D4 HIV Resistance Testing An Update Barnett
 
LLA 2011 - B. Cheson - Problems of the design and interpretation of very earl...
LLA 2011 - B. Cheson - Problems of the design and interpretation of very earl...LLA 2011 - B. Cheson - Problems of the design and interpretation of very earl...
LLA 2011 - B. Cheson - Problems of the design and interpretation of very earl...
 
Hao Liu Resume 2017-02
Hao Liu Resume 2017-02Hao Liu Resume 2017-02
Hao Liu Resume 2017-02
 
Dr. Andrea Wilson - New PRRS disease phenotypes as vaccine and genetic improv...
Dr. Andrea Wilson - New PRRS disease phenotypes as vaccine and genetic improv...Dr. Andrea Wilson - New PRRS disease phenotypes as vaccine and genetic improv...
Dr. Andrea Wilson - New PRRS disease phenotypes as vaccine and genetic improv...
 
EID_lec3_Bishai.pdf
EID_lec3_Bishai.pdfEID_lec3_Bishai.pdf
EID_lec3_Bishai.pdf
 
Diabetes Systems Biology And Genetics V6
Diabetes Systems Biology And Genetics V6Diabetes Systems Biology And Genetics V6
Diabetes Systems Biology And Genetics V6
 
Integration of NGS in Current & FutureTreatment Algorithm of Colorectal Cancer
Integration of NGS in Current & FutureTreatment Algorithm of Colorectal CancerIntegration of NGS in Current & FutureTreatment Algorithm of Colorectal Cancer
Integration of NGS in Current & FutureTreatment Algorithm of Colorectal Cancer
 
Современное лечение ВИЧ: лечение многократно леченных пациентов с резистентно...
Современное лечение ВИЧ: лечение многократно леченных пациентов с резистентно...Современное лечение ВИЧ: лечение многократно леченных пациентов с резистентно...
Современное лечение ВИЧ: лечение многократно леченных пациентов с резистентно...
 
Dr. Ying Fang - Emerging swine disease diagnostics and characterization: conn...
Dr. Ying Fang - Emerging swine disease diagnostics and characterization: conn...Dr. Ying Fang - Emerging swine disease diagnostics and characterization: conn...
Dr. Ying Fang - Emerging swine disease diagnostics and characterization: conn...
 
Joseph Levy MedicReS World Congress 2013 - 2
Joseph Levy MedicReS World Congress 2013 - 2Joseph Levy MedicReS World Congress 2013 - 2
Joseph Levy MedicReS World Congress 2013 - 2
 
SEYED MOHAMMADREZA Hashemian IFI
SEYED MOHAMMADREZA Hashemian IFISEYED MOHAMMADREZA Hashemian IFI
SEYED MOHAMMADREZA Hashemian IFI
 
Gene sequencing and tb – a new age approach.pptx
Gene sequencing and tb – a new age approach.pptxGene sequencing and tb – a new age approach.pptx
Gene sequencing and tb – a new age approach.pptx
 
Dr. Stephanie Rossow - Applications of Next Generation Sequencing
Dr. Stephanie Rossow - Applications of Next Generation SequencingDr. Stephanie Rossow - Applications of Next Generation Sequencing
Dr. Stephanie Rossow - Applications of Next Generation Sequencing
 
Assessing the clinical utility of cancer genomic and proteomic data across tu...
Assessing the clinical utility of cancer genomic and proteomic data across tu...Assessing the clinical utility of cancer genomic and proteomic data across tu...
Assessing the clinical utility of cancer genomic and proteomic data across tu...
 
patho.ppt
patho.pptpatho.ppt
patho.ppt
 

More from Eric Ma

Broad Retreat Poster
Broad Retreat PosterBroad Retreat Poster
Broad Retreat Poster
Eric Ma
 
Reconstructing the social network of viruses in wild ducks
Reconstructing the social network of viruses in wild ducksReconstructing the social network of viruses in wild ducks
Reconstructing the social network of viruses in wild ducks
Eric Ma
 
Investigating the effect of natural variation on an unusual H9 wild isolate s...
Investigating the effect of natural variation on an unusual H9 wild isolate s...Investigating the effect of natural variation on an unusual H9 wild isolate s...
Investigating the effect of natural variation on an unusual H9 wild isolate s...
Eric Ma
 
T Cells Aggravate Intestinal Inflammation and Fibrosis
T Cells Aggravate Intestinal Inflammation and FibrosisT Cells Aggravate Intestinal Inflammation and Fibrosis
T Cells Aggravate Intestinal Inflammation and Fibrosis
Eric Ma
 
The Contribution of T Cells to Intestinal Inflammation and Fibrosis
The Contribution of T Cells to Intestinal Inflammation and FibrosisThe Contribution of T Cells to Intestinal Inflammation and Fibrosis
The Contribution of T Cells to Intestinal Inflammation and Fibrosis
Eric Ma
 
What Synthetic Biology Can Do For You
What Synthetic Biology Can Do For YouWhat Synthetic Biology Can Do For You
What Synthetic Biology Can Do For You
Eric Ma
 

More from Eric Ma (6)

Broad Retreat Poster
Broad Retreat PosterBroad Retreat Poster
Broad Retreat Poster
 
Reconstructing the social network of viruses in wild ducks
Reconstructing the social network of viruses in wild ducksReconstructing the social network of viruses in wild ducks
Reconstructing the social network of viruses in wild ducks
 
Investigating the effect of natural variation on an unusual H9 wild isolate s...
Investigating the effect of natural variation on an unusual H9 wild isolate s...Investigating the effect of natural variation on an unusual H9 wild isolate s...
Investigating the effect of natural variation on an unusual H9 wild isolate s...
 
T Cells Aggravate Intestinal Inflammation and Fibrosis
T Cells Aggravate Intestinal Inflammation and FibrosisT Cells Aggravate Intestinal Inflammation and Fibrosis
T Cells Aggravate Intestinal Inflammation and Fibrosis
 
The Contribution of T Cells to Intestinal Inflammation and Fibrosis
The Contribution of T Cells to Intestinal Inflammation and FibrosisThe Contribution of T Cells to Intestinal Inflammation and Fibrosis
The Contribution of T Cells to Intestinal Inflammation and Fibrosis
 
What Synthetic Biology Can Do For You
What Synthetic Biology Can Do For YouWhat Synthetic Biology Can Do For You
What Synthetic Biology Can Do For You
 

BE Retreat 2015 Poster

  • 1. Data science for pathogen genomic surveillance: predicting quantitative phenotype from genotype Eric J. Ma, Islam T. M. Hussein, Jonathan A. Runstadler Department of Biological Engineering, MIT Analysis: HIV Drug Resistance Cross-Validated Prediction Performance • Good prediction performance: high correlation, low error. Global Drug Resistance Prediction • Model predictions are largely concordant with one another. Important Amino Acids • Match between expert-identified important positions and model predictions. Position 10 33 88 84 47 46 54 Rel. Impt 47% 12% 5% 3% 2% 2% 2% Database Y Y Y Y Y Y Y Temporal Emergence of Drug Resistance • FDA approval dates - IDV (indinavir): 1996, FPV (fosamprenavir): 2003, DRV (duranivir): 2006 (arrows) • Drug resistance emerged a few years after approval • FPV and DRV have similar chemical structures • Goal: Establish example pipeline for genomic surveillance • Input: HIV protease sequence & drug resistance profile Conclusions & Future Work • Machine learning models can predict drug resistance from protein sequence. • Genomic surveillance able to capture temporal rise of drug resistance • Applicable to other pathogens, with high quality genotype-phenotype data Genomic Surveillance • Zoonotic pathogens circulating in wild may affect livestock and human health. • Given sequence information, can we compute a pathogen “risk factor”? • Given a computed risk, can we do preventative surveillance of zoonotic pathogens? Influenza Influenza Genome Structure 1 PB2 2.4 kb 2 PB1 2.4 kb 3 PA 2.2 kb 4 HA 1.8 kb 5 NP 1.6 kb 6 NA 1.5 kb 7 M 1.0 kb 8 NS 0.9 kb Reassortment • Influenza is a zoonotic pathogen that has a broad tropic range. • Segmented genome allows reassortment, accelerating viral evolution. • A high polymerase mutation rate rapidly generates novel sequence diversity. Introduction Difficulties • Necessity: The presence of a point mutation may “enhance” phenotype, but not necessarily cause “dangerouse” phenotype levels (right). • Epistasis: Mapping from genotype to phenotype. • Experiments: Require assays to measure biochemical phenotype relevant to pathogenesis without infecting humans. • Data: Lack genotype-phenotype data. • Biology: Novel sequence diversity generated through error-prone polymerase. Gene(s)Genotypes PhenotypeHT Assay HA NA receptor affinity inhibitor resistance a(2,6) binding a(2,6) cleavage antigenic distanceHA, NA hemagglutinin/ neuraminidase inhibition Pol replication activity polymerase assay Significance infection potential treatability disease burden immunity training data for ML ML predicts phenotype Risk Prediction risk phenotype MERI... MKAK... risk phenotype risk phenotype MNPN... risk phenotype MKAK... MNPN... Application Risk Profile Informed Intervention NS1 IFN-β production dampening innate immunity risk phenotype MDSN... immunity Vision • Assay: Biochemical, quantitative measure relevant to pathogenesis • Characterize: population diversity • Machine learning: learn non-linear mapping from genotype to phenotype • Model: quantitative risk profile Experimentation Plan Rational Library ATGGTAACCA PacBio Sequencing Polymerase Assay Genotype- Phenotype Sequence PEU Machine Learning Web Query MERIKEL MERIREL MDRIKEL MERIKNL 26 10 9 15 • Rational sampling to cover polymorphic diversity. • High throughput library construction & verification. • Safe, scalable, standardized assay of RNA replication rate. • Matched phenotype to genotype • Machine learning models to predict RNA replication rate. • Open data release via web interface & API