Atul Butte NIPS 2017 ML4H

University of California, San Francisco
University of California, San FranciscoDirector, Institute for Computational Health Sciences at the University of California, San Francisco
Translating a Trillion Points of Data into
Diagnostics, Therapies and New Insights
in Health and Disease
atul.butte@ucsf.edu
@atulbutte
Atul Butte, MD, PhD
Director, Institute for Computational
Health Sciences
University of California, San Francisco
Conflicts of Interest
• Scientific founder and
advisory board membership
– Genstruct
– NuMedii
– Personalis
– Carmenta
• Honoraria for talks
– Lilly
– Pfizer
– Siemens
– Bristol Myers Squibb
– AstraZeneca
– Roche
– Genentech
– Warburg Pincus
• Past or present consultancy
– Lilly
– Johnson and Johnson
– Roche
– NuMedii
– Genstruct
– Tercica
– Ecoeos
– Helix
– Ansh Labs
– Prevendia
– Samsung
– Assay Depot
– Regeneron
– Verinata
– Pathway Diagnostics
– Geisinger Health
– Covance
– Wilson Sonsini Goodrich & Rosati
– Orrick
– 10X Genomics
– Medgenics
– GNS Healthcare
– Gerson Lehman Group
– Coatue Management
• Corporate Relationships
– Northrop Grumman
– Aptalis
– Allergan
– Astellas
– Thomson Reuters
– Intel
– SAP
– SV Angel
– Progenity
– Illumina
• Speakers’ bureau
– None
• Companies started by students
– Carmenta
– Serendipity
– Stimulomics
– NunaHealth
– Praedicat
– MyTime
– Flipora
– Tumbl.in
@affymetrix
bit.ly/genedata
The Cancer Genome Atlas
• 14 thousand cases
• 39 types of cancers
• 13 types of data: molecular, clinical, sequencing
227 million substances x
1.3 million assays
More than a billion measurements
within a grid of 300 trillion cells
71 million meet Lipinski 5
1.2 million active substances
http://www.nap.edu/catalog.php?record_id=13284
Marina Sirota
Atul Butte NIPS 2017 ML4H
Preeclampsia: large cause of maternal and fetal death
• Incidence
• 5-8% of all pregnancies in the U.S. and worldwide
• 4.1 million births in the U.S. in 2009
• Up to 300K cases of preeclampsia annually in the U.S.
• Mortality
• Responsible for 18% of all maternal deaths in the U.S.
• Maternal death in 56 out of every 100,000 live births in US
• Neonatal death in 71 out of every 100,000 live births in US
• Cost
• $20 billion in direct costs in the U.S. annually
• Average hospital stay of 3.5 days
Linda Liu
Bruce Ling
Matt Cooper
Atul Butte NIPS 2017 ML4H
New blood markers for preeclampsia
Linda Liu
Bruce Ling
Matt Cooper
@MarchofDimes
bit.ly/preeclamp
Need a
diagnostic for
preeclampsia
Public big data
available
March of Dimes
Center for
Prematurity
Research
Data analyzed,
diagnostic
designed
SPARK grant
($50k)
Life Science
Angels, other
seed investors
($2 million)
@CarmentaBio
progenity.com
bit.ly/carm_prog
@MatthewHerper
bit.ly/newdrug1
Atul Butte NIPS 2017 ML4H
Cancer Discovery 2013, 3:1.
Psychiatric Drug Imipramine Shows Significant Activity
Against Small Cell Lung Cancer
Vehicle control Imipramine
p53/Rb/p130
triple knockout
model of SCLC
Mice dosed after
tumor formation
Joel Dudley
Nadine Jahchan
Julien Sage
Alejandro Sweet-Cordero
Joel Neal
@NuMedii
Bin Chen
Wei Wei
Li Ma
Bin Yang
Mei-Sze Chua
Samuel So
Gastroenterology, 2017
Need more drugs
for more diseases
Public big data
available
NIH funding
Data analyzed,
method designed
Company launched,
ARRA, StartX,
Stanford license,
first deal
Claremont Creek,
Lightspeed ($3.5
million)
@NuMedii
The next big open data: clinical trials
Download 100+ studies today
Drug repositioning, new patient subsets,
digital comparative effectiveness, more!
immport.org
Sanchita Bhattacharya
Elizabeth Thomson
Atul Butte NIPS 2017 ML4H
Atul Butte NIPS 2017 ML4H
22
Atul Butte NIPS 2017 ML4H
Atul Butte NIPS 2017 ML4H
Clinical Data Warehouse
A Big UC Healthcare Data Analytics Platform
Combining healthcare data from across the
six University of California medical schools and systems
The next big data: clinical data
Atul Butte NIPS 2017 ML4H
Atul Butte NIPS 2017 ML4H
ML lessons I’ve learned over 20 years
• Get the question right; solve the problems that health care professionals need solved
– Solve the problems that health care professionals need solved: Don't just guess
• And verify good questions and good unmet needs with more than one doc
– Build a great diagnostic vs. understanding the biology
• Perfectly lassoed variables may miss the big picture biology
– Biologists and medical professionals really love explanations over black boxes
• Watch out for input limiting models
– Patients might not type in the right codes for their symptoms
– They barely enter their own race/ethnicity
– And docs?
• Learn what IRB, HIPAA, BAA are. Learn what ICD-10 and CPT codes are. CLIA and CAP.
– And learn patience.
– Not all of us are cloud allergic.
• Not everything needs deep learning
• Having all data on everyone is super rare: genomics, images, and longitudinal EHR data?
• Health care inefficiency is not about friction
• Data integration and harmonization can happen if there is a business reason for it
• Platforms and their companies are seemingly commoditized
– Come to us with more medical knowledge and background. Convince us you care about this vertical.
– Show us that we are going to learn more from you, than we are going to have to teach you.
Open challenges
• Can’t teach a computer with half the game. Need the start and finish of
medical “stories”
– In medical care, need primary and tertiary care in the same database
– Compound data might be available, but trials data is not
– Means data integration and harmonization is a rate limiter
• Need the right diversity in data
– Otherwise might be extrapolating beyond what was learned
– Need enough data, big amounts of data, true-positive cases
• Need methods to handle complicated multi-modal data
• What does validation mean?
– Drug discovery: pre-clinical success? Or Phase 2 success?
– Clinical accuracy? Or clinical utility?
• Career fear uncertainty and doubt (FUD) in some circles
– Tech recruitment
– Too many startups?
UC Clinical Data Warehouse Team
Executive Team
• Atul Butte
• Joe Bengfort
• Michael Pfeffer
• Tom Andriola
• Chris Longhurst
Steering Committee
• Irfan Chaudhry
• Mohammed Mahbouba
• Lisa Dahm
• David Dobbs
• Kent Andersen
• Ralph James
• Jennifer Holland
• Eugene Lee
ETL Team
• Albert Dugan
• Tony Choe
• Michael Sweeney
• Timothy Satterwhite
• Ayan Patel
• Niranjan Wagle
• Ralph James
• Joseph Dalton
Data Harmonization
• Dana Ludwig
• Daniella Meeker
Data Quality
• Momeena Ali
• Jodie Nygaard
Epic
• Kevin Ames
• Ben Jenkins
• Steve Gesualdo
Business Analyst
• Ankeeta Shukla
Hardware
• Sandeep Chandra
• Jeff Love
• Scott Bailey
• Kwong Law
• Pallav Saxena
Support
• Jack Stobo
• Michael Blum
• Sam Hawgood
Support
Admin and Tech Staff
• Mary Lyall
• Mounira Kenaani
• Kevin Kaier
• Boris Oskotsky
• Mae Moredo
• Ada Chen
• University of California, San Francisco
• Pricilla Chan and Mark Zuckerberg
• NIH: NIAID, NLM, NIGMS, NCI, NHLBI, OD; NIDDK, NHGRI, NIA, NCATS
• March of Dimes
• Juvenile Diabetes Research Foundation
• Hewlett Packard
• Howard Hughes Medical Institute
• California Institute for Regenerative Medicine
• Luke Evnin and Deann Wright (Scleroderma Research Foundation)
• Clayville Research Fund
• PhRMA Foundation
• Stanford Cancer Center, Bio-X, SPARK
• Tarangini Deshpande
• Kimayani Butte
• Sam Hawgood and Keith Yamamoto
• Isaac Kohane
1 of 32

Recommended

Atul Butte's presentation for the FDA 5th Annual Scientific Computing Days by
Atul Butte's presentation for the FDA 5th Annual Scientific Computing DaysAtul Butte's presentation for the FDA 5th Annual Scientific Computing Days
Atul Butte's presentation for the FDA 5th Annual Scientific Computing DaysUniversity of California, San Francisco
3.1K views74 slides
Atul Butte's presentation to the Association of Medical School Pediatric Depa... by
Atul Butte's presentation to the Association of Medical School Pediatric Depa...Atul Butte's presentation to the Association of Medical School Pediatric Depa...
Atul Butte's presentation to the Association of Medical School Pediatric Depa...University of California, San Francisco
6.7K views58 slides
Atul Butte presentation on 2019-02-05 for Accelerating biology 2019: Towards ... by
Atul Butte presentation on 2019-02-05 for Accelerating biology 2019: Towards ...Atul Butte presentation on 2019-02-05 for Accelerating biology 2019: Towards ...
Atul Butte presentation on 2019-02-05 for Accelerating biology 2019: Towards ...University of California, San Francisco
3.1K views40 slides
Precision Medicine World Conference 2017 by
Precision Medicine World Conference 2017Precision Medicine World Conference 2017
Precision Medicine World Conference 2017University of California, San Francisco
7.1K views21 slides
Atul Butte's presentation at the 2015 AMIA Fall Symposium by
Atul Butte's presentation at the 2015 AMIA Fall SymposiumAtul Butte's presentation at the 2015 AMIA Fall Symposium
Atul Butte's presentation at the 2015 AMIA Fall SymposiumUniversity of California, San Francisco
4.5K views21 slides
Presentation given at UCSF Precision Medicine meeting 4/11/2015 by
Presentation given at UCSF Precision Medicine meeting 4/11/2015 Presentation given at UCSF Precision Medicine meeting 4/11/2015
Presentation given at UCSF Precision Medicine meeting 4/11/2015 University of California, San Francisco
1.7K views34 slides

More Related Content

What's hot

2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative... by
2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative...2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative...
2015-04-28 Atul Butte's presentation to the NIH Precision Medicine Initiative...University of California, San Francisco
4.3K views18 slides
Atul Butte's presentation at the Milken Institute Public Health Summit by
Atul Butte's presentation at the Milken Institute Public Health SummitAtul Butte's presentation at the Milken Institute Public Health Summit
Atul Butte's presentation at the Milken Institute Public Health SummitUniversity of California, San Francisco
4.6K views19 slides
2013 09 atul butte mahajani symposium by
2013 09 atul butte mahajani symposium2013 09 atul butte mahajani symposium
2013 09 atul butte mahajani symposiumUniversity of California, San Francisco
971 views38 slides
Atul Butte's presentation at the From Data to Discovery symposium at Westat by
Atul Butte's presentation at the From Data to Discovery symposium at WestatAtul Butte's presentation at the From Data to Discovery symposium at Westat
Atul Butte's presentation at the From Data to Discovery symposium at WestatUniversity of California, San Francisco
3.8K views73 slides
2014 simr presentation by
2014 simr presentation2014 simr presentation
2014 simr presentationUniversity of California, San Francisco
1.4K views52 slides
Atul Butte's AAPS big data workshop presentation 6/2015 by
Atul Butte's AAPS big data workshop presentation 6/2015Atul Butte's AAPS big data workshop presentation 6/2015
Atul Butte's AAPS big data workshop presentation 6/2015University of California, San Francisco
4K views38 slides

What's hot(20)

The Uneven Future of Evidence-Based Medicine by Ida Sim
The Uneven Future of Evidence-Based MedicineThe Uneven Future of Evidence-Based Medicine
The Uneven Future of Evidence-Based Medicine
Ida Sim10.9K views
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In... by The Hive
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
The Hive325 views

Similar to Atul Butte NIPS 2017 ML4H

The Learning Health System: Thinking and Acting Across Scales by
The Learning Health System: Thinking and Acting Across ScalesThe Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across ScalesPhilip Payne
95 views36 slides
Atul Butte's presentation at CTIC 2020 by
Atul Butte's presentation at CTIC 2020Atul Butte's presentation at CTIC 2020
Atul Butte's presentation at CTIC 2020University of California, San Francisco
473 views28 slides
Overview of Health IT by
Overview of Health ITOverview of Health IT
Overview of Health ITNawanan Theera-Ampornpunt
740 views59 slides
AI in Healthcare by
AI in HealthcareAI in Healthcare
AI in HealthcarePaul Agapow
151 views33 slides
bootcamp 2023.pptx by
bootcamp 2023.pptxbootcamp 2023.pptx
bootcamp 2023.pptxHVCClibrary
90 views21 slides
Presentation by Atul Butte at the NSTC Interagency Working Group on Biologica... by
Presentation by Atul Butte at the NSTC Interagency Working Group on Biologica...Presentation by Atul Butte at the NSTC Interagency Working Group on Biologica...
Presentation by Atul Butte at the NSTC Interagency Working Group on Biologica...University of California, San Francisco
2.3K views17 slides

Similar to Atul Butte NIPS 2017 ML4H(20)

The Learning Health System: Thinking and Acting Across Scales by Philip Payne
The Learning Health System: Thinking and Acting Across ScalesThe Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across Scales
Philip Payne95 views
AI in Healthcare by Paul Agapow
AI in HealthcareAI in Healthcare
AI in Healthcare
Paul Agapow151 views
Sophia Zilber - Mito research and data webinar - June 3, 2021 by SophiaZilber
Sophia Zilber - Mito research and data webinar - June 3, 2021Sophia Zilber - Mito research and data webinar - June 3, 2021
Sophia Zilber - Mito research and data webinar - June 3, 2021
SophiaZilber29 views
The End of the Drug Development Casino? by Paul Agapow
The End of the Drug Development Casino?The End of the Drug Development Casino?
The End of the Drug Development Casino?
Paul Agapow124 views
How to Transition from Allopathic to Integrated Practice - IMM Brazil 2015 by Louis Cady, MD
How to Transition from Allopathic to Integrated Practice - IMM Brazil 2015How to Transition from Allopathic to Integrated Practice - IMM Brazil 2015
How to Transition from Allopathic to Integrated Practice - IMM Brazil 2015
Louis Cady, MD590 views
Digital Health Technology: The Ultimate Patient Advocate by David Lee Scher, MD
Digital Health Technology: The Ultimate Patient AdvocateDigital Health Technology: The Ultimate Patient Advocate
Digital Health Technology: The Ultimate Patient Advocate
ML & AI in pharma: an overview by Paul Agapow
ML & AI in pharma: an overviewML & AI in pharma: an overview
ML & AI in pharma: an overview
Paul Agapow207 views
Beyond Proofs of Concept for Biomedical AI by Paul Agapow
Beyond Proofs of Concept for Biomedical AIBeyond Proofs of Concept for Biomedical AI
Beyond Proofs of Concept for Biomedical AI
Paul Agapow162 views
6-005-1430-Keeppanasseril by med20su
6-005-1430-Keeppanasseril6-005-1430-Keeppanasseril
6-005-1430-Keeppanasseril
med20su539 views
From “Big Data” to Digital Medicine--PYA Explores Innovations in Healthcare by PYA, P.C.
From “Big Data” to Digital Medicine--PYA Explores Innovations in HealthcareFrom “Big Data” to Digital Medicine--PYA Explores Innovations in Healthcare
From “Big Data” to Digital Medicine--PYA Explores Innovations in Healthcare
PYA, P.C.1.7K views

Recently uploaded

Cross-network in Google Analytics 4.pdf by
Cross-network in Google Analytics 4.pdfCross-network in Google Analytics 4.pdf
Cross-network in Google Analytics 4.pdfGA4 Tutorials
6 views7 slides
CRM stick or twist workshop by
CRM stick or twist workshopCRM stick or twist workshop
CRM stick or twist workshopinfo828217
10 views16 slides
UNEP FI CRS Climate Risk Results.pptx by
UNEP FI CRS Climate Risk Results.pptxUNEP FI CRS Climate Risk Results.pptx
UNEP FI CRS Climate Risk Results.pptxpekka28
11 views51 slides
[DSC Europe 23] Ivana Sesic - Use of AI in Public Health.pptx by
[DSC Europe 23] Ivana Sesic - Use of AI in Public Health.pptx[DSC Europe 23] Ivana Sesic - Use of AI in Public Health.pptx
[DSC Europe 23] Ivana Sesic - Use of AI in Public Health.pptxDataScienceConferenc1
5 views15 slides
VoxelNet by
VoxelNetVoxelNet
VoxelNettaeseon ryu
9 views21 slides
3196 The Case of The East River by
3196 The Case of The East River3196 The Case of The East River
3196 The Case of The East RiverErickANDRADE90
16 views4 slides

Recently uploaded(20)

Cross-network in Google Analytics 4.pdf by GA4 Tutorials
Cross-network in Google Analytics 4.pdfCross-network in Google Analytics 4.pdf
Cross-network in Google Analytics 4.pdf
GA4 Tutorials6 views
CRM stick or twist workshop by info828217
CRM stick or twist workshopCRM stick or twist workshop
CRM stick or twist workshop
info82821710 views
UNEP FI CRS Climate Risk Results.pptx by pekka28
UNEP FI CRS Climate Risk Results.pptxUNEP FI CRS Climate Risk Results.pptx
UNEP FI CRS Climate Risk Results.pptx
pekka2811 views
3196 The Case of The East River by ErickANDRADE90
3196 The Case of The East River3196 The Case of The East River
3196 The Case of The East River
ErickANDRADE9016 views
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an... by StatsCommunications
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...
SUPER STORE SQL PROJECT.pptx by khan888620
SUPER STORE SQL PROJECT.pptxSUPER STORE SQL PROJECT.pptx
SUPER STORE SQL PROJECT.pptx
khan88862013 views
Organic Shopping in Google Analytics 4.pdf by GA4 Tutorials
Organic Shopping in Google Analytics 4.pdfOrganic Shopping in Google Analytics 4.pdf
Organic Shopping in Google Analytics 4.pdf
GA4 Tutorials16 views
CRIJ4385_Death Penalty_F23.pptx by yvettemm100
CRIJ4385_Death Penalty_F23.pptxCRIJ4385_Death Penalty_F23.pptx
CRIJ4385_Death Penalty_F23.pptx
yvettemm1006 views
Short Story Assignment by Kelly Nguyen by kellynguyen01
Short Story Assignment by Kelly NguyenShort Story Assignment by Kelly Nguyen
Short Story Assignment by Kelly Nguyen
kellynguyen0119 views
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx by DataScienceConferenc1
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init... by DataScienceConferenc1
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...
Chapter 3b- Process Communication (1) (1)(1) (1).pptx by ayeshabaig2004
Chapter 3b- Process Communication (1) (1)(1) (1).pptxChapter 3b- Process Communication (1) (1)(1) (1).pptx
Chapter 3b- Process Communication (1) (1)(1) (1).pptx
ayeshabaig20047 views
Data Journeys Hard Talk workshop final.pptx by info828217
Data Journeys Hard Talk workshop final.pptxData Journeys Hard Talk workshop final.pptx
Data Journeys Hard Talk workshop final.pptx
info82821710 views
Ukraine Infographic_22NOV2023_v2.pdf by AnastosiyaGurin
Ukraine Infographic_22NOV2023_v2.pdfUkraine Infographic_22NOV2023_v2.pdf
Ukraine Infographic_22NOV2023_v2.pdf
AnastosiyaGurin1.4K views
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M... by DataScienceConferenc1
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
CRM stick or twist.pptx by info828217
CRM stick or twist.pptxCRM stick or twist.pptx
CRM stick or twist.pptx
info82821711 views
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation by DataScienceConferenc1
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation

Atul Butte NIPS 2017 ML4H

  • 1. Translating a Trillion Points of Data into Diagnostics, Therapies and New Insights in Health and Disease atul.butte@ucsf.edu @atulbutte Atul Butte, MD, PhD Director, Institute for Computational Health Sciences University of California, San Francisco
  • 2. Conflicts of Interest • Scientific founder and advisory board membership – Genstruct – NuMedii – Personalis – Carmenta • Honoraria for talks – Lilly – Pfizer – Siemens – Bristol Myers Squibb – AstraZeneca – Roche – Genentech – Warburg Pincus • Past or present consultancy – Lilly – Johnson and Johnson – Roche – NuMedii – Genstruct – Tercica – Ecoeos – Helix – Ansh Labs – Prevendia – Samsung – Assay Depot – Regeneron – Verinata – Pathway Diagnostics – Geisinger Health – Covance – Wilson Sonsini Goodrich & Rosati – Orrick – 10X Genomics – Medgenics – GNS Healthcare – Gerson Lehman Group – Coatue Management • Corporate Relationships – Northrop Grumman – Aptalis – Allergan – Astellas – Thomson Reuters – Intel – SAP – SV Angel – Progenity – Illumina • Speakers’ bureau – None • Companies started by students – Carmenta – Serendipity – Stimulomics – NunaHealth – Praedicat – MyTime – Flipora – Tumbl.in
  • 5. The Cancer Genome Atlas • 14 thousand cases • 39 types of cancers • 13 types of data: molecular, clinical, sequencing
  • 6. 227 million substances x 1.3 million assays More than a billion measurements within a grid of 300 trillion cells 71 million meet Lipinski 5 1.2 million active substances
  • 10. Preeclampsia: large cause of maternal and fetal death • Incidence • 5-8% of all pregnancies in the U.S. and worldwide • 4.1 million births in the U.S. in 2009 • Up to 300K cases of preeclampsia annually in the U.S. • Mortality • Responsible for 18% of all maternal deaths in the U.S. • Maternal death in 56 out of every 100,000 live births in US • Neonatal death in 71 out of every 100,000 live births in US • Cost • $20 billion in direct costs in the U.S. annually • Average hospital stay of 3.5 days Linda Liu Bruce Ling Matt Cooper
  • 12. New blood markers for preeclampsia Linda Liu Bruce Ling Matt Cooper @MarchofDimes bit.ly/preeclamp
  • 13. Need a diagnostic for preeclampsia Public big data available March of Dimes Center for Prematurity Research Data analyzed, diagnostic designed SPARK grant ($50k) Life Science Angels, other seed investors ($2 million) @CarmentaBio progenity.com bit.ly/carm_prog
  • 16. Cancer Discovery 2013, 3:1. Psychiatric Drug Imipramine Shows Significant Activity Against Small Cell Lung Cancer Vehicle control Imipramine p53/Rb/p130 triple knockout model of SCLC Mice dosed after tumor formation Joel Dudley Nadine Jahchan Julien Sage Alejandro Sweet-Cordero Joel Neal @NuMedii
  • 17. Bin Chen Wei Wei Li Ma Bin Yang Mei-Sze Chua Samuel So Gastroenterology, 2017
  • 18. Need more drugs for more diseases Public big data available NIH funding Data analyzed, method designed Company launched, ARRA, StartX, Stanford license, first deal Claremont Creek, Lightspeed ($3.5 million) @NuMedii
  • 19. The next big open data: clinical trials Download 100+ studies today Drug repositioning, new patient subsets, digital comparative effectiveness, more! immport.org Sanchita Bhattacharya Elizabeth Thomson
  • 22. 22
  • 25. Clinical Data Warehouse A Big UC Healthcare Data Analytics Platform Combining healthcare data from across the six University of California medical schools and systems
  • 26. The next big data: clinical data
  • 29. ML lessons I’ve learned over 20 years • Get the question right; solve the problems that health care professionals need solved – Solve the problems that health care professionals need solved: Don't just guess • And verify good questions and good unmet needs with more than one doc – Build a great diagnostic vs. understanding the biology • Perfectly lassoed variables may miss the big picture biology – Biologists and medical professionals really love explanations over black boxes • Watch out for input limiting models – Patients might not type in the right codes for their symptoms – They barely enter their own race/ethnicity – And docs? • Learn what IRB, HIPAA, BAA are. Learn what ICD-10 and CPT codes are. CLIA and CAP. – And learn patience. – Not all of us are cloud allergic. • Not everything needs deep learning • Having all data on everyone is super rare: genomics, images, and longitudinal EHR data? • Health care inefficiency is not about friction • Data integration and harmonization can happen if there is a business reason for it • Platforms and their companies are seemingly commoditized – Come to us with more medical knowledge and background. Convince us you care about this vertical. – Show us that we are going to learn more from you, than we are going to have to teach you.
  • 30. Open challenges • Can’t teach a computer with half the game. Need the start and finish of medical “stories” – In medical care, need primary and tertiary care in the same database – Compound data might be available, but trials data is not – Means data integration and harmonization is a rate limiter • Need the right diversity in data – Otherwise might be extrapolating beyond what was learned – Need enough data, big amounts of data, true-positive cases • Need methods to handle complicated multi-modal data • What does validation mean? – Drug discovery: pre-clinical success? Or Phase 2 success? – Clinical accuracy? Or clinical utility? • Career fear uncertainty and doubt (FUD) in some circles – Tech recruitment – Too many startups?
  • 31. UC Clinical Data Warehouse Team Executive Team • Atul Butte • Joe Bengfort • Michael Pfeffer • Tom Andriola • Chris Longhurst Steering Committee • Irfan Chaudhry • Mohammed Mahbouba • Lisa Dahm • David Dobbs • Kent Andersen • Ralph James • Jennifer Holland • Eugene Lee ETL Team • Albert Dugan • Tony Choe • Michael Sweeney • Timothy Satterwhite • Ayan Patel • Niranjan Wagle • Ralph James • Joseph Dalton Data Harmonization • Dana Ludwig • Daniella Meeker Data Quality • Momeena Ali • Jodie Nygaard Epic • Kevin Ames • Ben Jenkins • Steve Gesualdo Business Analyst • Ankeeta Shukla Hardware • Sandeep Chandra • Jeff Love • Scott Bailey • Kwong Law • Pallav Saxena Support • Jack Stobo • Michael Blum • Sam Hawgood
  • 32. Support Admin and Tech Staff • Mary Lyall • Mounira Kenaani • Kevin Kaier • Boris Oskotsky • Mae Moredo • Ada Chen • University of California, San Francisco • Pricilla Chan and Mark Zuckerberg • NIH: NIAID, NLM, NIGMS, NCI, NHLBI, OD; NIDDK, NHGRI, NIA, NCATS • March of Dimes • Juvenile Diabetes Research Foundation • Hewlett Packard • Howard Hughes Medical Institute • California Institute for Regenerative Medicine • Luke Evnin and Deann Wright (Scleroderma Research Foundation) • Clayville Research Fund • PhRMA Foundation • Stanford Cancer Center, Bio-X, SPARK • Tarangini Deshpande • Kimayani Butte • Sam Hawgood and Keith Yamamoto • Isaac Kohane