Atul Butte's presentation to the Association of Medical School Pediatric Department Chairs #AMSPDC on March 3, 2018.
Some pre-publication data slides have been removed from this deck.
Atul Butte's presentation to the Association of Medical School Pediatric Department Chairs #AMSPDC on March 3, 2018.
Some pre-publication data slides have been removed from this deck.
The Uneven Future of Evidence-Based MedicineIda Sim
An Apple ResearchKit study enrolled 22,000 people in five days. A
study claims that Twitter can be used to identify depressed patients. A computer program crunches genomic data, the published literature, and electronic health record data to guide cancer treatment. The pace, the data sources, and the methods for generating medical evidence are changing radically. What will — what should — evidence-based medicine look like in a faster, personalized, data-dense tomorrow?
- Presented as the 3rd Annual Cochrane Lecture, October 2015 in Vienna, Austria.
The Learning Health System: Thinking and Acting Across ScalesPhilip Payne
A Learning Health System (LHS) can be defined as an environment in which knowledge generation processes are embedded into daily clinical practice in order to continually improve the quality, safety, and outcomes of healthcare delivery. While still largely an aspirational goal, the promise of the LHS is a future in which every patient encounter is an opportunity to learn and improve that patient’s care, as well as the care their family and broader community receives. The foundation for building such an LHS can and should be the Electronic Health Record (EHR), which provides the basis for the comprehensive instrumentation and measurement of clinical phenotypes, as well as a means of delivering new evidence at the patient- and population levels. In this presentation, we will explore the ways in which such EHR-derived phenotypes can be combined with complementary data across a spectrum from biomolecules to population level trends, to both generate insights and deliver such knowledge in the right time, place, and format, ultimately improving clinical outcomes and value.
The Uneven Future of Evidence-Based MedicineIda Sim
An Apple ResearchKit study enrolled 22,000 people in five days. A
study claims that Twitter can be used to identify depressed patients. A computer program crunches genomic data, the published literature, and electronic health record data to guide cancer treatment. The pace, the data sources, and the methods for generating medical evidence are changing radically. What will — what should — evidence-based medicine look like in a faster, personalized, data-dense tomorrow?
- Presented as the 3rd Annual Cochrane Lecture, October 2015 in Vienna, Austria.
The Learning Health System: Thinking and Acting Across ScalesPhilip Payne
A Learning Health System (LHS) can be defined as an environment in which knowledge generation processes are embedded into daily clinical practice in order to continually improve the quality, safety, and outcomes of healthcare delivery. While still largely an aspirational goal, the promise of the LHS is a future in which every patient encounter is an opportunity to learn and improve that patient’s care, as well as the care their family and broader community receives. The foundation for building such an LHS can and should be the Electronic Health Record (EHR), which provides the basis for the comprehensive instrumentation and measurement of clinical phenotypes, as well as a means of delivering new evidence at the patient- and population levels. In this presentation, we will explore the ways in which such EHR-derived phenotypes can be combined with complementary data across a spectrum from biomolecules to population level trends, to both generate insights and deliver such knowledge in the right time, place, and format, ultimately improving clinical outcomes and value.
Remote presentation by Atul Butte at the NSTC Interagency Working Group on Biological Data Sharing on 2019-06-12.
The working group is charged by the National Science and Technology Council to develop a road map to enable robust sharing and maximize reuse of biological data, identifying opportunities for interagency coordination, and academic, industrial, and international partnerships. The workshop will bring together a diverse community of government, academic, and industrial stakeholders to identify key bottlenecks and challenges that interfere with the open exchange of information and to identify potential solutions that will accelerate biological science research.
How to Transition from Allopathic to Integrated Practice - IMM Brazil 2015Louis Cady, MD
In this lecture, Dr. Cady compares and contrasts the significance differences, both conceptually and practically, between the conventional practice of medicine and a more rational, functional, integrated approach. Tactical concepts and didactic tools to make the transition are reviewed.
From “Big Data” to Digital Medicine--PYA Explores Innovations in HealthcarePYA, P.C.
With reform in healthcare and advancements in technology, the future of medicine is in a state of flux. What it all means can be heard in discussions from coast-to-coast, in the halls of hospitals, at conferences, and in board rooms.
Among the thought leaders who have broached this timely subject is PYA Principal Kent Bottles, MD, who is also PYA Analytics’ Chief Medical Officer. He recently spoke at The North American Menopause Society Annual Meeting on the topic: “The Perils and Prospects of Practicing Medicine in a Digital Era.”
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
1. Translating a Trillion Points of Data into
Diagnostics, Therapies and New Insights
in Health and Disease
atul.butte@ucsf.edu
@atulbutte
Atul Butte, MD, PhD
Director, Institute for Computational
Health Sciences
University of California, San Francisco
5. The Cancer Genome Atlas
• 14 thousand cases
• 39 types of cancers
• 13 types of data: molecular, clinical, sequencing
6. 227 million substances x
1.3 million assays
More than a billion measurements
within a grid of 300 trillion cells
71 million meet Lipinski 5
1.2 million active substances
10. Preeclampsia: large cause of maternal and fetal death
• Incidence
• 5-8% of all pregnancies in the U.S. and worldwide
• 4.1 million births in the U.S. in 2009
• Up to 300K cases of preeclampsia annually in the U.S.
• Mortality
• Responsible for 18% of all maternal deaths in the U.S.
• Maternal death in 56 out of every 100,000 live births in US
• Neonatal death in 71 out of every 100,000 live births in US
• Cost
• $20 billion in direct costs in the U.S. annually
• Average hospital stay of 3.5 days
Linda Liu
Bruce Ling
Matt Cooper
11.
12. New blood markers for preeclampsia
Linda Liu
Bruce Ling
Matt Cooper
@MarchofDimes
bit.ly/preeclamp
13. Need a
diagnostic for
preeclampsia
Public big data
available
March of Dimes
Center for
Prematurity
Research
Data analyzed,
diagnostic
designed
SPARK grant
($50k)
Life Science
Angels, other
seed investors
($2 million)
@CarmentaBio
progenity.com
bit.ly/carm_prog
16. Cancer Discovery 2013, 3:1.
Psychiatric Drug Imipramine Shows Significant Activity
Against Small Cell Lung Cancer
Vehicle control Imipramine
p53/Rb/p130
triple knockout
model of SCLC
Mice dosed after
tumor formation
Joel Dudley
Nadine Jahchan
Julien Sage
Alejandro Sweet-Cordero
Joel Neal
@NuMedii
17. Bin Chen
Wei Wei
Li Ma
Bin Yang
Mei-Sze Chua
Samuel So
Gastroenterology, 2017
18. Need more drugs
for more diseases
Public big data
available
NIH funding
Data analyzed,
method designed
Company launched,
ARRA, StartX,
Stanford license,
first deal
Claremont Creek,
Lightspeed ($3.5
million)
@NuMedii
19. The next big open data: clinical trials
Download 100+ studies today
Drug repositioning, new patient subsets,
digital comparative effectiveness, more!
immport.org
Sanchita Bhattacharya
Elizabeth Thomson
25. Clinical Data Warehouse
A Big UC Healthcare Data Analytics Platform
Combining healthcare data from across the
six University of California medical schools and systems
29. ML lessons I’ve learned over 20 years
• Get the question right; solve the problems that health care professionals need solved
– Solve the problems that health care professionals need solved: Don't just guess
• And verify good questions and good unmet needs with more than one doc
– Build a great diagnostic vs. understanding the biology
• Perfectly lassoed variables may miss the big picture biology
– Biologists and medical professionals really love explanations over black boxes
• Watch out for input limiting models
– Patients might not type in the right codes for their symptoms
– They barely enter their own race/ethnicity
– And docs?
• Learn what IRB, HIPAA, BAA are. Learn what ICD-10 and CPT codes are. CLIA and CAP.
– And learn patience.
– Not all of us are cloud allergic.
• Not everything needs deep learning
• Having all data on everyone is super rare: genomics, images, and longitudinal EHR data?
• Health care inefficiency is not about friction
• Data integration and harmonization can happen if there is a business reason for it
• Platforms and their companies are seemingly commoditized
– Come to us with more medical knowledge and background. Convince us you care about this vertical.
– Show us that we are going to learn more from you, than we are going to have to teach you.
30. Open challenges
• Can’t teach a computer with half the game. Need the start and finish of
medical “stories”
– In medical care, need primary and tertiary care in the same database
– Compound data might be available, but trials data is not
– Means data integration and harmonization is a rate limiter
• Need the right diversity in data
– Otherwise might be extrapolating beyond what was learned
– Need enough data, big amounts of data, true-positive cases
• Need methods to handle complicated multi-modal data
• What does validation mean?
– Drug discovery: pre-clinical success? Or Phase 2 success?
– Clinical accuracy? Or clinical utility?
• Career fear uncertainty and doubt (FUD) in some circles
– Tech recruitment
– Too many startups?
31. UC Clinical Data Warehouse Team
Executive Team
• Atul Butte
• Joe Bengfort
• Michael Pfeffer
• Tom Andriola
• Chris Longhurst
Steering Committee
• Irfan Chaudhry
• Mohammed Mahbouba
• Lisa Dahm
• David Dobbs
• Kent Andersen
• Ralph James
• Jennifer Holland
• Eugene Lee
ETL Team
• Albert Dugan
• Tony Choe
• Michael Sweeney
• Timothy Satterwhite
• Ayan Patel
• Niranjan Wagle
• Ralph James
• Joseph Dalton
Data Harmonization
• Dana Ludwig
• Daniella Meeker
Data Quality
• Momeena Ali
• Jodie Nygaard
Epic
• Kevin Ames
• Ben Jenkins
• Steve Gesualdo
Business Analyst
• Ankeeta Shukla
Hardware
• Sandeep Chandra
• Jeff Love
• Scott Bailey
• Kwong Law
• Pallav Saxena
Support
• Jack Stobo
• Michael Blum
• Sam Hawgood
32. Support
Admin and Tech Staff
• Mary Lyall
• Mounira Kenaani
• Kevin Kaier
• Boris Oskotsky
• Mae Moredo
• Ada Chen
• University of California, San Francisco
• Pricilla Chan and Mark Zuckerberg
• NIH: NIAID, NLM, NIGMS, NCI, NHLBI, OD; NIDDK, NHGRI, NIA, NCATS
• March of Dimes
• Juvenile Diabetes Research Foundation
• Hewlett Packard
• Howard Hughes Medical Institute
• California Institute for Regenerative Medicine
• Luke Evnin and Deann Wright (Scleroderma Research Foundation)
• Clayville Research Fund
• PhRMA Foundation
• Stanford Cancer Center, Bio-X, SPARK
• Tarangini Deshpande
• Kimayani Butte
• Sam Hawgood and Keith Yamamoto
• Isaac Kohane