SlideShare a Scribd company logo
Big Data in Health Care
Strata – New York
September, 28th 2016
Sabrina Dahlgren, Director, Health care Delivery and Innovation, Kaiser Permanente - @sabridahlgren
Taposh Dutta Roy, Health Lead, Innovation and Data Science, Decision Support, Kaiser Permanente -@taposhdr
Rajiv Synghal, Chief Architect, Big Data Strategy, Kaiser Permanente @synghalr
the Largest integrated delivery system in the US
primary care
specialty care
home care
lab
hospital
pharmacy
optical
dental
research
insurance
integratedApproaching 11 million members
INTEGRATED CARE DELIVERY
200K employees
19,000 physicians
38 hospitals
608 medical offices and
other outpatient facilities
$61 billion operating revenue
(2016)
Mission: Kaiser Permanente exists to provide high-quality, affordable
health care services and to improve the health of our members and
the communities we serve.
TECHNOLOGYINNOVATION ORIGINS
1960s|Dr. Sidney Garfield & Dr. Morris Collen
“We	should	begin	to	take	advantage	of	electronic	digital	computers.”
“Continuing	total	health	care	requires	a	continuing	life	record	for	
each	individual…The	content	of	that	life	record,	now	made	possible	
by	computer	information	technology,	will	chart	the	course	to	be	
taken	by	each	individual	for	optimal	health.”
Sidney	Garfield,	MD
Hospital	Computer	Systems,	1974
Supporting Health Care with Technology
OUR TECHNOLOGY JOURNEY
EHR
(KP Health
Connect)
Circle of Support
(continuous
availability, ancillary
systems)
Online
(KP.org)
Transforming
Health Care
Foundational
(Facilities)
Precision Medicine
Mobile App
(Video
Visits) Machine learning
Patient 360 View
(Cloud, IOT)
HEALTH CARE INDUSTRY
The
future
is
almost
here …
Providers
Employers
Payers
Retail	
Pharmacies
Tech	
Players
Pharmaceutical
Life	Sciences
Patients/
Consumer
HEALTHCARE ECOSYSTEM
Source: Bain and Co.
6 HRI Consumer Survey, PwC, 2013, 2015
Percentage of consumers with at least one
medical, health or fitness app on their mobile
devices doubled from 2013 to 2015*
7
TECHNOLOGY ADOPTION
8
Health 2050 Vision
Figure 1,Health 2050: The Realization of Personalized Medicine through Crowdsourcing, the Quantified Self, and the Participatory Biocitizen, Journal of
Personalized Medicine
Traditional
Population-
wide
demographics
Cohort-
relevant
measures
Individual
N=1
Traditional Model Emerging Model Future Model
Analytics and Enterprise Information Strategy…
9
right insights
Business
intelligence
and analytics
right time
and form
Delivery and
decision aids
good data
Data
sources and
platforms
information as a
strategic asset
Frame right questions
Make better decisions fast
Link decisions to action
decision maker
centered
Leaders & Managers
Care and Service Providers
Patients, Members, Groups
Approach Strategy Outcome
Easy to
Frame right questions
Know and do right things
Make better decisions
right information
+
at the right time
+
in the right hands
Making the RIGHT Architecture Choices…
Feature RDBMS	Ecosystem Distributed Ecosystem
Security ü Easy	within	a	System.	Fragmented	across	Enterprise. ü Five	tier	security	built-in across	all	enterprise	systems.
Governance ü Easy	within	a	domain.	Difficult	across	domains. ü Easy	across	systems.	Easy	to	align	fields	and	definitions.
Organizational	Preparedness ü Mature	People, Process. § Maturing	People,	Processes
Operational Readiness ü Mature § Maturing.	24	x	7	Support.	Active-Active	DR. Class	of	Service	0/1.
Storage	Formats § Single.	Opaque	storage formats. ü Multiple. Open	storage	formats.
Row	Vs.	Columnar § Maturing ü Mature
Data	Partitioning § Add	on	component.	After	thought. ü Partitioning	and	hashing	are	first	class	citizens.
OLAP § Design	time	and	pre-built	aggregations. ü On	demand,	real	time	aggregations.	Agility.
Alignment § Version	mismatch	across	functional	components ü Functional	components	are	an	integral	part	of	the	ecosystem
Search	&	Data	Mining § Missing.	Add on	component. ü Native
h/w	Scalability § Each	tier	has	to	independently	scale - CPU/Memory. ü Built	in	scalability.
System	Scalability § Vertical ü Horizontal
RDBMS	Ecosystem	- Distributed	Ecosystem	-
Centralized	Data	and	Processing	on	a	Distributed	Platform.
Minimize data
movement
Data and its various interaction points
Heuristic
processing
Identify data
linkages
Novel parallel processing
techniques and
algorithms
Liberate data from
source systems
Knowledge silos
Fragmented	Data	and	Processing	on	RDBMS	Platforms.
Siloed IT
departments
Fragmented data
models
Data
repurposing
Data
rejuvenation
Fragmented profiling &
data quality
Missing data
linkages
Siloed
business units
Fragmented data selection
rules, business logic,
algorithms…
11
Landing	Zone	(Home	to	Secure	and	Organized	Data)
- A	Self	Service	Data	Platform	hosting	both	the	raw	
and	prepared	data	sets	for	quick	business	
consumption.
§ Security	and	Data	Organization	are	both	First	
Class	Citizens:	Five	Layers	of	Security.	Data	
Organized	by	domains	and	use	cases.
§ Unified,	Interconnected	Data:	One	place	to	store	
all	data,	in	any	format,	large	volumes.
è Import	external	data	to	comingle	and	
corroborate	with	internal	data
ç De-Identify	internal	data	to	share	with	and	
leverage	external	partners.
§ Self-Service	Exploratory	BI: Allows	users	to	
explore,	discover,	and	mine	data,	with	full	
security,	using	interactive	analytical	tools	
§ Advanced	Analytics: Simple	tabular	data	can	mix	
with	more	complex	and	multi-structured	data	in	
ways	that	were	never	before	possible
HDFS
Raw
Data
Zone
All Data Encrypted @ Rest
JDBC	- Impala	Lookup	Query	/	Arcadia	OLAP	Query
User
Defined
Zone
Refined
Data
Zone
Master
Data
Zone
Meta
Data
Zone
Reference
Data
Zone
Usage
Data
Zone
Exploratory	Intelligence
Analyze MineRefineDiscover
Smart	Data	Zone
(Semantic	Layer)
Data	Platform		Landing	Zone
Visualize
Building the Future Data Platform…
Collect Curate Enrich
12
Pass
Sample Design
Collect
Ingest
Pre-checks
Failed
Ingest
Process
2016-08-25 5:35
2016-08-25 5:35
Curate Enrich
Pass
Verification
Pass
Data
Profiling
2016-08-25 5:35
2016-08-25 5:35
Pass
Enrich
Pre-checks
NA
Enrich
Process
2016-08-25 5:35
Use-case
Growth of run time
Growth of storage
Total tables
processed
40 tables
(Q1: 35 tables)
40 tables 12 tables
Source
Systematic Data Liberation…
Solving data
repurposing
– single
copy,
multiple
use, enrich &
context
archive
continuous
archive
ingest
true ELT
semantic
equivalence
13
episodes
pharmacy
membership
lab
kp.org
employee
hr
system logs
help desk
Consistent User Experience: One Platform
HDFS
Raw
Data
Zone
All Data Encrypted @ Rest
JDBC	- Impala	Lookup	Query	/	Arcadia	OLAP	Query
User
Defined
Zone
Refined
Data
Zone
Master
Data
Zone
Meta
Data
Zone
Reference
Data
Zone
Usage
Data
Zone
Exploratory	
Intelligence
Analyze MineRefineDiscover
Smart	Data	Zone
(Semantic	Layer)
Data	Platform	– Landing	Zone
Visualize
episode groupings
high utilizers
actionable findings
search prescriptions
semantic
hr analytics
risk intelligence
Liberate Curate Enrich ConsumeCollect
14
360 Member View
Personal	Behaviors
(Life	Style	Choices,
Preferences,	Activities,	QoL)	
Social	Factors
(Friends,	Family,	Affiliations,	
Communication,	Activities)
Demographic	Factors	
(Age,	Address,	Employer,	Industry)
Family	History	and	Genetics
Personal	“-omics”
(Genomics,	Proteomics,	Transcriptomes,	
Metabolomics)
Medical	Care
(encounter,	labs,
Rx,	medical	devices,	etc.)
Environmental
Factors	
Environment
(Temperature,
Humidity,
Pollen	Count,..)
Geographic
(Closest	Hospital,
Pharmacy,
Care	Clinic,…)
Member
15
Analytics - Current State
Self Serve
Reporting
Machine
Learning
16
Machine Learning - Current Process
Refining data
(clean/impute)
Applying variety of
algorithms – training
& testing datasets
Getting relevant
data
Feature
engineering
Data
pipeline
Dashboard for
model
performance
Data
pipeline
Continuous
improvement
Our Analytics Strategy
Mentoring
Developed our
people through
training.
Our
analytics
strategy
1
Challenging
Developed an
internal crowd
sourced machine
learning challenge.
2
Enabling
Provided an
infrastructure to
explore and
develop.
3
Mentoring
September 30, 2016
Training Programs In Person Events Ongoing Learning
Training our people
on big data and
statistical tools.
Developed
learning forum and
venues to support
machine learning.
Lunch and learn
seminars
19
The power of crowd
https://en.wikipedia.org/wiki/Board_of_Longitude
What was the first problem solved through crowd sourcing
and when?
1714 – British Board of Longitude
Problem : To determine the longitude of a ship at sea.
Data Science Challenge
September 30, 2016
Nature (Aug 2016): Crowdsourcing biomedical research: leveraging communities as innovation engines Julio Saez-Rodriguez et. al.
21
Organizational Challenge Flow
Nature (Aug 2016): Crowdsourcing biomedical research: leveraging communities as innovation engines Julio Saez-Rodriguez et. al.
IMPROVED ANALYTICS INFRASTRUCTURE
Competition
enabled us to test
an analytics
ecosystem
Competition
prompted us to
develop a process
to manage open
source tools.
The data science
team provided
access to
participants to de-
identified data
following HIPAA
compliance.
Results from Challenge
Fast facts
• Over 100 KP data scientists participated in the competition
• 1000 models were submitted in 6 weeks
• Model performance improved by more than 5%
• … in less than <10% of time
Learnings
•Discussion forums were lively and collaboration increased
•New algorithm strategies were discovered
•Papers are planned to be published to share the learning on
the algorithm strategies
Training
Data Science
Competition
Leaderboard
Improvement
Culture Change
Seminars
STEP WISE APPROACH FOR CHANGE
25
Analytical Opportunities
1 2 3 4 5 6 7 8 9 10
Food
Recommendations
Environment
Monitoring
Resource
Recommendations
Drive
Recommendation
Biometric
Monitoring
Alerts/
Dashboards
Brand
Sentiment
Triage
Recommendations
Expert
Advice
Event
Prediction
26
Acknowledgement
We wish to acknowledge the contribution of
many to this work:
• The Permanente Medical Group Physicians and the
Permanente Federation
• Health Plan and Hospital Operations, Quality and Finance
teams
• Kaiser Permanente Information Technology
It takes a “virtual” village … !

More Related Content

What's hot

Big data issues and challenges
Big data issues and challengesBig data issues and challenges
Big data issues and challenges
Dilpreet kaur Virk
 
Big Data - Insights & Challenges
Big Data - Insights & ChallengesBig Data - Insights & Challenges
Big Data - Insights & Challenges
Rupen Momaya
 
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
Amit Sheth
 
Challenges of Big Data Research
Challenges of Big Data ResearchChallenges of Big Data Research
Challenges of Big Data Research
Regional Science Academy
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big Data
MapR Technologies
 
Addressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop WayAddressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop Way
Xoriant Corporation
 
M.Florence Dayana
M.Florence DayanaM.Florence Dayana
M.Florence Dayana
Dr.Florence Dayana
 
Health care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cureHealth care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cure
Edureka!
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
BYTE Project
 
BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia
BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la IglesiaBIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia
BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia
Maria de la Iglesia
 
6.a survey on big data challenges in the context of predictive
6.a survey on big data challenges in the context of predictive6.a survey on big data challenges in the context of predictive
6.a survey on big data challenges in the context of predictive
EditorJST
 
Running Mixed Workloads on Kubernetes at IHME
Running Mixed Workloads on Kubernetes at IHMERunning Mixed Workloads on Kubernetes at IHME
Running Mixed Workloads on Kubernetes at IHME
Tyrone Grandison
 
Data Mining and Big Data Challenges and Research Opportunities
Data Mining and Big Data Challenges and Research OpportunitiesData Mining and Big Data Challenges and Research Opportunities
Data Mining and Big Data Challenges and Research Opportunities
Kathirvel Ayyaswamy
 
Making Data Work: Organizational Practices for Getting Value from Information
Making Data Work:  Organizational Practices for Getting Value from InformationMaking Data Work:  Organizational Practices for Getting Value from Information
Making Data Work: Organizational Practices for Getting Value from Information
Alex Santana
 
User Experience - How Sensors and Big Data will change your Healthcare experi...
User Experience - How Sensors and Big Data will change your Healthcare experi...User Experience - How Sensors and Big Data will change your Healthcare experi...
User Experience - How Sensors and Big Data will change your Healthcare experi...
Mark D'Cunha
 
Big data's impact on healthcare
Big data's impact on healthcareBig data's impact on healthcare
Big data's impact on healthcare
René Kuipers
 
2013 DataCite Summer Meeting - California Digital Library (Joan Starr - Calif...
2013 DataCite Summer Meeting - California Digital Library (Joan Starr - Calif...2013 DataCite Summer Meeting - California Digital Library (Joan Starr - Calif...
2013 DataCite Summer Meeting - California Digital Library (Joan Starr - Calif...
datacite
 
HP & Sogeti Healthcare Big Data Presentation for Discover 2015
HP & Sogeti Healthcare Big Data Presentation for Discover 2015HP & Sogeti Healthcare Big Data Presentation for Discover 2015
HP & Sogeti Healthcare Big Data Presentation for Discover 2015
Robert LeRoy
 

What's hot (20)

Big data issues and challenges
Big data issues and challengesBig data issues and challenges
Big data issues and challenges
 
Big Data - Insights & Challenges
Big Data - Insights & ChallengesBig Data - Insights & Challenges
Big Data - Insights & Challenges
 
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
 
Challenges of Big Data Research
Challenges of Big Data ResearchChallenges of Big Data Research
Challenges of Big Data Research
 
Big data analystics
Big data analysticsBig data analystics
Big data analystics
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big Data
 
Addressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop WayAddressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop Way
 
M.Florence Dayana
M.Florence DayanaM.Florence Dayana
M.Florence Dayana
 
Health care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cureHealth care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cure
 
Big Data: Issues and Challenges
Big Data: Issues and ChallengesBig Data: Issues and Challenges
Big Data: Issues and Challenges
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
 
BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia
BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la IglesiaBIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia
BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia
 
6.a survey on big data challenges in the context of predictive
6.a survey on big data challenges in the context of predictive6.a survey on big data challenges in the context of predictive
6.a survey on big data challenges in the context of predictive
 
Running Mixed Workloads on Kubernetes at IHME
Running Mixed Workloads on Kubernetes at IHMERunning Mixed Workloads on Kubernetes at IHME
Running Mixed Workloads on Kubernetes at IHME
 
Data Mining and Big Data Challenges and Research Opportunities
Data Mining and Big Data Challenges and Research OpportunitiesData Mining and Big Data Challenges and Research Opportunities
Data Mining and Big Data Challenges and Research Opportunities
 
Making Data Work: Organizational Practices for Getting Value from Information
Making Data Work:  Organizational Practices for Getting Value from InformationMaking Data Work:  Organizational Practices for Getting Value from Information
Making Data Work: Organizational Practices for Getting Value from Information
 
User Experience - How Sensors and Big Data will change your Healthcare experi...
User Experience - How Sensors and Big Data will change your Healthcare experi...User Experience - How Sensors and Big Data will change your Healthcare experi...
User Experience - How Sensors and Big Data will change your Healthcare experi...
 
Big data's impact on healthcare
Big data's impact on healthcareBig data's impact on healthcare
Big data's impact on healthcare
 
2013 DataCite Summer Meeting - California Digital Library (Joan Starr - Calif...
2013 DataCite Summer Meeting - California Digital Library (Joan Starr - Calif...2013 DataCite Summer Meeting - California Digital Library (Joan Starr - Calif...
2013 DataCite Summer Meeting - California Digital Library (Joan Starr - Calif...
 
HP & Sogeti Healthcare Big Data Presentation for Discover 2015
HP & Sogeti Healthcare Big Data Presentation for Discover 2015HP & Sogeti Healthcare Big Data Presentation for Discover 2015
HP & Sogeti Healthcare Big Data Presentation for Discover 2015
 

Similar to strata_ny_2016_version_final_no_animation

Transforming Research in Collaboration with Funding Agencies
Transforming Research in Collaboration with Funding AgenciesTransforming Research in Collaboration with Funding Agencies
Transforming Research in Collaboration with Funding Agencies
Amazon Web Services
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
Dhruv Saxena
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data Science
Philip Bourne
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
Philip Bourne
 
My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018
Susanna-Assunta Sansone
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
Philip Bourne
 
SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017
SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017
SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017
Sandra Gesing
 
BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm. BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm. maigva
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
Philip Bourne
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything Change
Philip Bourne
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not Alone
Philip Bourne
 
Big Data Brown Bag
Big Data Brown BagBig Data Brown Bag
Big Data Brown Bagusmanqureshi
 
2016 09 cxo forum
2016 09 cxo forum2016 09 cxo forum
2016 09 cxo forum
Chris Dwan
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
wahiba ben abdessalem
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's View
Philip Bourne
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
ssuser1a4f0f
 
Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)
Robert Grossman
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdf
vishal choudhary
 
Big Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARLBig Data & DS Analytics for PAARL

Similar to strata_ny_2016_version_final_no_animation (20)

Transforming Research in Collaboration with Funding Agencies
Transforming Research in Collaboration with Funding AgenciesTransforming Research in Collaboration with Funding Agencies
Transforming Research in Collaboration with Funding Agencies
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data Science
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018My FAIR share of the work - Diamond Light Source - Dec 2018
My FAIR share of the work - Diamond Light Source - Dec 2018
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017
SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017
SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017
 
BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm. BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm.
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything Change
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not Alone
 
Big Data Brown Bag
Big Data Brown BagBig Data Brown Bag
Big Data Brown Bag
 
2016 09 cxo forum
2016 09 cxo forum2016 09 cxo forum
2016 09 cxo forum
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's View
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdf
 
Big Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARLBig Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARL
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
 

strata_ny_2016_version_final_no_animation

  • 1. Big Data in Health Care Strata – New York September, 28th 2016 Sabrina Dahlgren, Director, Health care Delivery and Innovation, Kaiser Permanente - @sabridahlgren Taposh Dutta Roy, Health Lead, Innovation and Data Science, Decision Support, Kaiser Permanente -@taposhdr Rajiv Synghal, Chief Architect, Big Data Strategy, Kaiser Permanente @synghalr
  • 2. the Largest integrated delivery system in the US primary care specialty care home care lab hospital pharmacy optical dental research insurance integratedApproaching 11 million members INTEGRATED CARE DELIVERY 200K employees 19,000 physicians 38 hospitals 608 medical offices and other outpatient facilities $61 billion operating revenue (2016) Mission: Kaiser Permanente exists to provide high-quality, affordable health care services and to improve the health of our members and the communities we serve.
  • 3. TECHNOLOGYINNOVATION ORIGINS 1960s|Dr. Sidney Garfield & Dr. Morris Collen “We should begin to take advantage of electronic digital computers.” “Continuing total health care requires a continuing life record for each individual…The content of that life record, now made possible by computer information technology, will chart the course to be taken by each individual for optimal health.” Sidney Garfield, MD Hospital Computer Systems, 1974 Supporting Health Care with Technology
  • 4. OUR TECHNOLOGY JOURNEY EHR (KP Health Connect) Circle of Support (continuous availability, ancillary systems) Online (KP.org) Transforming Health Care Foundational (Facilities) Precision Medicine Mobile App (Video Visits) Machine learning Patient 360 View (Cloud, IOT)
  • 7. 6 HRI Consumer Survey, PwC, 2013, 2015 Percentage of consumers with at least one medical, health or fitness app on their mobile devices doubled from 2013 to 2015*
  • 9. 8 Health 2050 Vision Figure 1,Health 2050: The Realization of Personalized Medicine through Crowdsourcing, the Quantified Self, and the Participatory Biocitizen, Journal of Personalized Medicine Traditional Population- wide demographics Cohort- relevant measures Individual N=1 Traditional Model Emerging Model Future Model
  • 10. Analytics and Enterprise Information Strategy… 9 right insights Business intelligence and analytics right time and form Delivery and decision aids good data Data sources and platforms information as a strategic asset Frame right questions Make better decisions fast Link decisions to action decision maker centered Leaders & Managers Care and Service Providers Patients, Members, Groups Approach Strategy Outcome Easy to Frame right questions Know and do right things Make better decisions right information + at the right time + in the right hands
  • 11. Making the RIGHT Architecture Choices… Feature RDBMS Ecosystem Distributed Ecosystem Security ü Easy within a System. Fragmented across Enterprise. ü Five tier security built-in across all enterprise systems. Governance ü Easy within a domain. Difficult across domains. ü Easy across systems. Easy to align fields and definitions. Organizational Preparedness ü Mature People, Process. § Maturing People, Processes Operational Readiness ü Mature § Maturing. 24 x 7 Support. Active-Active DR. Class of Service 0/1. Storage Formats § Single. Opaque storage formats. ü Multiple. Open storage formats. Row Vs. Columnar § Maturing ü Mature Data Partitioning § Add on component. After thought. ü Partitioning and hashing are first class citizens. OLAP § Design time and pre-built aggregations. ü On demand, real time aggregations. Agility. Alignment § Version mismatch across functional components ü Functional components are an integral part of the ecosystem Search & Data Mining § Missing. Add on component. ü Native h/w Scalability § Each tier has to independently scale - CPU/Memory. ü Built in scalability. System Scalability § Vertical ü Horizontal RDBMS Ecosystem - Distributed Ecosystem - Centralized Data and Processing on a Distributed Platform. Minimize data movement Data and its various interaction points Heuristic processing Identify data linkages Novel parallel processing techniques and algorithms Liberate data from source systems Knowledge silos Fragmented Data and Processing on RDBMS Platforms. Siloed IT departments Fragmented data models Data repurposing Data rejuvenation Fragmented profiling & data quality Missing data linkages Siloed business units Fragmented data selection rules, business logic, algorithms…
  • 12. 11 Landing Zone (Home to Secure and Organized Data) - A Self Service Data Platform hosting both the raw and prepared data sets for quick business consumption. § Security and Data Organization are both First Class Citizens: Five Layers of Security. Data Organized by domains and use cases. § Unified, Interconnected Data: One place to store all data, in any format, large volumes. è Import external data to comingle and corroborate with internal data ç De-Identify internal data to share with and leverage external partners. § Self-Service Exploratory BI: Allows users to explore, discover, and mine data, with full security, using interactive analytical tools § Advanced Analytics: Simple tabular data can mix with more complex and multi-structured data in ways that were never before possible HDFS Raw Data Zone All Data Encrypted @ Rest JDBC - Impala Lookup Query / Arcadia OLAP Query User Defined Zone Refined Data Zone Master Data Zone Meta Data Zone Reference Data Zone Usage Data Zone Exploratory Intelligence Analyze MineRefineDiscover Smart Data Zone (Semantic Layer) Data Platform Landing Zone Visualize Building the Future Data Platform…
  • 13. Collect Curate Enrich 12 Pass Sample Design Collect Ingest Pre-checks Failed Ingest Process 2016-08-25 5:35 2016-08-25 5:35 Curate Enrich Pass Verification Pass Data Profiling 2016-08-25 5:35 2016-08-25 5:35 Pass Enrich Pre-checks NA Enrich Process 2016-08-25 5:35 Use-case Growth of run time Growth of storage Total tables processed 40 tables (Q1: 35 tables) 40 tables 12 tables Source Systematic Data Liberation… Solving data repurposing – single copy, multiple use, enrich & context archive continuous archive ingest true ELT semantic equivalence
  • 14. 13 episodes pharmacy membership lab kp.org employee hr system logs help desk Consistent User Experience: One Platform HDFS Raw Data Zone All Data Encrypted @ Rest JDBC - Impala Lookup Query / Arcadia OLAP Query User Defined Zone Refined Data Zone Master Data Zone Meta Data Zone Reference Data Zone Usage Data Zone Exploratory Intelligence Analyze MineRefineDiscover Smart Data Zone (Semantic Layer) Data Platform – Landing Zone Visualize episode groupings high utilizers actionable findings search prescriptions semantic hr analytics risk intelligence Liberate Curate Enrich ConsumeCollect
  • 16. 15 Analytics - Current State Self Serve Reporting Machine Learning
  • 17. 16 Machine Learning - Current Process Refining data (clean/impute) Applying variety of algorithms – training & testing datasets Getting relevant data Feature engineering Data pipeline Dashboard for model performance Data pipeline Continuous improvement
  • 18. Our Analytics Strategy Mentoring Developed our people through training. Our analytics strategy 1 Challenging Developed an internal crowd sourced machine learning challenge. 2 Enabling Provided an infrastructure to explore and develop. 3
  • 19. Mentoring September 30, 2016 Training Programs In Person Events Ongoing Learning Training our people on big data and statistical tools. Developed learning forum and venues to support machine learning. Lunch and learn seminars
  • 20. 19 The power of crowd https://en.wikipedia.org/wiki/Board_of_Longitude What was the first problem solved through crowd sourcing and when? 1714 – British Board of Longitude Problem : To determine the longitude of a ship at sea.
  • 21. Data Science Challenge September 30, 2016 Nature (Aug 2016): Crowdsourcing biomedical research: leveraging communities as innovation engines Julio Saez-Rodriguez et. al.
  • 22. 21 Organizational Challenge Flow Nature (Aug 2016): Crowdsourcing biomedical research: leveraging communities as innovation engines Julio Saez-Rodriguez et. al.
  • 23. IMPROVED ANALYTICS INFRASTRUCTURE Competition enabled us to test an analytics ecosystem Competition prompted us to develop a process to manage open source tools. The data science team provided access to participants to de- identified data following HIPAA compliance.
  • 24. Results from Challenge Fast facts • Over 100 KP data scientists participated in the competition • 1000 models were submitted in 6 weeks • Model performance improved by more than 5% • … in less than <10% of time Learnings •Discussion forums were lively and collaboration increased •New algorithm strategies were discovered •Papers are planned to be published to share the learning on the algorithm strategies
  • 26. 25 Analytical Opportunities 1 2 3 4 5 6 7 8 9 10 Food Recommendations Environment Monitoring Resource Recommendations Drive Recommendation Biometric Monitoring Alerts/ Dashboards Brand Sentiment Triage Recommendations Expert Advice Event Prediction
  • 27. 26 Acknowledgement We wish to acknowledge the contribution of many to this work: • The Permanente Medical Group Physicians and the Permanente Federation • Health Plan and Hospital Operations, Quality and Finance teams • Kaiser Permanente Information Technology It takes a “virtual” village … !