SlideShare a Scribd company logo
1 of 14
Download to read offline
Philosopher’sstone
Open Data Science Conference, San Francisco
November 2015
Vin Sharma
@ciphr | vin.sharma@intel.com
2
Datascience:Philosopher’sstone Data Science has grow from a tongue-in-
cheek epithet (see “rocket science”) into
a real profession. Data Scientists now
have great power in enterprises. We hold
the Philosopher's Stone that transforms
raw data into intelligence. But with great
power comes great responsibility.
For Data Science to evolve into a peer of
physical sciences like chemistry, our
community needs to help it develop the
essential character of a Science:
Openness, methodological consistency,
substantive body of knowledge, reuse,
reproducibility, open research questions,
ethics and professional responsibility.
Our team at Intel has been working on
these issues helping to evolve Data
Science from alchemy to chemistry.
3
Fromalchemytochemistry
+ =
THINGS VALUE
Revenue
Growth
Cost
Savings
Margin Gain
50 Billion 35 ZB
DATA
TransmutationofDataintoValue
+ =
THINGS VALUE
Revenue
Growth
Cost
Savings
Margin Gain
50 Billion 35 ZB
DATA
Personalized
Ubiquitous
New Ventures
Higher Productivity
Greater Efficiency
Better Products
Engaged
Customers
New
Solutions
TransmutationofDataintoValue
Value
Innovation
Delaysanddetours
+ =
THINGS VALUE
Revenue
Growth
Cost
Savings
Margin Gain
50 Billion 35 ZB
NO NO NO
TRUST INSIGHT PROOF
 Fail to Scale
Lack of Use Cases
 Fail to Secure
Scarcity of Skills Complexity of Systems
 Fail to show ROI
DATA
IoT
Developer
Platform
Wearables
Developer
Platform
Parkinson’s
Research
Platform
Retail
Analytics
Solutions
Power
Distribution
Analytics
Digital
Oil Field
Population
Genomics
Data
Source
Use
Cases
Maker
solutions on
intel® Galileo &
Intel® Edison
Customer
device usage
analyses for
fashion watch
ODM
Disease
progression
tracking via
sensors
RFID-based
inventory
tracking; social
media based
demand
forecasting
Grid overlay
network data
analysis
Preventive
maintenance
for oil field
assets
Compare the
anonymized
genome data
of a local
patient with
genome data
in public data
sets
Conceptsolutions
Sciencefriction
Data Science:
• Iterative error-prone drudgery
• One-off, ad hoc models in isolation
Analytics Processing:
• Single-threaded, single-node processing
• Proprietary, fixed-function solutions
Application Code:
• Monolithic architecture
• Legacy components
From data science to big data analytics: Less alchemy, more chemistry
8
Open source software
project to accelerate
creation of cloud native
apps driven by big data
analytics. TAP provides a
shared environment for
app developers to
collaborate with data
scientists, making it
easier to use advanced
analytics on big data in
the Cloud.
TrustedAnalyticsPlatform
Graph
TrustedanalyticsPlatform
Connectors
Message Brokers & Queues
Kafka, RabbitMQ
MQTT, WS, REST…
Processors
Stream & Batch
Hadoop, Spark, GearPump…
Manage Orchestration, Telemetry, Security
Stores
Polyglot Persistence
HDFS, HBase, PostgreSQL,
MySQL, Redis, MongoDB,
InfluxDB, Objectivity, etc…
Models
Develop, train, evaluate,
deploy models as services
Data Scientist
Develop  Deploy
Intel, DataRobot, DL4J, H2O
Runtimes
Polyglot App Runtime
Python, R, Java, Scala, Go…
Develop, test, push
applications; manage lifecycle
App DeveloperSystem Operator
Infrastructure (IaaS)
Appliance
Modelbuildingservices
11
Data Preparation
Join, filter, and
cleanse data sets
Model Evaluation
Accuracy measures,
cross-validation
Application Integration
Invoke model via APIs
Hypothesis Selection
Define inferential or predictive
hypothesis
Model Training
Use ML to find β
Model Deployment
Run in scoring engine,
track concept drift
TAPcommunity
12
Casestudy:patientreadmissionpredictionatpennmedicine
13
LDA-derived medication features led to
15% improvement in accuracy
Raw Medication Lists
Cleaned Medication Lists
(text processing methods,
regular expressions)
LDA-derived Features
Data are noisy and sparse[ ]
Data are less noisy, but sparse[ ]
Data are neither noisy nor sparse
[ ]
42,358 features
23,663 features
23,663 features
20 features
Penn Medicine wants to identify and stratify heart failure patients
at risk of re-admission within 30 or 90 days of discharge.
• Patient phenotype approach to risk classification
• Use of patient medication history
• Applying unsupervised text analytics algorithms, such as
Latent Dirichlet Allocation (LDA), to model relationship
between medications and medical conditions
• Using this model with patient health records to identify high-
risk patient profiles
• Evaluating individual patient risk of re-admission for new and
existing patients
14
Vin Sharma / @ciphr / vin.sharma@intel.com

More Related Content

What's hot

The Linked Data Advantage
The Linked Data AdvantageThe Linked Data Advantage
The Linked Data AdvantageSqrrl
 
Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...
Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...
Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...Big Data Spain
 
Big Data and Health Care
Big Data and Health CareBig Data and Health Care
Big Data and Health CareJeffrey Funk
 
Sqrrl Enterprise: Integrate, Explore, Analyze
Sqrrl Enterprise: Integrate, Explore, AnalyzeSqrrl Enterprise: Integrate, Explore, Analyze
Sqrrl Enterprise: Integrate, Explore, AnalyzeSqrrl
 
Integrating Structure and Analytics with Unstructured Data
Integrating Structure and Analytics with Unstructured DataIntegrating Structure and Analytics with Unstructured Data
Integrating Structure and Analytics with Unstructured DataDATAVERSITY
 
Tech Incubation. Delivering an enterprise platform on AWS
Tech Incubation. Delivering an enterprise platform on AWSTech Incubation. Delivering an enterprise platform on AWS
Tech Incubation. Delivering an enterprise platform on AWSNick Brown
 
From Data to Visualization: Emerging Tools for Research / Jan Johansson
From Data to Visualization: Emerging Tools for Research / Jan JohanssonFrom Data to Visualization: Emerging Tools for Research / Jan Johansson
From Data to Visualization: Emerging Tools for Research / Jan JohanssonPVC.ASIST
 
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCampSteve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCampBigDataCamp
 
From Data Lakes to the Data Fabric: Our Vision for Digital Strategy
From Data Lakes to the Data Fabric: Our Vision for Digital StrategyFrom Data Lakes to the Data Fabric: Our Vision for Digital Strategy
From Data Lakes to the Data Fabric: Our Vision for Digital StrategyCambridge Semantics
 
BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics Dell Statisti...
BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics  Dell Statisti...BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics  Dell Statisti...
BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics Dell Statisti...Big Data Week
 
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...Cambridge Semantics
 
How Can You Leverage DevSecOps Approach For Secure Data Analytics?
How Can You Leverage DevSecOps Approach For Secure Data Analytics?How Can You Leverage DevSecOps Approach For Secure Data Analytics?
How Can You Leverage DevSecOps Approach For Secure Data Analytics?Enov8
 
Big Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationBig Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationCambridge Semantics
 
Unified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge GraphUnified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge GraphVaticle
 
Text graph-visualization redux
Text graph-visualization reduxText graph-visualization redux
Text graph-visualization reduxVasko Yordanov
 
Sapiens data science and snowflake data warehouse
Sapiens data science and snowflake data warehouseSapiens data science and snowflake data warehouse
Sapiens data science and snowflake data warehouseLarry Heminger
 
Data science 101 Masterclass
Data science 101 MasterclassData science 101 Masterclass
Data science 101 MasterclassBen Keen
 
Power of the Run Graph
Power of the Run GraphPower of the Run Graph
Power of the Run GraphVaticle
 

What's hot (20)

The Linked Data Advantage
The Linked Data AdvantageThe Linked Data Advantage
The Linked Data Advantage
 
Big Data & Data Mining
Big Data & Data MiningBig Data & Data Mining
Big Data & Data Mining
 
Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...
Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...
Health Insurance Predictive Analysis with Hadoop and Machine Learning. JULIEN...
 
Big Data and Health Care
Big Data and Health CareBig Data and Health Care
Big Data and Health Care
 
Sqrrl Enterprise: Integrate, Explore, Analyze
Sqrrl Enterprise: Integrate, Explore, AnalyzeSqrrl Enterprise: Integrate, Explore, Analyze
Sqrrl Enterprise: Integrate, Explore, Analyze
 
Integrating Structure and Analytics with Unstructured Data
Integrating Structure and Analytics with Unstructured DataIntegrating Structure and Analytics with Unstructured Data
Integrating Structure and Analytics with Unstructured Data
 
Data science for developers
Data science for developersData science for developers
Data science for developers
 
Tech Incubation. Delivering an enterprise platform on AWS
Tech Incubation. Delivering an enterprise platform on AWSTech Incubation. Delivering an enterprise platform on AWS
Tech Incubation. Delivering an enterprise platform on AWS
 
From Data to Visualization: Emerging Tools for Research / Jan Johansson
From Data to Visualization: Emerging Tools for Research / Jan JohanssonFrom Data to Visualization: Emerging Tools for Research / Jan Johansson
From Data to Visualization: Emerging Tools for Research / Jan Johansson
 
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCampSteve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
 
From Data Lakes to the Data Fabric: Our Vision for Digital Strategy
From Data Lakes to the Data Fabric: Our Vision for Digital StrategyFrom Data Lakes to the Data Fabric: Our Vision for Digital Strategy
From Data Lakes to the Data Fabric: Our Vision for Digital Strategy
 
BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics Dell Statisti...
BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics  Dell Statisti...BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics  Dell Statisti...
BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics Dell Statisti...
 
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
 
How Can You Leverage DevSecOps Approach For Secure Data Analytics?
How Can You Leverage DevSecOps Approach For Secure Data Analytics?How Can You Leverage DevSecOps Approach For Secure Data Analytics?
How Can You Leverage DevSecOps Approach For Secure Data Analytics?
 
Big Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationBig Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data Democratization
 
Unified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge GraphUnified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge Graph
 
Text graph-visualization redux
Text graph-visualization reduxText graph-visualization redux
Text graph-visualization redux
 
Sapiens data science and snowflake data warehouse
Sapiens data science and snowflake data warehouseSapiens data science and snowflake data warehouse
Sapiens data science and snowflake data warehouse
 
Data science 101 Masterclass
Data science 101 MasterclassData science 101 Masterclass
Data science 101 Masterclass
 
Power of the Run Graph
Power of the Run GraphPower of the Run Graph
Power of the Run Graph
 

Viewers also liked

Jason Lee_Resume
Jason Lee_ResumeJason Lee_Resume
Jason Lee_ResumeJason Lee
 
Ieeepro techno solutions ieee 2014 embedded project helmet for road hazard...
Ieeepro techno solutions    ieee 2014 embedded project helmet for road hazard...Ieeepro techno solutions    ieee 2014 embedded project helmet for road hazard...
Ieeepro techno solutions ieee 2014 embedded project helmet for road hazard...srinivasanece7
 
Taking the lead: customer acquisition barometer 2015
Taking the lead: customer acquisition barometer 2015Taking the lead: customer acquisition barometer 2015
Taking the lead: customer acquisition barometer 2015Rachel Aldighieri
 
Spaans a2 ln 21 1-2013
Spaans a2 ln 21 1-2013Spaans a2 ln 21 1-2013
Spaans a2 ln 21 1-2013SpaanIt
 
Montevideo pspl report final_spreads
Montevideo pspl report final_spreadsMontevideo pspl report final_spreads
Montevideo pspl report final_spreadsarqacom
 
V 10 7-20
V 10 7-20V 10 7-20
V 10 7-20SpaanIt
 
Decreto 10 2012 ley de actualizacion tributaria
Decreto 10 2012 ley de actualizacion tributariaDecreto 10 2012 ley de actualizacion tributaria
Decreto 10 2012 ley de actualizacion tributariaGusi-83
 
Travel, Leisure + Tourism: Jonathan Vose + Tim Woodhead
Travel, Leisure + Tourism: Jonathan Vose + Tim WoodheadTravel, Leisure + Tourism: Jonathan Vose + Tim Woodhead
Travel, Leisure + Tourism: Jonathan Vose + Tim WoodheadPlace North West
 
Презентация Игорь Баньковский (Depositphotos) для NaZapad 3
Презентация Игорь Баньковский (Depositphotos) для NaZapad 3Презентация Игорь Баньковский (Depositphotos) для NaZapad 3
Презентация Игорь Баньковский (Depositphotos) для NaZapad 3NaZapad
 
Google Applied CS - Introduction
Google Applied CS - IntroductionGoogle Applied CS - Introduction
Google Applied CS - IntroductionHarsh Vakharia
 

Viewers also liked (16)

Region-based volumetric medical image retrieval
Region-based volumetric medical image retrievalRegion-based volumetric medical image retrieval
Region-based volumetric medical image retrieval
 
Jason Lee_Resume
Jason Lee_ResumeJason Lee_Resume
Jason Lee_Resume
 
Ieeepro techno solutions ieee 2014 embedded project helmet for road hazard...
Ieeepro techno solutions    ieee 2014 embedded project helmet for road hazard...Ieeepro techno solutions    ieee 2014 embedded project helmet for road hazard...
Ieeepro techno solutions ieee 2014 embedded project helmet for road hazard...
 
Taking the lead: customer acquisition barometer 2015
Taking the lead: customer acquisition barometer 2015Taking the lead: customer acquisition barometer 2015
Taking the lead: customer acquisition barometer 2015
 
Spaans a2 ln 21 1-2013
Spaans a2 ln 21 1-2013Spaans a2 ln 21 1-2013
Spaans a2 ln 21 1-2013
 
Navigating B2B marketing
Navigating B2B marketingNavigating B2B marketing
Navigating B2B marketing
 
Montevideo pspl report final_spreads
Montevideo pspl report final_spreadsMontevideo pspl report final_spreads
Montevideo pspl report final_spreads
 
V 10 7-20
V 10 7-20V 10 7-20
V 10 7-20
 
The atlas of practicalities
The atlas of practicalitiesThe atlas of practicalities
The atlas of practicalities
 
Decreto 10 2012 ley de actualizacion tributaria
Decreto 10 2012 ley de actualizacion tributariaDecreto 10 2012 ley de actualizacion tributaria
Decreto 10 2012 ley de actualizacion tributaria
 
Lehna's portfolio
Lehna's portfolioLehna's portfolio
Lehna's portfolio
 
Travel, Leisure + Tourism: Jonathan Vose + Tim Woodhead
Travel, Leisure + Tourism: Jonathan Vose + Tim WoodheadTravel, Leisure + Tourism: Jonathan Vose + Tim Woodhead
Travel, Leisure + Tourism: Jonathan Vose + Tim Woodhead
 
Data Templates - #BIM4M2help
Data Templates - #BIM4M2helpData Templates - #BIM4M2help
Data Templates - #BIM4M2help
 
Medical image analysis and big data evaluation infrastructures
Medical image analysis and big data evaluation infrastructuresMedical image analysis and big data evaluation infrastructures
Medical image analysis and big data evaluation infrastructures
 
Презентация Игорь Баньковский (Depositphotos) для NaZapad 3
Презентация Игорь Баньковский (Depositphotos) для NaZapad 3Презентация Игорь Баньковский (Depositphotos) для NaZapad 3
Презентация Игорь Баньковский (Depositphotos) для NaZapad 3
 
Google Applied CS - Introduction
Google Applied CS - IntroductionGoogle Applied CS - Introduction
Google Applied CS - Introduction
 

Similar to Data Science: Philosopher's Stone

Big data journey to the cloud maz chaudhri 5.30.18
Big data journey to the cloud   maz chaudhri 5.30.18Big data journey to the cloud   maz chaudhri 5.30.18
Big data journey to the cloud maz chaudhri 5.30.18Cloudera, Inc.
 
LFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
LFS302_Real-World Evidence Platform to Enable Therapeutic InnovationLFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
LFS302_Real-World Evidence Platform to Enable Therapeutic InnovationAmazon Web Services
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data ManagementCarole Goble
 
Opportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deckOpportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deckPistoia Alliance
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformSanjay Padhi, Ph.D
 
Alitora Innovation Networks
Alitora Innovation NetworksAlitora Innovation Networks
Alitora Innovation Networksalitora
 
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...Neo4j
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Shirshanka Das
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Yael Garten
 
The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration James Hendler
 
OpenSistemas Corporate Presentation
OpenSistemas Corporate PresentationOpenSistemas Corporate Presentation
OpenSistemas Corporate PresentationOpenSistemas
 
Accelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success StoriesAccelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success StoriesCambridge Semantics
 
Embracing Cloud Deployment for Big Data and Dev Ops
Embracing Cloud Deployment for Big Data and Dev OpsEmbracing Cloud Deployment for Big Data and Dev Ops
Embracing Cloud Deployment for Big Data and Dev OpsNick Brown
 
Embracing Cloud Deployment for Big Data and DevOps
Embracing Cloud Deployment for Big Data and DevOpsEmbracing Cloud Deployment for Big Data and DevOps
Embracing Cloud Deployment for Big Data and DevOpsSteve Woodward
 
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...Denodo
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...Alex Liu
 
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...
1° Sessione Oracle CRUI: Analytics Data Lab,  the power of Big Data Investiga...1° Sessione Oracle CRUI: Analytics Data Lab,  the power of Big Data Investiga...
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...Jürgen Ambrosi
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data ScienceDataWorks Summit
 
Matt McIlwain opening keynote
Matt McIlwain opening keynoteMatt McIlwain opening keynote
Matt McIlwain opening keynoteSeattleSIM
 

Similar to Data Science: Philosopher's Stone (20)

Big data journey to the cloud maz chaudhri 5.30.18
Big data journey to the cloud   maz chaudhri 5.30.18Big data journey to the cloud   maz chaudhri 5.30.18
Big data journey to the cloud maz chaudhri 5.30.18
 
LFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
LFS302_Real-World Evidence Platform to Enable Therapeutic InnovationLFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
LFS302_Real-World Evidence Platform to Enable Therapeutic Innovation
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data Management
 
Opportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deckOpportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deck
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
 
Alitora Innovation Networks
Alitora Innovation NetworksAlitora Innovation Networks
Alitora Innovation Networks
 
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
 
The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration
 
OpenSistemas Corporate Presentation
OpenSistemas Corporate PresentationOpenSistemas Corporate Presentation
OpenSistemas Corporate Presentation
 
Accelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success StoriesAccelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success Stories
 
Embracing Cloud Deployment for Big Data and Dev Ops
Embracing Cloud Deployment for Big Data and Dev OpsEmbracing Cloud Deployment for Big Data and Dev Ops
Embracing Cloud Deployment for Big Data and Dev Ops
 
Embracing Cloud Deployment for Big Data and DevOps
Embracing Cloud Deployment for Big Data and DevOpsEmbracing Cloud Deployment for Big Data and DevOps
Embracing Cloud Deployment for Big Data and DevOps
 
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
 
On Big Data
On Big DataOn Big Data
On Big Data
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
 
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...
1° Sessione Oracle CRUI: Analytics Data Lab,  the power of Big Data Investiga...1° Sessione Oracle CRUI: Analytics Data Lab,  the power of Big Data Investiga...
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
Matt McIlwain opening keynote
Matt McIlwain opening keynoteMatt McIlwain opening keynote
Matt McIlwain opening keynote
 

Recently uploaded

Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...gajnagarg
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...gajnagarg
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 

Recently uploaded (20)

Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 

Data Science: Philosopher's Stone

  • 1. Philosopher’sstone Open Data Science Conference, San Francisco November 2015 Vin Sharma @ciphr | vin.sharma@intel.com
  • 2. 2 Datascience:Philosopher’sstone Data Science has grow from a tongue-in- cheek epithet (see “rocket science”) into a real profession. Data Scientists now have great power in enterprises. We hold the Philosopher's Stone that transforms raw data into intelligence. But with great power comes great responsibility. For Data Science to evolve into a peer of physical sciences like chemistry, our community needs to help it develop the essential character of a Science: Openness, methodological consistency, substantive body of knowledge, reuse, reproducibility, open research questions, ethics and professional responsibility. Our team at Intel has been working on these issues helping to evolve Data Science from alchemy to chemistry.
  • 4. + = THINGS VALUE Revenue Growth Cost Savings Margin Gain 50 Billion 35 ZB DATA TransmutationofDataintoValue
  • 5. + = THINGS VALUE Revenue Growth Cost Savings Margin Gain 50 Billion 35 ZB DATA Personalized Ubiquitous New Ventures Higher Productivity Greater Efficiency Better Products Engaged Customers New Solutions TransmutationofDataintoValue Value Innovation
  • 6. Delaysanddetours + = THINGS VALUE Revenue Growth Cost Savings Margin Gain 50 Billion 35 ZB NO NO NO TRUST INSIGHT PROOF  Fail to Scale Lack of Use Cases  Fail to Secure Scarcity of Skills Complexity of Systems  Fail to show ROI DATA
  • 7. IoT Developer Platform Wearables Developer Platform Parkinson’s Research Platform Retail Analytics Solutions Power Distribution Analytics Digital Oil Field Population Genomics Data Source Use Cases Maker solutions on intel® Galileo & Intel® Edison Customer device usage analyses for fashion watch ODM Disease progression tracking via sensors RFID-based inventory tracking; social media based demand forecasting Grid overlay network data analysis Preventive maintenance for oil field assets Compare the anonymized genome data of a local patient with genome data in public data sets Conceptsolutions
  • 8. Sciencefriction Data Science: • Iterative error-prone drudgery • One-off, ad hoc models in isolation Analytics Processing: • Single-threaded, single-node processing • Proprietary, fixed-function solutions Application Code: • Monolithic architecture • Legacy components From data science to big data analytics: Less alchemy, more chemistry 8
  • 9. Open source software project to accelerate creation of cloud native apps driven by big data analytics. TAP provides a shared environment for app developers to collaborate with data scientists, making it easier to use advanced analytics on big data in the Cloud. TrustedAnalyticsPlatform Graph
  • 10. TrustedanalyticsPlatform Connectors Message Brokers & Queues Kafka, RabbitMQ MQTT, WS, REST… Processors Stream & Batch Hadoop, Spark, GearPump… Manage Orchestration, Telemetry, Security Stores Polyglot Persistence HDFS, HBase, PostgreSQL, MySQL, Redis, MongoDB, InfluxDB, Objectivity, etc… Models Develop, train, evaluate, deploy models as services Data Scientist Develop  Deploy Intel, DataRobot, DL4J, H2O Runtimes Polyglot App Runtime Python, R, Java, Scala, Go… Develop, test, push applications; manage lifecycle App DeveloperSystem Operator Infrastructure (IaaS) Appliance
  • 11. Modelbuildingservices 11 Data Preparation Join, filter, and cleanse data sets Model Evaluation Accuracy measures, cross-validation Application Integration Invoke model via APIs Hypothesis Selection Define inferential or predictive hypothesis Model Training Use ML to find β Model Deployment Run in scoring engine, track concept drift
  • 13. Casestudy:patientreadmissionpredictionatpennmedicine 13 LDA-derived medication features led to 15% improvement in accuracy Raw Medication Lists Cleaned Medication Lists (text processing methods, regular expressions) LDA-derived Features Data are noisy and sparse[ ] Data are less noisy, but sparse[ ] Data are neither noisy nor sparse [ ] 42,358 features 23,663 features 23,663 features 20 features Penn Medicine wants to identify and stratify heart failure patients at risk of re-admission within 30 or 90 days of discharge. • Patient phenotype approach to risk classification • Use of patient medication history • Applying unsupervised text analytics algorithms, such as Latent Dirichlet Allocation (LDA), to model relationship between medications and medical conditions • Using this model with patient health records to identify high- risk patient profiles • Evaluating individual patient risk of re-admission for new and existing patients
  • 14. 14 Vin Sharma / @ciphr / vin.sharma@intel.com