Machine Learning in Healthcare and Life Science
andrew.zhang@ibm.com
IBM	Cloud	Analytics
SoCal	Data	Science	Conference	2017.10.22
1. Let’s	talk	about	machine	learning
2. Watson	Data	Platform
3. Data	Science	Experience
4. ML	inside	healthcare	and	life	science
5. Case	Study:	Deep	learning	in	lung	cancer	detection
Agenda
Machine Learning
• Classification:	predict	class	from	observations
- E.g.	Spam	Email	Detection
• Regression	(prediction):	predict	value	from	observations
- E.g.	Energy	consumption	prediction
• Clustering:	group	observations	into	“meaningful”	groups
- E.g.	Amazon	Recommendations
and	many	different	technologies	and	libraries	are	available:
Why ML now?
The	economics
1. Companies	want	ML	but	they	don’t	have	
deep	understanding	
2. Skills	are	hard	to	find
3. Machine	Learning	at	scale	is	hard
4. Many	technology	choices
5. Productionized	ML	is	very	complex	and	not	
standardized
Why not many companies
are using ML today?
IBM Cloud Analytics
Comprehensive Trusted Flexible
• Broadest	selection	of	data	and	
analytics	services
• Seamless	integrations
• Open-source	leadership
• Fully	managed	environment	24	x	7	
• Secure	infrastructure
• No	installation,	configuration,	
maintenance	required
• Cloud,	on-premises	&	hybrid	support
• No	vendor	lock-in
• Subscription	pricing
IBM Analytics for
Apache Spark
BigInsights for
Apache Hadoop
Cloudant dashDB DataConnect
Elasticsearch by
Compose
Geospatial
Analytics
IBM DB2 on
Cloud
Insights
for Twitter
MongoDB by
Compose
Object Storage
PostgreSQL by
Compose
Predictive
Analytics
Redis by
Compose
SQL
Database
Streaming
Analytics
Time Series
Database
Analytics for
Apache Hadoop
Insights for
Weather
Graph
Blu
Acceleration
Data	Engineering
Data	Science Business	Analysis App	Development
Data	Sources
• On-premises	/	cloud
• Structured	/	unstructured
[and	content	repositories]
• In-motion	/	at-rest
• Internal	/	external Hadoop
NoSQL	/	SQL
Object	store
Discovery	/	Exploration
Machine	learning
Model	development
Reports	/	Dashboards
Applications
APIs
Integration
Matching	/	Quality
Streaming
Persist
Analyze
Ingest Deploy
Iterate
Govern
Data	Assessment
Metadata	/	Policies
Find Share Collaborate
Intelligent Data Fabric
IBM Watson Data Platform
Watson	Developer	Cloud	
APIs
•Natural	Language
•Vision	Services
•Data	Insight	Services
ML pipeline in Watson Data Platform
BigInsights HDFS
(Hadoop)
Data	Connect DashDB
Data	Science	ExperienceCloudantNode.js Web	Form
Training	Data Convert	to	CSV
Predictions
New	Records
Predictions
What is the IBM Data Science Experience (DSx)?
IBM Data Science Experience
Community Open Source IBM Added Value
Powered by IBM Watson Data Platform
• Find tutorials and datasets
• Connect with Data Scientists
• Ask questions
• Read articles and papers
• Fork and share projects
• Code in Scala/Python/R/SQL
• Jupyter Notebooks
• RStudio IDE and Shiny
• Spark ML
• Your favorite libraries
• Managed Spark Service
• Project, Catalog, Data Connectors
• ML Model Builder and Canvas
• IBM Machine Learning API
• Cloud, Desktop and Local Deployment
Core Attributes of the Data Science
Experience
Machine Learning Impact Across Industries and Use Cases
$10s	of	Billions	in	each	industry	and	use	case
Data	Science	Solutions
Case Study: Lung Cancer Detection
Source:	Gilles	Wainrib,		AI	to	Accelerate	Drug	Discovery,	OWKIN
1. Machine	learning	will	be	there	where	requires	analytics
2. Deep	learning	is	a	black-box	but	it	has	great	potentials
3. Data	is	important	but	where	to	get	them?
4. Algorithm	is	hard	but	everyone	can	use	with	some	
understanding
5. Computing	is	cheap	but	not	that	cheap
6. Read	a	lot	of	papers,	replicate	a	lot	of	cases,	and	find	a	real	
(hardcore)	problem	to	solve!
7. It	is	critical	to	shorten	the	life	cycle	of	the	process:	be	agile!
Summary
Good	reading:
- Rob	Thomas:	A	Practical	Guide	to	Machine	Learning:	Understand,	Differentiate,	and	Apply	
http://www.robdthomas.com/2016/07/a-practical-guide-to-machine-learning.html
- Francois	Chollet:	Building	powerful	image	classification	models	using	very	little	data
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-
data.html
- Gilles	Wainrib: AI	to	Accelerate	Drug	Discovery	
https://www.youtube.com/watch?v=JSdUHfUR5gg&index=26&list=PLudzo2N8PCHg1iDNqEbFYUDb5Cb16bkJR	
Q&A
Machine Learning in Healthcare and Life Science

Machine Learning in Healthcare and Life Science