The document discusses machine learning in healthcare and life sciences. It provides an overview of machine learning techniques like classification, regression, and clustering. It then discusses IBM's Watson Data Platform and Data Science Experience for developing and deploying machine learning models at scale. The document presents a case study on using deep learning for lung cancer detection from medical images. It concludes with recommendations for applying machine learning including the importance of data and shortening development cycles.
Machine Learning
• Classification: predict class from observations
-E.g. Spam Email Detection
• Regression (prediction): predict value from observations
- E.g. Energy consumption prediction
• Clustering: group observations into “meaningful” groups
- E.g. Amazon Recommendations
and many different technologies and libraries are available:
5.
Why ML now?
The economics
1.Companies want ML but they don’t have
deep understanding
2. Skills are hard to find
3. Machine Learning at scale is hard
4. Many technology choices
5. Productionized ML is very complex and not
standardized
Why not many companies
are using ML today?
6.
IBM Cloud Analytics
ComprehensiveTrusted Flexible
• Broadest selection of data and
analytics services
• Seamless integrations
• Open-source leadership
• Fully managed environment 24 x 7
• Secure infrastructure
• No installation, configuration,
maintenance required
• Cloud, on-premises & hybrid support
• No vendor lock-in
• Subscription pricing
IBM Analytics for
Apache Spark
BigInsights for
Apache Hadoop
Cloudant dashDB DataConnect
Elasticsearch by
Compose
Geospatial
Analytics
IBM DB2 on
Cloud
Insights
for Twitter
MongoDB by
Compose
Object Storage
PostgreSQL by
Compose
Predictive
Analytics
Redis by
Compose
SQL
Database
Streaming
Analytics
Time Series
Database
Analytics for
Apache Hadoop
Insights for
Weather
Graph
Blu
Acceleration
7.
Data Engineering
Data Science Business Analysis App Development
Data Sources
•On-premises / cloud
• Structured / unstructured
[and content repositories]
• In-motion / at-rest
• Internal / external Hadoop
NoSQL / SQL
Object store
Discovery / Exploration
Machine learning
Model development
Reports / Dashboards
Applications
APIs
Integration
Matching / Quality
Streaming
Persist
Analyze
Ingest Deploy
Iterate
Govern
Data Assessment
Metadata / Policies
Find Share Collaborate
Intelligent Data Fabric
IBM Watson Data Platform
Watson Developer Cloud
APIs
•Natural Language
•Vision Services
•Data Insight Services
8.
ML pipeline inWatson Data Platform
BigInsights HDFS
(Hadoop)
Data Connect DashDB
Data Science ExperienceCloudantNode.js Web Form
Training Data Convert to CSV
Predictions
New Records
Predictions
IBM Data ScienceExperience
Community Open Source IBM Added Value
Powered by IBM Watson Data Platform
• Find tutorials and datasets
• Connect with Data Scientists
• Ask questions
• Read articles and papers
• Fork and share projects
• Code in Scala/Python/R/SQL
• Jupyter Notebooks
• RStudio IDE and Shiny
• Spark ML
• Your favorite libraries
• Managed Spark Service
• Project, Catalog, Data Connectors
• ML Model Builder and Canvas
• IBM Machine Learning API
• Cloud, Desktop and Local Deployment
Core Attributes of the Data Science
Experience
11.
Machine Learning ImpactAcross Industries and Use Cases
$10s of Billions in each industry and use case
Data Science Solutions