IBM AI Solutions on Power Systems
--
Andrew Zhang
Data Scientist
IBM Cognitive Systems
andrew.zhang@ibm.com
04/23/2020
• Introduction
• IBM Visual Insights
• IBM AI Solutions on Power Systems
• Use Case Demo: Breast Cancer Classification
• Use Case Demo: Diabetic Retinopathy Detection
• Response to Covid-19
• Open Issues
Agenda
Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
• Open Source
• Big Data Analytics
• Data and AI
• High Performance Computing
About me
4
Introduction
6
Machine Learning vs Deep Learning
Machine Learning
• Traditional ML requires manual
feature extraction/engineering
• Feature extraction for unstructured
data is very difficult
Deep Learning
• Deep learning can automatically
learn features in data
• Deep learning is largely a "black box"
technique, updating learned weights
at each layer
Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
8
9
Tools & Infrastructure
• Need an environment
that enables a “fail fast”
approach
• Discrete tools present
barriers to productivity
Governance
• If the data isn’t secure,
self-service isn’t a
reality
• Challenge
understanding data
lineage and getting to a
system of truth
Skills
• Data Science skills are in
low supply and high
demand
• Nurturing new data
professionals is
challenging
Data
• Data resides in silos &
difficult to access
• Unstructured and
external data wasn’t
considered
10
Why are organizations
struggling to capture the value
of Data and AI?
11
IBM Visual Insights
12
Source: Fei-Fei Li, Andrej Karpathy & Justin Johnson (2016) cs231n, Lecture 8 - Slide 8, Spatial Localization and Detection (01/02/2016).
Available: http://cs231n.stanford.edu/slides/2016/winter1516_lecture8.pdf
IBM Visual Insights
13
• Image Classification
• Object Detection
• Image Segmentation
• Video Action Detection
• Built-in Data Augmentation
• Transfer Learning
• Advanced Labeling Assistance Features
• Video Labeling Support and Video Preview
• Deployment on FPGA, GPU, or CPU (x86 or P9)
• Built for (Private) Cloud Native on Kubernetes
Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
TopicsWatson Machine Learning Accelerator: A Data Science &
Enterprise AI Platform Architecture Overview
16
Embracing Kubernetes & Containers (OpenShift)
On Premise, K8S/Docker via OpenShift (Power and x86 Support)
Kubernetes & Containers
Advanced Kubernetes Scheduling Policy Engine
Kubernetes Namespace with CPU/GPU Resources
Advanced Workload Scheduler – Meta Session Scheduling Daemon (MSD)
Training Execution – EDT
Hyper-Parameter Optimization Execution- HPO
Inference Execution - EDI
Resource
Management
Resource Allocation
Workload
Scheduler
Execution Logic
Example
Frameworks /
Development Tools
/ 3rd Party Support
SnapML
WMLA:
End-to-End
Enterprise AI
Platform
© IBM Corporation 2019 17
Simplicity: Integrated
Platform that Just Works
Curate, test, and support
fast moving Open Source
Provide enterprise
distributions
Easy to deploy enterprise
AI platform
Ease of Use,
Unique Capabilities
Faster Model
Training Time
Large data & model
support with NVLink
Acceleration of analytics,
ML and DL
AutoDL: Visual Insights
AutoML: H2O
Elastic training: scale GPUs
as required
Faster training times from
single server
with scalability to 100s of
servers
Leads to faster insights
and better economics
Platform that Partners can
build on
Software Partners: H2O,
IBM, Anaconda
SIs, Solution Vendors
& Accelerator Partners
Open AI Platform w/
Ecosystem Partners
Power9
CPU
GPU
WML-A
IBM
SW
ISV SW
Solution
SIs
Top reasons to choose Watson ML Accelerator
18
Use Case and Demo
Breast Cancer Classification
Use Case Description: Breast cancer affects 1 in 8 women during their lives. Worldwide, breast cancer is the most common type
of cancer in women and the second highest in terms of mortality rates. Recent studies have shown that deep learning system
performed as well as radiologists in standalone mode and improved the radiologists’ performance in support mode.
Solution: Classify breast cancer tumor types in less than 1 second with a high confidence level.
Dataset: https://web.inf.ufpr.br/vri/databases/breast-cancer-histopathological-database-breakhis/
• 9,109 microscopic images of breast tumor tissue
• collected from 82 patients using different magnifying
factors (40X, 100X, 200X, and 400X)
• 2,480 benign and 5,429 malignant samples
• Resolution: 700X460 pixels, 3-channel RGB, 8-bit depth
in each channel, PNG format
B = Benign
Adenosis
Fibroadenoma
Tubular Adenoma
Phyllodes Tumor
M = Malignant
Ductal Carcinoma
Lobular Carcinoma
Mucinous Carcinoma (Colloid)
Papillary Carcinoma
Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
Breast Cancer Classification
Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
The Goal - Classify 40x magnification cells
Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
22
Demo: Breast Cancer Classification
Interactive Demo (10 mins)
Diabetic Retinopathy Detection
35,000 + images of various classes
0 - No DR
1 - Mild
2 - Moderate
3 - Severe
4 - Proliferative DR
Class 0 Class 4
Use Case Description: Diabetic retinopathy (DR) is the leading cause of blindness. DR will grow from 126 million in
2010 to 191 million by 2030. Comprehensive and automated method of DR screening has long been recognized.
Solution: Create an automatic image classification solution to assist medical diagnosis and research etc.
Dataset: https://www.kaggle.com/c/diabetic-retinopathy-detection
Data Augmentation Train Models Deploy Models InferencingGenerate DatasetPrepare Data
Catalog large number of
data files through
metadata management.
It is almost impossible to
manage large number of
files manually.
Create dataset with
balanced categories to
improve model
accuracy. Simple UI to
create new image data.
Automatic model
training using pre-
trained model and GPU
to scale compute.
One click model API
end-point
deployment for any
application.
Infuse AI with online
(mobile/web app),
batch, and real-time
streaming process.
Automatically select
any dataset and
subset for multiple
training iterations.
Step 1 Step 2 Step 3 Step 4 Step 5 Step 6
Diabetic Retinopathy Detection
Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
25
Demo: Diabetic Retinopathy Detection
Play Video (10 mins)
Diabetic Retinopathy Detection – Next Steps
Python and Open Source
• Exploratory Data Analysis
• Crop and Resize Images
• Rotate and Mirror Images
• Neural Network Architecture
Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
IBM Visual Insights Use Cases in Medical Imaging
Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
Open Issues
28
• AI Golden Age: Big Data, Big Compute, Deep Learning
• State-of-the-art deep neural network architectures
• Supervised pre-training/domain-specific fine-tuning
• Unsupervised learning – GAN (Generative adversarial network)
• Data + knowledge
• Open medical datasets: From ImageNet to ??
• Data Labeling and Annotation
• Data Privacy – Federated Learning
• High Performance Computing in FPGA, GPU, or CPU (x86 or P9)
• Benchmarking – MLPerf – Medical Imaging
• Model size reduction
• Deployment on Edge Devices
Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
© IBM Corporation 2019 29
Client Experience Centers Additional Resources
Design Sprint
Discovery Workshop
Discuss infrastructure and business challenges and identify potential use cases
IBM provides a 4-hour free workshop
Deliverable: Workshop and Use Cases
MVP (Minimum Viable Product) Build
Architectural Consulting
Team with an architect to help you define the framework of your solution.
IBM provides one week of solution architecture consulting
Deliverable: 40 hours of architecture consulting with an IBM architect
Develop a functioning solution using agile methodologies, leveraging IBM experts
IBM provides an application development team for 6-8 weeks
Deliverable: MVP application
How can IBM
make you
successful?
Contact:
design@us.ibm.com
aicoc@us.ibm.com
Apply IBM Design Thinking principles to evaluate current business and technology
processes and define the minimum viable product (MVP).
IBM provides one week of solution design, including an in-person workshop
Deliverable: Workshop and MVP definition
Questions ?
Twitter: @a9zhang
Email the speaker: andrew.zhang@ibm.com
References
31
1. PowerAI Vision
2. Using IBM PowerAI Vision to count cars with Object Detection!
3. How computers learn to recognize objects instantly
4. What are Generative Adversarial Networks (GANs) and how do they work?
5. Deep Learning in Medical Imaging - Ben Glocker
6. AI for "Deep Blue" Moment in Medical Imaging with Open Source Data
7. Case Study: TensorFlow in Medicine - Retinal Imaging (TensorFlow Dev Summit 2017)
8. Workshop on Deep Learning for CryoEM – 2018
Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
Covid-19 Response
IBM Covid-19 Data and AI Response
33
X-Ray Covid-19 Demo
34
Training and Inferencing
patientid
offset
sex
age
finding
survival
intubated
intubation_present
went_icu
needed_supplemental_O2
extubated
temperature
pO2_saturation
leukocyte_count
neutrophil_count
lymphocyte_count
view
modality
date
location
folder
filename
doi
url
license
clinical_notes
other_notes
Call to Action
35
- Contact Ganesan or AICOC (aicoc@us.ibm.com) for guided lab exercises
- Collaborate to share data and models for both DR and X-Ray lung Covid-19
- Looking for PhD students with medical background to develop advanced research on IBM super-computers
- Post slides on my twitter and Linkedin account
• Twitter: @a9zhang
• Email the speaker: andrew.zhang@ibm.com

OpenPOWER/POWER9 AI webinar

  • 1.
    IBM AI Solutionson Power Systems -- Andrew Zhang Data Scientist IBM Cognitive Systems andrew.zhang@ibm.com 04/23/2020
  • 2.
    • Introduction • IBMVisual Insights • IBM AI Solutions on Power Systems • Use Case Demo: Breast Cancer Classification • Use Case Demo: Diabetic Retinopathy Detection • Response to Covid-19 • Open Issues Agenda Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
  • 3.
    • Open Source •Big Data Analytics • Data and AI • High Performance Computing About me
  • 4.
  • 5.
    6 Machine Learning vsDeep Learning Machine Learning • Traditional ML requires manual feature extraction/engineering • Feature extraction for unstructured data is very difficult Deep Learning • Deep learning can automatically learn features in data • Deep learning is largely a "black box" technique, updating learned weights at each layer Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
  • 6.
    Cognitive Systems /Andrew Zhang / © 2020 IBM Corporation
  • 7.
  • 8.
  • 9.
    Tools & Infrastructure •Need an environment that enables a “fail fast” approach • Discrete tools present barriers to productivity Governance • If the data isn’t secure, self-service isn’t a reality • Challenge understanding data lineage and getting to a system of truth Skills • Data Science skills are in low supply and high demand • Nurturing new data professionals is challenging Data • Data resides in silos & difficult to access • Unstructured and external data wasn’t considered 10 Why are organizations struggling to capture the value of Data and AI?
  • 10.
  • 11.
    12 Source: Fei-Fei Li,Andrej Karpathy & Justin Johnson (2016) cs231n, Lecture 8 - Slide 8, Spatial Localization and Detection (01/02/2016). Available: http://cs231n.stanford.edu/slides/2016/winter1516_lecture8.pdf
  • 12.
    IBM Visual Insights 13 •Image Classification • Object Detection • Image Segmentation • Video Action Detection • Built-in Data Augmentation • Transfer Learning • Advanced Labeling Assistance Features • Video Labeling Support and Video Preview • Deployment on FPGA, GPU, or CPU (x86 or P9) • Built for (Private) Cloud Native on Kubernetes Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
  • 13.
    TopicsWatson Machine LearningAccelerator: A Data Science & Enterprise AI Platform Architecture Overview 16 Embracing Kubernetes & Containers (OpenShift) On Premise, K8S/Docker via OpenShift (Power and x86 Support) Kubernetes & Containers Advanced Kubernetes Scheduling Policy Engine Kubernetes Namespace with CPU/GPU Resources Advanced Workload Scheduler – Meta Session Scheduling Daemon (MSD) Training Execution – EDT Hyper-Parameter Optimization Execution- HPO Inference Execution - EDI Resource Management Resource Allocation Workload Scheduler Execution Logic Example Frameworks / Development Tools / 3rd Party Support SnapML WMLA: End-to-End Enterprise AI Platform
  • 14.
    © IBM Corporation2019 17 Simplicity: Integrated Platform that Just Works Curate, test, and support fast moving Open Source Provide enterprise distributions Easy to deploy enterprise AI platform Ease of Use, Unique Capabilities Faster Model Training Time Large data & model support with NVLink Acceleration of analytics, ML and DL AutoDL: Visual Insights AutoML: H2O Elastic training: scale GPUs as required Faster training times from single server with scalability to 100s of servers Leads to faster insights and better economics Platform that Partners can build on Software Partners: H2O, IBM, Anaconda SIs, Solution Vendors & Accelerator Partners Open AI Platform w/ Ecosystem Partners Power9 CPU GPU WML-A IBM SW ISV SW Solution SIs Top reasons to choose Watson ML Accelerator
  • 15.
  • 16.
    Breast Cancer Classification UseCase Description: Breast cancer affects 1 in 8 women during their lives. Worldwide, breast cancer is the most common type of cancer in women and the second highest in terms of mortality rates. Recent studies have shown that deep learning system performed as well as radiologists in standalone mode and improved the radiologists’ performance in support mode. Solution: Classify breast cancer tumor types in less than 1 second with a high confidence level. Dataset: https://web.inf.ufpr.br/vri/databases/breast-cancer-histopathological-database-breakhis/ • 9,109 microscopic images of breast tumor tissue • collected from 82 patients using different magnifying factors (40X, 100X, 200X, and 400X) • 2,480 benign and 5,429 malignant samples • Resolution: 700X460 pixels, 3-channel RGB, 8-bit depth in each channel, PNG format B = Benign Adenosis Fibroadenoma Tubular Adenoma Phyllodes Tumor M = Malignant Ductal Carcinoma Lobular Carcinoma Mucinous Carcinoma (Colloid) Papillary Carcinoma Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
  • 17.
    Breast Cancer Classification CognitiveSystems / Andrew Zhang / © 2020 IBM Corporation
  • 18.
    The Goal -Classify 40x magnification cells Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
  • 19.
    22 Demo: Breast CancerClassification Interactive Demo (10 mins)
  • 20.
    Diabetic Retinopathy Detection 35,000+ images of various classes 0 - No DR 1 - Mild 2 - Moderate 3 - Severe 4 - Proliferative DR Class 0 Class 4 Use Case Description: Diabetic retinopathy (DR) is the leading cause of blindness. DR will grow from 126 million in 2010 to 191 million by 2030. Comprehensive and automated method of DR screening has long been recognized. Solution: Create an automatic image classification solution to assist medical diagnosis and research etc. Dataset: https://www.kaggle.com/c/diabetic-retinopathy-detection
  • 21.
    Data Augmentation TrainModels Deploy Models InferencingGenerate DatasetPrepare Data Catalog large number of data files through metadata management. It is almost impossible to manage large number of files manually. Create dataset with balanced categories to improve model accuracy. Simple UI to create new image data. Automatic model training using pre- trained model and GPU to scale compute. One click model API end-point deployment for any application. Infuse AI with online (mobile/web app), batch, and real-time streaming process. Automatically select any dataset and subset for multiple training iterations. Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Diabetic Retinopathy Detection Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
  • 22.
    25 Demo: Diabetic RetinopathyDetection Play Video (10 mins)
  • 23.
    Diabetic Retinopathy Detection– Next Steps Python and Open Source • Exploratory Data Analysis • Crop and Resize Images • Rotate and Mirror Images • Neural Network Architecture Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
  • 24.
    IBM Visual InsightsUse Cases in Medical Imaging Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
  • 25.
    Open Issues 28 • AIGolden Age: Big Data, Big Compute, Deep Learning • State-of-the-art deep neural network architectures • Supervised pre-training/domain-specific fine-tuning • Unsupervised learning – GAN (Generative adversarial network) • Data + knowledge • Open medical datasets: From ImageNet to ?? • Data Labeling and Annotation • Data Privacy – Federated Learning • High Performance Computing in FPGA, GPU, or CPU (x86 or P9) • Benchmarking – MLPerf – Medical Imaging • Model size reduction • Deployment on Edge Devices Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
  • 26.
    © IBM Corporation2019 29 Client Experience Centers Additional Resources Design Sprint Discovery Workshop Discuss infrastructure and business challenges and identify potential use cases IBM provides a 4-hour free workshop Deliverable: Workshop and Use Cases MVP (Minimum Viable Product) Build Architectural Consulting Team with an architect to help you define the framework of your solution. IBM provides one week of solution architecture consulting Deliverable: 40 hours of architecture consulting with an IBM architect Develop a functioning solution using agile methodologies, leveraging IBM experts IBM provides an application development team for 6-8 weeks Deliverable: MVP application How can IBM make you successful? Contact: design@us.ibm.com aicoc@us.ibm.com Apply IBM Design Thinking principles to evaluate current business and technology processes and define the minimum viable product (MVP). IBM provides one week of solution design, including an in-person workshop Deliverable: Workshop and MVP definition
  • 27.
    Questions ? Twitter: @a9zhang Emailthe speaker: andrew.zhang@ibm.com
  • 28.
    References 31 1. PowerAI Vision 2.Using IBM PowerAI Vision to count cars with Object Detection! 3. How computers learn to recognize objects instantly 4. What are Generative Adversarial Networks (GANs) and how do they work? 5. Deep Learning in Medical Imaging - Ben Glocker 6. AI for "Deep Blue" Moment in Medical Imaging with Open Source Data 7. Case Study: TensorFlow in Medicine - Retinal Imaging (TensorFlow Dev Summit 2017) 8. Workshop on Deep Learning for CryoEM – 2018 Cognitive Systems / Andrew Zhang / © 2020 IBM Corporation
  • 29.
  • 30.
    IBM Covid-19 Dataand AI Response 33
  • 31.
    X-Ray Covid-19 Demo 34 Trainingand Inferencing patientid offset sex age finding survival intubated intubation_present went_icu needed_supplemental_O2 extubated temperature pO2_saturation leukocyte_count neutrophil_count lymphocyte_count view modality date location folder filename doi url license clinical_notes other_notes
  • 32.
    Call to Action 35 -Contact Ganesan or AICOC (aicoc@us.ibm.com) for guided lab exercises - Collaborate to share data and models for both DR and X-Ray lung Covid-19 - Looking for PhD students with medical background to develop advanced research on IBM super-computers - Post slides on my twitter and Linkedin account • Twitter: @a9zhang • Email the speaker: andrew.zhang@ibm.com