SlideShare a Scribd company logo
1
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2017. Technical University/Symposia materials may not
be reproduced in whole or in part without the prior written permission of IBM.
Paulo Sergio Lemes Queiroz
Systems Consultant
IBM
PowerAI Deep Dive
2
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2017. Technical University/Symposia materials may not
be reproduced in whole or in part without the prior written permission of IBM.
• What is PowerAI
• PowerAI Components
 Hardware requirements
 CPU VS GPU
 Volta / Tensors
• Using PowerAI Components
• Extending PowerAI
Session objectives
3
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2017. Technical University/Symposia materials may not
be reproduced in whole or in part without the prior written permission of IBM.
What is PowerAI
• Set of support Libraries to
develop machine learning and
Deep learning applications
4
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2017. Technical University/Symposia materials may not
be reproduced in whole or in part without the prior written permission of IBM.
Hardware Requirements
Minimal requirements for accelerated Machine Learning is:
- Any Power Server with a Nvidia GPU
IBM Power System S824L (with NVIDIA technology)
IBM Power System S822LC for High Performance Computing
IBM Power System S822LC for Big Data
IBM Power System S821LC
Key start point is have the GPU for acceleration
Nvlink enable even further acceleration
Nvlink usage is transparent
5
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2017. Technical University/Symposia materials may not
be reproduced in whole or in part without the prior written permission of IBM.
CPU vs GPU
CPU is a generic core, wich has access to the system main memmory in order to do all kind of tasks
GPU is composed by thousands of specialized cores that handles mathematical operations
GPUs are specialized on Tensor / Matrix / Vector / Float point parallel operations
Sequential operations are slower on GPUS
6
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2017. Technical University/Symposia materials may not
be reproduced in whole or in part without the prior written permission of IBM.
GPU limitations
GPU doesn't has access to the system main memmory, all data must be copied to the GPU and copied
back when processing is done ( batch like operations )
Non SIMD ( Same Instruction Multiple Datapoints ) operations are not THAT fast
7
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2017. Technical University/Symposia materials may not
be reproduced in whole or in part without the prior written permission of IBM.
Volta / Tensors
Volta GPU will have dedicated Tensor function units, which will allow processing of Neural Networks to
run even faster
However this will require even bigger bandwidth ( and lower lattency ) between the main memory and
the GPU
8
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2017. Technical University/Symposia materials may not
be reproduced in whole or in part without the prior written permission of IBM.
Using PowerAI Components
PowerAI, by default, require Ubunto 16.04 and it's install files can be located at:
https://public.dhe.ibm.com/software/server/POWER/Linux/mldl/ubuntu/README.html
The following components come along at the free package:
caffe-bvlc - Berkeley Vision and Learning Center (BVLC) upstream Caffe, v1.0.0
caffe-ibm - IBM Optimized version of BVLC Caffe, v1.0.0
caffe-nv - NVIDIA fork of Caffe, v0.15.14
chainer - Chainer, v1.23.0
digits - DIGITS, v5.0.0
tensorflow - Google TensorFlow, v1.1.0
ddl-tensorflow - Distributed Deep Learning custom operator for TensorFlow
theano - Theano, v0.9.0
torch - Torch, v7
It's important to highlight that PowerAI isn't limited by these pre-compiled libraries or versions
As on any Ubuntu machine, the user can compile any software
9
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2017. Technical University/Symposia materials may not
be reproduced in whole or in part without the prior written permission of IBM.
Using PowerAI Components
All software components of PowerAI are installed at /opt/DL
root@pq-s824l-kvm:/opt/DL# ls -l
total 0
drwxr-xr-x 4 root root 29 Jun 23 16:08 bazel
lrwxrwxrwx 1 root root 23 Jun 23 16:39 caffe -> /etc/alternatives/caffe
drwxr-xr-x 13 root root 151 Jun 23 16:38 caffe-bvlc
drwxr-xr-x 13 root root 151 Jun 23 16:38 caffe-ibm
drwxr-xr-x 13 root root 151 Jun 23 16:38 caffe-nv
drwxr-xr-x 7 root root 68 Jun 23 16:38 chainer
drwxr-xr-x 4 root root 28 Aug 8 14:21 ddl
drwxr-xr-x 6 root root 55 Aug 8 14:21 ddl-tensorflow
drwxr-xr-x 8 root root 210 Aug 6 16:47 digits
drwxr-xr-x 7 root root 67 Jun 23 16:38 nccl
drwxr-xr-x 6 root root 54 Jun 23 16:38 openblas
drwxr-xr-x 3 root root 44 Aug 8 14:19 repo
drwxr-xr-x 5 root root 54 Jun 23 16:10 tensorflow
drwxr-xr-x 6 root root 54 Jun 23 16:38 theano
drwxr-xr-x 9 root root 94 Jun 23 16:38 torch
root@pq-s824l-kvm:/opt/DL# pwd
/opt/DL
root@pq-s824l-kvm:/opt/DL#
10
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2017. Technical University/Symposia materials may not
be reproduced in whole or in part without the prior written permission of IBM.
Using PowerAI Components
By default these packages are not in the system PATH, however to enable those to exist at the PATH a
few helper scripts can be used:
. /opt/DL/tensorflow/bin/tensorflow-activate
. /opt/DL/theano/bin/theano-activate
export PATH="${PATH}:/opt/DL/bazel/bin"
11
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2017. Technical University/Symposia materials may not
be reproduced in whole or in part without the prior written permission of IBM.
Tuning recommendation
Change the CPU govenor to performance, in order to avoid performance fluctuations
for i in ./devices/system/cpu/cpufreq/policy* ; do echo performance > $i/scaling_governor ; done
cpupower -c all frequency-set -g performance
Enable Persistent memory mode for the GPU:
nvidia-smi -pm ENABLED
22
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2017. Technical University/Symposia materials may not
be reproduced in whole or in part without the prior written permission of IBM.
Your feedback about this session is very important to us.
Submit a survey at:
ibmtechu.com
23
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2017. Technical University/Symposia materials may not
be reproduced in whole or in part without the prior written permission of IBM.
Continue the conversation
view event
highlights
talk to tech
experts
connect with
attendees
read training
articles
IBM Systems
Technical Events
LinkedIn
community
Join today
bit.ly/IBMTechUconnect
24
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2017. Technical University/Symposia materials may not
be reproduced in whole or in part without the prior written permission of IBM.
ibm.com/training
provides a comprehensive
portfolio of skills and career
accelerators that are designed
to meet all your training needs.
If you can’t find the training that is right for you
with our Global Training Providers, we can help.
Contact IBM Training at dpmc@us.ibm.com
Continue growing your IBM skills

More Related Content

What's hot

TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
Willy Marroquin (WillyDevNET)
 
Intel's Machine Learning Strategy
Intel's Machine Learning StrategyIntel's Machine Learning Strategy
Intel's Machine Learning Strategy
inside-BigData.com
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
Büşra İçöz
 
WML OpenPOWER presentation
WML OpenPOWER presentationWML OpenPOWER presentation
WML OpenPOWER presentation
Ganesan Narayanasamy
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY
 
CFD on Power
CFD on Power CFD on Power
CFD on Power
Ganesan Narayanasamy
 
A Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysA Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate Arrays
Taylor Riggan
 
OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar
Ganesan Narayanasamy
 
OpenPOWER Foundation Overview
OpenPOWER Foundation OverviewOpenPOWER Foundation Overview
OpenPOWER Foundation Overview
NVIDIA Taiwan
 
Distributed deep learning optimizations for Finance
Distributed deep learning optimizations for FinanceDistributed deep learning optimizations for Finance
Distributed deep learning optimizations for Finance
geetachauhan
 
OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM
Ganesan Narayanasamy
 
Distributed deep learning optimizations - AI WithTheBest
Distributed deep learning optimizations - AI WithTheBestDistributed deep learning optimizations - AI WithTheBest
Distributed deep learning optimizations - AI WithTheBest
geetachauhan
 
AWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI ResearchAWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI Research
Intel® Software
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platform
inside-BigData.com
 
Amd ces tech day 2018 lisa su
Amd ces tech day 2018 lisa suAmd ces tech day 2018 lisa su
Amd ces tech day 2018 lisa su
Teddy Kuo
 
AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019
Intel® Software
 
DDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your HardwareDDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your Hardware
inside-BigData.com
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning
Dr. Swaminathan Kathirvel
 
AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group
Ganesan Narayanasamy
 
MIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platformMIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platform
Ganesan Narayanasamy
 

What's hot (20)

TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
 
Intel's Machine Learning Strategy
Intel's Machine Learning StrategyIntel's Machine Learning Strategy
Intel's Machine Learning Strategy
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
WML OpenPOWER presentation
WML OpenPOWER presentationWML OpenPOWER presentation
WML OpenPOWER presentation
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
 
CFD on Power
CFD on Power CFD on Power
CFD on Power
 
A Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysA Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate Arrays
 
OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar
 
OpenPOWER Foundation Overview
OpenPOWER Foundation OverviewOpenPOWER Foundation Overview
OpenPOWER Foundation Overview
 
Distributed deep learning optimizations for Finance
Distributed deep learning optimizations for FinanceDistributed deep learning optimizations for Finance
Distributed deep learning optimizations for Finance
 
OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM
 
Distributed deep learning optimizations - AI WithTheBest
Distributed deep learning optimizations - AI WithTheBestDistributed deep learning optimizations - AI WithTheBest
Distributed deep learning optimizations - AI WithTheBest
 
AWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI ResearchAWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI Research
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platform
 
Amd ces tech day 2018 lisa su
Amd ces tech day 2018 lisa suAmd ces tech day 2018 lisa su
Amd ces tech day 2018 lisa su
 
AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019
 
DDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your HardwareDDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your Hardware
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning
 
AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group
 
MIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platformMIT's experience on OpenPOWER/POWER 9 platform
MIT's experience on OpenPOWER/POWER 9 platform
 

Similar to PowerAI Deep Dive ( key points )

Enabling POWER 8 advanced features on Linux
Enabling POWER 8 advanced features on LinuxEnabling POWER 8 advanced features on Linux
Enabling POWER 8 advanced features on Linux
Sebastien Chabrolles
 
AIX Performance Tuning Session at STU2017
AIX Performance Tuning Session at STU2017AIX Performance Tuning Session at STU2017
AIX Performance Tuning Session at STU2017
Paulo Sergio Lemes Queiroz
 
Visão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Visão geral do hardware do servidor System z e Linux on z - Concurso MainframeVisão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Visão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Anderson Bassani
 
l011029
l011029l011029
"Relax and Recover", an Open Source mksysb for Linux on Power
"Relax and Recover", an Open Source mksysb for Linux on Power"Relax and Recover", an Open Source mksysb for Linux on Power
"Relax and Recover", an Open Source mksysb for Linux on Power
Sebastien Chabrolles
 
Octobus technical university def
Octobus technical university   defOctobus technical university   def
Octobus technical university def
Daniela Zuppini
 
S016576 managing-data-footprint-reduction-brazil-v1708f
S016576 managing-data-footprint-reduction-brazil-v1708fS016576 managing-data-footprint-reduction-brazil-v1708f
S016576 managing-data-footprint-reduction-brazil-v1708f
Tony Pearson
 
S016394 pendulum-swings-melbourne-v1708d
S016394 pendulum-swings-melbourne-v1708dS016394 pendulum-swings-melbourne-v1708d
S016394 pendulum-swings-melbourne-v1708d
Tony Pearson
 
S014068 pendulum-swings-orlando-v1705c
S014068 pendulum-swings-orlando-v1705cS014068 pendulum-swings-orlando-v1705c
S014068 pendulum-swings-orlando-v1705c
Tony Pearson
 
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CSTCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
zOSCommserver
 
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
Amazon Web Services
 
EnterpriseDB - IT innovation at the speed of business.
EnterpriseDB - IT innovation at the speed of business.EnterpriseDB - IT innovation at the speed of business.
EnterpriseDB - IT innovation at the speed of business.
Gerdan Santos
 
GPGPU in Commercial Software: Lessons From Three Cycles of the Adobe Creative...
GPGPU in Commercial Software: Lessons From Three Cycles of the Adobe Creative...GPGPU in Commercial Software: Lessons From Three Cycles of the Adobe Creative...
GPGPU in Commercial Software: Lessons From Three Cycles of the Adobe Creative...
Kevin Goldsmith
 
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
Amazon Web Services
 
S104878 nvme-revolution-jburg-v1809b
S104878 nvme-revolution-jburg-v1809bS104878 nvme-revolution-jburg-v1809b
S104878 nvme-revolution-jburg-v1809b
Tony Pearson
 
[Café techno] - Ibm power7 - Les dernières annonces
[Café techno] - Ibm power7 - Les dernières annonces[Café techno] - Ibm power7 - Les dernières annonces
[Café techno] - Ibm power7 - Les dernières annonces
Groupe D.FI
 
Some experiences for porting application to Intel Xeon Phi
Some experiences for porting application to Intel Xeon PhiSome experiences for porting application to Intel Xeon Phi
Some experiences for porting application to Intel Xeon Phi
Maho Nakata
 
FPGA MeetUp
FPGA MeetUpFPGA MeetUp
FPGA MeetUp
Moya Brannan
 
S104874 toe-pool-jburg-v1809e
S104874 toe-pool-jburg-v1809eS104874 toe-pool-jburg-v1809e
S104874 toe-pool-jburg-v1809e
Tony Pearson
 
Best Practices and Performance Studies for High-Performance Computing Clusters
Best Practices and Performance Studies for High-Performance Computing ClustersBest Practices and Performance Studies for High-Performance Computing Clusters
Best Practices and Performance Studies for High-Performance Computing Clusters
Intel® Software
 

Similar to PowerAI Deep Dive ( key points ) (20)

Enabling POWER 8 advanced features on Linux
Enabling POWER 8 advanced features on LinuxEnabling POWER 8 advanced features on Linux
Enabling POWER 8 advanced features on Linux
 
AIX Performance Tuning Session at STU2017
AIX Performance Tuning Session at STU2017AIX Performance Tuning Session at STU2017
AIX Performance Tuning Session at STU2017
 
Visão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Visão geral do hardware do servidor System z e Linux on z - Concurso MainframeVisão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Visão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
 
l011029
l011029l011029
l011029
 
"Relax and Recover", an Open Source mksysb for Linux on Power
"Relax and Recover", an Open Source mksysb for Linux on Power"Relax and Recover", an Open Source mksysb for Linux on Power
"Relax and Recover", an Open Source mksysb for Linux on Power
 
Octobus technical university def
Octobus technical university   defOctobus technical university   def
Octobus technical university def
 
S016576 managing-data-footprint-reduction-brazil-v1708f
S016576 managing-data-footprint-reduction-brazil-v1708fS016576 managing-data-footprint-reduction-brazil-v1708f
S016576 managing-data-footprint-reduction-brazil-v1708f
 
S016394 pendulum-swings-melbourne-v1708d
S016394 pendulum-swings-melbourne-v1708dS016394 pendulum-swings-melbourne-v1708d
S016394 pendulum-swings-melbourne-v1708d
 
S014068 pendulum-swings-orlando-v1705c
S014068 pendulum-swings-orlando-v1705cS014068 pendulum-swings-orlando-v1705c
S014068 pendulum-swings-orlando-v1705c
 
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CSTCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
 
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
 
EnterpriseDB - IT innovation at the speed of business.
EnterpriseDB - IT innovation at the speed of business.EnterpriseDB - IT innovation at the speed of business.
EnterpriseDB - IT innovation at the speed of business.
 
GPGPU in Commercial Software: Lessons From Three Cycles of the Adobe Creative...
GPGPU in Commercial Software: Lessons From Three Cycles of the Adobe Creative...GPGPU in Commercial Software: Lessons From Three Cycles of the Adobe Creative...
GPGPU in Commercial Software: Lessons From Three Cycles of the Adobe Creative...
 
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
 
S104878 nvme-revolution-jburg-v1809b
S104878 nvme-revolution-jburg-v1809bS104878 nvme-revolution-jburg-v1809b
S104878 nvme-revolution-jburg-v1809b
 
[Café techno] - Ibm power7 - Les dernières annonces
[Café techno] - Ibm power7 - Les dernières annonces[Café techno] - Ibm power7 - Les dernières annonces
[Café techno] - Ibm power7 - Les dernières annonces
 
Some experiences for porting application to Intel Xeon Phi
Some experiences for porting application to Intel Xeon PhiSome experiences for porting application to Intel Xeon Phi
Some experiences for porting application to Intel Xeon Phi
 
FPGA MeetUp
FPGA MeetUpFPGA MeetUp
FPGA MeetUp
 
S104874 toe-pool-jburg-v1809e
S104874 toe-pool-jburg-v1809eS104874 toe-pool-jburg-v1809e
S104874 toe-pool-jburg-v1809e
 
Best Practices and Performance Studies for High-Performance Computing Clusters
Best Practices and Performance Studies for High-Performance Computing ClustersBest Practices and Performance Studies for High-Performance Computing Clusters
Best Practices and Performance Studies for High-Performance Computing Clusters
 

Recently uploaded

RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 

Recently uploaded (20)

RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 

PowerAI Deep Dive ( key points )

  • 1. 1 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2017. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Paulo Sergio Lemes Queiroz Systems Consultant IBM PowerAI Deep Dive
  • 2. 2 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2017. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. • What is PowerAI • PowerAI Components  Hardware requirements  CPU VS GPU  Volta / Tensors • Using PowerAI Components • Extending PowerAI Session objectives
  • 3. 3 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2017. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. What is PowerAI • Set of support Libraries to develop machine learning and Deep learning applications
  • 4. 4 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2017. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Hardware Requirements Minimal requirements for accelerated Machine Learning is: - Any Power Server with a Nvidia GPU IBM Power System S824L (with NVIDIA technology) IBM Power System S822LC for High Performance Computing IBM Power System S822LC for Big Data IBM Power System S821LC Key start point is have the GPU for acceleration Nvlink enable even further acceleration Nvlink usage is transparent
  • 5. 5 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2017. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. CPU vs GPU CPU is a generic core, wich has access to the system main memmory in order to do all kind of tasks GPU is composed by thousands of specialized cores that handles mathematical operations GPUs are specialized on Tensor / Matrix / Vector / Float point parallel operations Sequential operations are slower on GPUS
  • 6. 6 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2017. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. GPU limitations GPU doesn't has access to the system main memmory, all data must be copied to the GPU and copied back when processing is done ( batch like operations ) Non SIMD ( Same Instruction Multiple Datapoints ) operations are not THAT fast
  • 7. 7 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2017. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Volta / Tensors Volta GPU will have dedicated Tensor function units, which will allow processing of Neural Networks to run even faster However this will require even bigger bandwidth ( and lower lattency ) between the main memory and the GPU
  • 8. 8 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2017. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Using PowerAI Components PowerAI, by default, require Ubunto 16.04 and it's install files can be located at: https://public.dhe.ibm.com/software/server/POWER/Linux/mldl/ubuntu/README.html The following components come along at the free package: caffe-bvlc - Berkeley Vision and Learning Center (BVLC) upstream Caffe, v1.0.0 caffe-ibm - IBM Optimized version of BVLC Caffe, v1.0.0 caffe-nv - NVIDIA fork of Caffe, v0.15.14 chainer - Chainer, v1.23.0 digits - DIGITS, v5.0.0 tensorflow - Google TensorFlow, v1.1.0 ddl-tensorflow - Distributed Deep Learning custom operator for TensorFlow theano - Theano, v0.9.0 torch - Torch, v7 It's important to highlight that PowerAI isn't limited by these pre-compiled libraries or versions As on any Ubuntu machine, the user can compile any software
  • 9. 9 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2017. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Using PowerAI Components All software components of PowerAI are installed at /opt/DL root@pq-s824l-kvm:/opt/DL# ls -l total 0 drwxr-xr-x 4 root root 29 Jun 23 16:08 bazel lrwxrwxrwx 1 root root 23 Jun 23 16:39 caffe -> /etc/alternatives/caffe drwxr-xr-x 13 root root 151 Jun 23 16:38 caffe-bvlc drwxr-xr-x 13 root root 151 Jun 23 16:38 caffe-ibm drwxr-xr-x 13 root root 151 Jun 23 16:38 caffe-nv drwxr-xr-x 7 root root 68 Jun 23 16:38 chainer drwxr-xr-x 4 root root 28 Aug 8 14:21 ddl drwxr-xr-x 6 root root 55 Aug 8 14:21 ddl-tensorflow drwxr-xr-x 8 root root 210 Aug 6 16:47 digits drwxr-xr-x 7 root root 67 Jun 23 16:38 nccl drwxr-xr-x 6 root root 54 Jun 23 16:38 openblas drwxr-xr-x 3 root root 44 Aug 8 14:19 repo drwxr-xr-x 5 root root 54 Jun 23 16:10 tensorflow drwxr-xr-x 6 root root 54 Jun 23 16:38 theano drwxr-xr-x 9 root root 94 Jun 23 16:38 torch root@pq-s824l-kvm:/opt/DL# pwd /opt/DL root@pq-s824l-kvm:/opt/DL#
  • 10. 10 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2017. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Using PowerAI Components By default these packages are not in the system PATH, however to enable those to exist at the PATH a few helper scripts can be used: . /opt/DL/tensorflow/bin/tensorflow-activate . /opt/DL/theano/bin/theano-activate export PATH="${PATH}:/opt/DL/bazel/bin"
  • 11. 11 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2017. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Tuning recommendation Change the CPU govenor to performance, in order to avoid performance fluctuations for i in ./devices/system/cpu/cpufreq/policy* ; do echo performance > $i/scaling_governor ; done cpupower -c all frequency-set -g performance Enable Persistent memory mode for the GPU: nvidia-smi -pm ENABLED
  • 12. 22 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2017. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Your feedback about this session is very important to us. Submit a survey at: ibmtechu.com
  • 13. 23 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2017. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Continue the conversation view event highlights talk to tech experts connect with attendees read training articles IBM Systems Technical Events LinkedIn community Join today bit.ly/IBMTechUconnect
  • 14. 24 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2017. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. ibm.com/training provides a comprehensive portfolio of skills and career accelerators that are designed to meet all your training needs. If you can’t find the training that is right for you with our Global Training Providers, we can help. Contact IBM Training at dpmc@us.ibm.com Continue growing your IBM skills