From Zero to Hero - All you need to do serious deep learning stuff in R

From Zero to Hero
ALL YOU NEED TO DO SERIOUS DEEP LEARNING STUFF IN R

Agenda
• About Me
• How I use R/RStudio/(Nvidia)Docker
• Image recognition with R and Keras

Kai Lichtenberg, 33
Machine Learning Deep Learning
Sales Engineer
Mechanical
Engineer
Data Scientist
MSc
Sales Engineering & Product
Management
BSc
Sales Engineering & Product
Management
PhD @ Bosch
MSc
Mechanical Engineering
BSc
Mechanical Engineering
Working student
Gearbox
Development
Advanced Engineering
Transmission Units
Advanced Engineering
Connectivity Solutions
Founder, Data Scientist,
Accountant, Office
Manager, Janitor, …
Classic Stuff
Language Journey
TODAY
Quantum Computing
Q#

Some Projects
• Modeling reliability of equipment with high dimensional data (PhD)
• Non-linear exogenous autoregression (neural nets) to predict temperature
• Various production datasets, e.g. high pressure diesel pump and injector
• Who’s driving the machine? Driver classification and profiling
• Condition monitoring via acceleration and acoustic signals
• Customer churn prediction
• Image recognition with deep neural nets
• Blogging @Bosch

The Toolbox
NO CRAFTSMANSHIP WITHOUT GOOD TOOLS!

How my Dev Environment looked like
• Installed on the same system I use for office, browsing, etc.
• Pretty big stack with numerous packages, IDE’s, dependen-
cies, drivers, venvs, environment variables, …
• Especially the Keras->TensorFlow->cuDNN toolchain is tricky
• Takes hours to set up and just seconds to mess up
• I tend to fiddle around and break things :-)
• Needs to be installed on my laptop and my workstation (and
behave the same)

How I want my Dev Environment
• Very easy to install
• Portable to any Linux
• GPU support!
• No big overhead, use of native system resources
• No fiddling, tweaking an tuning
• All my beloved packages and tools
No cloud services! Only a container registry

Sneak Peek
#Docker CE
apt-get update
apt-get install apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
apt-get update
apt-get install docker-ce
usermod -a -G docker $USER
# Add the package repositories for nvidia docker
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64/nvidia-docker.list |
tee /etc/apt/sources.list.d/nvidia-docker.list
apt-get update
# Install nvidia-docker2 and reload the Docker daemon
apt-get install -y nvidia-docker2
pkill -SIGHUP dockerd
#Log into the nvidia repo and get tensorflow python 3
docker login nvcr.io
docker pull nvcr.io/nvidia/tensorflow:18.02-py3
#Clone Repo and build images
git clone https://github.com/KaiLicht/DataScience_Toolbox
cd DataScience_Toolbox/dockerfiles/Rbase_keras_tf
docker build -t kailicht/Rbase_keras_tf .
cd DataScience_Toolbox/dockerfiles/RStudio_deeplearning
docker build -t kailicht/rstudio_deeplearning .
cd DataScience_Toolbox/dockerfiles/My_Rstudio
docker build -t kailicht/myrstudio:1.0 .
No need to copy! I’m going to publish a
blogpost on how to create this stuff from
scratch and link it in the meetup group.

What is Docker?
Server
Host OS
Hypervisor
Guest OS
Bins/Libs
App A
Guest OS
Bins/Libs
App B
Server
Host OS
Docker Engine
Bins/Libs
App A
Bins/Libs
App B
Virtualization (VM’s) Isolation (Container)
Server
Host OS
Docker Engine
Bins/Libs
App A
Bins/Libs
App B
Container with GPU
GPU GPU GPU
CUDA Driver
CUDA toolkit CUDA toolkit
• No shared resources
• Complete OS is emulated
• Big overhead
• Not really suited for Data
Science environments
• Shared resources with host
• Only isolated file system
• Very little overhead
• Perfect for reproducible
Data Science
• Containers with GPU
passthrough
• Nvidia offers preconfigured
and optimized images for
different use cases (e.g. a
python-TensorFlow image)
See for yourself!

How to make a docker image
FROM ubuntu:16.04
MAINTAINER Kai Lichtenberg <kai@sentin.ai>
# Set a default user.
RUN useradd docker
&& mkdir /home/docker
&& chown docker:docker /home/docker
&& addgroup docker staff
RUN apt-get update
&& apt-get install -y --no-install-recommends
ed
locales
vim-tiny
fonts-texgyre
gnupg2
libcurl4-openssl-dev
libssl-dev
libssh2-1-dev
sudo
&& rm -rf /var/lib/apt/lists/*
dockerfile:
kai@XPS15:~$ docker build -t my_ubuntu:1.0 .
Building the image:
What is happening?
• By starting the build process docker first downloads the ubuntu:16.04
base image from dockerhub
• Then it starts the base image and basically parses the commands in the
dockerfile (here: adding a user and installing some stuff with apt-get)
• After everything is done the image is saved with the tag my_ubuntu:1.0
kai@XPS15:~$ docker run –it my_ubuntu:1.0
root@52d676fa5956:/workspace# uname –a
Linux 9b6e2b8f6e53 4.13.0-38-generic #43~16.04.1-Ubuntu SMP
Wed Mar 14 17:48:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Starting the image:

My Rstudio Docker Containers
kailicht/my_rstudio:1.0
Building it in 4 stages • The foundation for this build is the
tensorflow:18.02-py3 image from NVIDIA
• It comes batteries included:
• Ubuntu 16.04 with python 3.5
• NVIDIA CUDA
• cuBLAS (Linear Algebra)
• NCCL (For Multi-GPU usage)
• Horovod Distributed DL Framework
• OpenMPI
• Tensorboard
• TensorRT to optimize deployment
kailicht/rstudio_deeplearning
kailicht/rbase_keras_tf
Base: tensorflow:18.02-py3
Let’s have a look
into the dockerfiles!
0. Get the base image
1. Install R, Keras &TensorFlow for R
2. Install Rstudio and dependencies
3. Customize (Settings, Packages)
kailicht/my_rstudio:1.x 4. Install new packages
Images for different use cases:
• Do Data Science REPL-Work: kailicht/my_rstudio:1.x
• Deploy a shiny app with image recognition: kailicht/rbase_keras_tf
• Deploy a flask microservice with a TensorFlow model: tensorflow:18.02-py3
• …

Hello world!
CLASSIFY HANDWRITTEN DIGITS WITH KERAS FOR R AND OUR BRAND NEW DEV
ENVIRONMENT!

But wait, isn’t python better for Deep Learning?
• No.
• In python and R you only define the neural network!
• All the magic happens in the toolchain.
• It creates the computational graph and performs all the matrix operations needed on
your CUDA enabled GPU.
• Sorry but using R or python for Deep Learning is pretty much like using a GUI
• If you want to deploy a trained network (e.g. as an API) then python is the better choice
• It’s important to differentiate between creating a model and deploying it!

Before we start: The MNIST Data Set
In R:
• The Hello World of Machine Learning
• 70k handwritten digits as 28x28 monochrome pictures
• And off course the class labels (0-9)
btw: this is a tensor! (like
any other matrix or vector)

Before we start: Piping
#Select 3 columns in a data frame
MyDataFrame2 <- select(MyDataFrame1, MyCol1, MyCol2, MyCol3)
#Is the same like:
MyDataFrame2 <- MyDataFrame1 %>%
select(MyCol1, MyCol2, MyCol3)
#Chain it!
MyDataFrame2 <- MyDataFrame1 %>%
select(MyCol1, MyCol2, MyCol3) %>%
filter(MyCol1 <= "SomeValue") %>%
mutate(MyCol4 = MyCol1 / MyCol2) %>%
summarize_all(mean)
Also very neat for operating with tensors flowing through a neural net!

Before we start: The MLP Architecture
Inputs Weights
Sum
Activation
Function
Bias
Symbol:
Input Layer Hidden Layer Output Layer
• The weights and biases are optimized to best fit the data in a process called backpropagation
• There are a lot of (simple) tricks to prevent overfitting (dropout, learning rate decay, …)
• For a more detailed introduction I recommend the 2 hour talk from Martin Görner @ Google
Cloud Next ‘17

But how does it work with a 28x28 picture?
It’s tidy! :)
Let’s code that!
https://cntk.ai/pythondocs/CNTK_103C_MNIST_MultiLayerPerceptron.html

What are we missing?
• The spatial correlation of the pixel in their 2D space is lost!
• This information is important!  Convolutional Neural Network!
https://codelabs.developers.google.com/codelabs/cloud-tensorflow-mnist/#4

ConvNet for MNIST

Cross Entropy

From Zero to Hero - All you need to do serious deep learning stuff in R

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to From Zero to Hero - All you need to do serious deep learning stuff in R

Similar to From Zero to Hero - All you need to do serious deep learning stuff in R (20)

Recently uploaded

Recently uploaded (20)

From Zero to Hero - All you need to do serious deep learning stuff in R