The Flow of TensorFlow

The Flow of TensorFlow
Jeongkyu Shin
Lablup Inc.
2017. 11. 12 / GDG DevFest Nanjing 2017
2017. 11. 19 / GDG DevFest Seoul 2017

Descript.ion
§ CEO / Co-founder, Lablup Inc.
§ Develops Backend.AI
§ Open-source devotee
§ Google Developer Experts (Machine Learning)
§ Principal Researcher, KOSSLab., Korea
§ Textcube open-source project maintainer (10th
anniversary!)
§ Physicist / Neuroscientist
§ Adj. professor (Dept. of CSE, Hanyang Univ.)
§ Ph.D in Statistical Physics (complex systems /
computational neuroscience)
Jeongkyu Shin / @inureyes

Machine Learning Era: All came from dust
§ Machine learning
§ ”Field of study that gives computers the ability to learn without being explicitly programmed”
Arthur Samuel (1959)
§ "A computer program is said to learn from experience E with respect to some class of tasks T
and performance measure P, if its performance at tasks in T, as measured by P, improves with
experience E.” Tom Michel (1999)
§ Type of Machine Learning
§ Supervised learning
§ Unsupervised learning
§ Reinforcement learning
§ Recommender system

Artificial Intelligence
§ Definition
§ Allan Turing, ‘The Imitation Game” (1950) => Turing test
§ John McCarthy, Dartmouth Artificial Intelligence Conference (1956)
§ Information Processing Language (1955)
§ From axiom to theory
§ Heuristics to reduce probing space
§ Born of LISP programming language
§ First approach : IF-THEN rule
§ Probe every possible cases and choose the pathway with highest fitness

Artificial Neural Network: Basics
§ Effect of layers
A. K. Jain, J. Mao, K. M. Mohiuddin (1996) Artificial Neural Networks: A Tutorial IEEE Computer 29

Winter was coming
§ First winter (1970s)
§ Complex problems: too difficult to construct logic models (by hand)
§ Second winter (1990s)
§ Overfitting problem → pre-training, supervised backpropagation → dropout (2013)
§ Convergence → vanishing gradient problem (1991)
§ Divergence problem → weight decay / sparsity regularization
§ Tedious training speed → IT evolution, mini-batch
§ And the spring: Environmental changes open the gate
§ Rise of big-data
§ Phenomenal computation cost reduction

Deep Learning: flower of the golden era
§ What if you have enough money to do (formally) crazy experiments? Like
§ Increase the number of hidden layers
§ Pour unlimited number of data
§ Breakthrough of deep learning
§ Geoffrey Hinton (2005)
§ Andrew Ng (2012)
§ Convolution Neural Network
§ Pooling layer + weight
§ Recurrent Neural Network
§ Feedforward routine with (long/short) term memory
§ Deep disbelief Network
§ Multipartite neural network with generative model
§ Deep Q-Network
§ Using deep learning for reinforcement learning

AlphaGo as a mixture of Machine Learning techniques
§ Reducing search space
§ Breadth reduction
§ And depth reduction
§ Prediction
§ 13 layer convolutional NN
§ Value network
§ Policy network
§ Principal variation

Flow of TensorFlow
Still less than two years passed.

TensorFlow
§ Open-source software library for machine learning across a range of tasks
§ Developed by Google (Dec. 2015~)
§ Characteristics
§ Python API (like Theano)
§ From 1.0, TensorFlow expands native API binding with Java C, etc.
§ Supports
§ Linux, macOS
§ NVidia GPUs (pascal and above)

Before TensorFlow
§ User-friendly Deep-learning toolkits
§ Caffe (2012)
§ Generalized programming method to researchers
§ Provides common NN blocks
§ Configuration file + training kernel program
§ Theano (2013~2017)
§ User code / configuration part is written in Python
§ Keras (2015~)
§ Meta-framework for Deep Learning programming
§ Supports various backends:
§ Theano (default) / TensorFlow (2016~) / MXNet (2017~) / CNTK (WIP)
§ ETC
§ Paddle, Chainer, DL4J…

TensorFlow: Summary
§ Statistics
§ More than 24000 commits since Dec. 2015
§ More than 1140 committers
§ More than 24000 forks for last 12 months
§ Dominates Bootstrap! (15000)
§ More than 6400 TensorFlow-related
repository created on GitHub
§ Current
§ Complete ML model prototyping
§ Distributed training
§ CPU / GPU / TPU / Mobile support
§ TensorFlow Serving
§ Enables easier inference / model serving
§ XLA compiler (1.0~)
§ Support various environments / speedups
§ Keras API Support (1.2~)
§ High-level programming API
§ Keras-compatible API
§ Eager Execution (1.4~)
§ Interactive mode of TensorFlow
§ Treat TensorFlow python code as real
python code
https://www.infoworld.com/article/3233283/javascript/at-github-javascript-rules-in-usage-tensorflow-leads-in-forks.html

TensorFlow: Summary
§ TensorFlow Serving
§ Enables easier inference / model serving
§ XLA compiler (1.0~)
§ Support various environments / speedups
§ Keras API Support (1.2~)
§ High-level programming API
§ Eager Execution (1.4~)
§ Interactive mode of TensorFlow
§ Treat TensorFlow python code as real
python code
2016
2017
⏤ TensorFlow Serving
⏤ Keras API
⏤ Eager Execution
⏤ TensorFlow Lite
⏤ XLA
⏤ OpenAL w/ OpenCompute
⏤ Distributed TensorFlow
⏤ Multi GPU support
⏤ Mobile TensorFlow
⏤ TensorFlow Datasets
⏤ SKLearn (contrib)
⏤ TensorFlow Slim
⏤ SyntaxNet
⏤ DRAGNN
⏤ TFLearn (contrib)
⏤ TensorFlow TimeSeries

How TensorFlow works
§ CPU
§ Multiprocessor
§ AVX-based acceleration
§ GPU part in chip
§ OpenMP
§ GPU
§ CUDA (NVidia) ➜ cuDNN
§ OpenCL (AMD) ➜ ComputeCPP /
ROCm
§ TPU (1st, 2nd gen.)
§ ASIC for accelerating matrix
calculation
§ In-house development by Google
https://www.tensorflow.org/get_started/graph_viz

§ Python but not Python
§ Python API is default API for
TensorFlow
§ However, TF core is written in C++,
with cuDNN library (for GPU
acceleration)
§ Computation Graph
§ User TF code is not a code
§ it is a configuration to generate
computation graph
§ Session
§ Creates a computation graph and
run the training using C++ core
§ Tedious debug process

Google I/O 2017 / TensorFlow Frontiers

TensorFlow Features
§ Recent TensorFlow core features
§ TensorFlow Estimators
§ Included in 1.4 (Oct. 2017) / high-level API for using, modeling well-known estimators
§ TensorFlow Serving (independent project)
§ TensorFlow Keras-compatible API (Sep. 2017)
§ Included in 1.3 (Sep. 2017)
§ TensorFlow Datasets
§ Included in 1.4 (Oct. 2017)
§ Upcoming/testing TensorFlow core features
§ TensorFlow eager execution
§ Introduced in 1.4 (Oct. 2017)
§ TensorFlow Lite
§ (Work-in-progress)

XLA: linear algebra compiler for TensorFlow
Google I/O 2017 / TensorFlow Frontiers

TensorFlow Serving
§ Serving system for inference service
§ Components
§ Servables
§ Loaders
§ Managers
§ Features
§ Model building
§ Model versioning
§ Model saving / loading
§ Online inference support with RPC

Keras-compatible API for TensorFlow
§ Keras ( https://keras.io )
§ High-level API
§ Focus on user experience
§ “Deep learning accessible to everyone”
§ History
§ Announced at Feb. 2017
§ Bundled as an contribution package from TF 1.2
§ Official core package since 1.4
§ Characteristics
§ “Simplified workflow for TensorFlow users, more powerful features to Keras users”
§ Most Keras code can be used on TensorFlow (with keras. to tf.keras.)
§ Can mix Keras code with TensorFlow codes

TensorFlow Datasets
§ New way to generate data pipeline
§ Dataset classes
§ TextLineDataset
§ TFRecordDataset
§ FixedLengthRecordDataset
§ Iterator

Example: Decoding and resizing image data
# Reads an image from a file, decodes it into a dense tensor, and resizes it
# to a fixed shape.
def _parse_function(filename, label):
image_string = tf.read_file(filename)
image_decoded = tf.image.decode_image(image_string)
image_resized = tf.image.resize_images(image_decoded, [28, 28])
return image_resized, label
# A vector of filenames.
filenames = tf.constant(["/var/data/image1.jpg", "/var/data/image2.jpg", ...])
# `labels[i]` is the label for the image in `filenames[i].
labels = tf.constant([0, 37, ...])
dataset = tf.data.Dataset.from_tensor_slices((filenames, labels))
dataset = dataset.map(_parse_function)

Eager execution
§ Announced at Oct. 30, 2017
§ Makes TensorFlow execute operations immediately
§ Returns concrete values
§ Provides
§ A NumPy-like library for numerical computation
§ Support for GPU acceleration and automatic differentiation
§ A flexible platform for machine learning research and experiments
§ Advantages
§ Python debugger tools
§ Immediate error reporting
§ Easy control flow
§ Python data structures

Example: Session
x = tf.placeholder(tf.float32, shape=[1, 1])
m = tf.matmul(x, x)
print(m)
# Tensor("MatMul:0", shape=(1, 1),
dtype=float32)
with tf.Session() as sess:
m_out = sess.run(m, feed_dict={x: [[2.]]})
print(m_out)
# [[4.]]
x = [[2.]]
m = tf.matmul(x, x)
print(m)
# tf.Tensor([[4.]], dtype=float32,
shape=(1,1))

Example: Instant error
x = tf.gather([0, 1, 2], 7)
InvalidArgumentError: indices = 7 is not in [0, 3) [Op:Gather]

Example: removing metaprogramming
x = tf.random_uniform([2, 2])
with tf.Session() as sess:
for i in range(x.shape[0]):
for j in range(x.shape[1]):
print(sess.run(x[i, j]))
for i in range(x.shape[0]):
for j in range(x.shape[1]):
print(x[i, j])

a = tf.constant(6)
while not tf.equal(a, 1):
if tf.equal(a % 2, 0):
a = a / 2
else:
a = 3 * a + 1
print(a)
Eager execution: Python Control Flow
# Outputs
tf.Tensor(3, dtype=int32)

def square(x):
return tf.multiply(x, x) # Or x * x
grad = tfe.gradients_function(square)
print(square(3.)) # tf.Tensor(9., dtype=tf.float32
print(grad(3.)) # [tf.Tensor(6., dtype=tf.float32))]
Eager execution: Gradients

def square(x):
return tf.multiply(x, x) # Or x * x
grad = tfe.gradients_function(square)
gradgrad = tfe.gradients_function(lambda x: grad(x)[0])
print(square(3.)) # tf.Tensor(9., dtype=tf.float32)
print(grad(3.)) # [tf.Tensor(6., dtype=tf.float32)]
print(gradgrad(3.)) # [tf.Tensor(2., dtype=tf.float32))]
Eager execution: Gradients

def log1pexp(x):
return tf.log(1 + tf.exp(x))
grad_log1pexp = tfe.gradients_function(log1pexp)
print(grad_log1pexp(0.))
Eager execution: Custom Gradients
Works fine, prints [0.5]

def log1pexp(x):
return tf.log(1 + tf.exp(x))
print(grad_log1pexp(100.))
[nan] due to numeric
instability

@tfe.custom_gradient
def log1pexp(x):
e = tf.exp(x)
def grad(dy):
return dy * (1 - 1 / (1 + e))
return tf.log(1 + e), grad
# Gradient at x = 0 works as before.
print(grad_log1pexp(0.)) # [0.5]
# And now gradient computation at x=100 works as well.
print(grad_log1pexp(100.)) # [1.0]

tf.device() for manual placement
with tf.device(“/gpu:0”):
y = tf.matmul(x, x)
# x and y reside in GPU memory
Eager execution: Using GPUs

The same APIs as graph building
(tf.layers, tf.train.Optimizer, tf.data etc.)
model = tf.layers.Dense(units=1, use_bias=True)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)
Eager execution: Building Models

model = tf.layers.Dense(units=1, use_bias=True)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)
# Define a loss function
def loss(x, y):
return tf.reduce_mean(tf.square(y - model(x)))
Eager execution: Building Models

Compute and apply gradients
for (x, y) in get_next_batch():
optimizer.apply_gradients(grad_fn(x, y))
Eager execution: Training Models

Compute and apply gradients
grad_fn = tfe.implicit_gradients(loss)
for (x, y) in get_next_batch():
optimizer.apply_gradients(grad_fn(x, y))
Eager execution: Training Models

Comparison
TensorFlow TFlearn TF Slim
TF Eager
Execution
Keras
(with TF
backend)
Keras
(with MXNet
backend)
PyTorch CNTK MXNet
Difficulty ■■■■ ■■■ ■■ ■■ ■■ ■■■ ■ ■■■■ ■■■■
Extensibility ■■■■ ■■■■ ■■■■ ■■ ■■ ■■ ■ ■■■■ ■■■■
Interactive
mode
X X X O X X O X X
Multi-CPU
(NUMA)
O O X X O O O O O
Multi-CPU
(Cluster)
O O O X O O X O O
Multi-GPU
(single node)
O O O X O O
?
(manual
multi-
batch)
O O
Multi-GPU
(Cluster)
O O O X O O X O O

TensorFlow Lite
§ TensorFlow Lite: Embedded
TensorFlow
§ No additional environment installation
required
§ OS level hardware acceleration
§ Leverages Android NN
§ XLA-based optimization support
§ Enables binding to various programming
languages
§ Developer Preview (4 days ago)
§ Part of Android O-MR1
Google I/O 2017 / Android meets TensorFlow

TensorFlow Lite
§ Format
§ FlatBuffers instead of ProtocolBuffers
§ Provides converter
§ Models
§ InceptionV3
§ MobileNets: vision-specific model family
§ API
§ Java
§ C++

TensorFlow Lite: Why and How
§ Why? Less traffic / faster response
§ Image / OCR, Speech <-> Text, Translation, NLP
§ Motion, GPS and more
§ ML can extract the meaning from raw data
§ Image recognition: Send raw image vs. send detected label
§ Motion detection: Send raw motion vs. send feature vector
§ How? Model compression
§ Graph freezing
§ Graph conversion tools
§ Quantization
§ Weight
§ Calculation
§ Memory mapping

Android Neural Network API
§ New APIs for NeuralNet
§ Part of Android Framework
§ Since next Android release
§ Reduce the library duplication through apps.
§ Supports Hardware acceleration
§ GPU, DSP, ISP, NeuralNet chips, etc.

Flow goes to: market
What is flowing through the stream?

Market: API-based (personalized) deep learning service
§ Service with pre-baked models via API
§ Focuses on the fields that does not require real-time
§ e.g. Microsoft Azure Cognitive service
§ Pre-trained ANN + personalized data = personalized NN
§ Easy personalization : server-side training
+ =

Market: User-side deep learning services
§ Inference with trained models
§ Does not require heavy calculation
§ e.g. ARMv7 with ~512MB / 1GB RAM
§ Toys / light products
§ Smart toys for kidult (adult + kids) : Self-driving R/C car / drone
§ Home appliance and controllers
§ IoT + ML
§ Locality : Home (per room), Car, Office, etc.
§ E.g. Smart home resource management systems

Market: Deep Learning service for everyone
§ Digital assistants War
§ Digital assistant (with sprakers): Gateway of deep learning based services
§ Context extraction + inference + features
§ Echo (Amazon) / Google Home (Google)
§ Microsoft (Cortana in every MS products) / Apple (HomePod)
§ Korea? Also entering the war field
§ Naver: Wave / Friends
§ Kakao: Kakao mini
§ SK: Nugu

Flow goes to: tech.
What is flowing through the stream?

Portability and extensibility
§ Training on
§ Mac / windows
§ GPU server
§ GPU / TPU on Cloud
§ Prediction / Inference using
§ Android / iOS
§ Raspberry Pi and TPU
§ Android Things

Open-source Machine Learning Framework
§ Machine Learning Framework: (almost) open-source
§ Google: TensorFlow (2015~)
§ Microsoft: CNTK (2016~)
§ Amazon: MxNet (2015~)
§ Facebook: Caffe 2 (2017~) / PyTorch (2016~)
§ Baidu: PaddlePaddle (2016~)
§ Why?
§ 2017
§ General goal of new versions: user-friendly syntax
§ Rise of Keras, PyTorch leads TensorFlow Eager execution

Server-side machine learning
§ Machine learning workload characteristics
§ Training
§ Requires ultra-heavy computation resources
§ Need to feed big, indexed data
§ OR, (reinforcement learning) need pair model / training
environment to give feedbacks
§ Serving
§ Requires (relatively) light resources:
§ Low CPU cost
§ Middle memory capacity (to load NeuralNet)

TensorFlow: Multiverse
§ TensorFlow AMD GPU acceleration
§ OpenCL with ComputeCPP (Feb. 2017)
§ Accelerates c++ codes (codeplay)
§ Khronos support / SYCL standard
§ Still in early stage
§ Only supports Linux
§ ROCm (AMD) based TensorFlow (Sep. 2017)
§ First open-source HPC/Hyperscale-class
platform for GPU computing
§ LLVM based / HCC C++ / GCN compiler
§ https://github.com/ROCmSoftwarePlatform/
hiptensorflow

Hand-held machine learning: Why?
§ Issues from real-time models / apps
§ Autopilot
§ Real-time effect on photos / videos
§ Voice recognition
§ Automators
§ Privacy issues
§ Increasing privacy information
§ ETC
§ Lead the network cost reduction

Hand-held machine learning: How?
§ Apple’s approach
§ Keeping user privacy with Differential Privacy
§ Gather Anonymized user data
§ User-specific machine learning models: keep them in the phone
§ e.g. Photo face detection / voice recognition / smart keyboard
§ Core ML (iOS 11)
§ Support Machine Learning model as function (.mlmodel format)
§ Google’s approach
§ Ultra-large scale server side training using TPU (2nd gen.)
§ Mobile: Handles data compression and feature extraction (to reduce traffic)
§ On the mobile:
§ Android NeuralNet API (Android O)
§ TensorFlow Lite on Android (Android O)
https://backchannel.com/an-exclusive-look-at-how-ai-and-machine-learning-work-at-apple-8dbfb131932b

§ Train on server, Serve on smartphone
§ Enough to serve pre-trained models on smartphones
§ Both train and serve on smartphone
§ Keeping privacy / reduce traffic / personalization
§ Uses GPUs on recent smartphones
§ Working together
§ Feature extraction / compression / preprocessing ‒ Mobile side
§ Machine Learning model training / updating / streaming advanced models ‒ Server side

§ TensorFlow
§ Supports both Android and iOS
§ XCode and Android Studio
§ XLA compiler framework since TensorFlow 1.0:
§ Will support diverse languages / environments
§ Also, optimizing for smartphones and tablets
§ MobileNet (Apr. 2017)
§ Efficient Convolutional Neural Networks for Mobile Vision Applications
§ TensorFlow Lite (Nov. 2017): development focus
§ Built-in operators for both quantized models (int (8bit) / fixed point) and floating point models
(FP10, FP16)
§ Support for embedded GPUs / ASICs

Browser-side machine learning
§ Machine Learning without hassle
§ Ingredients for machine learning: Computation, Data, Algorithm
§ XLA: provides binary-code level optimization for various environment
§ Do we have cross-platform computation environment?
§ Java?
§ Browser!
§ Recent improvements of web browser
§ WebGL
§ Unified programming environment for many GPU-enabled machines
§ WebAssembly
§ Binary-level optimization
§ Shipped to every mainstream browser! (just in this week)

Convertible NeuralNet format
§ ONNX (Open Neural Network Exchange)
§ Microsoft / Facebook (Sep. 2017)
§ Caffe 2, PyTorch (by Facebook), CNTK (Microsoft)
§ MLMODEL (Code ML model, Machine Learning Model)
§ Apple (Aug. 2017)
§ Caffe, Keras, scikit-learn, LIBSVM (Open Source)
§ Provides Core ML converter / specification

Recap
§ Machine Learning / Artificial Intelligence
§ Flow of TensorFlow
§ TensorFlow Serving Project
§ Datasets
§ Eager execution
§ TensorFlow Lite
§ Flow goes to
§ More user-friendly toolkits / frameworks
§ API-based / personalized
§ User-side inference / Hand-held ML
§ Convertible Machine Learning Model formats

End!
Thank you for listening
https://www.lablup.ai
https://backend.ai
https://cloud.backend.ai
https://www.codeonweb.com
https://github.com/lablup
Lablup Inc.
Backend.AI
Backend.AI Cloud
CodeOnWeb Service
Github repository

The Flow of TensorFlow

More Related Content

What's hot

Similar to The Flow of TensorFlow

More from Jeongkyu Shin

Recently uploaded

The Flow of TensorFlow