SlideShare a Scribd company logo
Advanced and
Meetup
May 26, 2016
Meetup Agenda
Meetup Updates (Chris F)
Technology Updates (Chris F)
Spark + TensorFlow Model Serving/Deployment (Chris F)
Neural Net and TensorFlow Landscape (Chris F)
TensorFlow Core, Best Practices, Hidden Gems (Sam A)
TensorFlow Distributed (Fabrizio M)
Meetup Updates
New Sponsor
Meetup Metrics
Co-presenters
Sam Abrahams
“LA Kid (from DC)
Fabrizio Milo
“Italian Stallion”
Chris Fregly
“Chicago Fire”
advancedspark.com
Workshop - June 4th - Spark ML + TensorFlow
Chris Fregly
Technology Updates
github.com/fluxcapacitor/pipeline
Neural Network Tool Landscape
TensorFlow Tool Landscape
TensorFlow Serving
Spark ML Serving
pipeline.io (Shh… Stealth)
Technology Updates...
Spark 2.0
Kafka 0.10.0 + Confluent 3.0
CUDA 8.0 + cuDNN v5
Whole-Stage Code Gen
● SPARK-12795
● Physically Fuse Together Operators (within a Stage) into 1 Operation
● Avoids Excessive Virtual Function Calls
● Utilize CPU Registers vs. L1, L2, Main Memory
● Loop Unrolling and SIMD Code Generation
● Speeds up CPU-bound workloads (not I/O-bound)
Vectorization (SPARK-12992)
● Operate on Batches of Data
● Reduces Virtual Function Calls
● Fallback if Whole-Stage Not an Option
Spark 2.0: Core
Save/Load Support for All Models and Pipelines!
● Python and Scala
● Saved as Parquet ONLY
Local Linear Algebra Library
● SPARK-13944, SPARK-14615
● Drop-in Replacement for Distributed Linear Algebra library
● Opens up door to my new pipeline.io prediction layer!
Spark 2.0: ML
Kafka v0.10 + Confluent v3.0 Platform
Kafka Streams
New Event-Creation Timestamp
Rack Awareness (THX NFLX!)
Kerberos + SASL Improvements
Ability to Pause a Connector (ie. Maintenance)
New max.poll.records to limit messages retrieved
CUDA Deep Neural Network (cuDNN) v5 Updates
LSTM (Long Short-Term Memory) RNN Support for NLP use cases (6x speedup)
Optimized for NVIDIA Pascal GPU Architecture including FP16 (low precision)
Highly-optimized networks with 3x3 convolutions (GoogleNet)
github.com/fluxcapacitor/pipeline Updates
Tools and Examples
● JupyterHub and Python 3 Support
● Spark-Redis Connector
● Theano
● Keras (TensorFlow + Theano Support)
Code
● Spark ML DecisionTree Code Generator (Janino JVM ByteCode Generator)
● Hot-swappable ML Model Watcher (similar to TensorFlow Serving)
● Eigenface-based Image Recommendations
● Streaming Matrix Factorization w/ Kafka
● Netflix Hystrix-based Circuit Breaker - Prediction Service @ Scale
Neural Network Landscape
Interesting Neural Net Use Case
Bonus: CPU vs. GPU
Bonus! GPUs and Branching
TensorFlow Landscape
TensorFlow Core
TensorFlow Distributed
TensorFlow Serving (similar to Prediction.IO, Pipeline.IO)
TensorBoard (Visualize Neural Network Training)
Playground (Toy)
SkFlow (Scikit-Learn + TensorFlow)
Keras (High-level API for both TensorFlow and Theano)
Models (Parsey McParseface/SyntaxNet)
TensorFlow Serving (Model Deployment)
Dynamic Model Loader
Model Versioning & Rollback
Written in C/C++
Extend to Serve any Model!
Demos!!
Spark ML Serving (Model Deployment)
Same thing except for Spark...
Keep an eye on pipeline.io!
Sam Abrahams
TensorFlow Core
Best Practices
Hidden Gems
TensorFlow Core Terminology
gRPC (?) - Should this go here or in distributed?
TensorBoard Overview (?)
Explaining this will go a long way ->
Who dis?
•My name is Sam Abrahams
•Los Angeles based machine learning engineer
•TensorFlow White Paper Notes
•TensorFlow for Raspberry Pi
•Contributor to the TensorFlow project
samjabrahams @sabraha
This Talk: Sprint Through:
• Core TensorFlow API and terminology
• TensorFlow workflow
• Example TensorFlow Code
• TensorBoard
TensorFlow Programming Model
• Very similar to Theano
• The primary user-facing API is Python, though there is a
partial C++ API for executing models
• Computational code is written in C++, implemented for
CPUs, GPUs, or both
• Integrates tightly with NumPy
TensorFlow Programming
Generally boils down to two steps:
1. Build the computational graph(s) that you’d
like to run
2. Use a TensorFlow Session to run your graph
one or more times
Graph
• The primary structure of a TensorFlow model
• Generally there is one graph per program, but
TensorFlow can support multiple Graphs
• Nodes represent computations or data transformations
• Edges represent data transfer or computational control
What is a data flow graph?
• Also known as a “computational graph”, or just “graph”
• Nice way to visualize series of mathematical computations
• Here’s a simple graph showing the addition of two variables:
a ba
a + b
Components of a Graph
Graphs are composed of two types of elements:
• Nodes
• These are the elliptical shapes in the graph, and represent some form of
computation
• Edges
• These are the arrows in the graph, and represent data flowing from one
node into another node.
NODE NODEEDGE
Why Use Graphs?
• They are highly compositional
• Useful for calculating derivatives
• It’s easier to implement distributed computation
• Computation is already segmented nicely
• Neural networks are already implemented as computational graphs!
Tensors
• N-Dimensional Matrices
• 0-Dimensional → Scalar
• 1-Dimensional → Vector
• 2-Dimensional → Matrix
• All data that moves through a TensorFlow graph is a Tensor
• TensorFlow can convert Python native types or NumPy arrays
into Tensor objects
Tensors
•Python
# Vector
> [1, 2, 3]
# 3-D Tensor
> [[ [0,0], [0,1], [0,2] ],
[ [1,0], [1,1], [1,2] ],
[ [2,0], [2,1], [2,2] ]
# Matrix as NumPy Array
> np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
dtype=np.int64)
# Scalar
> 3
# Matrix
> [[1, 2, 3],
[4, 5, 6],
[7, 8, 9]]
•NumPy
Tensors
• Best practice is to use NumPy arrays when directly
defining Tensors
• Can explicitly set data type
• This presentation does not create Tensors with NumPy
• Space is limited
• I am lazy
• Tensors returned by TensorFlow graphs are NumPy
arrays
Operations
• “Op” for short
• Represent any sort of computation
• Take in zero or more Tensors as input, and output zero
or more Tensors
• Numerous uses: perform matrix algebra, initialize
variables, print info to the console, etc.
• Do not run when defined: they must be called from a
TensorFlow Session (coming up later)
Operations
•Quick Example
> import tensorflow as tf
> a = tf.mul(3,5) # Returns handle to new tf.mul node
> sess = tf.Session() # Creates TF Session
> sess.run(a) # Actually runs the Operation
out: 15
Placeholders
• Define “input” nodes
• Specifies information that be provided when graph is run
• Typically used for training data
• Define the tensor shape and data type when created:
import tensorflow as tf
# Create a placeholder of size 100x400
# With 32-bit floating point data type
my_placeholder = tf.placeholder(tf.float32, shape=(100,400))
TensorFlow Session
• In charge of coordinating graph execution
• Most important method is run()
• This is what actually runs the Graph
• It takes in two parameters, ‘fetches’ and ‘feed_dict’
• ‘fetches’ is a list of objects you’d like to get the results for, such
as the final layer in a neural network
• ‘feed_dict’ is a dictionary that maps tensors (often
Placeholders) to values those tensors should use
TensorFlow Variables
• Contain tensor data that persists each time you run your
Graph
• Typically used to hold weights and biases of a machine
learning model
• The final values of the weights and biases, along with
the Graph shape, define a trained model
• Before being run in a Graph, must be initialized (will
discuss this at the Session slide)
TensorFlow Variables
•Old school: Define with tensorflow.Variable()
•Best practices: tf.get_variable()
•Update its information with the assign() method
Import tensorflow as tf
# Create a variable with value 0 and name ‘my_variable’
my_var = tf.get_variable(0, name=‘my_variable’)
# Increment the variable by one
my_var.assign(my_var + 1)
Building the graph!
import tensorflow as tf
# create a placeholder for inputting floating point data
x = tf.placeholder(tf.float32)
# Make a Variable with the starting value of 0
start = tf.Variable(0.0)
# Create a node that is the value of (start + x)
y = start.assign(start + x)
a startx
y = x + start
Running a Session (using previous graph)
• Start a Session with tensorflow.Session()
• Close it when you’re done!
# Open up a TensorFlow Session
# and assign it to the handle 'sess'
sess = tf.Session()
# Important: initialize the Variable
init = tf.initialize_all_variables
sess.run(init)
# Run the graph to get the value of y
# Feed in different values of x each time
print(sess.run(y, feed_dict={x:1})) # Prints 1.0
print(sess.run(y, feed_dict={x:0.5})) # Prints 1.5
print(sess.run(y, feed_dict={x:2.2})) # Prints 3.7
# Close the Session
sess.close()
Devices
• A single device is a CPU, GPU, or other computational
unit (TPU?!)
• One machine can have multiple devices
• “Distributed” → Multiple machines
• Multi-device != Distributed
TensorBoard
• One of the most useful (and underutilized) aspects of
TensorFlow - Takes in serialized information from graph to
visualize data.
• Complete control over what data is stored
Let’s do a brief live demo. Hopefully nothing blows up!
TensorFlow Codebase
Standard and third party C++ libraries (Eigen)
TensorFlow core framework
C++ kernel implementation
Swig
Python client API
TensorFlow Codebase Structure: Core
tensorflow/core : Primary C++ implementations and runtimes
•core/ops – Registration of Operation signatures
•core/kernels – Operation implementations
•core/framework – Fundamental TensorFlow classes and functions
•core/platform – Abstractions for operating system platforms
Based on information from Eugene Brevdo, Googler and TensorFlow member
TensorFlow Codebase Structure: Python
tensorflow/python : Python implementations, wrappers, API
•python/ops – Code that is accessed through the Python API
•python/kernel_tests – Unit tests (which means examples)
•python/framework – TensorFlow fundamental units
•python/platform – Abstracts away system-level Python calls
contrib/ : contributed or non-fully adopted code
Based on information from Eugene Brevdo, Googler and TensorFlow member
Learning TensorFlow: Beyond the Tutorials
● There is a ton of excellent documentation in the code itself-
especially in the C++ implementations
● If you ever want to write a custom Operation or class, you need
to immerse yourself
● Easiest way to dive in: look at your #include statements!
● Use existing code as reference material
● If you see something new, learn something new
Tensor - tensorflow/core/framework/tensor.h
• dtype() - returns data type
• shape() - returns TensorShape representing tensor’s shape
• dims() – returns the number of dimensions in the tensor
• dim_size(int d) - returns size of specified dimension
• NumElements() - returns number of elements
• isSameSize(const Tensor& b) – do these Tensors dimensions match?
• TotalBytes() - estimated memory usage
• CopyFrom() – Copy another tensor and share its memory
• ...plus many more
Some files worth checking out:
tensorflow/core/framework
● tensor_util.h: deep copying, concatenation, splitting
● resource_mgr.h: Resource manager for mutable Operation state
● register_types.h: Useful helper macros for registering Operations
● kernel_def_builder.h: Full documentation for kernel definitions
tensorflow/core/util
● cuda_kernel_helper.h: Helper functions for cuda kernel implementations
● sparse/sparse_tensor.h: Documentation for C++ SparseTensor class
General advice for GPU implementation:
1. If possible, always register for GPU, even if it’s not a full implementation
○ Want to be able to run code on GPU if it won’t bottleneck the system
2. Before writing a custom GPU implementation, sanity check! Will it help?
○ Not everything benefits from parallelization
3. Utilize Eigen!
○ TensorFlow’s core Tensor class is based on Eigen
4. Be careful with memory: you are in charge of mutable data structures
Fabrizio Milo
TensorFlow Distributed
githbu.com/Mistobaan twitter.com/fabmilo
Distributed Tensorflow
gRPC (http://www.grpc.io/)
- A high performance, open source, general RPC framework that puts mobile and HTTP/2
first.
- Protocol Buffers
- HTTP2
HTTP2
HTTP2
gRPC
python.Tensor python.Tensor
tensor.proto
Distributed Tensorflow - Terminology
- Cluster
- Jobs
- Tasks
Distributed Tensorflow - Terminology
- Cluster: ClusterSpec
- Jobs: Parameter Server, Worker
- Tasks: Usually 0, 1, 2, …
Example of a Cluster Spec
{
"ps":[
"8.9.15.20:2222",
],
"workers":[
"8.34.25.90:2222",
"30.21.18.24:2222",
"4.17.19.14:2222"
]
}
One Session One Server One Job One Task
One Session One Server One Job One Task
ServerSession
Use Cases
- Training
- Hyper Parameters Optimization
- Ensembling
Graph Components
tf.python.training.session_manager.SessionManager
1. Checkpointing trained variables as the training
progresses.
2. Initializing variables on startup, restoring them from the
most recent checkpoint after a crash, or wait for
checkpoints to become available.
tf.python.training.supervisor.Supervisor
# Single Program
with tf.Graph().as_default():
...add operations to the graph...
# Create a Supervisor that will checkpoint the model in '/tmp/mydir'.
sv = Supervisor(logdir='/tmp/mydir')
# Get a Tensorflow session managed by the supervisor.
with sv.managed_session(FLAGS.master) as sess:
# Use the session to train the graph.
while not sv.should_stop():
sess.run(<my_train_op>)
tf.python.training.supervisor.Supervisor
# Multiple Replica Program
is_chief = (server_def.task_index == 0)
server = tf.train.Server(server_def)
with tf.Graph().as_default():
...add operations to the graph...
# Create a Supervisor that uses log directory on a shared file system.
# Indicate if you are the 'chief'
sv = Supervisor(logdir='/shared_directory/...', is_chief=is_chief)
# Get a Session in a TensorFlow server on the cluster.
with sv.managed_session(server.target) as sess:
# Use the session to train the graph.
while not sv.should_stop():
sess.run(<my_train_op>)
Use Cases: Asynchronous Training
Use Cases: Synchronous Training
tf.train.SyncReplicasOptimizer
This optimizer avoids stale gradients by collecting gradients from all replicas,
summing them, then applying them to the variables in one shot, after
which replicas can fetch the new variables and continue.
tf.train.replica_device_setter
The device setter will automatically place Variables ops on separate
parameter servers (ps). The non-Variable ops will be placed on the workers.
tf.train.replica_device_setter(cluster_def)
with tf.device(device_setter):
pass
Training  Replication In Graph Between Graphs
Asynchronous
Synchronous
Use Cases: Training
/job:ps /job:worker
In-Graph Replication
/task:0 /task:0
/task:1/task:1
with tf.device("/job:ps/task:0"):
weights_1 = tf.Variable(...)
biases_1 = tf.Variable(...)
with tf.device("/job:ps/task:1"):
weights_2 = tf.Variable(...)
biases_2 = tf.Variable(...)
with tf.device("/job:worker/task:0"):
input, labels = ...
layer_1 = tf.nn.relu(...)
with tf.device("/job:worker/task:0"):
train_op = ...
logits = tf.nn.relu(...)
with tf.Session() as sess:
for _ in range(10000):
sess.run(train_op)
Client
# To build a cluster with two ps jobs on hosts ps0 and ps1, and 3
worker
# jobs on hosts worker0, worker1 and worker2.
cluster_spec = {
"ps": ["ps0:2222", "ps1:2222"],
"worker": ["worker0:2222", "worker1:2222", "worker2:2222"]}
with tf.device(tf.replica_device_setter(cluster=cluster_spec)):
# Build your graph
v1 = tf.Variable(...) # assigned to /job:ps/task:0
v2 = tf.Variable(...) # assigned to /job:ps/task:1
v3 = tf.Variable(...) # assigned to /job:ps/task:0
# Run compute
...
/job:ps /job:worker
Replication Between Graph
/task:0 /task:0
/task:1/task:1
Use Cases: Train HyperParameter Optimization
- Grid Search
- Random Search
- Gradient Optimization
- Bayesian
Ensemble
Ensembling
Model 0
Model 1
Model 2
Ensemble
Comparison with other frameworks
mxnet
Surprise Demo : Rasberry Pi Cluster
Distributed TensorFlow Cluster
Distributed TensorFlow
TensorBoard Visualizations
Workshop Demo - LARGE GCE Cluster
Super-Large Cluster
Attendee GCE (Google Cloud Engine) Spec
50 GB RAM, 100 GB SSD, 8 CPUs (No GPUs)
TODO:
Build Script for Each Attendee to Run as Either a Worker or Parameter Server
Figure out split between Worker and Parameter Server
ImageNet
TODO: Train full 64-bit precision, quantitize down to 8-bit (Sam A)
Workshop Demo - Initial Cluster
8.34.215.90 (Sam)
130.211.128.240 (Fabrizio)
104.197.159.134 (Fregly)
Cluster Spec: “8.34.215.90, 130.211.128.240, 104.197.159.134”
SSH PEM file: http://advancedspark.com/keys/pipeline-training-gce.pem
chmod 600 pipeline-training-gce.pem
Username: pipeline-training
Password: password9
sudo docker images (VERIFY fluxcapacitor/pipeline)
START: sudo docker run -it --privileged --name pipeline --net=host -m 48g fluxcapacitor/pipeline bash
TensorFlow Operations: Hidden Gems
- Go over macros, functions provided in the library, but not mentioned in
documentation
- Brief brief overview of writing custom op
- Testing GPU code
- Leveraging Eigen and existing code
- Python wrapper

More Related Content

What's hot

On heap cache vs off-heap cache
On heap cache vs off-heap cacheOn heap cache vs off-heap cache
On heap cache vs off-heap cache
rgrebski
 
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUsScaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Jim Dowling
 
Chainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereportChainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereport
Preferred Networks
 
Improved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleImproved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as example
DataWorks Summit/Hadoop Summit
 
Tensorflow internal
Tensorflow internalTensorflow internal
Tensorflow internal
Hyunghun Cho
 
Profiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & SustainabilityProfiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & Sustainability
geetachauhan
 
Chainer v3
Chainer v3Chainer v3
Chainer v3
Seiya Tokui
 
Apache Storm Tutorial
Apache Storm TutorialApache Storm Tutorial
Apache Storm Tutorial
Farzad Nozarian
 
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Chris Fregly
 
Storm Anatomy
Storm AnatomyStorm Anatomy
Storm Anatomy
Eiichiro Uchiumi
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time Computation
Sonal Raj
 
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
PeterAndreasEntschev
 
Tensorflow in Docker
Tensorflow in DockerTensorflow in Docker
Tensorflow in Docker
Eric Ahn
 
Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)
Robert Evans
 
Demystifying DataFrame and Dataset
Demystifying DataFrame and DatasetDemystifying DataFrame and Dataset
Demystifying DataFrame and Dataset
Kazuaki Ishizaki
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
DataWorks Summit/Hadoop Summit
 
Storm
StormStorm
Chainer v4 and v5
Chainer v4 and v5Chainer v4 and v5
Chainer v4 and v5
Preferred Networks
 
Apache Storm Internals
Apache Storm InternalsApache Storm Internals
Apache Storm Internals
Humoyun Ahmedov
 
Stream Processing Frameworks
Stream Processing FrameworksStream Processing Frameworks
Stream Processing Frameworks
SirKetchup
 

What's hot (20)

On heap cache vs off-heap cache
On heap cache vs off-heap cacheOn heap cache vs off-heap cache
On heap cache vs off-heap cache
 
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUsScaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
 
Chainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereportChainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereport
 
Improved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleImproved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as example
 
Tensorflow internal
Tensorflow internalTensorflow internal
Tensorflow internal
 
Profiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & SustainabilityProfiling PyTorch for Efficiency & Sustainability
Profiling PyTorch for Efficiency & Sustainability
 
Chainer v3
Chainer v3Chainer v3
Chainer v3
 
Apache Storm Tutorial
Apache Storm TutorialApache Storm Tutorial
Apache Storm Tutorial
 
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
 
Storm Anatomy
Storm AnatomyStorm Anatomy
Storm Anatomy
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time Computation
 
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
 
Tensorflow in Docker
Tensorflow in DockerTensorflow in Docker
Tensorflow in Docker
 
Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)
 
Demystifying DataFrame and Dataset
Demystifying DataFrame and DatasetDemystifying DataFrame and Dataset
Demystifying DataFrame and Dataset
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 
Storm
StormStorm
Storm
 
Chainer v4 and v5
Chainer v4 and v5Chainer v4 and v5
Chainer v4 and v5
 
Apache Storm Internals
Apache Storm InternalsApache Storm Internals
Apache Storm Internals
 
Stream Processing Frameworks
Stream Processing FrameworksStream Processing Frameworks
Stream Processing Frameworks
 

Viewers also liked

Kubernetes on EGO : Bringing enterprise resource management and scheduling to...
Kubernetes on EGO : Bringing enterprise resource management and scheduling to...Kubernetes on EGO : Bringing enterprise resource management and scheduling to...
Kubernetes on EGO : Bringing enterprise resource management and scheduling to...
Yong Feng
 
When HPC meet ML/DL: Manage HPC Data Center with Kubernetes
When HPC meet ML/DL: Manage HPC Data Center with KubernetesWhen HPC meet ML/DL: Manage HPC Data Center with Kubernetes
When HPC meet ML/DL: Manage HPC Data Center with Kubernetes
Yong Feng
 
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
Chris Fregly
 
Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...
Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...
Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...
Chris Fregly
 
TensorFlow 深度學習快速上手班--電腦視覺應用
TensorFlow 深度學習快速上手班--電腦視覺應用TensorFlow 深度學習快速上手班--電腦視覺應用
TensorFlow 深度學習快速上手班--電腦視覺應用
Mark Chang
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Alex Pinto
 
陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰
台灣資料科學年會
 
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Chris Fregly
 
Machine Learning without the Math: An overview of Machine Learning
Machine Learning without the Math: An overview of Machine LearningMachine Learning without the Math: An overview of Machine Learning
Machine Learning without the Math: An overview of Machine Learning
Arshad Ahmed
 
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
Sri Ambati
 
Big Data Spain - Nov 17 2016 - Madrid Continuously Deploy Spark ML and Tensor...
Big Data Spain - Nov 17 2016 - Madrid Continuously Deploy Spark ML and Tensor...Big Data Spain - Nov 17 2016 - Madrid Continuously Deploy Spark ML and Tensor...
Big Data Spain - Nov 17 2016 - Madrid Continuously Deploy Spark ML and Tensor...
Chris Fregly
 
[系列活動] 資料探勘速遊
[系列活動] 資料探勘速遊[系列活動] 資料探勘速遊
[系列活動] 資料探勘速遊
台灣資料科學年會
 
02 math essentials
02 math essentials02 math essentials
02 math essentials
Poongodi Mano
 
Boston Spark Meetup May 24, 2016
Boston Spark Meetup May 24, 2016Boston Spark Meetup May 24, 2016
Boston Spark Meetup May 24, 2016
Chris Fregly
 
Advanced Spark and TensorFlow Meetup 08-04-2016 One Click Spark ML Pipeline D...
Advanced Spark and TensorFlow Meetup 08-04-2016 One Click Spark ML Pipeline D...Advanced Spark and TensorFlow Meetup 08-04-2016 One Click Spark ML Pipeline D...
Advanced Spark and TensorFlow Meetup 08-04-2016 One Click Spark ML Pipeline D...
Chris Fregly
 
高嘉良/Open Innovation as Strategic Plan
高嘉良/Open Innovation as Strategic Plan高嘉良/Open Innovation as Strategic Plan
高嘉良/Open Innovation as Strategic Plan
台灣資料科學年會
 
Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...
Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...
Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...
Chris Fregly
 
The Genome Assembly Problem
The Genome Assembly ProblemThe Genome Assembly Problem
The Genome Assembly Problem
Mark Chang
 
Machine Learning Essentials (dsth Meetup#3)
Machine Learning Essentials (dsth Meetup#3)Machine Learning Essentials (dsth Meetup#3)
Machine Learning Essentials (dsth Meetup#3)
Data Science Thailand
 
Machine Learning Preliminaries and Math Refresher
Machine Learning Preliminaries and Math RefresherMachine Learning Preliminaries and Math Refresher
Machine Learning Preliminaries and Math Refresherbutest
 

Viewers also liked (20)

Kubernetes on EGO : Bringing enterprise resource management and scheduling to...
Kubernetes on EGO : Bringing enterprise resource management and scheduling to...Kubernetes on EGO : Bringing enterprise resource management and scheduling to...
Kubernetes on EGO : Bringing enterprise resource management and scheduling to...
 
When HPC meet ML/DL: Manage HPC Data Center with Kubernetes
When HPC meet ML/DL: Manage HPC Data Center with KubernetesWhen HPC meet ML/DL: Manage HPC Data Center with Kubernetes
When HPC meet ML/DL: Manage HPC Data Center with Kubernetes
 
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
High Performance Distributed TensorFlow with GPUs - TensorFlow Chicago Meetup...
 
Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...
Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...
Gradient Descent, Back Propagation, and Auto Differentiation - Advanced Spark...
 
TensorFlow 深度學習快速上手班--電腦視覺應用
TensorFlow 深度學習快速上手班--電腦視覺應用TensorFlow 深度學習快速上手班--電腦視覺應用
TensorFlow 深度學習快速上手班--電腦視覺應用
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
 
陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰
 
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
 
Machine Learning without the Math: An overview of Machine Learning
Machine Learning without the Math: An overview of Machine LearningMachine Learning without the Math: An overview of Machine Learning
Machine Learning without the Math: An overview of Machine Learning
 
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
 
Big Data Spain - Nov 17 2016 - Madrid Continuously Deploy Spark ML and Tensor...
Big Data Spain - Nov 17 2016 - Madrid Continuously Deploy Spark ML and Tensor...Big Data Spain - Nov 17 2016 - Madrid Continuously Deploy Spark ML and Tensor...
Big Data Spain - Nov 17 2016 - Madrid Continuously Deploy Spark ML and Tensor...
 
[系列活動] 資料探勘速遊
[系列活動] 資料探勘速遊[系列活動] 資料探勘速遊
[系列活動] 資料探勘速遊
 
02 math essentials
02 math essentials02 math essentials
02 math essentials
 
Boston Spark Meetup May 24, 2016
Boston Spark Meetup May 24, 2016Boston Spark Meetup May 24, 2016
Boston Spark Meetup May 24, 2016
 
Advanced Spark and TensorFlow Meetup 08-04-2016 One Click Spark ML Pipeline D...
Advanced Spark and TensorFlow Meetup 08-04-2016 One Click Spark ML Pipeline D...Advanced Spark and TensorFlow Meetup 08-04-2016 One Click Spark ML Pipeline D...
Advanced Spark and TensorFlow Meetup 08-04-2016 One Click Spark ML Pipeline D...
 
高嘉良/Open Innovation as Strategic Plan
高嘉良/Open Innovation as Strategic Plan高嘉良/Open Innovation as Strategic Plan
高嘉良/Open Innovation as Strategic Plan
 
Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...
Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...
Deploy Spark ML and Tensorflow AI Models from Notebooks to Microservices - No...
 
The Genome Assembly Problem
The Genome Assembly ProblemThe Genome Assembly Problem
The Genome Assembly Problem
 
Machine Learning Essentials (dsth Meetup#3)
Machine Learning Essentials (dsth Meetup#3)Machine Learning Essentials (dsth Meetup#3)
Machine Learning Essentials (dsth Meetup#3)
 
Machine Learning Preliminaries and Math Refresher
Machine Learning Preliminaries and Math RefresherMachine Learning Preliminaries and Math Refresher
Machine Learning Preliminaries and Math Refresher
 

Similar to Advanced Spark and TensorFlow Meetup May 26, 2016

Meetup tensorframes
Meetup tensorframesMeetup tensorframes
Meetup tensorframes
Paolo Platter
 
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
Simplilearn
 
Introduction To Using TensorFlow & Deep Learning
Introduction To Using TensorFlow & Deep LearningIntroduction To Using TensorFlow & Deep Learning
Introduction To Using TensorFlow & Deep Learning
ali alemi
 
Deep Learning in your Browser: powered by WebGL
Deep Learning in your Browser: powered by WebGLDeep Learning in your Browser: powered by WebGL
Deep Learning in your Browser: powered by WebGL
Oswald Campesato
 
TensorFlow example for AI Ukraine2016
TensorFlow example  for AI Ukraine2016TensorFlow example  for AI Ukraine2016
TensorFlow example for AI Ukraine2016
Andrii Babii
 
running Tensorflow in Production
running Tensorflow in Productionrunning Tensorflow in Production
running Tensorflow in Production
Matthias Feys
 
Lecture Note DL&NN Tensorflow.pptx
Lecture Note DL&NN Tensorflow.pptxLecture Note DL&NN Tensorflow.pptx
Lecture Note DL&NN Tensorflow.pptx
BhaviniBhatt7
 
Neural Networks with Google TensorFlow
Neural Networks with Google TensorFlowNeural Networks with Google TensorFlow
Neural Networks with Google TensorFlow
Darshan Patel
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Chris Fregly
 
Tensorflow
TensorflowTensorflow
Tensorflow
marwa Ayad Mohamed
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
Databricks
 
Machine Learning - Introduction to Tensorflow
Machine Learning - Introduction to TensorflowMachine Learning - Introduction to Tensorflow
Machine Learning - Introduction to Tensorflow
Andrew Ferlitsch
 
Dive into Deep Learning
Dive into Deep LearningDive into Deep Learning
Dive into Deep Learning
Darío Garigliotti
 
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLabIntroduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
CloudxLab
 
Tensorflow 101 @ Machine Learning Innovation Summit SF June 6, 2017
Tensorflow 101 @ Machine Learning Innovation Summit SF June 6, 2017Tensorflow 101 @ Machine Learning Innovation Summit SF June 6, 2017
Tensorflow 101 @ Machine Learning Innovation Summit SF June 6, 2017
Ashish Bansal
 
Deep Learning, Scala, and Spark
Deep Learning, Scala, and SparkDeep Learning, Scala, and Spark
Deep Learning, Scala, and Spark
Oswald Campesato
 
Natural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usageNatural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usage
hyunyoung Lee
 
Spark Summit EU talk by Tim Hunter
Spark Summit EU talk by Tim HunterSpark Summit EU talk by Tim Hunter
Spark Summit EU talk by Tim Hunter
Spark Summit
 
Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...
Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...
Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...
confluent
 
Tensorflow - Intro (2017)
Tensorflow - Intro (2017)Tensorflow - Intro (2017)
Tensorflow - Intro (2017)
Alessio Tonioni
 

Similar to Advanced Spark and TensorFlow Meetup May 26, 2016 (20)

Meetup tensorframes
Meetup tensorframesMeetup tensorframes
Meetup tensorframes
 
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
 
Introduction To Using TensorFlow & Deep Learning
Introduction To Using TensorFlow & Deep LearningIntroduction To Using TensorFlow & Deep Learning
Introduction To Using TensorFlow & Deep Learning
 
Deep Learning in your Browser: powered by WebGL
Deep Learning in your Browser: powered by WebGLDeep Learning in your Browser: powered by WebGL
Deep Learning in your Browser: powered by WebGL
 
TensorFlow example for AI Ukraine2016
TensorFlow example  for AI Ukraine2016TensorFlow example  for AI Ukraine2016
TensorFlow example for AI Ukraine2016
 
running Tensorflow in Production
running Tensorflow in Productionrunning Tensorflow in Production
running Tensorflow in Production
 
Lecture Note DL&NN Tensorflow.pptx
Lecture Note DL&NN Tensorflow.pptxLecture Note DL&NN Tensorflow.pptx
Lecture Note DL&NN Tensorflow.pptx
 
Neural Networks with Google TensorFlow
Neural Networks with Google TensorFlowNeural Networks with Google TensorFlow
Neural Networks with Google TensorFlow
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
 
Tensorflow
TensorflowTensorflow
Tensorflow
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
 
Machine Learning - Introduction to Tensorflow
Machine Learning - Introduction to TensorflowMachine Learning - Introduction to Tensorflow
Machine Learning - Introduction to Tensorflow
 
Dive into Deep Learning
Dive into Deep LearningDive into Deep Learning
Dive into Deep Learning
 
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLabIntroduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
 
Tensorflow 101 @ Machine Learning Innovation Summit SF June 6, 2017
Tensorflow 101 @ Machine Learning Innovation Summit SF June 6, 2017Tensorflow 101 @ Machine Learning Innovation Summit SF June 6, 2017
Tensorflow 101 @ Machine Learning Innovation Summit SF June 6, 2017
 
Deep Learning, Scala, and Spark
Deep Learning, Scala, and SparkDeep Learning, Scala, and Spark
Deep Learning, Scala, and Spark
 
Natural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usageNatural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usage
 
Spark Summit EU talk by Tim Hunter
Spark Summit EU talk by Tim HunterSpark Summit EU talk by Tim Hunter
Spark Summit EU talk by Tim Hunter
 
Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...
Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...
Real Time Streaming Data with Kafka and TensorFlow (Yong Tang, MobileIron) Ka...
 
Tensorflow - Intro (2017)
Tensorflow - Intro (2017)Tensorflow - Intro (2017)
Tensorflow - Intro (2017)
 

More from Chris Fregly

AWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and DataAWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and Data
Chris Fregly
 
Pandas on AWS - Let me count the ways.pdf
Pandas on AWS - Let me count the ways.pdfPandas on AWS - Let me count the ways.pdf
Pandas on AWS - Let me count the ways.pdf
Chris Fregly
 
Ray AI Runtime (AIR) on AWS - Data Science On AWS Meetup
Ray AI Runtime (AIR) on AWS - Data Science On AWS MeetupRay AI Runtime (AIR) on AWS - Data Science On AWS Meetup
Ray AI Runtime (AIR) on AWS - Data Science On AWS Meetup
Chris Fregly
 
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds UpdatedSmokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Chris Fregly
 
Amazon reInvent 2020 Recap: AI and Machine Learning
Amazon reInvent 2020 Recap:  AI and Machine LearningAmazon reInvent 2020 Recap:  AI and Machine Learning
Amazon reInvent 2020 Recap: AI and Machine Learning
Chris Fregly
 
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...
Waking the Data Scientist at 2am:  Detect Model Degradation on Production Mod...Waking the Data Scientist at 2am:  Detect Model Degradation on Production Mod...
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...
Chris Fregly
 
Quantum Computing with Amazon Braket
Quantum Computing with Amazon BraketQuantum Computing with Amazon Braket
Quantum Computing with Amazon Braket
Chris Fregly
 
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
Chris Fregly
 
AWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:CapAWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:Cap
Chris Fregly
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
Chris Fregly
 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Chris Fregly
 
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
Chris Fregly
 
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
Chris Fregly
 
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Chris Fregly
 
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
Chris Fregly
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Chris Fregly
 
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
Chris Fregly
 
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
Chris Fregly
 
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
Chris Fregly
 
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
Chris Fregly
 

More from Chris Fregly (20)

AWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and DataAWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and Data
 
Pandas on AWS - Let me count the ways.pdf
Pandas on AWS - Let me count the ways.pdfPandas on AWS - Let me count the ways.pdf
Pandas on AWS - Let me count the ways.pdf
 
Ray AI Runtime (AIR) on AWS - Data Science On AWS Meetup
Ray AI Runtime (AIR) on AWS - Data Science On AWS MeetupRay AI Runtime (AIR) on AWS - Data Science On AWS Meetup
Ray AI Runtime (AIR) on AWS - Data Science On AWS Meetup
 
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds UpdatedSmokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
 
Amazon reInvent 2020 Recap: AI and Machine Learning
Amazon reInvent 2020 Recap:  AI and Machine LearningAmazon reInvent 2020 Recap:  AI and Machine Learning
Amazon reInvent 2020 Recap: AI and Machine Learning
 
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...
Waking the Data Scientist at 2am:  Detect Model Degradation on Production Mod...Waking the Data Scientist at 2am:  Detect Model Degradation on Production Mod...
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...
 
Quantum Computing with Amazon Braket
Quantum Computing with Amazon BraketQuantum Computing with Amazon Braket
Quantum Computing with Amazon Braket
 
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
 
AWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:CapAWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:Cap
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
 
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
 
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
 
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
 
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
 
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
 
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
 
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
 
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
 

Recently uploaded

Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web Services
KrzysztofKkol1
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
NaapbooksPrivateLimi
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
kalichargn70th171
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
varshanayak241
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 

Recently uploaded (20)

Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web Services
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 

Advanced Spark and TensorFlow Meetup May 26, 2016

  • 2. Meetup Agenda Meetup Updates (Chris F) Technology Updates (Chris F) Spark + TensorFlow Model Serving/Deployment (Chris F) Neural Net and TensorFlow Landscape (Chris F) TensorFlow Core, Best Practices, Hidden Gems (Sam A) TensorFlow Distributed (Fabrizio M)
  • 3. Meetup Updates New Sponsor Meetup Metrics Co-presenters Sam Abrahams “LA Kid (from DC) Fabrizio Milo “Italian Stallion” Chris Fregly “Chicago Fire”
  • 4. advancedspark.com Workshop - June 4th - Spark ML + TensorFlow
  • 5. Chris Fregly Technology Updates github.com/fluxcapacitor/pipeline Neural Network Tool Landscape TensorFlow Tool Landscape TensorFlow Serving Spark ML Serving pipeline.io (Shh… Stealth)
  • 6. Technology Updates... Spark 2.0 Kafka 0.10.0 + Confluent 3.0 CUDA 8.0 + cuDNN v5
  • 7. Whole-Stage Code Gen ● SPARK-12795 ● Physically Fuse Together Operators (within a Stage) into 1 Operation ● Avoids Excessive Virtual Function Calls ● Utilize CPU Registers vs. L1, L2, Main Memory ● Loop Unrolling and SIMD Code Generation ● Speeds up CPU-bound workloads (not I/O-bound) Vectorization (SPARK-12992) ● Operate on Batches of Data ● Reduces Virtual Function Calls ● Fallback if Whole-Stage Not an Option Spark 2.0: Core
  • 8. Save/Load Support for All Models and Pipelines! ● Python and Scala ● Saved as Parquet ONLY Local Linear Algebra Library ● SPARK-13944, SPARK-14615 ● Drop-in Replacement for Distributed Linear Algebra library ● Opens up door to my new pipeline.io prediction layer! Spark 2.0: ML
  • 9. Kafka v0.10 + Confluent v3.0 Platform Kafka Streams New Event-Creation Timestamp Rack Awareness (THX NFLX!) Kerberos + SASL Improvements Ability to Pause a Connector (ie. Maintenance) New max.poll.records to limit messages retrieved
  • 10. CUDA Deep Neural Network (cuDNN) v5 Updates LSTM (Long Short-Term Memory) RNN Support for NLP use cases (6x speedup) Optimized for NVIDIA Pascal GPU Architecture including FP16 (low precision) Highly-optimized networks with 3x3 convolutions (GoogleNet)
  • 11. github.com/fluxcapacitor/pipeline Updates Tools and Examples ● JupyterHub and Python 3 Support ● Spark-Redis Connector ● Theano ● Keras (TensorFlow + Theano Support) Code ● Spark ML DecisionTree Code Generator (Janino JVM ByteCode Generator) ● Hot-swappable ML Model Watcher (similar to TensorFlow Serving) ● Eigenface-based Image Recommendations ● Streaming Matrix Factorization w/ Kafka ● Netflix Hystrix-based Circuit Breaker - Prediction Service @ Scale
  • 15. Bonus! GPUs and Branching
  • 16. TensorFlow Landscape TensorFlow Core TensorFlow Distributed TensorFlow Serving (similar to Prediction.IO, Pipeline.IO) TensorBoard (Visualize Neural Network Training) Playground (Toy) SkFlow (Scikit-Learn + TensorFlow) Keras (High-level API for both TensorFlow and Theano) Models (Parsey McParseface/SyntaxNet)
  • 17. TensorFlow Serving (Model Deployment) Dynamic Model Loader Model Versioning & Rollback Written in C/C++ Extend to Serve any Model! Demos!!
  • 18. Spark ML Serving (Model Deployment) Same thing except for Spark... Keep an eye on pipeline.io!
  • 19. Sam Abrahams TensorFlow Core Best Practices Hidden Gems
  • 20. TensorFlow Core Terminology gRPC (?) - Should this go here or in distributed? TensorBoard Overview (?) Explaining this will go a long way ->
  • 21. Who dis? •My name is Sam Abrahams •Los Angeles based machine learning engineer •TensorFlow White Paper Notes •TensorFlow for Raspberry Pi •Contributor to the TensorFlow project samjabrahams @sabraha
  • 22. This Talk: Sprint Through: • Core TensorFlow API and terminology • TensorFlow workflow • Example TensorFlow Code • TensorBoard
  • 23. TensorFlow Programming Model • Very similar to Theano • The primary user-facing API is Python, though there is a partial C++ API for executing models • Computational code is written in C++, implemented for CPUs, GPUs, or both • Integrates tightly with NumPy
  • 24. TensorFlow Programming Generally boils down to two steps: 1. Build the computational graph(s) that you’d like to run 2. Use a TensorFlow Session to run your graph one or more times
  • 25. Graph • The primary structure of a TensorFlow model • Generally there is one graph per program, but TensorFlow can support multiple Graphs • Nodes represent computations or data transformations • Edges represent data transfer or computational control
  • 26. What is a data flow graph? • Also known as a “computational graph”, or just “graph” • Nice way to visualize series of mathematical computations • Here’s a simple graph showing the addition of two variables: a ba a + b
  • 27. Components of a Graph Graphs are composed of two types of elements: • Nodes • These are the elliptical shapes in the graph, and represent some form of computation • Edges • These are the arrows in the graph, and represent data flowing from one node into another node. NODE NODEEDGE
  • 28. Why Use Graphs? • They are highly compositional • Useful for calculating derivatives • It’s easier to implement distributed computation • Computation is already segmented nicely • Neural networks are already implemented as computational graphs!
  • 29. Tensors • N-Dimensional Matrices • 0-Dimensional → Scalar • 1-Dimensional → Vector • 2-Dimensional → Matrix • All data that moves through a TensorFlow graph is a Tensor • TensorFlow can convert Python native types or NumPy arrays into Tensor objects
  • 30. Tensors •Python # Vector > [1, 2, 3] # 3-D Tensor > [[ [0,0], [0,1], [0,2] ], [ [1,0], [1,1], [1,2] ], [ [2,0], [2,1], [2,2] ] # Matrix as NumPy Array > np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.int64) # Scalar > 3 # Matrix > [[1, 2, 3], [4, 5, 6], [7, 8, 9]] •NumPy
  • 31. Tensors • Best practice is to use NumPy arrays when directly defining Tensors • Can explicitly set data type • This presentation does not create Tensors with NumPy • Space is limited • I am lazy • Tensors returned by TensorFlow graphs are NumPy arrays
  • 32. Operations • “Op” for short • Represent any sort of computation • Take in zero or more Tensors as input, and output zero or more Tensors • Numerous uses: perform matrix algebra, initialize variables, print info to the console, etc. • Do not run when defined: they must be called from a TensorFlow Session (coming up later)
  • 33. Operations •Quick Example > import tensorflow as tf > a = tf.mul(3,5) # Returns handle to new tf.mul node > sess = tf.Session() # Creates TF Session > sess.run(a) # Actually runs the Operation out: 15
  • 34. Placeholders • Define “input” nodes • Specifies information that be provided when graph is run • Typically used for training data • Define the tensor shape and data type when created: import tensorflow as tf # Create a placeholder of size 100x400 # With 32-bit floating point data type my_placeholder = tf.placeholder(tf.float32, shape=(100,400))
  • 35. TensorFlow Session • In charge of coordinating graph execution • Most important method is run() • This is what actually runs the Graph • It takes in two parameters, ‘fetches’ and ‘feed_dict’ • ‘fetches’ is a list of objects you’d like to get the results for, such as the final layer in a neural network • ‘feed_dict’ is a dictionary that maps tensors (often Placeholders) to values those tensors should use
  • 36. TensorFlow Variables • Contain tensor data that persists each time you run your Graph • Typically used to hold weights and biases of a machine learning model • The final values of the weights and biases, along with the Graph shape, define a trained model • Before being run in a Graph, must be initialized (will discuss this at the Session slide)
  • 37. TensorFlow Variables •Old school: Define with tensorflow.Variable() •Best practices: tf.get_variable() •Update its information with the assign() method Import tensorflow as tf # Create a variable with value 0 and name ‘my_variable’ my_var = tf.get_variable(0, name=‘my_variable’) # Increment the variable by one my_var.assign(my_var + 1)
  • 38. Building the graph! import tensorflow as tf # create a placeholder for inputting floating point data x = tf.placeholder(tf.float32) # Make a Variable with the starting value of 0 start = tf.Variable(0.0) # Create a node that is the value of (start + x) y = start.assign(start + x) a startx y = x + start
  • 39. Running a Session (using previous graph) • Start a Session with tensorflow.Session() • Close it when you’re done! # Open up a TensorFlow Session # and assign it to the handle 'sess' sess = tf.Session() # Important: initialize the Variable init = tf.initialize_all_variables sess.run(init) # Run the graph to get the value of y # Feed in different values of x each time print(sess.run(y, feed_dict={x:1})) # Prints 1.0 print(sess.run(y, feed_dict={x:0.5})) # Prints 1.5 print(sess.run(y, feed_dict={x:2.2})) # Prints 3.7 # Close the Session sess.close()
  • 40. Devices • A single device is a CPU, GPU, or other computational unit (TPU?!) • One machine can have multiple devices • “Distributed” → Multiple machines • Multi-device != Distributed
  • 41. TensorBoard • One of the most useful (and underutilized) aspects of TensorFlow - Takes in serialized information from graph to visualize data. • Complete control over what data is stored Let’s do a brief live demo. Hopefully nothing blows up!
  • 42. TensorFlow Codebase Standard and third party C++ libraries (Eigen) TensorFlow core framework C++ kernel implementation Swig Python client API
  • 43. TensorFlow Codebase Structure: Core tensorflow/core : Primary C++ implementations and runtimes •core/ops – Registration of Operation signatures •core/kernels – Operation implementations •core/framework – Fundamental TensorFlow classes and functions •core/platform – Abstractions for operating system platforms Based on information from Eugene Brevdo, Googler and TensorFlow member
  • 44. TensorFlow Codebase Structure: Python tensorflow/python : Python implementations, wrappers, API •python/ops – Code that is accessed through the Python API •python/kernel_tests – Unit tests (which means examples) •python/framework – TensorFlow fundamental units •python/platform – Abstracts away system-level Python calls contrib/ : contributed or non-fully adopted code Based on information from Eugene Brevdo, Googler and TensorFlow member
  • 45. Learning TensorFlow: Beyond the Tutorials ● There is a ton of excellent documentation in the code itself- especially in the C++ implementations ● If you ever want to write a custom Operation or class, you need to immerse yourself ● Easiest way to dive in: look at your #include statements! ● Use existing code as reference material ● If you see something new, learn something new
  • 46. Tensor - tensorflow/core/framework/tensor.h • dtype() - returns data type • shape() - returns TensorShape representing tensor’s shape • dims() – returns the number of dimensions in the tensor • dim_size(int d) - returns size of specified dimension • NumElements() - returns number of elements • isSameSize(const Tensor& b) – do these Tensors dimensions match? • TotalBytes() - estimated memory usage • CopyFrom() – Copy another tensor and share its memory • ...plus many more
  • 47. Some files worth checking out: tensorflow/core/framework ● tensor_util.h: deep copying, concatenation, splitting ● resource_mgr.h: Resource manager for mutable Operation state ● register_types.h: Useful helper macros for registering Operations ● kernel_def_builder.h: Full documentation for kernel definitions tensorflow/core/util ● cuda_kernel_helper.h: Helper functions for cuda kernel implementations ● sparse/sparse_tensor.h: Documentation for C++ SparseTensor class
  • 48. General advice for GPU implementation: 1. If possible, always register for GPU, even if it’s not a full implementation ○ Want to be able to run code on GPU if it won’t bottleneck the system 2. Before writing a custom GPU implementation, sanity check! Will it help? ○ Not everything benefits from parallelization 3. Utilize Eigen! ○ TensorFlow’s core Tensor class is based on Eigen 4. Be careful with memory: you are in charge of mutable data structures
  • 51. gRPC (http://www.grpc.io/) - A high performance, open source, general RPC framework that puts mobile and HTTP/2 first. - Protocol Buffers - HTTP2 HTTP2 HTTP2
  • 53. Distributed Tensorflow - Terminology - Cluster - Jobs - Tasks
  • 54. Distributed Tensorflow - Terminology - Cluster: ClusterSpec - Jobs: Parameter Server, Worker - Tasks: Usually 0, 1, 2, …
  • 55. Example of a Cluster Spec { "ps":[ "8.9.15.20:2222", ], "workers":[ "8.34.25.90:2222", "30.21.18.24:2222", "4.17.19.14:2222" ] }
  • 56. One Session One Server One Job One Task
  • 57. One Session One Server One Job One Task ServerSession
  • 58. Use Cases - Training - Hyper Parameters Optimization - Ensembling
  • 60. tf.python.training.session_manager.SessionManager 1. Checkpointing trained variables as the training progresses. 2. Initializing variables on startup, restoring them from the most recent checkpoint after a crash, or wait for checkpoints to become available.
  • 61. tf.python.training.supervisor.Supervisor # Single Program with tf.Graph().as_default(): ...add operations to the graph... # Create a Supervisor that will checkpoint the model in '/tmp/mydir'. sv = Supervisor(logdir='/tmp/mydir') # Get a Tensorflow session managed by the supervisor. with sv.managed_session(FLAGS.master) as sess: # Use the session to train the graph. while not sv.should_stop(): sess.run(<my_train_op>)
  • 62. tf.python.training.supervisor.Supervisor # Multiple Replica Program is_chief = (server_def.task_index == 0) server = tf.train.Server(server_def) with tf.Graph().as_default(): ...add operations to the graph... # Create a Supervisor that uses log directory on a shared file system. # Indicate if you are the 'chief' sv = Supervisor(logdir='/shared_directory/...', is_chief=is_chief) # Get a Session in a TensorFlow server on the cluster. with sv.managed_session(server.target) as sess: # Use the session to train the graph. while not sv.should_stop(): sess.run(<my_train_op>)
  • 64. Use Cases: Synchronous Training tf.train.SyncReplicasOptimizer This optimizer avoids stale gradients by collecting gradients from all replicas, summing them, then applying them to the variables in one shot, after which replicas can fetch the new variables and continue.
  • 65. tf.train.replica_device_setter The device setter will automatically place Variables ops on separate parameter servers (ps). The non-Variable ops will be placed on the workers. tf.train.replica_device_setter(cluster_def) with tf.device(device_setter): pass
  • 66. Training Replication In Graph Between Graphs Asynchronous Synchronous Use Cases: Training
  • 67. /job:ps /job:worker In-Graph Replication /task:0 /task:0 /task:1/task:1 with tf.device("/job:ps/task:0"): weights_1 = tf.Variable(...) biases_1 = tf.Variable(...) with tf.device("/job:ps/task:1"): weights_2 = tf.Variable(...) biases_2 = tf.Variable(...) with tf.device("/job:worker/task:0"): input, labels = ... layer_1 = tf.nn.relu(...) with tf.device("/job:worker/task:0"): train_op = ... logits = tf.nn.relu(...) with tf.Session() as sess: for _ in range(10000): sess.run(train_op) Client
  • 68. # To build a cluster with two ps jobs on hosts ps0 and ps1, and 3 worker # jobs on hosts worker0, worker1 and worker2. cluster_spec = { "ps": ["ps0:2222", "ps1:2222"], "worker": ["worker0:2222", "worker1:2222", "worker2:2222"]} with tf.device(tf.replica_device_setter(cluster=cluster_spec)): # Build your graph v1 = tf.Variable(...) # assigned to /job:ps/task:0 v2 = tf.Variable(...) # assigned to /job:ps/task:1 v3 = tf.Variable(...) # assigned to /job:ps/task:0 # Run compute ... /job:ps /job:worker Replication Between Graph /task:0 /task:0 /task:1/task:1
  • 69. Use Cases: Train HyperParameter Optimization - Grid Search - Random Search - Gradient Optimization - Bayesian
  • 71. Comparison with other frameworks mxnet
  • 72. Surprise Demo : Rasberry Pi Cluster Distributed TensorFlow Cluster Distributed TensorFlow TensorBoard Visualizations
  • 73. Workshop Demo - LARGE GCE Cluster Super-Large Cluster Attendee GCE (Google Cloud Engine) Spec 50 GB RAM, 100 GB SSD, 8 CPUs (No GPUs) TODO: Build Script for Each Attendee to Run as Either a Worker or Parameter Server Figure out split between Worker and Parameter Server ImageNet TODO: Train full 64-bit precision, quantitize down to 8-bit (Sam A)
  • 74. Workshop Demo - Initial Cluster 8.34.215.90 (Sam) 130.211.128.240 (Fabrizio) 104.197.159.134 (Fregly) Cluster Spec: “8.34.215.90, 130.211.128.240, 104.197.159.134” SSH PEM file: http://advancedspark.com/keys/pipeline-training-gce.pem chmod 600 pipeline-training-gce.pem Username: pipeline-training Password: password9 sudo docker images (VERIFY fluxcapacitor/pipeline) START: sudo docker run -it --privileged --name pipeline --net=host -m 48g fluxcapacitor/pipeline bash
  • 75. TensorFlow Operations: Hidden Gems - Go over macros, functions provided in the library, but not mentioned in documentation - Brief brief overview of writing custom op - Testing GPU code - Leveraging Eigen and existing code - Python wrapper