The document discusses machine learning and TensorFlow. It provides three ways to get started with machine learning of varying complexity: using cloud APIs, retraining existing models, or developing new models. It then discusses Google Cloud machine learning APIs for vision, natural language, speech and translation. It provides examples of using the vision API and mobile vision API. It introduces TensorFlow as an open source machine learning library for research and production with Python and C++ frontends.
2. How Can You Get Started with Machine Learning?
Three ways, with varying complexity:
(1) Use a Cloud-based or Mobile API (Vision, Natural Language,
etc.)
(2) Use an existing model architecture, and retrain it or fine tune
on your dataset
(3) Develop your own machine learning models for new
problems
More
flexible,
but more
effort
required
#DevFestGRX
GDG Granada
6. Faces
Faces, facial landmarks, emotions
OCR
Read and extract text, with
support for > 10 languages
Label
Detect entities from furniture to
transportation
Logos
Identify product logos
Landmarks & Image Properties
Detect landmarks & dominant
color of image
Safe Search
Detect explicit content - adult,
violent, medical and spoof
Cloud Vision API
#DevFestGRX
GDG Granada
7. API Usage: Detect Objects in an Image
Image Detected
Items
Vision API
Create JSON
request with the
image or pointer
to an image
Process
the JSON
response
Call the
REST API1 2 3
#DevFestGRX
GDG Granada
9. Confidential & ProprietaryGoogle Cloud Platform 9
Cloud Natural Language API
Extract sentence, identify parts of
speech and create dependency parse
trees for each sentence.
Identify entities and label by types such
as person, organization, location, events,
products and media.
Understand the overall sentiment of a
block of text.
Syntax Analysis Entity Recognition
Sentiment Analysis
#DevFestGRX
GDG Granada
11. Confidential & ProprietaryGoogle Cloud Platform 11
Cloud Speech API
Automatic Speech Recognition (ASR)
powered by deep learning neural
networking to power your
applications like voice search or
speech transcription.
Recognizes over 80
languages and variants
with an extensive
vocabulary.
Returns partial
recognition results
immediately, as they
become available.
Filter inappropriate
content in text results.
Audio input can be captured by an application’s
microphone or sent from a pre-recorded audio
file. Multiple audio file formats are supported,
including FLAC, AMR, PCMU and linear-16.
Handles noisy audio from many
environments without requiring
additional noise cancellation.
Audio files can be uploaded in the
request and, in future releases,
integrated with Google Cloud
Storage.
Automatic Speech Recognition Global Vocabulary Inappropriate Content
Filtering
Streaming Recognition
Real-time or Buffered Audio Support Noisy Audio Handling Integrated API
#DevFestGRX
GDG Granada
13. Face API
faces, facial landmarks, eyes
open, smiling
Barcode API
1D and 2D barcodes
Text API
Latin-based text / structure
Common Mobile Vision API
Support for fast image and video on-device detection and tracking.
#DevFestGRX
GDG Granada
14. Googly Eyes Android App
Video credit Google
1. Create a face detector for facial landmarks (e.g., eyes)
3. For each face, draw the eyes
FaceDetector detector = new FaceDetector.Builder()
.setLandmarkType(FaceDetector.ALL_LANDMARKS)
.build();
SparseArray<Face> faces = detector.detect(image);
for (int i = 0; i < faces.size(); ++i) {
Face face = faces.valueAt(i);
for (Landmark landmark : face.getLandmarks()) {
// Draw eyes
2. Detect faces in the image
#DevFestGRX
GDG Granada
21. ● Open source Machine
Learning library
● Especially useful for
Deep Learning
● For research and
production
● Apache 2.0 license
#DevFestGRX
GDG Granada
22. Architecture
● Core in C++
● Different front ends
○ Python and C++ today, community may add more
Core TensorFlow Execution System
CPU GPU Android iOS ...
C++ front end Python front end ...
#DevFestGRX
GDG Granada
25. From the whitepaper: “TensorFlow is an interface for expressing machine
learning algorithms, and an implementation for executing such algorithms.”
In short: TensorFlow is Theano++.
● Symbolic ML dataflow framework that compiles to native / GPU code
What is TensorFlow?
#DevFestGRX
GDG Granada
26. Data Flow Graphs
Computation is defined as a directed acyclic graph
(DAG) to optimize an objective function
● Graph is defined in high-level language (Python)
● Graph is compiled and optimized
● Graph is executed (in parts or fully) on available low
level devices (CPU, GPU)
● Data (tensors) flow through the graph
● TensorFlow can compute gradients automatically
#DevFestGRX
GDG Granada
28. Variables are 0-ary stateful nodes
which output their current value.
(State is retained across multiple executions
of a graph.)
(parameters, gradient stores, eligibility traces, …)
Graph?
#DevFestGRX
GDG Granada
29. Placeholders are 0-ary nodes whose
value is fed in at execution time.
(inputs, variable learning rates, …)
Graph?
#DevFestGRX
GDG Granada
30. Mathematical operations:
MatMul: Multiply two matrix values.
Add: Add elementwise (with broadcasting).
ReLU: Activate with elementwise rectified
linear function.
Graph?
#DevFestGRX
GDG Granada
31. In code, please!
1. Create model weights, including
initialization
a. W ~ Uniform(-1, 1); b = 0
2. Create input placeholder x
a. m * 784 input matrix
3. Create computation graph
import tensorflow as tf
b = tf.Variable(tf.zeros((100,)))
W = tf.Variable(tf.random_uniform((784, 100),
-1, 1))
x = tf.placeholder(tf.float32, (None, 784))
h_i = tf.nn.relu(tf.matmul(x, W) + b)
1
2
3
#DevFestGRX
GDG Granada
32. So far we have defined a graph.
We can deploy this graph with a session: a binding
to a particular execution context (e.g. CPU, GPU)
How do we run it?
#DevFestGRX
GDG Granada
33. Getting output
sess.run(fetches, feeds)
Fetches: List of graph nodes.
Return the outputs of these
nodes.
Feeds: Dictionary mapping from
graph nodes to concrete values.
Specifies the value of each graph
node given in the dictionary.
import numpy as np
import tensorflow as tf
b = tf.Variable(tf.zeros((100,)))
W = tf.Variable(tf.random_uniform((784,
100),-1, 1))
x = tf.placeholder(tf.float32, (None, 784))
h_i = tf.nn.relu(tf.matmul(x, W) + b)
1
2
3
sess = tf.Session()
sess.run(tf.initialize_all_variables())
sess.run(h_i, {x: np.random.random(64, 784)})
#DevFestGRX
GDG Granada
34. 1. Build a graph
a. Graph contains parameter specifications, model architecture, optimization process, …
2. Initialize a session
3. Fetch and feed data with
Session.run
a. Compilation, optimization, etc. happens at this step — you probably won’t notice
Basic flow
#DevFestGRX
GDG Granada
36. tensorflow.org
github.com/tensorflow
Want to learn more?
Udacity class on Deep Learning, goo.gl/iHssII
Guides, codelabs, videos
MNIST for Beginners, goo.gl/tx8R2b
TF Learn Quickstart, goo.gl/uiefRn
TensorFlow for Poets, goo.gl/bVjFIL
ML Recipes, goo.gl/KewA03
TensorFlow and Deep Learning without a PhD, goo.gl/pHeXe7
What's Next
#DevFestGRX
GDG Granada