SlideShare a Scribd company logo
1 of 74
Introduction to tf.data (TF2)
H2O Meetup
Galvanize San Francisco
02/19/2020
Oswald Campesato
ocampesato@yahoo.com
Highlights/Overview
What is tf.data?
Working with TF 2 tf.data.Dataset
Intermediate operators
Terminal operators
filter() and map()
zip() and batch()
Working with TF 2 generators
tf.data: TF Input Pipeline
 An input pipeline is useful:
 for streaming data
 when data is too big to fit in memory
 When data requires preprocessing
 When you need to shuffle large data
 Can be scaled to multiple hosts
 => ETL functionality
What are tf.data.Datasets
 Simple example:
 1) define a Numpy array of numbers
 2) create a TF Dataset ds
 3) iterate through the dataset ds
What are TF Datasets
 import tensorflow as tf # tf-dataset1.py
 import numpy as np
 x = np.array([1,2,3,4,5])
 ds = tf.data.Dataset.from_tensor_slices(x)
 # iterate through the elements:
for value in ds.take(len(x)):
print(value)
What are Lambda Expressions
 a lambda expression is an anonymous function
 use lambda expressions to define local functions
 pass lambda expressions as arguments
 return them as the value of function calls
Some tf.data “lazy operators”
map()
filter()
flatmap()
batch()
take()
zip()
flatten()
=> Combined via “method chaining”
tf.data “lazy operators”
 filter():
 uses Boolean logic to "filter" the elements in an array to
determine which elements satisfy the Boolean condition
 map(): a projection
 this operator "applies" a lambda expression to each input
element
 flat_map():
 maps a single element of the input dataset to a Dataset of
elements
tf.data “lazy operators”
 batch(n):
 processes a "batch" of n elements during each
iteration
 repeat(n):
 repeats its input values n times
 take(n):
 operator "takes" n input values
tf.data.Dataset.from_tensors()
 Import tensorflow as tf
 #combine the input into one element
 t1 = tf.constant([[1, 2], [3, 4]])
 ds1 = tf.data.Dataset.from_tensors(t1)
 # output: [[1, 2], [3, 4]]
tf.data.Dataset.from_tensor_slices()
 Import tensorflow as tf
 #separate element for each item
 t2 = tf.constant([[1, 2], [3, 4]])
 ds1 =
tf.data.Dataset.from_tensor_slices(t2)
 # output: [1, 2], [3, 4]
TF2 Datasets: code sample
 import tensorflow as tf
 import numpy as np
x = np.arange(0, 10)
 # create a dataset from a Numpy array
ds = tf.data.Dataset.from_tensor_slices(x)
TF filter() operator: ex #1
import tensorflow as tf # tf2_filter1.py
import numpy as np
x = np.array([1,2,3,4,5])
ds = tf.data.Dataset.from_tensor_slices(x)
 print("First iteration:")
 for value in ds:
 print("value:",value)
TF filter() operator: ex #1
 First iteration:
 value: tf.Tensor(1, shape=(), dtype=int64)
 value: tf.Tensor(2, shape=(), dtype=int64)
 value: tf.Tensor(3, shape=(), dtype=int64)
 value: tf.Tensor(4, shape=(), dtype=int64)
 value: tf.Tensor(5, shape=(), dtype=int64)
TF filter() operator: ex #2
import tensorflow as tf # tf2_filter2.py
import numpy as np
x = np.array([1,2,3,4,5])
ds = tf.data.Dataset.from_tensor_slices(x)
 print("First iteration:")
 for value in ds:
 print("value:",value)
TF filter() operator: ex #2
 # "tf.math.equal(x, y)" is required
 # for equality comparison
 def filter_fn(x):
 return tf.math.equal(x, 1)
 ds = ds.filter(filter_fn)
 print("Second iteration:")
 for value in ds:
 print("value:",value)
TF filter() operator: ex #2
 First iteration:
 value: tf.Tensor(1, shape=(), dtype=int64)
 value: tf.Tensor(2, shape=(), dtype=int64)
 value: tf.Tensor(3, shape=(), dtype=int64)
 Second iteration:
 value: tf.Tensor(1, shape=(), dtype=int64)
What are Lambda Expressions
 a lambda expression takes an input variable
 performs an operation on that variable
 A "bare bones" lambda expression:
lambda x: x + 1
 => this adds 1 to an input variable x
TF filter() operator: ex #3
 import tensorflow as tf # tf2_filter3.py
 import numpy as np
 ds = tf.data.Dataset.from_tensor_slices([1,2,3,4,5])
 ds = ds.filter(lambda x: x < 4) # [1,2,3]
 print("First iteration:")
 for value in ds:
 print("value:",value)
TF filter() operator: ex #3
 # "tf.math.equal(x, y)" is required
 # for equality comparison
 def filter_fn(x):
 return tf.math.equal(x, 1)
 ds = ds.filter(filter_fn)
 print("Second iteration:")
 for value in ds:
 print("value:",value)
TF filter() operator: ex #3
 First iteration:
 value: tf.Tensor(1, shape=(), dtype=int32)
 value: tf.Tensor(2, shape=(), dtype=int32)
 value: tf.Tensor(3, shape=(), dtype=int32)
 Second iteration:
 value: tf.Tensor(1, shape=(), dtype=int32)
TF filter() operator: ex #4
 import tensorflow as tf # tf2_filter5.py
 import numpy as np
 def filter_fn(x):
 return tf.equal(x % 2, 0)
 x = np.array([1,2,3,4,5,6,7,8,9,10])
 ds = tf.data.Dataset.from_tensor_slices(x)
 ds = ds.filter(filter_fn)
 for value in ds:
 print("value:",value)
TF filter() operator: ex #4
 value: tf.Tensor(2, shape=(), dtype=int64)
 value: tf.Tensor(4, shape=(), dtype=int64)
 value: tf.Tensor(6, shape=(), dtype=int64)
 value: tf.Tensor(8, shape=(), dtype=int64)
 value: tf.Tensor(10, shape=(), dtype=int64)
TF filter() operator: ex #5
 import tensorflow as tf # tf2_filter5.py
 import numpy as np
 def filter_fn(x):
 return tf.reshape(tf.not_equal(x % 2, 1), [])
 x = np.array([1,2,3,4,5,6,7,8,9,10])
 ds = tf.data.Dataset.from_tensor_slices(x)
 ds = ds.filter(filter_fn)
 for value in ds:
 print("value:",value)
TF filter() operator: ex #5
 value: tf.Tensor(2, shape=(), dtype=int64)
 value: tf.Tensor(4, shape=(), dtype=int64)
 value: tf.Tensor(6, shape=(), dtype=int64)
 value: tf.Tensor(8, shape=(), dtype=int64)
 value: tf.Tensor(10, shape=(), dtype=int64)
TF map() operator: ex #1
import tensorflow as tf # tf2-map.py
import numpy as np
 x = np.array([[1],[2],[3],[4]])
 ds = tf.data.Dataset.from_tensor_slices(x)
 ds = ds.map(lambda x: x*2)
 for value in ds:
 print("value:",value)
TF map() operator: ex #1
 value: tf.Tensor([2], shape=(1,), dtype=int64)
 value: tf.Tensor([4], shape=(1,), dtype=int64)
 value: tf.Tensor([6], shape=(1,), dtype=int64)
 value: tf.Tensor([8], shape=(1,), dtype=int64)
TF map() and filter() operators
import tensorflow as tf # tf2_map_filter.py
import numpy as np
 x = np.array([1,2,3,4,5,6,7,8,9,10])
 ds = tf.data.Dataset.from_tensor_slices(x)
 ds1 = ds.filter(lambda x: tf.equal(x % 4, 0))
 ds1 = ds1.map(lambda x: x*x)
 ds2 = ds.map(lambda x: x*x)
 ds2 = ds2.filter(lambda x: tf.equal(x % 4, 0))
TF map() and filter() operators
 for value1 in ds1:
 print("value1:",value1)
 for value2 in ds2:
 print("value2:",value2)
TF map() and filter() operators
 value1: tf.Tensor(16, shape=(), dtype=int64)
 value1: tf.Tensor(64, shape=(), dtype=int64)
 value2: tf.Tensor(4, shape=(), dtype=int64)
 value2: tf.Tensor(16, shape=(), dtype=int64)
 value2: tf.Tensor(36, shape=(), dtype=int64)
 value2: tf.Tensor(64, shape=(), dtype=int64)
 value2: tf.Tensor(100, shape=(), dtype=int64)
TF map() operator: ex #2
import tensorflow as tf # tf2-map2.py
import numpy as np
 x = np.array([[1],[2],[3],[4]])
 ds = tf.data.Dataset.from_tensor_slices(x)
 # METHOD #1: THE LONG WAY
 # a lambda expression to double each value
 #ds = ds.map(lambda x: x*2)
 # a lambda expression to add one to each value
 #ds = ds.map(lambda x: x+1)
 # a lambda expression to cube each value
 #ds = ds.map(lambda x: x**3)
TF map() operator: ex #2
 # METHOD #2: A SHORTER WAY
ds = ds.map(lambda x: x*2).map(lambda x: x+1).map(lambda x: x**3)
 for value in ds:
 print("value:",value
 # an example of “Method Chaining”
TF map() operator: ex #2
 value: tf.Tensor([27], shape=(1,), dtype=int64)
 value: tf.Tensor([125], shape=(1,), dtype=int64)
 value: tf.Tensor([343], shape=(1,), dtype=int64)
 value: tf.Tensor([729], shape=(1,), dtype=int64)
TF take() operator: ex #1
import tensorflow as tf # tf2-take.py
import numpy as np
ds = tf.data.Dataset.from_tensor_slices(tf.range(8))
ds = ds.take(5)
 for value in ds.take(20):
 print("value:",value)
TF take() operator: ex #1
 value: tf.Tensor(0, shape=(), dtype=int32)
 value: tf.Tensor(1, shape=(), dtype=int32)
 value: tf.Tensor(2, shape=(), dtype=int32)
 value: tf.Tensor(3, shape=(), dtype=int32)
 value: tf.Tensor(4, shape=(), dtype=int32)
TF take() operator: ex #2
import tensorflow as tf # tf2_take.py
import numpy as np
 x = np.array([[1],[2],[3],[4]])
 # make a ds from a numpy array
 ds = tf.data.Dataset.from_tensor_slices(x)
 ds = ds.map(lambda x: x*2)
.map(lambda x: x+1).map(lambda x: x**3)
 for value in ds.take(4):
 print("value:",value)
TF 2 map() and take(): output
 value: tf.Tensor([27], shape=(1,), dtype=int64)
 value: tf.Tensor([125], shape=(1,), dtype=int64)
 value: tf.Tensor([343], shape=(1,), dtype=int64)
 value: tf.Tensor([729], shape=(1,), dtype=int64)
TF zip() operator: ex #1
 import tensorflow as tf # tf2_zip1.py
 import numpy as np
 dx = tf.data.Dataset.from_tensor_slices([0,1,2,3,4])
 dy = tf.data.Dataset.from_tensor_slices([1,1,2,3,5])
 # zip the two datasets together
 d2 = tf.data.Dataset.zip((dx, dy))
 for value in d2:
 print("value:",value)
TF zip() operator: ex #1
 value:
(<tf.Tensor: id=11, shape=(), dtype=int32, numpy=0>,
<tf.Tensor: id=12, shape=(), dtype=int32, numpy=1>)
 value:
(<tf.Tensor: id=13, shape=(), dtype=int32, numpy=1>,
<tf.Tensor: id=14, shape=(), dtype=int32, numpy=1>)
 value:
(<tf.Tensor: id=15, shape=(), dtype=int32, numpy=2>,
<tf.Tensor: id=16, shape=(), dtype=int32, numpy=2>)
=> Plus two more rows of output
TF zip() operator: ex #2
 import tensorflow as tf # tf2_zip_take.py
 import numpy as np
 x = np.arange(0, 10)
 y = np.arange(1, 11)
 dx = tf.data.Dataset.from_tensor_slices(x)
 dy = tf.data.Dataset.from_tensor_slices(y)
 # zip the two datasets together
 d2 = tf.data.Dataset.zip((dx, dy)).batch(3)
 for value in d2.take(8):
 print("value:",value)
TF zip() operator: ex #2
 value: (<tf.Tensor: id=11, shape=(), dtype=int32,
numpy=0>, <tf.Tensor: id=12, shape=(), dtype=int32,
numpy=1>)
 value: (<tf.Tensor: id=15, shape=(), dtype=int32,
numpy=1>, <tf.Tensor: id=16, shape=(), dtype=int32,
numpy=1>)
 value: (<tf.Tensor: id=19, shape=(), dtype=int32,
numpy=2>, <tf.Tensor: id=20, shape=(), dtype=int32,
numpy=2>)
 value: (<tf.Tensor: id=23, shape=(), dtype=int32,
numpy=3>, <tf.Tensor: id=24, shape=(), dtype=int32,
numpy=3>)
 value: (<tf.Tensor: id=27, shape=(), dtype=int32,
numpy=4>, <tf.Tensor: id=28, shape=(), dtype=int32,
numpy=5>)
TF zip() operator: ex #3
 import tensorflow as tf # tf2_zip_batch.py
 import numpy as np
 ds1 = tf.data.Dataset.range(100)
 ds2 = tf.data.Dataset.range(0, -100, -1)
 ds3 = tf.data.Dataset.zip((ds1, ds2))
 ds4 = ds3.batch(4)
 for value in ds4.take(4):
 print("value:",value)
 for value in d2.take(8):
 print("value:",value)
TF zip() operator: ex #3
 value: (<tf.Tensor: id=21, shape=(4,), dtype=int64,
numpy=array([0, 1, 2, 3])>, <tf.Tensor: id=22,
shape=(4,), dtype=int64, numpy=array([ 0, -1, -2, -
3])>)
 value: (<tf.Tensor: id=25, shape=(4,), dtype=int64,
numpy=array([4, 5, 6, 7])>, <tf.Tensor: id=26,
shape=(4,), dtype=int64, numpy=array([-4, -5, -6, -
7])>)
 value: (<tf.Tensor: id=29, shape=(4,), dtype=int64,
numpy=array([ 8, 9, 10, 11])>, <tf.Tensor: id=30,
shape=(4,), dtype=int64, numpy=array([ -8, -9, -10,
-11])>)
 value: (<tf.Tensor: id=33, shape=(4,), dtype=int64,
numpy=array([12, 13, 14, 15])>, <tf.Tensor: id=34,
shape=(4,), dtype=int64, numpy=array([-12, -13, -14,
-15])>)
TF 2 generators
Python functions
Containing your custom code
Specified in the dataset definition
Invoked when data is requested
Generator Functions (1)
 import tensorflow as tf # tf2_generator1.py
 import numpy as np
 x = np.arange(0, 7)
 def gener():
 i = 0
 while(i < len(x)):
 yield i
 i += 1
Generator Functions (1)
ds=tf.data.Dataset.from_generator(gener,(tf.int64))
 size = 2*len(x)
 for value in ds.take(size):
 print("value:",value)
 # value: tf.Tensor(0, shape=(), dtype=int64)
 # value: tf.Tensor(1, shape=(), dtype=int64)
 # value: tf.Tensor(2, shape=(), dtype=int64)
 # value: tf.Tensor(3, shape=(), dtype=int64)
 # value: tf.Tensor(4, shape=(), dtype=int64)
 # value: tf.Tensor(5, shape=(), dtype=int64)
 # value: tf.Tensor(6, shape=(), dtype=int64)
Generator Functions (2)
import tensorflow as tf # tf2-timesthree.py
import numpy as np
x = np.arange(0, 5) # 0, 1, 2, 3, 4
def gener():
for i in x:
yield (3*i)
ds = tf.data.Dataset.from_generator(gener, (tf.int64))
for value in ds.take(len(x)):
print("1value:",value)
for value in ds.take(2*len(x)):
print("2value:",value)
Generator Functions (2)
 1value: tf.Tensor(0, shape=(), dtype=int64)
 1value: tf.Tensor(3, shape=(), dtype=int64)
 1value: tf.Tensor(6, shape=(), dtype=int64)
 1value: tf.Tensor(9, shape=(), dtype=int64)
 1value: tf.Tensor(12, shape=(), dtype=int64)
 2value: tf.Tensor(0, shape=(), dtype=int64)
 2value: tf.Tensor(3, shape=(), dtype=int64)
 2value: tf.Tensor(6, shape=(), dtype=int64)
 2value: tf.Tensor(9, shape=(), dtype=int64)
 2value: tf.Tensor(12, shape=(), dtype=int64)
Generator Functions (3)
 import tensorflow as tf # tf2_generator3.py
 import numpy as np
 x = np.arange(0, 7)
 while(i < len(x/3)):
 yield (i, i+1, i+2)
 i += 3
Generator Functions (3)
 ds = tf.data.Dataset.from_generator(
gener,
(tf.int64,tf.int64,tf.int64))
 third = int(len(x)/3)
 for value in ds.take(third):
 print("value:",value)
Generator Functions (3)
#value:
#(<tf.Tensor: id=35, shape=(),dtype=int64,numpy=0>,
# <tf.Tensor: id=36, shape=(),dtype=int64,numpy=1>,
# <tf.Tensor: id=37, shape=(),dtype=int64,numpy=2>)
#value:
#(<tf.Tensor: id=41, shape=(),dtype=int64,numpy=3>,
# <tf.Tensor: id=42, shape=(),dtype=int64,numpy=4>,
# <tf.Tensor: id=43, shape=(),dtype=int64,numpy=5>)
# <tf.Tensor: id=43, shape=(),dtype=int64,numpy=5>)
Processing Text Files (1)
 define a TF Dataset with lines in file.txt
 skip lines that start with a “#” character
 then display only the first two lines
Contents of file.txt
 #this is file line #1
 #this is file line #2
 this is file line #3
 #this is file line #4
 this is file line #5
 #this is file line #6
Processing Text Files (2)
 import tensorflow as tf # tf2_flatmap_filter.py
 filenames = ["file.txt”]
 ds = tf.data.Dataset.from_tensor_slices(filenames)
Processing Text Files (3)
 ds = ds.flat_map(
 lambda filename: (
 tf.data.TextLineDataset(filename)
 .skip(1)
 .filter(lambda line:
tf.not_equal(tf.strings.substr(line,0,1),"#"))))
 for value in ds.take(2):
 print("value:",value)
Text Output (first two lines)
('value:', <tf.Tensor: id=16, shape=(5,),
dtype=string,
numpy=array(['this', 'is', 'file', 'line', ’#3'],
dtype=object)>)
('value:', <tf.Tensor: id=18, shape=(5,),
dtype=string,
numpy=array(['this', 'is', 'file', 'line', ’#5'],
dtype=object)>)
Tokenizers and tf.text
 import tensorflow as tf # NB: requires TF 2
 import tensorflow_text as text
 # pip3 install -q tensorflow-text
 docs = tf.data.Dataset.from_tensor_slices(
[['Chicago Pizza'],
["how are you"]])
 tokenizer = text.WhitespaceTokenizer()
 token_docs = docs.map(
lambda x: tokenizer.tokenize(x))
Tokenizers and tf.text
 iterator = iter(tokenized_docs)
 print(next(iterator).to_list())
 print(next(iterator).to_list())
 # [[b'a', b'b', b'c']]
 # [[b'd', b'e', b'f']]
 [[b'Chicago', b'Pizza']]
 [[b'how', b'are', b'you']]
Tf.data and MNIST
 import tensorflow as tf # tf2_mnist.py
train, test = tf.keras.datasets.mnist.load_data()
mnist_x, mnist_y = train
mnist_ds=tf.data.Dataset.from_tensor_slices(mnist_x)
print(mnist_ds)
 for value in mnist_ds.take(2):
 print("value:",value)
Tf.data and MNIST
 value: tf.Tensor(
 [[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0
 0 0 0 0 0 0 0 0 0 0]
 [ 0 0 0 0 0 0 0 0 0 0 0 0 3 18
18 18 126 136
 175 26 166 255 247 127 0 0 0 0]
 [ 0 0 0 0 0 0 0 0 30 36 94 154 170 253
253 253 253 253
 225 172 253 242 195 64 0 0 0 0]
 [ 0 0 0 0 0 0 0 49 238 253 253 253 253 253
253 253 253 251
 93 82 82 56 39 0 0 0 0 0]
TF2 generator Example
import tensorflow as tf # tf2_generator2.py
import numpy as np
x = np.arange(0, 12)
def gener():
i = 0
while(i < len(x/3)):
yield (i, i+1, i+2) # three integers at a time
i += 3
ds = tf.data.Dataset.from_generator(gener, (tf.int64,tf.int64,tf.int64))
third = int(len(x)/3)
for value in ds.take(third):
print("value:",value)
TF2 generator Example
 value:
(<tf.Tensor: id=35, shape=(), dtype=int64,
numpy=0>,
<tf.Tensor: id=36, shape=(), dtype=int64,
numpy=1>,
<tf.Tensor: id=37, shape=(), dtype=int64,
numpy=2>)
 value:
(<tf.Tensor: id=38, shape=(), dtype=int64,
numpy=3>,
<tf.Tensor: id=39, shape=(), dtype=int64,
numpy=4>,
TF 2 tf.data.TFRecordDataset
 import tensorflow as tf
 ds = tf.data.TFRecordDataset(tf-records)
 ds = ds.map(your-pre-processing)
 ds = ds.batch(batch_size=32)
 OR:
 ds = tf.data.TFRecordDataset(a-tf-record)
 .map(your-pre-processing)
 .batch(batch_size=32)
 model = . . . [Keras]
 model.fit(ds, epochs=20)
Use prefetch() for Performance
 import tensorflow as tf
 ds = tf.data.TFRecordDataset(tf-records)
 .map(your-pre-processing)
 .batch(batch_size=32)
 .prefetch(buffer_size=X)
 model = . . . [Keras]
 model.fit(ds, epochs=20)
Parallelize Data Transformations
 import tensorflow as tf
 ds = tf.data.TFRecordDataset(tf-records)
 .map(preprocess,num_parallel_calls=Y)
 .batch(batch_size=32)
 .prefetch(buffer_size=X)
 model = . . . [Keras]
 model.fit(ds, epochs=20)
 => uses background threads & internal buffer
Parallelize Data “Readers”
 import tensorflow as tf
 ds = tf.data.TFRecordDataset(tf-records,
 num_parallel_readers=Z)
 .map(preprocess,num_parallel_calls=Y)
 .batch(batch_size=32)
 .prefetch(buffer_size=X)
 model = . . . [Keras]
 model.fit(ds, epochs=20)
 => for sharded data
Parallelize Data “Readers”
 Selecting optimal values:
 tf.data.experimental.AUTO_TUNE
 Uses Reinforcement Learning
 To tune values during data input
tf.data.Options
 Setting global options:
 Deterministic/non-deterministic
 Statistics
 Optimizations (ex: autotuning)
 Threading (ex: private thread pool)
tf.data.Options
 import tensorflow as tf
 ds = tf.data.TFRecordDataset(tf-records)
 .map(your-pre-processing)
 .batch(batch_size=32)
 .prefetch(buffer_size=X)
op = tf.data.Options()
op.experimental_optimization.map_optimization=True
ds = ds.with_optimization(op)
TF 2: Built-in Datasets
tf.keras.datasets.boston_housing
tf.keras.datasets.cifar10
tf.keras.datasets.cifar100
tf.keras.datasets.fashion_mnist
tf.keras.datasets.imdb
tf.keras.datasets.mnist
tf.keras.datasets.reuters
About Me: Recent Books
 1) Python3 and Machine Learning (2020)
 2) Angular 9 and Deep Learning (2020)
 3) Angular 8 & Machine Learning (2020)
 4) AI/ML/DL: Concepts and Code (2020)
 5) Bash Programming on Mac (2020)
 6) TensorFlow 2 Pocket Primer (2019)
 7) TensorFlow 1.x Pocket Primer (2019)
 8) Python for TensorFlow (2019)
 9) C Programming Pocket Primer (2019)
About Me: Less Recent Books
 10) RegEx Pocket Primer (2018)
 11) Data Cleaning Pocket Primer (2018)
 12) Angular Pocket Primer (2017)
 13) Android Pocket Primer (2017)
 14) CSS3 Pocket Primer (2016)
 15) SVG Pocket Primer (2016)
 16) Python Pocket Primer (2015)
 17) D3 Pocket Primer (2015)
 18) HTML5 Mobile Pocket Primer (2014)
About Me: Older Books
 19) jQuery, CSS3, and HTML5 (2013)
 20) HTML5 Pocket Primer (2013)
 21) jQuery Pocket Primer (2013)
 22) HTML5 Canvas (2012)
 23) Flash on Android (2011)
 24) Web 2.0 Fundamentals (2010)
 25) MS Silverlight Graphics (2008)
 26) Fundamentals of SVG (2003)
 27) Java Graphics Library (2002)
ML/DL/NLP/DRL Classes
ML/DL/NLP/RL Instructor at UCSC (Santa Clara):
Deep Learning (TF2/Keras) (01/30/2020) 10 weeks
NLP (Transformer/BERT/etc) (02/21/2020) 10 weeks
ML (ML, NLP, RL) (05/01/2020) 10 weeks
DRL (PPO/A3C/SAC/etc) (05/07/2020) 10 weeks
ML (ML, NLP, RL) (06/16/2020) 10 weeks
NLP (Transformer/BERT/etc) (06/29/2020) 10 weeks
UCSC link:
https://www.ucsc-extension.edu/certificate-program/offering/deep-learning-
and-artificial-intelligence-tensorflow

More Related Content

What's hot

What's hot (20)

TensorFlow example for AI Ukraine2016
TensorFlow example  for AI Ukraine2016TensorFlow example  for AI Ukraine2016
TensorFlow example for AI Ukraine2016
 
Introduction to Deep Learning and TensorFlow
Introduction to Deep Learning and TensorFlowIntroduction to Deep Learning and TensorFlow
Introduction to Deep Learning and TensorFlow
 
Deep Learning in Your Browser
Deep Learning in Your BrowserDeep Learning in Your Browser
Deep Learning in Your Browser
 
Intro to Deep Learning, TensorFlow, and tensorflow.js
Intro to Deep Learning, TensorFlow, and tensorflow.jsIntro to Deep Learning, TensorFlow, and tensorflow.js
Intro to Deep Learning, TensorFlow, and tensorflow.js
 
Introduction to TensorFlow 2.0
Introduction to TensorFlow 2.0Introduction to TensorFlow 2.0
Introduction to TensorFlow 2.0
 
Natural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usageNatural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usage
 
Tensorflow - Intro (2017)
Tensorflow - Intro (2017)Tensorflow - Intro (2017)
Tensorflow - Intro (2017)
 
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLabIntroduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
 
Google TensorFlow Tutorial
Google TensorFlow TutorialGoogle TensorFlow Tutorial
Google TensorFlow Tutorial
 
TensorFlow for IITians
TensorFlow for IITiansTensorFlow for IITians
TensorFlow for IITians
 
Introduction to TensorFlow, by Machine Learning at Berkeley
Introduction to TensorFlow, by Machine Learning at BerkeleyIntroduction to TensorFlow, by Machine Learning at Berkeley
Introduction to TensorFlow, by Machine Learning at Berkeley
 
Introduction to Tensorflow
Introduction to TensorflowIntroduction to Tensorflow
Introduction to Tensorflow
 
Tensor board
Tensor boardTensor board
Tensor board
 
Deep Learning in your Browser: powered by WebGL
Deep Learning in your Browser: powered by WebGLDeep Learning in your Browser: powered by WebGL
Deep Learning in your Browser: powered by WebGL
 
TensorFlow
TensorFlowTensorFlow
TensorFlow
 
Tensor flow (1)
Tensor flow (1)Tensor flow (1)
Tensor flow (1)
 
Introduction to Machine Learning with TensorFlow
Introduction to Machine Learning with TensorFlowIntroduction to Machine Learning with TensorFlow
Introduction to Machine Learning with TensorFlow
 
Deep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlowDeep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlow
 
TensorFlow Tutorial
TensorFlow TutorialTensorFlow Tutorial
TensorFlow Tutorial
 
Explanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expertExplanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expert
 

Similar to Working with tf.data (TF 2)

Introduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlowIntroduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlow
Sri Ambati
 

Similar to Working with tf.data (TF 2) (20)

Frsa
FrsaFrsa
Frsa
 
About RNN
About RNNAbout RNN
About RNN
 
About RNN
About RNNAbout RNN
About RNN
 
What is TensorFlow and why do we use it
What is TensorFlow and why do we use itWhat is TensorFlow and why do we use it
What is TensorFlow and why do we use it
 
Introduction to TensorFlow
Introduction to TensorFlowIntroduction to TensorFlow
Introduction to TensorFlow
 
TENSOR DECOMPOSITION WITH PYTHON
TENSOR DECOMPOSITION WITH PYTHONTENSOR DECOMPOSITION WITH PYTHON
TENSOR DECOMPOSITION WITH PYTHON
 
Practice_Exercises_Data_Structures.pptx
Practice_Exercises_Data_Structures.pptxPractice_Exercises_Data_Structures.pptx
Practice_Exercises_Data_Structures.pptx
 
Tensorflow basics
Tensorflow basicsTensorflow basics
Tensorflow basics
 
TensorFlow Tutorial.pdf
TensorFlow Tutorial.pdfTensorFlow Tutorial.pdf
TensorFlow Tutorial.pdf
 
Lucio Floretta - TensorFlow and Deep Learning without a PhD - Codemotion Mila...
Lucio Floretta - TensorFlow and Deep Learning without a PhD - Codemotion Mila...Lucio Floretta - TensorFlow and Deep Learning without a PhD - Codemotion Mila...
Lucio Floretta - TensorFlow and Deep Learning without a PhD - Codemotion Mila...
 
Deep Learning, Scala, and Spark
Deep Learning, Scala, and SparkDeep Learning, Scala, and Spark
Deep Learning, Scala, and Spark
 
Introduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlowIntroduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlow
 
Revision Tour 1 and 2 complete.doc
Revision Tour 1 and 2 complete.docRevision Tour 1 and 2 complete.doc
Revision Tour 1 and 2 complete.doc
 
TensorFlow in Practice
TensorFlow in PracticeTensorFlow in Practice
TensorFlow in Practice
 
TUPLE.ppt
TUPLE.pptTUPLE.ppt
TUPLE.ppt
 
ECE-PYTHON.docx
ECE-PYTHON.docxECE-PYTHON.docx
ECE-PYTHON.docx
 
Python programming : List and tuples
Python programming : List and tuplesPython programming : List and tuples
Python programming : List and tuples
 
Fourier project presentation
Fourier project  presentationFourier project  presentation
Fourier project presentation
 
Baby Steps to Machine Learning at DevFest Lagos 2019
Baby Steps to Machine Learning at DevFest Lagos 2019Baby Steps to Machine Learning at DevFest Lagos 2019
Baby Steps to Machine Learning at DevFest Lagos 2019
 
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager ExecutionTensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
 

More from Oswald Campesato

More from Oswald Campesato (17)

Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
"An Introduction to AI and Deep Learning"
"An Introduction to AI and Deep Learning""An Introduction to AI and Deep Learning"
"An Introduction to AI and Deep Learning"
 
Introduction to Deep Learning for Non-Programmers
Introduction to Deep Learning for Non-ProgrammersIntroduction to Deep Learning for Non-Programmers
Introduction to Deep Learning for Non-Programmers
 
Deep Learning and TensorFlow
Deep Learning and TensorFlowDeep Learning and TensorFlow
Deep Learning and TensorFlow
 
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and Tensorflow
 
C++ and Deep Learning
C++ and Deep LearningC++ and Deep Learning
C++ and Deep Learning
 
Scala and Deep Learning
Scala and Deep LearningScala and Deep Learning
Scala and Deep Learning
 
Java and Deep Learning (Introduction)
Java and Deep Learning (Introduction)Java and Deep Learning (Introduction)
Java and Deep Learning (Introduction)
 
Deep Learning: R with Keras and TensorFlow
Deep Learning: R with Keras and TensorFlowDeep Learning: R with Keras and TensorFlow
Deep Learning: R with Keras and TensorFlow
 
Java and Deep Learning
Java and Deep LearningJava and Deep Learning
Java and Deep Learning
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
 
Android and Deep Learning
Android and Deep LearningAndroid and Deep Learning
Android and Deep Learning
 
Introduction to Kotlin
Introduction to KotlinIntroduction to Kotlin
Introduction to Kotlin
 
TypeScript and Deep Learning
TypeScript and Deep LearningTypeScript and Deep Learning
TypeScript and Deep Learning
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep Learning
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep Learning
 
Angular and Deep Learning
Angular and Deep LearningAngular and Deep Learning
Angular and Deep Learning
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Working with tf.data (TF 2)

  • 1. Introduction to tf.data (TF2) H2O Meetup Galvanize San Francisco 02/19/2020 Oswald Campesato ocampesato@yahoo.com
  • 2. Highlights/Overview What is tf.data? Working with TF 2 tf.data.Dataset Intermediate operators Terminal operators filter() and map() zip() and batch() Working with TF 2 generators
  • 3. tf.data: TF Input Pipeline  An input pipeline is useful:  for streaming data  when data is too big to fit in memory  When data requires preprocessing  When you need to shuffle large data  Can be scaled to multiple hosts  => ETL functionality
  • 4. What are tf.data.Datasets  Simple example:  1) define a Numpy array of numbers  2) create a TF Dataset ds  3) iterate through the dataset ds
  • 5. What are TF Datasets  import tensorflow as tf # tf-dataset1.py  import numpy as np  x = np.array([1,2,3,4,5])  ds = tf.data.Dataset.from_tensor_slices(x)  # iterate through the elements: for value in ds.take(len(x)): print(value)
  • 6. What are Lambda Expressions  a lambda expression is an anonymous function  use lambda expressions to define local functions  pass lambda expressions as arguments  return them as the value of function calls
  • 7. Some tf.data “lazy operators” map() filter() flatmap() batch() take() zip() flatten() => Combined via “method chaining”
  • 8. tf.data “lazy operators”  filter():  uses Boolean logic to "filter" the elements in an array to determine which elements satisfy the Boolean condition  map(): a projection  this operator "applies" a lambda expression to each input element  flat_map():  maps a single element of the input dataset to a Dataset of elements
  • 9. tf.data “lazy operators”  batch(n):  processes a "batch" of n elements during each iteration  repeat(n):  repeats its input values n times  take(n):  operator "takes" n input values
  • 10. tf.data.Dataset.from_tensors()  Import tensorflow as tf  #combine the input into one element  t1 = tf.constant([[1, 2], [3, 4]])  ds1 = tf.data.Dataset.from_tensors(t1)  # output: [[1, 2], [3, 4]]
  • 11. tf.data.Dataset.from_tensor_slices()  Import tensorflow as tf  #separate element for each item  t2 = tf.constant([[1, 2], [3, 4]])  ds1 = tf.data.Dataset.from_tensor_slices(t2)  # output: [1, 2], [3, 4]
  • 12. TF2 Datasets: code sample  import tensorflow as tf  import numpy as np x = np.arange(0, 10)  # create a dataset from a Numpy array ds = tf.data.Dataset.from_tensor_slices(x)
  • 13. TF filter() operator: ex #1 import tensorflow as tf # tf2_filter1.py import numpy as np x = np.array([1,2,3,4,5]) ds = tf.data.Dataset.from_tensor_slices(x)  print("First iteration:")  for value in ds:  print("value:",value)
  • 14. TF filter() operator: ex #1  First iteration:  value: tf.Tensor(1, shape=(), dtype=int64)  value: tf.Tensor(2, shape=(), dtype=int64)  value: tf.Tensor(3, shape=(), dtype=int64)  value: tf.Tensor(4, shape=(), dtype=int64)  value: tf.Tensor(5, shape=(), dtype=int64)
  • 15. TF filter() operator: ex #2 import tensorflow as tf # tf2_filter2.py import numpy as np x = np.array([1,2,3,4,5]) ds = tf.data.Dataset.from_tensor_slices(x)  print("First iteration:")  for value in ds:  print("value:",value)
  • 16. TF filter() operator: ex #2  # "tf.math.equal(x, y)" is required  # for equality comparison  def filter_fn(x):  return tf.math.equal(x, 1)  ds = ds.filter(filter_fn)  print("Second iteration:")  for value in ds:  print("value:",value)
  • 17. TF filter() operator: ex #2  First iteration:  value: tf.Tensor(1, shape=(), dtype=int64)  value: tf.Tensor(2, shape=(), dtype=int64)  value: tf.Tensor(3, shape=(), dtype=int64)  Second iteration:  value: tf.Tensor(1, shape=(), dtype=int64)
  • 18. What are Lambda Expressions  a lambda expression takes an input variable  performs an operation on that variable  A "bare bones" lambda expression: lambda x: x + 1  => this adds 1 to an input variable x
  • 19. TF filter() operator: ex #3  import tensorflow as tf # tf2_filter3.py  import numpy as np  ds = tf.data.Dataset.from_tensor_slices([1,2,3,4,5])  ds = ds.filter(lambda x: x < 4) # [1,2,3]  print("First iteration:")  for value in ds:  print("value:",value)
  • 20. TF filter() operator: ex #3  # "tf.math.equal(x, y)" is required  # for equality comparison  def filter_fn(x):  return tf.math.equal(x, 1)  ds = ds.filter(filter_fn)  print("Second iteration:")  for value in ds:  print("value:",value)
  • 21. TF filter() operator: ex #3  First iteration:  value: tf.Tensor(1, shape=(), dtype=int32)  value: tf.Tensor(2, shape=(), dtype=int32)  value: tf.Tensor(3, shape=(), dtype=int32)  Second iteration:  value: tf.Tensor(1, shape=(), dtype=int32)
  • 22. TF filter() operator: ex #4  import tensorflow as tf # tf2_filter5.py  import numpy as np  def filter_fn(x):  return tf.equal(x % 2, 0)  x = np.array([1,2,3,4,5,6,7,8,9,10])  ds = tf.data.Dataset.from_tensor_slices(x)  ds = ds.filter(filter_fn)  for value in ds:  print("value:",value)
  • 23. TF filter() operator: ex #4  value: tf.Tensor(2, shape=(), dtype=int64)  value: tf.Tensor(4, shape=(), dtype=int64)  value: tf.Tensor(6, shape=(), dtype=int64)  value: tf.Tensor(8, shape=(), dtype=int64)  value: tf.Tensor(10, shape=(), dtype=int64)
  • 24. TF filter() operator: ex #5  import tensorflow as tf # tf2_filter5.py  import numpy as np  def filter_fn(x):  return tf.reshape(tf.not_equal(x % 2, 1), [])  x = np.array([1,2,3,4,5,6,7,8,9,10])  ds = tf.data.Dataset.from_tensor_slices(x)  ds = ds.filter(filter_fn)  for value in ds:  print("value:",value)
  • 25. TF filter() operator: ex #5  value: tf.Tensor(2, shape=(), dtype=int64)  value: tf.Tensor(4, shape=(), dtype=int64)  value: tf.Tensor(6, shape=(), dtype=int64)  value: tf.Tensor(8, shape=(), dtype=int64)  value: tf.Tensor(10, shape=(), dtype=int64)
  • 26. TF map() operator: ex #1 import tensorflow as tf # tf2-map.py import numpy as np  x = np.array([[1],[2],[3],[4]])  ds = tf.data.Dataset.from_tensor_slices(x)  ds = ds.map(lambda x: x*2)  for value in ds:  print("value:",value)
  • 27. TF map() operator: ex #1  value: tf.Tensor([2], shape=(1,), dtype=int64)  value: tf.Tensor([4], shape=(1,), dtype=int64)  value: tf.Tensor([6], shape=(1,), dtype=int64)  value: tf.Tensor([8], shape=(1,), dtype=int64)
  • 28. TF map() and filter() operators import tensorflow as tf # tf2_map_filter.py import numpy as np  x = np.array([1,2,3,4,5,6,7,8,9,10])  ds = tf.data.Dataset.from_tensor_slices(x)  ds1 = ds.filter(lambda x: tf.equal(x % 4, 0))  ds1 = ds1.map(lambda x: x*x)  ds2 = ds.map(lambda x: x*x)  ds2 = ds2.filter(lambda x: tf.equal(x % 4, 0))
  • 29. TF map() and filter() operators  for value1 in ds1:  print("value1:",value1)  for value2 in ds2:  print("value2:",value2)
  • 30. TF map() and filter() operators  value1: tf.Tensor(16, shape=(), dtype=int64)  value1: tf.Tensor(64, shape=(), dtype=int64)  value2: tf.Tensor(4, shape=(), dtype=int64)  value2: tf.Tensor(16, shape=(), dtype=int64)  value2: tf.Tensor(36, shape=(), dtype=int64)  value2: tf.Tensor(64, shape=(), dtype=int64)  value2: tf.Tensor(100, shape=(), dtype=int64)
  • 31. TF map() operator: ex #2 import tensorflow as tf # tf2-map2.py import numpy as np  x = np.array([[1],[2],[3],[4]])  ds = tf.data.Dataset.from_tensor_slices(x)  # METHOD #1: THE LONG WAY  # a lambda expression to double each value  #ds = ds.map(lambda x: x*2)  # a lambda expression to add one to each value  #ds = ds.map(lambda x: x+1)  # a lambda expression to cube each value  #ds = ds.map(lambda x: x**3)
  • 32. TF map() operator: ex #2  # METHOD #2: A SHORTER WAY ds = ds.map(lambda x: x*2).map(lambda x: x+1).map(lambda x: x**3)  for value in ds:  print("value:",value  # an example of “Method Chaining”
  • 33. TF map() operator: ex #2  value: tf.Tensor([27], shape=(1,), dtype=int64)  value: tf.Tensor([125], shape=(1,), dtype=int64)  value: tf.Tensor([343], shape=(1,), dtype=int64)  value: tf.Tensor([729], shape=(1,), dtype=int64)
  • 34. TF take() operator: ex #1 import tensorflow as tf # tf2-take.py import numpy as np ds = tf.data.Dataset.from_tensor_slices(tf.range(8)) ds = ds.take(5)  for value in ds.take(20):  print("value:",value)
  • 35. TF take() operator: ex #1  value: tf.Tensor(0, shape=(), dtype=int32)  value: tf.Tensor(1, shape=(), dtype=int32)  value: tf.Tensor(2, shape=(), dtype=int32)  value: tf.Tensor(3, shape=(), dtype=int32)  value: tf.Tensor(4, shape=(), dtype=int32)
  • 36. TF take() operator: ex #2 import tensorflow as tf # tf2_take.py import numpy as np  x = np.array([[1],[2],[3],[4]])  # make a ds from a numpy array  ds = tf.data.Dataset.from_tensor_slices(x)  ds = ds.map(lambda x: x*2) .map(lambda x: x+1).map(lambda x: x**3)  for value in ds.take(4):  print("value:",value)
  • 37. TF 2 map() and take(): output  value: tf.Tensor([27], shape=(1,), dtype=int64)  value: tf.Tensor([125], shape=(1,), dtype=int64)  value: tf.Tensor([343], shape=(1,), dtype=int64)  value: tf.Tensor([729], shape=(1,), dtype=int64)
  • 38. TF zip() operator: ex #1  import tensorflow as tf # tf2_zip1.py  import numpy as np  dx = tf.data.Dataset.from_tensor_slices([0,1,2,3,4])  dy = tf.data.Dataset.from_tensor_slices([1,1,2,3,5])  # zip the two datasets together  d2 = tf.data.Dataset.zip((dx, dy))  for value in d2:  print("value:",value)
  • 39. TF zip() operator: ex #1  value: (<tf.Tensor: id=11, shape=(), dtype=int32, numpy=0>, <tf.Tensor: id=12, shape=(), dtype=int32, numpy=1>)  value: (<tf.Tensor: id=13, shape=(), dtype=int32, numpy=1>, <tf.Tensor: id=14, shape=(), dtype=int32, numpy=1>)  value: (<tf.Tensor: id=15, shape=(), dtype=int32, numpy=2>, <tf.Tensor: id=16, shape=(), dtype=int32, numpy=2>) => Plus two more rows of output
  • 40. TF zip() operator: ex #2  import tensorflow as tf # tf2_zip_take.py  import numpy as np  x = np.arange(0, 10)  y = np.arange(1, 11)  dx = tf.data.Dataset.from_tensor_slices(x)  dy = tf.data.Dataset.from_tensor_slices(y)  # zip the two datasets together  d2 = tf.data.Dataset.zip((dx, dy)).batch(3)  for value in d2.take(8):  print("value:",value)
  • 41. TF zip() operator: ex #2  value: (<tf.Tensor: id=11, shape=(), dtype=int32, numpy=0>, <tf.Tensor: id=12, shape=(), dtype=int32, numpy=1>)  value: (<tf.Tensor: id=15, shape=(), dtype=int32, numpy=1>, <tf.Tensor: id=16, shape=(), dtype=int32, numpy=1>)  value: (<tf.Tensor: id=19, shape=(), dtype=int32, numpy=2>, <tf.Tensor: id=20, shape=(), dtype=int32, numpy=2>)  value: (<tf.Tensor: id=23, shape=(), dtype=int32, numpy=3>, <tf.Tensor: id=24, shape=(), dtype=int32, numpy=3>)  value: (<tf.Tensor: id=27, shape=(), dtype=int32, numpy=4>, <tf.Tensor: id=28, shape=(), dtype=int32, numpy=5>)
  • 42. TF zip() operator: ex #3  import tensorflow as tf # tf2_zip_batch.py  import numpy as np  ds1 = tf.data.Dataset.range(100)  ds2 = tf.data.Dataset.range(0, -100, -1)  ds3 = tf.data.Dataset.zip((ds1, ds2))  ds4 = ds3.batch(4)  for value in ds4.take(4):  print("value:",value)  for value in d2.take(8):  print("value:",value)
  • 43. TF zip() operator: ex #3  value: (<tf.Tensor: id=21, shape=(4,), dtype=int64, numpy=array([0, 1, 2, 3])>, <tf.Tensor: id=22, shape=(4,), dtype=int64, numpy=array([ 0, -1, -2, - 3])>)  value: (<tf.Tensor: id=25, shape=(4,), dtype=int64, numpy=array([4, 5, 6, 7])>, <tf.Tensor: id=26, shape=(4,), dtype=int64, numpy=array([-4, -5, -6, - 7])>)  value: (<tf.Tensor: id=29, shape=(4,), dtype=int64, numpy=array([ 8, 9, 10, 11])>, <tf.Tensor: id=30, shape=(4,), dtype=int64, numpy=array([ -8, -9, -10, -11])>)  value: (<tf.Tensor: id=33, shape=(4,), dtype=int64, numpy=array([12, 13, 14, 15])>, <tf.Tensor: id=34, shape=(4,), dtype=int64, numpy=array([-12, -13, -14, -15])>)
  • 44. TF 2 generators Python functions Containing your custom code Specified in the dataset definition Invoked when data is requested
  • 45. Generator Functions (1)  import tensorflow as tf # tf2_generator1.py  import numpy as np  x = np.arange(0, 7)  def gener():  i = 0  while(i < len(x)):  yield i  i += 1
  • 46. Generator Functions (1) ds=tf.data.Dataset.from_generator(gener,(tf.int64))  size = 2*len(x)  for value in ds.take(size):  print("value:",value)  # value: tf.Tensor(0, shape=(), dtype=int64)  # value: tf.Tensor(1, shape=(), dtype=int64)  # value: tf.Tensor(2, shape=(), dtype=int64)  # value: tf.Tensor(3, shape=(), dtype=int64)  # value: tf.Tensor(4, shape=(), dtype=int64)  # value: tf.Tensor(5, shape=(), dtype=int64)  # value: tf.Tensor(6, shape=(), dtype=int64)
  • 47. Generator Functions (2) import tensorflow as tf # tf2-timesthree.py import numpy as np x = np.arange(0, 5) # 0, 1, 2, 3, 4 def gener(): for i in x: yield (3*i) ds = tf.data.Dataset.from_generator(gener, (tf.int64)) for value in ds.take(len(x)): print("1value:",value) for value in ds.take(2*len(x)): print("2value:",value)
  • 48. Generator Functions (2)  1value: tf.Tensor(0, shape=(), dtype=int64)  1value: tf.Tensor(3, shape=(), dtype=int64)  1value: tf.Tensor(6, shape=(), dtype=int64)  1value: tf.Tensor(9, shape=(), dtype=int64)  1value: tf.Tensor(12, shape=(), dtype=int64)  2value: tf.Tensor(0, shape=(), dtype=int64)  2value: tf.Tensor(3, shape=(), dtype=int64)  2value: tf.Tensor(6, shape=(), dtype=int64)  2value: tf.Tensor(9, shape=(), dtype=int64)  2value: tf.Tensor(12, shape=(), dtype=int64)
  • 49. Generator Functions (3)  import tensorflow as tf # tf2_generator3.py  import numpy as np  x = np.arange(0, 7)  while(i < len(x/3)):  yield (i, i+1, i+2)  i += 3
  • 50. Generator Functions (3)  ds = tf.data.Dataset.from_generator( gener, (tf.int64,tf.int64,tf.int64))  third = int(len(x)/3)  for value in ds.take(third):  print("value:",value)
  • 51. Generator Functions (3) #value: #(<tf.Tensor: id=35, shape=(),dtype=int64,numpy=0>, # <tf.Tensor: id=36, shape=(),dtype=int64,numpy=1>, # <tf.Tensor: id=37, shape=(),dtype=int64,numpy=2>) #value: #(<tf.Tensor: id=41, shape=(),dtype=int64,numpy=3>, # <tf.Tensor: id=42, shape=(),dtype=int64,numpy=4>, # <tf.Tensor: id=43, shape=(),dtype=int64,numpy=5>) # <tf.Tensor: id=43, shape=(),dtype=int64,numpy=5>)
  • 52. Processing Text Files (1)  define a TF Dataset with lines in file.txt  skip lines that start with a “#” character  then display only the first two lines
  • 53. Contents of file.txt  #this is file line #1  #this is file line #2  this is file line #3  #this is file line #4  this is file line #5  #this is file line #6
  • 54. Processing Text Files (2)  import tensorflow as tf # tf2_flatmap_filter.py  filenames = ["file.txt”]  ds = tf.data.Dataset.from_tensor_slices(filenames)
  • 55. Processing Text Files (3)  ds = ds.flat_map(  lambda filename: (  tf.data.TextLineDataset(filename)  .skip(1)  .filter(lambda line: tf.not_equal(tf.strings.substr(line,0,1),"#"))))  for value in ds.take(2):  print("value:",value)
  • 56. Text Output (first two lines) ('value:', <tf.Tensor: id=16, shape=(5,), dtype=string, numpy=array(['this', 'is', 'file', 'line', ’#3'], dtype=object)>) ('value:', <tf.Tensor: id=18, shape=(5,), dtype=string, numpy=array(['this', 'is', 'file', 'line', ’#5'], dtype=object)>)
  • 57. Tokenizers and tf.text  import tensorflow as tf # NB: requires TF 2  import tensorflow_text as text  # pip3 install -q tensorflow-text  docs = tf.data.Dataset.from_tensor_slices( [['Chicago Pizza'], ["how are you"]])  tokenizer = text.WhitespaceTokenizer()  token_docs = docs.map( lambda x: tokenizer.tokenize(x))
  • 58. Tokenizers and tf.text  iterator = iter(tokenized_docs)  print(next(iterator).to_list())  print(next(iterator).to_list())  # [[b'a', b'b', b'c']]  # [[b'd', b'e', b'f']]  [[b'Chicago', b'Pizza']]  [[b'how', b'are', b'you']]
  • 59. Tf.data and MNIST  import tensorflow as tf # tf2_mnist.py train, test = tf.keras.datasets.mnist.load_data() mnist_x, mnist_y = train mnist_ds=tf.data.Dataset.from_tensor_slices(mnist_x) print(mnist_ds)  for value in mnist_ds.take(2):  print("value:",value)
  • 60. Tf.data and MNIST  value: tf.Tensor(  [[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  0 0 0 0 0 0 0 0 0 0]  [ 0 0 0 0 0 0 0 0 0 0 0 0 3 18 18 18 126 136  175 26 166 255 247 127 0 0 0 0]  [ 0 0 0 0 0 0 0 0 30 36 94 154 170 253 253 253 253 253  225 172 253 242 195 64 0 0 0 0]  [ 0 0 0 0 0 0 0 49 238 253 253 253 253 253 253 253 253 251  93 82 82 56 39 0 0 0 0 0]
  • 61. TF2 generator Example import tensorflow as tf # tf2_generator2.py import numpy as np x = np.arange(0, 12) def gener(): i = 0 while(i < len(x/3)): yield (i, i+1, i+2) # three integers at a time i += 3 ds = tf.data.Dataset.from_generator(gener, (tf.int64,tf.int64,tf.int64)) third = int(len(x)/3) for value in ds.take(third): print("value:",value)
  • 62. TF2 generator Example  value: (<tf.Tensor: id=35, shape=(), dtype=int64, numpy=0>, <tf.Tensor: id=36, shape=(), dtype=int64, numpy=1>, <tf.Tensor: id=37, shape=(), dtype=int64, numpy=2>)  value: (<tf.Tensor: id=38, shape=(), dtype=int64, numpy=3>, <tf.Tensor: id=39, shape=(), dtype=int64, numpy=4>,
  • 63. TF 2 tf.data.TFRecordDataset  import tensorflow as tf  ds = tf.data.TFRecordDataset(tf-records)  ds = ds.map(your-pre-processing)  ds = ds.batch(batch_size=32)  OR:  ds = tf.data.TFRecordDataset(a-tf-record)  .map(your-pre-processing)  .batch(batch_size=32)  model = . . . [Keras]  model.fit(ds, epochs=20)
  • 64. Use prefetch() for Performance  import tensorflow as tf  ds = tf.data.TFRecordDataset(tf-records)  .map(your-pre-processing)  .batch(batch_size=32)  .prefetch(buffer_size=X)  model = . . . [Keras]  model.fit(ds, epochs=20)
  • 65. Parallelize Data Transformations  import tensorflow as tf  ds = tf.data.TFRecordDataset(tf-records)  .map(preprocess,num_parallel_calls=Y)  .batch(batch_size=32)  .prefetch(buffer_size=X)  model = . . . [Keras]  model.fit(ds, epochs=20)  => uses background threads & internal buffer
  • 66. Parallelize Data “Readers”  import tensorflow as tf  ds = tf.data.TFRecordDataset(tf-records,  num_parallel_readers=Z)  .map(preprocess,num_parallel_calls=Y)  .batch(batch_size=32)  .prefetch(buffer_size=X)  model = . . . [Keras]  model.fit(ds, epochs=20)  => for sharded data
  • 67. Parallelize Data “Readers”  Selecting optimal values:  tf.data.experimental.AUTO_TUNE  Uses Reinforcement Learning  To tune values during data input
  • 68. tf.data.Options  Setting global options:  Deterministic/non-deterministic  Statistics  Optimizations (ex: autotuning)  Threading (ex: private thread pool)
  • 69. tf.data.Options  import tensorflow as tf  ds = tf.data.TFRecordDataset(tf-records)  .map(your-pre-processing)  .batch(batch_size=32)  .prefetch(buffer_size=X) op = tf.data.Options() op.experimental_optimization.map_optimization=True ds = ds.with_optimization(op)
  • 70. TF 2: Built-in Datasets tf.keras.datasets.boston_housing tf.keras.datasets.cifar10 tf.keras.datasets.cifar100 tf.keras.datasets.fashion_mnist tf.keras.datasets.imdb tf.keras.datasets.mnist tf.keras.datasets.reuters
  • 71. About Me: Recent Books  1) Python3 and Machine Learning (2020)  2) Angular 9 and Deep Learning (2020)  3) Angular 8 & Machine Learning (2020)  4) AI/ML/DL: Concepts and Code (2020)  5) Bash Programming on Mac (2020)  6) TensorFlow 2 Pocket Primer (2019)  7) TensorFlow 1.x Pocket Primer (2019)  8) Python for TensorFlow (2019)  9) C Programming Pocket Primer (2019)
  • 72. About Me: Less Recent Books  10) RegEx Pocket Primer (2018)  11) Data Cleaning Pocket Primer (2018)  12) Angular Pocket Primer (2017)  13) Android Pocket Primer (2017)  14) CSS3 Pocket Primer (2016)  15) SVG Pocket Primer (2016)  16) Python Pocket Primer (2015)  17) D3 Pocket Primer (2015)  18) HTML5 Mobile Pocket Primer (2014)
  • 73. About Me: Older Books  19) jQuery, CSS3, and HTML5 (2013)  20) HTML5 Pocket Primer (2013)  21) jQuery Pocket Primer (2013)  22) HTML5 Canvas (2012)  23) Flash on Android (2011)  24) Web 2.0 Fundamentals (2010)  25) MS Silverlight Graphics (2008)  26) Fundamentals of SVG (2003)  27) Java Graphics Library (2002)
  • 74. ML/DL/NLP/DRL Classes ML/DL/NLP/RL Instructor at UCSC (Santa Clara): Deep Learning (TF2/Keras) (01/30/2020) 10 weeks NLP (Transformer/BERT/etc) (02/21/2020) 10 weeks ML (ML, NLP, RL) (05/01/2020) 10 weeks DRL (PPO/A3C/SAC/etc) (05/07/2020) 10 weeks ML (ML, NLP, RL) (06/16/2020) 10 weeks NLP (Transformer/BERT/etc) (06/29/2020) 10 weeks UCSC link: https://www.ucsc-extension.edu/certificate-program/offering/deep-learning- and-artificial-intelligence-tensorflow