MLCC Schedule #1

MLCC Schedule #1
Bruce Lee ⼤大順

Outline
• Introduction to Machine Learning
• Framing: Key ML Terminology
• Descending into ML
• Reducing Loss
• First Steps with TF

Introduction to Machine
Learning
• Reduce time programming
• feed machine learning tool some examples, and get a more
reliable program in a small fraction of the time.
• Customize and scale products
• To support multiple languages, you can collect data in that
language and feeding it into the exact same machine learning
model.
• Complete seemingly "unprogrammable" tasks
• ML lets you solve problems that you, as a programmer, have no
idea how to do by hand, ex. recognize face

Introduction to Machine
Learning
• Coding
• We use assertions to prove properties of our program
are correct.
• ML
• The focus shifts from a mathematical science to a
natural science:
• We're making observations about an uncertain world,
running experiments, and using statistics, not logic, to
analyze the results of the experiment.

Framing: Key ML
Terminology
• Label
• A label is the thing we're predicting—the y variable in
simple linear regression.
• already has an answer
• Feature
• A feature is an input variable — the x variable in simple
linear regression.
• Parameter types of data we already have

Framing: Key ML
Terminology
• Example
• An example is a particular instance of data, x. (We put x in boldface to
indicate that it is a vector.)
• labeled examples
• {features, label}: (x, y)
• train the model
• unlabeled examples
• {features, ?}: (x, ?)
• we want to predict

Descending into ML
• Linear Regression

Descending into ML
• Linear Regression
• ﬁnd the closest linear relationship (prediction)
between x and y
• prediction could be deﬁned as

Descending into ML
• - Loss
• a number indicating how bad the model's prediction was on a
single example

Descending into ML
• Loss Function
• Squared Loss (L2 loss)
• = the square of the difference between the label and the
prediction
•
•
• Mean Square Error (MSE)
• sum up all the L2 loss, and then divide by the number of examples

Reducing Loss
• An iterative trial-and-error approach to training a model
• start with an initial guess for the weights and bias
• iteratively adjusting those guesses
• until learning the weights and bias with the lowest
possible loss
• overall loss stops changing or at least changes
extremely slowly
• called the model has converged

Reducing Loss
• Gradient descent
• apply for the plots of loss vs weights are convex

Reducing Loss
• Gradient descent
• ﬁnd a learning rate (a hyperparameter) large enough that gradient
descent converges efﬁciently, but not so large that it never converges

Reducing Loss
• batch
• the total number of examples you use to calculate the
gradient in a single iteration.
• small: computing ↓ noisy ↑; large: computing ↑ noisy ↓
• Stochastic gradient descent (SGD)：one example (a
batch size of 1) per iteration
• Mini-batch stochastic gradient descent (mini-batch
SGD)：10 and 1,000 examples

First Steps with TensorFlow
• TensorFlow Estimators

• Pandas
• deal with examples (input data, x) before being
put into TensorFlow
• data structure
• DataFrame - like examples, has 1↑ Series
• Series - like features,

• TensorFlow
• Build the First Model
• Tweak the Model Hyperparameters

• Build the First Model
• Define and Configure Feature
• Define the Target (y)
• Configure the LinearRegressor
• Define the Input Function
• Train the Model
• Evaluate the Model

• Define and Configure Feature
• Configure data type for TF’s feature column
• Categorical Data
• Numerical Data

• Deﬁne the Target (y)

• Conﬁgure the LinearRegressor
• apply gradient clipping via clip_gradients_by_norm
• ensures the magnitude of the gradients do not
become too large during training, which can cause
gradient descent to fail.

• instructs TensorFlow how to preprocess the data, as well as
how to batch, shufﬂe, and repeat it during model training.
• convert our pandas feature data into a dict of NumPy
arrays.
• use the TensorFlow Dataset API to construct a dataset
object
• break data into batches of batch_size, to be repeated for
the speciﬁed number of epochs (num_epochs).

• Train the Model
• call train() on our linear_regressor to train the model.

• Evaluate the Model
• compare max, min, mean value to Root Mean Squared
Error (RMSE)

• Tweak the Model Hyperparameters
• learning_rate, steps, batch_size, input_feature
• tips
• Lower learning rate
• Larger number of steps or batch size

QA
• 如何有效率地調整超參參數? 有什什麼業界常⽤用的經驗法則
• learning rate, steps, batch_size, input_feature
• 如何確定 RMSE 已經夠⼩小
• 如何決定每次 batch 完後，下⼀一次 batch 後的資料應該如何處理理
• batch size = 10 執⾏行行 10 次，跟 batch size = 20 執⾏行行 5 次，如何選擇？
• 假設我的 dataframe 的資料總數有 100 個
• ds.batch(batch_size).repeat(num_epochs) 且 shuffle 關掉的狀狀況下去執⾏行行的時候，如
果 batch_size * num_epochs 等於資料總數的話，就會⽤用到 dataframe 裡所有資料嗎？
• 如果要做商業⽤用途，可能因為 Tensorflow 的 apache license 有什什麼限制？或是有其他應
該注意？如果⾃自⼰己⽤用開源程式架 Tensorflow ⽤用在商業⽤用途就完全免責？

MLCC Schedule #1

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Similar to MLCC Schedule #1

Similar to MLCC Schedule #1 (20)

Recently uploaded

Recently uploaded (20)

MLCC Schedule #1