Machine Learning, Deep Learning and Data Analysis Introduction

工業技術研究院機密資料禁止複製、轉載、外流 ITRI CONFIDENTIAL DOCUMENT DO NOT COPY OR DISTRIBUTE
Machine Learning, Deep
Learning and Data Analysis
簡介
劉得彥
teyen.liu@gmail.com

Outline
2
• Overview of ML, DL and Data Analysis
• What is Machine Learning
▪ Take a Look At Linear Regression
▪ Other ML Algorithms at a Glance
▪ What is Neural Network?
• What is Deep Learning?
• Deep Learning using TensorFlow
• Data Analysis
▪ Case 1, 2 and 3
▪ Multivariate Analysis

My Experience for Machine Learning
3
• 學習過程走了一些冤望路!!
▪ Hope giving you an experience and guideline
• Take courses:
▪ Coursera: Machine Learning ( Got Certificate )
▪ Udemy: Data Science: Deep Learning in Python ( ongoing)
• Study on-line resources:
▪ Youtube、ML/DL tutorials … and so on
▪ https://morvanzhou.github.io/
▪ http://bangqu.com/gpu/blog
▪ http://www.jiqizhixin.com/insights
• Get you hands dirty
▪ Python programming
a. TensorFlow Deep Learning Library
b. Scikit-Learn Library
c. Numby, Pandas, matplotlib, …

From AI to Deep Learning
4
• 推薦觀賞: 人工智能极客公园 2017 年大会演讲
▪ Google首席科学家李飛飛 https://www.youtube.com/watch?v=uZ-7DVzRCy8
https://blogs.nvidia.com.tw/2016/07/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/
CPU/
GPU
Big Data
Algorithms
Breakthrough

ML, DL and Data Analysis
5
• Visually Linking
• What we focus today
https://whatsthebigdata.com/2016/10/17/visually-linking-ai-machine-learning-deep-learning-big-data-and-data-science/
迷思??
Data Analysis

Machine Learning definition
7
• Arthur Samuel (1959). Machine Learning:
Field of study that gives computers the
ability to learn without being explicitly
programmed.
• Tom Mitchell (1998) Well-posed
Learning Problem: A computer program
is said to learn from experience E with
respect to some task T and some
performance measure P, if its
performance on T, as measured by P,
improves with experience E.

Machine Learning definition
8
• Suppose your email program watches
which emails you do or do not mark as
spam, and based on that learns how to
better filter spam. What is the task T in
this setting?
▪ Classifying emails as spam or not spam. (T)
▪ Watching you label emails as spam or not spam. (E)
▪ The number (or fraction) of emails correctly classified
as spam/not spam. (P)
▪ None of the above—this is not a machine learning
problem

What is Machine Learning ?
• Without writing any
custom code
specific to the
problem
• Feed data to the
generic algorithm
• It builds its own logic

Two styles of Machine Learning
• Supervised Learning 監督式學習
• Unsupervised Learning 非監督式學習
Use the logic to
predict the sales
price
figure out if there is a
pattern or grouping or
something
Features Label

What are machine learning
algorithms?
• Regression Algorithms
▪ Linear Regression
▪ Logistic Regression
▪ LASSO
• Decision Tree Algorithms
▪ Classification and Regression Tree (CART)
▪ Iterative Dichotomiser 3 (ID3)
▪ C4.5 and C5.0 (different versions of a powerful approach)
• Bayesian Algorithms
▪ Naive Bayes
• Clustering Algorithms (unsupervised)
▪ k-Means
• Support Vector Machines
• Principal Component Analysis
• Anomaly Detection
• Recommender Systems
• Artificial Neural Network Algorithms

LET’S TAKE A LOOK AT
LINEAR REGRESSION
12

Linear Regression
13
• The Hypothesis Function
• Cost Function
• Gradient Descent for Multiple Variables
https://www.coursera.org/learn/machine-learning/

Gradient Descent
14
• How to choose learning α

Gradient Descent
15
• Convergence of gradient descent with an appropriate
learning rate α
Cost
Function

Linear Regression
16
• Training data with linear regression fit

OTHER ML ALGORITHMS
AT A GLANCE
17

Logistic Regression
18
• Training data with decision boundary
linear decision boundary
no linear decision boundary

Support Vector Machines
19
• The difference between the kernels in SVM
▪ Linear
▪ Polynomial
▪ Gaussian (RBF)
▪ Sigmoid
• SVM (Gaussian Kernel) Decision Boundary
▪ Choose gamma ( auto )
Gaussian (RBF)
Non-linear decision boundary

K-Means
20
• The original 128x128 image
with 24-bit color (three 8-bit )
• using K-means (K=16) to use
find the 16 colors that best
group (cluster) the pixels in the
3-dimensional RGB space.
• K=3 and computing centroid
means Iteratively
非監督式學習

Principal Component Analysis
• An example to deal with image dimension
reduction and proximate recovery.
Faces Dataset
Recovered faces
Principal components
非監督式學習

WHAT IS NEURAL
NETWORK
(We will review the previous concepts a little bit)
22

ML -- write that program by
ourselves
• To estimate the price of a house
▪ If we could just figure out the perfect weights to use that work
for every house, our function could predict house prices!
▪ How to do that with ML?
def estimate_house_sales_price(num_of_bedrooms, sqft, neighborhood):
price = 0 # a little pinch of this
price += num_of_bedrooms * .841231951398213
price += sqft * 1231.1231231
price += neighborhood * 2.3242341421
price += 201.23432095
return price

ML -- write that program by
ourselves
• Step 1
▪ Initialize weights to 1.0
• Step 2
▪ See the difference and how far off the function is at guessing the
correct price
• Step 3
▪ Repeat Step 2 over and over with every single possible
combination of weights.

ML -- What about that whole “try
every number” bit in Step 3?
θ is what represents your current weights. J(θ) means the ‘cost for your current weights’.
• Clever ways to quickly find good values for those
weights without having to try very many.
• If we graph this cost equation for all possible values of
our weights for number_of_bedrooms and sqft
Gradient Descent

Making Smarter Guesses
26
• We ended up with this simple estimation function
def estimate_house_sales_price(num_of_bedrooms, sqft, neighborhood):
price = 0 # a little pinch of this
price += num_of_bedrooms * 0.123
price += sqft * 0.41
price += neighborhood * 0.57
return price
a linear relationship with the input

If there is more complicated situation?
▪ Different of weights for the different house sizes

What is a Neural Network
• Now we have four different price estimates.
• Let’s combine those four price estimates
into one final estimate.
neurons
This is a neural network

• Human Brains
http://www.slideshare.net/tw_dsconf/ss-62245351

• Different
connections lead
to different
structured
network.

Fully Connected Feedforward
Network

Fully connected feedforward network
• Matrix
Operation

Output Layer
• Softmax layer as the output layer

Neural Network Playground
• http://playground.tensorflow.org/

What is Deep Learning?
37
• Deep Learning is Large Neural Networks
• Deep Learning attracts lots of attention
http://static.googleusercontent.com/media/research.google.com/en//people/jeff/BayLearn2015.pdf

Why Deep Learning?
• The more data, the more performance.
• Game Changer
▪ DL accuracy/performance is more than 99%

Deep Learning Models
39
• Convolutional Neural Network
▪ Inception-V3
• Recurrent Neural Network
▪ LSTM
• Auto-encoder
• Reinforcement Learning
▪ Q-Learning
▪ Policy Gradient
• Wide and Deep Learning
▪ Recommender system

Deep Learning is not so simple
• Backpropagation
▪ an efficient way to compute Gradient Descent
• Overfitting
• Choosing Loss function
▪ Square Error, Cross Entropy, and so on…
• Mini-Batch
• Too deep ( many hidden layers )
▪ ReLU, MaxOut, …
• Learning Rates
• Momentum
▪ Adam ( optimizer )
• Weight Decay
• Dropout

Backpropagation
41
• A common method of training artificial neural
networks and used in conjunction with an optimization
method such as gradient descent.

Underfitting and Overfitting
• Bias-Variance Tradeoff

Convolutional Neural Network (CNN)
• Why CNN is for image?
▪ The first layer of fully connected network would be very
large

Adding Even More Steps

DEEP LEARNING USING
TENSORFLOW
48

Linear Regression in TensorFlow
49
X_data  array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8,
0.9, 1. , 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7,
1.8, 1.9, 2. , 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, …
Y_data  array([ 0. , 0.29999667, 0.59997333, 0.89991 ,
1.19978668, 1.49958339, 1.79928013, 2.09885695,
2.39829388, 2.69757098, 2.99666833, 3.29556602, …

50

51

MNIST
52
• 手寫數字辨識
• MNIST dataset
▪ 55000 samples (50000 for training, 5000 for testing)
▪ For each sample, it has X, y parts.
▪ X are the image with 28*28 pixels in 8 bit gray scale
▪ Y is the label answer: 0, 1, 2, …, 9

MNIST
53
• X, y can be represented as follows

MNIST
54
• If you want to get the accuracy more than 99%, check it out:
• https://gotocon.com/dl/goto-london-2016/slides/MartinGorner_TensorflowAndDeepLearningWithoutAPhD.pdf
92%

Image Recognition and Retraining
• Inception-v3 model is ready and made by Google
▪ it took Google researchers two weeks to build on a desktop with eight NVidia
Tesla K40s.
▪ It can recognize > 1000 categories
• Retraining
▪ To prepare the new images and categories
▪ Do training and testing

Plate Number Recognition
• There is an example using UK’s Plate Number
and Character to train TensorFlow CNN model
• Take 3 days with GPU Card (GTX 750 TI)
http://matthewearl.github.io/2016/05/06/cnn-anpr/

Technical breakthrough for Deep-ANPR
http://matthewearl.github.io/2016/05/06/cnn-anpr/

Autoencoder
• Enccode the input data (MNIST data) and then decode it back
• It is similar to PCA
autoencoder
original

LSTM (RNN)
• It is a special kind of RNN, capable of learning
long-term dependencies
LSTM
Training with

Recurrent Neural Network
60
RNN Model
training
output
Play this map with Super Mario Maker
https://medium.com/@ageitgey/machine-learning-is-fun-part-2-a26a10b68df3

Data Analysis
62
• The steps to do data analysis
▪ Data Collection
a. From CSV files, database, … and so on.
▪ Data Preprocessing ( very very very important )
a. Regularization, Normalization, …
b. Table Join,
▪ Feature Extraction
a. Reduce the dimensions ….
▪ Feature Selection
a. Select the important features
▪ Machine Learning / Deep Learning
a. To train the model
b. To do the Prediction and Classification by the trained model
c. Apply or implement to system
• But, still needs:
▪ domain experts involved!!
▪ Studying related papers and researches

Analysis Tools and Libraries
63
• Open Sources(Python)
▪ Machine Learning
a. SciKit-Learn
b. NumPy
c. Matplotlib
d. Pandas
▪ Deep Learning
a. TensorFlow
b. Keras
▪ Hadoop & Spark
• Commercial Software ( rare to use…)
▪ PolyAnalyst 6.5
▪ SAS

Data Analysis
64
• In my experience with data analysis, I
belong to a “Rookie”…
• 製造資料科學：從預測性思維到處方性決策
▪ http://www.slideshare.net/tw_dsconf/ss-71780267

Reference
• Machine Learning is Fun! – Medium
• Machine Learning is Fun! Part 2 – Medium
• Machine Learning is Fun! Part 3: Deep Learning and ... - Medium
• Deep Learning Tutorial
• FIRST CONTACT WITH TENSORFLOW
• https://ireneli.eu/2016/02/03/deep-learning-05-talk-about-convolutional-neural-
network%EF%BC%88cnn%EF%BC%89/
• http://www.slideshare.net/tw_dsconf/ss-71780267
• https://morvanzhou.github.io/tutorials/python-basic/
• https://media.readthedocs.org/pdf/python-for-multivariate-analysis/latest/python-for-
multivariate-analysis.pdf
• http://blog.topspeedsnail.com/
• http://www.leiphone.com/news/201702/vJpJqREn7EyoAd09.html
• Python 之機器學習套件 scikit-learn
▪ https://machine-learning-python.kspax.io/
▪ version >= 0.17

Thank You
66
如果你了解了機器學習，看到下列這句話, 你會會心一笑。

Machine Learning, Deep Learning and Data Analysis Introduction

More Related Content

What's hot

Similar to Machine Learning, Deep Learning and Data Analysis Introduction

More from Te-Yen Liu

Recently uploaded

Machine Learning, Deep Learning and Data Analysis Introduction