SlideShare a Scribd company logo
1 of 216
Download to read offline
Demysti
fi
ed Arti
fi
cial
Intelligence with NLP
Taka Wang
2021/11/19
About me
• Engineering Team Lead @ Nilvana - Computer Vision & ML Applications
• BS, MS, PhD in Computer Science
• Ph.D thesis in Machine Learning
• Previously
• Tech Lead @ ADLINK IoT Team
• Software Product Manager @ Delta Research Center
• Engineering Manager @ Networking Company
6
2
3
Case Study
以清華
大
學開發的 AI 輔助親權
判決預測系統為例
Hyper Cycle
什麼是技術炒作週期
Critical Thinking
AI 相關事件探討與資料科學家
的
日
常
4
5
1
ML & NN
機器學習與類神經網路快速入
門
NLP
自
然語
言
處理的流程與困難點
Moravec’s Paradox
AI 問題的難易與我們常識相佐
Agenda
莫拉維克悖論 (Moravec’s Paradox)
In the 60s, Marvin Minsky assigned
a couple of undergrads to spend
the summer programming a
computer to use a camera to
identify objects in a scene. He
fi
gured they'd have the problem
solved by the end of the summer.
Half a century later, we're still
working on it.
[1]
而
在感知和移動
方
面
,則很難或不可能使電腦具有
一
歲兒童的技能
莫拉維克悖論 (Moravec’s Paradox)
讓電腦在智
力
測試或下棋中表現出成
人
水
平的表現相對容易
[2]
3
Critical Thinking
AI 相關事件探討與資料科學家
的
日
常
4
1
ML & NN
機器學習與類神經網路快速入
門
Moravec’s Paradox
AI 問題的難易與我們傳統的假
設相反
Agenda
5
NLP
自
然語
言
處理的流程與困難點
6
Case Study
以清華
大
學開發的 AI 輔助親權
判決預測系統為例
2
Hyper Cycle
什麼是技術炒作週期
過度期望的
高
峰階段
泡沫化的
谷
底階段
穩步爬升的
光明階段
實質
生
產的
高
峰階段
科技創新的
啟動階段
時間
能
見
度
用
來分析技術發展趨勢與科技產品的
生
命週期
[3]
[4]
[5]
[6]
2
Hyper Cycle
什麼是技術炒作週期
4
1
ML & NN
機器學習與類神經網路快速入
門
Moravec’s Paradox
AI 問題的難易與我們傳統的假
設相反
Agenda
5
NLP
自
然語
言
處理的流程與困難點
6
Case Study
以清華
大
學開發的 AI 輔助親權
判決預測系統為例
3
Critical Thinking
AI 相關事件探討與資料科學家
的
日
常
[7]
[8]
二
次加
工
註釋數據庫供應商 第三
方
檢索服務提供商 AI 律師服務
萬律
[9]
無
人
駕駛還會出事故
純粹
人
員駕駛,無任何輔助
人
類為主,電腦提供輔助 電腦駕駛為主,
人
類輔助駕駛
基本
自
動駕駛,
人
類應急處理
高
度
自
動駕駛,
人
類仍可參與 完全
自
動駕駛,
人
類純粹乘坐
[10]
資料科學家都在做些什麼
Data
Science
Machine
Learning Computer
Science
Data
Processing
Domain
Expertise
Statistical
Research
Mathematics
Data Science Work
fl
ow
Data Science Work
fl
ow
1. Start with a Question
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
“If I study more, will I
get a higher grade?”
1. Start with a Question
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
“If I study more, will I get a higher grade?”
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
2. Get & Clean Data
“If I study more, will I get a higher grade?”
Student Hour Studies Grade
Alice 20 90
Bob 5 70
Charlie 10 96
David 15 82
Eve two 62
Frank 16 87
Grace 22 998
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
2. Get & Clean Data
“If I study more, will I get a higher grade?”
Student Hour Studies Grade
Alice 20 90
Bob 5 70
Charlie 10 96
David 15 82
Eve two 62
Frank 16 87
Grace 22 998
98
2
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
“If I study more, will I get a higher grade?”
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
“If I study more, will I get a higher grade?”
3. Perform EDA
Grade
0
25
50
75
100
Hours Studied
0 5 10 15 20 25
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
“If I study more, will I get a higher grade?”
3. Perform EDA
Grade
0
25
50
75
100
Hours Studied
0 5 10 15 20 25
Finding #1. The more you study, the
higher grade you will get.
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
“If I study more, will I get a higher grade?”
3. Perform EDA
Grade
0
25
50
75
100
Hours Studied
0 5 10 15 20 25
Finding #1. The more you study, the
higher grade you will get.
Finding #2. Also, Charlie is a smarty pants.
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
“If I study more, will I get a higher grade?”
Grade
0
25
50
75
100
Hours Studied
0 5 10 15 20 25
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
“If I study more, will I get a higher grade?”
Grade
0
25
50
75
100
Hours Studied
0 5 10 15 20 25
Linear Regression
Grade = 1.5*Hors + 65
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
“If I study more, will I get a higher grade?”
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
“If I study more, will I get a higher grade?”
Yes, there is a positive correlation between
the number of hours you study and the
grade you will get.
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
“If I study more, will I get a higher grade?”
Yes, there is a positive correlation between
the number of hours you study and the
grade you will get.
Speci
fi
cally, the relationship is: Grade = 1.5 * Hours + 65
So if you study 10 hours, you can expect to get an 80.
Data Science Work
fl
ow
1. Start with a Question
2. Get & Clean Data
3. Perform EDA
4. Apply Techniques
5. Share Insights
“If I study more, will I get a higher grade?”
Yes, there is a positive correlation between
the number of hours you study and the
grade you will get.
Speci
fi
cally, the relationship is: Grade = 1.5 * Hours + 65
So if you study 10 hours, you can expect to get an 80.
However, Charlie is a smarty pants and is in
fl
ating the grade
estimate. You’ll probably get slightly less then 80.
深度學習是萬靈丹?
Rocket Engine Fuel
AI DL Data
Scale Drives deep learning progress
[11, 12]
Labelled Data size
Model
Performance
Copyright © 2021, inwinSTACK, Inc. or its affiliates. All rights reserved.
6
2
3
Case Study
以清華
大
學開發的 AI 輔助親權
判決預測系統為例
Hyper Cycle
什麼是技術炒作週期
Critical Thinking
AI 相關事件探討與資料科學家
的
日
常
4
5
1
ML & NN
機器學習與類神經網路快速入
門
NLP
自
然語
言
處理的流程與困難點
Moravec’s Paradox
AI 問題的難易與我們傳統的假
設相反
Agenda
Deep
Learning
Neural Nets
Machine
Learning
Artificial
Intelligence
Dozens of
different ML
methods
[15]
Rule-Based
寫
一
隻程式,分辨圖片內是不是貓
Rule-Based
寫
一
隻程式,分辨圖片內是不是貓
Rule-Based
寫
一
隻程式,分辨圖片內是不是貓
如果耳朵 = 2
且腳 = 4
且尾巴 = 1
且牙
齒
= 30
…..
回傳這是貓
Rule-Based
寫
一
隻程式,分辨圖片內是不是貓
如果耳朵 = 2
且腳 = 4
且尾巴 = 1
且牙
齒
= 30
…..
回傳這是貓
這是貓
用
窮舉法列出所有規則
IBM Watson
[16]
IBM Watson
• 90台 Server
• 2880顆 CPU
• 灌入2億
頁
的內容
• 600萬條規則
• 3秒內回答問題
[16]
Programming Paradigm
Classical programming
Machine learning
Programming Paradigm
Rules
Output
(Answers)
Input
Output
(Answers)
Rules
Input
Classical programming
Machine learning
Types of Machine Learning
[17]
Supervised Learning
Classi
fi
cation
Regression
[18]
Supervised Learning (Training)
X
Features
Y
Labels
[19]
Supervised Learning (Training)
X
Features
Y
Labels
Prediction
Function
[19]
Supervised Learning (Training)
X
Features
Y
Labels
Prediction
Function
Θ
Parameters
[19]
Supervised Learning (Training)
X
Features
Y
Labels
Prediction
Function Y
⌃
Output
Θ
Parameters
[19]
Supervised Learning (Training)
X
Features
Y
Labels
Prediction
Function Y
⌃
Output
Θ
Parameters
Cost
Output
vs
Label Y
Y
⌃
[19]
Supervised Learning (Training)
X
Features
Y
Labels
Prediction
Function Y
⌃
Output
Θ
Parameters
Cost
Output
vs
Label Y
Y
⌃
[19]
Unsupervised Learning
Unlabeled Training Set Clustering
Customer
[18]
Unsupervised Learning
[18]
Semi-Supervised Learning
[20]
Semi-Supervised Learning
[20]
時序決策任務 近似動態規劃
Reinforcement
Learning
[18]
時序決策任務 近似動態規劃
Reinforcement
Learning
[18]
時序決策任務 近似動態規劃
Reinforcement
Learning
[18]
類
神
經
網
路
[21]
類
神
經
網
路
[21]
神經網路的啟發
神經元
神經網路的啟發
神經元
訊號
生
物事實:All or Nothing
標度盤
閥值
生
物事實:All or Nothing
標度盤
閥值
輸出
生
物事實:All or Nothing
標度盤
閥值
輸出
無輸出
Handwritten Digit Recognition
[22]
[22]
DL
ML
NLP
AI
Google Search
Email Spam Sentiment Analysis
Calendar Event Mentioned in Social Media
Hidden Structure
NLP Tasks
Siri Alexa
[24]
社群聲量、輿情分析、聊天機器
人
Marketing
信
用
評比、投資分析
Financial Industry
診斷分類、治療預防
Medical
合約審查、合約搜尋與分析識讀、訴訟預測
Legal
NLP產業應
用
[25]
Heuristics
Machine Learning
Deep Learning
Approaches to NLP
Generic NLP pipeline
Data Acquisition
Text
Processing
Feature
Extraction
Modeling
Evaluation
Deployment
Monitoring and
Model Updating
Cleaning &
Pre-Processing
( )
End-to-end Pipeline
Text
Processing
Feature
Extraction
Modeling
Today’s Highlight
Text Processing
Coronavirus diseases are caused by viruses in
the coronavirus subfamily, a group of related RNA
viruses that cause diseases
in mammals and birds. In humans and birds, the
group of viruses cause respiratory tract
infections that can range from mild to lethal.
Coronavirus diseases
Text Processing
Coronavirus diseases are caused by viruses in
the coronavirus subfamily, a group of related RNA
viruses that cause diseases
in mammals and birds. In humans and birds, the
group of viruses cause respiratory tract
infections that can range from mild to lethal.
Coronavirus diseases <html>
<head>
<title>Coronavirus diseases</title>
</head>
<body>
<h1>Coronavirus diseases</h1>
<div style="text-align: center;">
<img src="Coronavirus.jpg" width="400" alt="Coronavirus
diseases">
</div>
<p><b>Coronavirus diseases</b> are caused by <a href="/wiki/Virus"
title="Virus">viruses</a> in the <a href="/wiki/Coronavirus"
title="Coronavirus">coronavirus</a> subfamily, a group of related <a
href="/wiki/Orthornavirae" title="Orthornavirae">RNA viruses</a> that
cause diseases in <a href="/wiki/Mammal"
title="Mammal">mammals</a> and <a href="/wiki/Bird"
title="Bird">birds</a>. In humans and birds, the group of viruses cause
<a href="/wiki/Respiratory_tract_infection" title="Respiratory tract
infection">respiratory tract infections</a> that can range from mild to
lethal. </p>
</body>
</html>
Text
Processing
Text
Processing
Text
Processing
Text
Processing
Coronavirus diseases
Coronavirus diseases are caused by viruses in
the coronavirus subfamily, a group of related
RNA viruses that cause diseases in mammals
and birds. In humans and birds, the group of
viruses cause respiratory tract infections that
can range from mild to lethal.
A B C
65 66 67
[27]
A B C
65 66 67
< <
[27]
nilvana Is cool
? ? ?
A B C
65 66 67
< <
[27]
nilvana Is cool
? ? ?
A B C
65 66 67
< <
[27]
Tokenization (斷詞)
Panic-buying of goods is starting to appear everywhere.
[28]
Tokenization (斷詞)
Panic-buying of goods is starting to appear everywhere.
Panic-buying of goods is starting to appear everywhere
[28]
Tokenization (斷詞)
Panic-buying of goods is starting to appear everywhere.
Panic-buying of goods is starting to appear everywhere
各地開始出現貨品搶購潮
[28]
Tokenization (斷詞)
Panic-buying of goods is starting to appear everywhere.
Panic-buying of goods is starting to appear everywhere
各地開始出現貨品搶購潮
各
各地
各地
各地
地 開 始 出 現 貨 品 搶 購 潮
開始 出現 貨品 搶購 潮
開始 出 現貨 品 搶購潮
開始出現 貨品搶購潮
[28]
meeting
better
Stemming & Lemmatization
adjustable adjust
formality formaliti
formaliti formal
airliner airlin
was (to) be
good
meeting
詞幹提取 詞形還原
Common Text Processing Steps
[24]
Common Text Processing Steps
Text
Sentence
Tokenization
Sentences
[24]
Common Text Processing Steps
Sentence
Lowecasing
Removal of Punctuation
Removal of Stopwords
Stemming & Lemmatization
Text
Sentence
Tokenization
Sentences
[24]
“@Jamie went back to University[http://cdn.thu.edu.tw].”
“@Jamie went back to University[http://cdn.thu.edu.tw].”
jamie went back to university
Cleaning
“@Jamie went back to University[http://cdn.thu.edu.tw].”
jamie went back to university
jamie went back to university
Cleaning
Tokenize
“@Jamie went back to University[http://cdn.thu.edu.tw].”
jamie went back to university
jamie went back to university
jamie went university
Cleaning
Tokenize
Remove Stop Words
“@Jamie went back to University[http://cdn.thu.edu.tw].”
jamie went back to university
jamie went back to university
jamie went university
jamie go univers
Cleaning
Tokenize
Remove Stop Words
Stem / Lemmatize
Linguistic Diversity Challenge
Linguistic Diversity Challenge
Hey, when we date we always eat at the co
ff
eeshop (one).
Feature
Extraction ?
Feature
Extraction learn
study
teach
read
hear
Discover
listen
Graphic Model
Feature
Extraction learn
study
teach
read
hear
Discover
listen
WordNet
Graphic Model
Statistical Model
Feature
Extraction
0.2 0.8 0.1 0.2 0.4
Document Level (垃圾郵件/情緒分析)
Document Level (垃圾郵件/情緒分析)
doc2vec
Bag of words
Bag of Words
Bag of Words
Little House on the Prairie
Bag of Words
Little House on the Prairie littl house prairi
Bag of Words
Little House on the Prairie
Mary had a Little Lamb
littl house prairi
mari littl lamb
Bag of Words
Little House on the Prairie
Mary had a Little Lamb
The Silence of the Lambs
littl house prairi
mari littl lamb
silenc lamb
Bag of Words
Little House on the Prairie
Mary had a Little Lamb
The Silence of the Lambs
Twinkle Twinkle Little Star
littl house prairi
mari littl lamb
silenc lamb
twinkl littl star
Bag of Words
Little House on the Prairie
Mary had a Little Lamb
Twinkle Twinkle Little Star
littl house prairi mari
lamb silenc twinkl star
Corpus (D) Vocabulary (V)
The Silence of the Lambs
Bag of Words
Little House on the Prairie
Mary had a Little Lamb
Twinkle Twinkle Little Star
The Silence of the Lambs
littl hous priairi mari lamb silenc twinkl star
Bag of Words
Little House on the Prairie
Mary had a Little Lamb
Twinkle Twinkle Little Star
The Silence of the Lambs
littl hous priairi mari lamb silenc twinkl star
1 1 1 0 0 0 0 0
1 0 0 1 1 0 0 0
0 0 0 0 1 1 0 0
1 0 0 0 0 0 2 1
Document-Term Matrix
<0.7, 0.7, 0.3, 0.4, 0.7>
+ -
<0.7, 0.7, 0.3, 0.4, 0.7>
<0.3, 0.9, 0.8, 0.4, 0.2>
+ -
W
Word Level (
自
然語
言
生
成/機器翻譯)
Word Level (
自
然語
言
生
成/機器翻譯)
nilvana Is cool
0.2
0.3
0.9
0.2
0.1
0.8
0.1
0.5
0.7
0.3
0.5
0.4
word2vec
GloVe
One-Hot Encoding
hous
lamb
twinkl
silenc
littl hous priairi mari lamb silenc twinkl star
0 1 0 0 0 0 0 0
0 0 0 0 1 0 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 0 1 0
Word Embeddings
Word Embeddings
Word Embeddings
child
kid
Word Embeddings
child
kid chair
horse
Word Embeddings
Word Embeddings
man
woman
king
queen
Word Embeddings
man
woman
king
queen
Word2Vec
The quick brown fox jumps over the lazy dog
Word2Vec
The quick brown fox jumps over the lazy dog
Word2Vec
The quick brown fox jumps over the lazy dog
Continuous Bag of Words (CBoW)
Word2Vec
The quick brown fox jumps over the lazy dog
Continuous Bag of Words (CBoW)
Word2Vec
The quick brown fox jumps over the lazy dog
Continuous Bag of Words (CBoW)
Continuous Skip-gram
Skip-gram Model
Skip-gram Model
jumps
Skip-gram Model
0
1
0
0
1
0
0
0
0
0
0
1
0
0
1
0
0
1
0
0
jumps
brown
the
fox
over
Skip-gram Model
0
1
0
0
1
0
0
0
0
0
0
1
0
0
1
0
0
1
0
0
jumps
brown
the
fox
over
word vector
Why embeddings works?
Why embeddings works?
Would you like a cup of _____?
Why embeddings works?
Would you like a cup of _____?
I like my _____ black.
Why embeddings works?
Would you like a cup of _____?
I like my _____ black.
I need my morning _____ before I can do anything.
Why embeddings works?
Would you like a cup of _____?
I like my _____ black.
I need my morning _____ before I can do anything.
Why embeddings works?
Would you like a cup of _____?
I like my _____ black.
I need my morning _____ before I can do anything.
Why embeddings works?
Why embeddings works?
co
ff
ee
tea
Why embeddings works?
co
ff
ee
tea
[29]
[29]
最佳化求解問題
最佳化求解
最佳化求解
2
3
Hyper Cycle
什麼是技術炒作週期
Critical Thinking
AI 相關事件探討與資料科學家
的
日
常
4
1
ML & NN
機器學習與類神經網路快速入
門
Moravec’s Paradox
AI 問題的難易與我們傳統的假
設相反
Agenda
5
NLP
自
然語
言
處理的流程與困難點
6
Case Study
以清華
大
學開發的 AI 輔助親權
判決預測系統為例
https://custodyprediction.herokuapp.com
http://www.phys.nthu.edu.tw/~aicmt/Civil%20Law%20Project.html
民
法
子
女最佳利益
類型特徵
理據特徵
主題組成分類 4000篇 親權 非親權
有意願 無意願
親權
99.98%
94.56%
80%
K-means
中
文
斷詞
LDA
Auto-encoder
XGBoost
Word Vector
Doc2Vec
司法院法學資料檢索系統
結果、意願、雙
方
身
份、國籍
有利不利的
文
字理由
判決
文
字意向判斷模型
親權判決預測模型
Topic Modeling
Ready?
https://law.judicial.gov.tw/FJUD/qryresult.aspx?q=65ef164c97b5f356128f013fa28380d4&akw=
子
女最佳利益
2015-2017
離婚
斟酌
子
女最佳利益
https://www.selenium.dev/selenium-ide/
https://links.nilvana.ai/nlp2021
LDA Auto-Encoder
K-Means XGBoost
Latent Dirichlet Allocation
Every document consists of a
mix of topics
bananas
kitten
ohio
kale
puppy
frog
cute
bananas
kitten
ohio
kale
puppy
frog
cute
Every topic consists of a
mix of words
doc1 doc2 doc3
topic1
topic2
LDA Auto-Encoder
K-Means XGBoost
Auto-Encoder
1
0
0
0
1.2
0.8
0.5
主
要
照
顧
者
Vocabulary
Vector Semantics
維度較
小
的編碼
One-Hot Encoding
1
0
0
0
主
要
照
顧
者
[30, 31]
Autoencoder - Deepfake
[30, 31]
Autoencoder - Deepfake
Latent Face A
Latent Face B
Encoder
Decoder A
Decoder B
Original Face A Reconstructed Face A
Reconstructed Face B
Original Face B
[30, 31]
Autoencoder - Deepfake
Latent Face A
Latent Face B
Encoder
Decoder A
Decoder B
Original Face A Reconstructed Face A
Reconstructed Face B
Original Face B
Latent Face A
Encoder Decoder B Reconstructed Face B from A
Original Face A
LDA Auto-Encoder
K-Means XGBoost
K-Means
K-Means
K-Means
K-Means
K-Means
K-Means
LDA Auto-Encoder
K-Means XGBoost
Decision Tree (決策樹)
天氣 溫度 濕度 有風 打球嗎
1 晴天 85 85 沒有 不打
2 晴天 80 90 有 不打
3 陰天 83 78 沒有 打
4 雨天 70 96 沒有 打
5 雨天 68 80 沒有 打
6 雨天 65 70 有 不打
7 陰天 64 65 有 打
8 晴天 72 95 沒有 不打
9 晴天 69 70 沒有 打
10 雨天 75 80 沒有 打
11 晴天 75 70 有 打
12 陰天 72 90 有 打
13 陰天 81 75 沒有 打
14 雨天 71 80 有 不打
高
爾夫球場,
人
員管理問題
[32]
Decision Tree (決策樹)
天氣 溫度 濕度 有風 打球嗎
1 晴天 85 85 沒有 不打
2 晴天 80 90 有 不打
3 陰天 83 78 沒有 打
4 雨天 70 96 沒有 打
5 雨天 68 80 沒有 打
6 雨天 65 70 有 不打
7 陰天 64 65 有 打
8 晴天 72 95 沒有 不打
9 晴天 69 70 沒有 打
10 雨天 75 80 沒有 打
11 晴天 75 70 有 打
12 陰天 72 90 有 打
13 陰天 81 75 沒有 打
14 雨天 71 80 有 不打
天氣
(9/5)
濕度
(2/3)
有風
打球
(4/0)
陰天 雨天
晴天
打球
(3/0)
不打
(0/2)
打球
(2/0)
不打
(0/3)
<=70 >70 有 沒有
高
爾夫球場,
人
員管理問題
[32]
Ensemble Method
[33]
Random Forest (隨機森林)
Random Forest (隨機森林)
天氣 溫度 濕度 有風 打球嗎
1 晴天 85 85 沒有 不打
2 晴天 80 90 有 不打
3 陰天 83 78 沒有 打
4 雨天 70 96 沒有 打
5 雨天 68 80 沒有 打
6 雨天 65 70 有 不打
7 陰天 64 65 有 打
8 晴天 72 95 沒有 不打
9 晴天 69 70 沒有 打
10 雨天 75 80 沒有 打
11 晴天 75 70 有 打
12 陰天 72 90 有 打
13 陰天 81 75 沒有 打
14 雨天 71 80 有 不打
Random Forest (隨機森林)
天氣 溫度 濕度 有風 打球嗎
1 晴天 85 85 沒有 不打
2 晴天 80 90 有 不打
3 陰天 83 78 沒有 打
4 雨天 70 96 沒有 打
5 雨天 68 80 沒有 打
6 雨天 65 70 有 不打
7 陰天 64 65 有 打
8 晴天 72 95 沒有 不打
9 晴天 69 70 沒有 打
10 雨天 75 80 沒有 打
11 晴天 75 70 有 打
12 陰天 72 90 有 打
13 陰天 81 75 沒有 打
14 雨天 71 80 有 不打
天氣 溫度 溫度 濕度 濕度 有風 溫度 有風
部分特徵
Random Forest (隨機森林)
天氣 溫度 濕度 有風 打球嗎
1 晴天 85 85 沒有 不打
2 晴天 80 90 有 不打
3 陰天 83 78 沒有 打
4 雨天 70 96 沒有 打
5 雨天 68 80 沒有 打
6 雨天 65 70 有 不打
7 陰天 64 65 有 打
8 晴天 72 95 沒有 不打
9 晴天 69 70 沒有 打
10 雨天 75 80 沒有 打
11 晴天 75 70 有 打
12 陰天 72 90 有 打
13 陰天 81 75 沒有 打
14 雨天 71 80 有 不打
天氣 溫度 溫度 濕度 濕度 有風 溫度 有風
部分特徵
一
堆專家
Random Forest (隨機森林)
天氣 溫度 濕度 有風 打球嗎
1 晴天 85 85 沒有 不打
2 晴天 80 90 有 不打
3 陰天 83 78 沒有 打
4 雨天 70 96 沒有 打
5 雨天 68 80 沒有 打
6 雨天 65 70 有 不打
7 陰天 64 65 有 打
8 晴天 72 95 沒有 不打
9 晴天 69 70 沒有 打
10 雨天 75 80 沒有 打
11 晴天 75 70 有 打
12 陰天 72 90 有 打
13 陰天 81 75 沒有 打
14 雨天 71 80 有 不打
天氣 溫度 溫度 濕度 濕度 有風 溫度 有風
部分特徵
一
堆專家
多數決
Gradient Boost
假設某
人
今年30歲,
用
GBM來猜測他的年齡
Gradient Boost
假設某
人
今年30歲,
用
GBM來猜測他的年齡
20
Gradient Boost
假設某
人
今年30歲,
用
GBM來猜測他的年齡
20 相差10
Gradient Boost
假設某
人
今年30歲,
用
GBM來猜測他的年齡
20 相差10
6
Gradient Boost
假設某
人
今年30歲,
用
GBM來猜測他的年齡
20 相差10
6 相差4
Gradient Boost
假設某
人
今年30歲,
用
GBM來猜測他的年齡
20 相差10
6 相差4
3
Gradient Boost
假設某
人
今年30歲,
用
GBM來猜測他的年齡
20 相差10
6 相差4
3 相差1
Gradient Boost
假設某
人
今年30歲,
用
GBM來猜測他的年齡
20 相差10
6 相差4
3 相差1
1
Gradient Boost
假設某
人
今年30歲,
用
GBM來猜測他的年齡
20 相差10
6 相差4
3 相差1
1 相差0
Gradient Boost
假設某
人
今年30歲,
用
GBM來猜測他的年齡
20 相差10
6 相差4
3 相差1
1 相差0
LDA Auto-Encoder
K-Means XGBoost
Take Aways
• Choose a baseline
• Manage your expectations
• Start without ML if …
[34]
References
[1] Ending with humour
[2] Dartmouth Workshop: The Birthplace Of AI
[3] 想投資科技股,先了解技術炒作週期
[4] 5 Trends Emerge in the Gartner Hype Cycle for Emerging Technologies, 2018
[5] 2 Megatrends Dominate the Gartner Hype Cycle for Arti
fi
cial Intelligence, 2020
[6] The relentless threat of arti
fi
cial intelligence taking our jobs away
[7] 繼IBM後,「Google健康」也開始裁員⋯醫療AI為什麼讓
大
廠
一
一
敗退?
[8] 為什麼AI醫療容易失敗?李建璋:最
大
原因可能在資料
[9] 法律AI公司Ross Intelligence倒闭:头顶是星空,脚下是薄冰
[10] This is what the evolution of self-driving cars looks like
[11] What is Deep Learning and Why you need it?
[12] How Scale is Enabling Deep Learning
[13] Scaling to Very Very Large Corpora for Natural Language Disambiguation
[14] The unreasonable e
ff
ectiveness of data
[15] Arti
fi
cial Intelligence vs. Machine Learning vs. Deep Learning: Essentials
[16]
人
工
智慧新
革
命--超級電腦「華
生
」
[17] What is Arti
fi
cial Intelligence?
[18] Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition
[19] Natural Language Processing with Classi
fi
cation and Vector Spaces
[20] Use the People album in Photos on your iPhone, iPad, or iPod touch
[21] Neural network playground
[22] Neural Network 3D Simulation
[23] 機器學習基
石
[24] Practical Natural Language Processing
[25] 最火熱的AI應
用
:NLP在含
金
量最
高
的2個產業率先發光
[27] Machine Learning - Convolution with color images
[28] NLP的基本執
行
步驟(I)
[29] Every Machine Learning Algorithm Can Be Represented as a Neural Network
[30] Deep Learning for Deepfakes Creation and Detection: A Survey
[31] Deepfake
大
解密!「換臉」技術更簡單,到底怎麼辦到的?
[32] 維基百科 - 决策树
[33] General Ensemble Method
[34] The First Rule of Machine Learning: Start without Machine Learning
References

More Related Content

Similar to 20211119 - demystified artificial intelligence with NLP

萬物皆兄弟。數據若比鄰—以教育大數據應用為例
萬物皆兄弟。數據若比鄰—以教育大數據應用為例 萬物皆兄弟。數據若比鄰—以教育大數據應用為例
萬物皆兄弟。數據若比鄰—以教育大數據應用為例 張大明 Ta-Ming Chang
 
網路2.0時代情報蒐集術
網路2.0時代情報蒐集術網路2.0時代情報蒐集術
網路2.0時代情報蒐集術基欽 劉
 
数据科学家
数据科学家数据科学家
数据科学家Felix Liu
 
育儿塘 app design for ipad
育儿塘 app design for ipad育儿塘 app design for ipad
育儿塘 app design for ipadGinger Yu
 
AI for Everyone (Chinese)
AI for Everyone (Chinese)AI for Everyone (Chinese)
AI for Everyone (Chinese)Xiao-Wei CAO
 
2019/11/28 環境數據分析|以空污分析為例
2019/11/28 環境數據分析|以空污分析為例2019/11/28 環境數據分析|以空污分析為例
2019/11/28 環境數據分析|以空污分析為例彭其捷 Jack
 
20151106思翱夫子雲倍力大數據服務創新教育科技發展與推廣
20151106思翱夫子雲倍力大數據服務創新教育科技發展與推廣20151106思翱夫子雲倍力大數據服務創新教育科技發展與推廣
20151106思翱夫子雲倍力大數據服務創新教育科技發展與推廣張大明 Ta-Ming Chang
 
2015.07.24 data lifehacker - 問題意識
2015.07.24 data lifehacker - 問題意識2015.07.24 data lifehacker - 問題意識
2015.07.24 data lifehacker - 問題意識Opendata.tw
 
海量資料與圖書館
海量資料與圖書館海量資料與圖書館
海量資料與圖書館皓仁 柯
 
MixTaiwan 20170104-趨勢-陳昇瑋-從資料科學到人工智慧
MixTaiwan 20170104-趨勢-陳昇瑋-從資料科學到人工智慧MixTaiwan 20170104-趨勢-陳昇瑋-從資料科學到人工智慧
MixTaiwan 20170104-趨勢-陳昇瑋-從資料科學到人工智慧Mix Taiwan
 
2020_11 (南湖高中)用資料視覺化說故事
2020_11 (南湖高中)用資料視覺化說故事2020_11 (南湖高中)用資料視覺化說故事
2020_11 (南湖高中)用資料視覺化說故事彭其捷 Jack
 
[台灣人工智慧學校] 人工智慧民主化在台灣
[台灣人工智慧學校] 人工智慧民主化在台灣[台灣人工智慧學校] 人工智慧民主化在台灣
[台灣人工智慧學校] 人工智慧民主化在台灣台灣資料科學年會
 

Similar to 20211119 - demystified artificial intelligence with NLP (15)

萬物皆兄弟。數據若比鄰—以教育大數據應用為例
萬物皆兄弟。數據若比鄰—以教育大數據應用為例 萬物皆兄弟。數據若比鄰—以教育大數據應用為例
萬物皆兄弟。數據若比鄰—以教育大數據應用為例
 
網路2.0時代情報蒐集術
網路2.0時代情報蒐集術網路2.0時代情報蒐集術
網路2.0時代情報蒐集術
 
從大數據走向人工智慧
從大數據走向人工智慧從大數據走向人工智慧
從大數據走向人工智慧
 
数据科学家
数据科学家数据科学家
数据科学家
 
育儿塘 app design for ipad
育儿塘 app design for ipad育儿塘 app design for ipad
育儿塘 app design for ipad
 
AI for Everyone (Chinese)
AI for Everyone (Chinese)AI for Everyone (Chinese)
AI for Everyone (Chinese)
 
2019/11/28 環境數據分析|以空污分析為例
2019/11/28 環境數據分析|以空污分析為例2019/11/28 環境數據分析|以空污分析為例
2019/11/28 環境數據分析|以空污分析為例
 
20151106思翱夫子雲倍力大數據服務創新教育科技發展與推廣
20151106思翱夫子雲倍力大數據服務創新教育科技發展與推廣20151106思翱夫子雲倍力大數據服務創新教育科技發展與推廣
20151106思翱夫子雲倍力大數據服務創新教育科技發展與推廣
 
2015.07.24 data lifehacker - 問題意識
2015.07.24 data lifehacker - 問題意識2015.07.24 data lifehacker - 問題意識
2015.07.24 data lifehacker - 問題意識
 
海量資料與圖書館
海量資料與圖書館海量資料與圖書館
海量資料與圖書館
 
MixTaiwan 20170104-趨勢-陳昇瑋-從資料科學到人工智慧
MixTaiwan 20170104-趨勢-陳昇瑋-從資料科學到人工智慧MixTaiwan 20170104-趨勢-陳昇瑋-從資料科學到人工智慧
MixTaiwan 20170104-趨勢-陳昇瑋-從資料科學到人工智慧
 
GDSC NYCU下學期社員大會
GDSC NYCU下學期社員大會GDSC NYCU下學期社員大會
GDSC NYCU下學期社員大會
 
20150206 aic machine learning
20150206 aic machine learning20150206 aic machine learning
20150206 aic machine learning
 
2020_11 (南湖高中)用資料視覺化說故事
2020_11 (南湖高中)用資料視覺化說故事2020_11 (南湖高中)用資料視覺化說故事
2020_11 (南湖高中)用資料視覺化說故事
 
[台灣人工智慧學校] 人工智慧民主化在台灣
[台灣人工智慧學校] 人工智慧民主化在台灣[台灣人工智慧學校] 人工智慧民主化在台灣
[台灣人工智慧學校] 人工智慧民主化在台灣
 

More from Jamie (Taka) Wang

More from Jamie (Taka) Wang (20)

20200606_insight_Ignition
20200606_insight_Ignition20200606_insight_Ignition
20200606_insight_Ignition
 
20200727_Insight workstation
20200727_Insight workstation20200727_Insight workstation
20200727_Insight workstation
 
20200723_insight_release_plan
20200723_insight_release_plan20200723_insight_release_plan
20200723_insight_release_plan
 
20210105_量產技轉
20210105_量產技轉20210105_量產技轉
20210105_量產技轉
 
20200808自營電商平台策略討論
20200808自營電商平台策略討論20200808自營電商平台策略討論
20200808自營電商平台策略討論
 
20200427_hardware
20200427_hardware20200427_hardware
20200427_hardware
 
20200429_ec
20200429_ec20200429_ec
20200429_ec
 
20200607_insight_sync
20200607_insight_sync20200607_insight_sync
20200607_insight_sync
 
20220113_product_day
20220113_product_day20220113_product_day
20220113_product_day
 
20200429_software
20200429_software20200429_software
20200429_software
 
20200602_insight_business
20200602_insight_business20200602_insight_business
20200602_insight_business
 
20200408_gen11_sequence_diagram
20200408_gen11_sequence_diagram20200408_gen11_sequence_diagram
20200408_gen11_sequence_diagram
 
20190827_activity_diagram
20190827_activity_diagram20190827_activity_diagram
20190827_activity_diagram
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
20161220 - microservice
20161220 - microservice20161220 - microservice
20161220 - microservice
 
20160217 - Overview of Vortex Intelligent Data Sharing Platform
20160217 - Overview of Vortex Intelligent Data Sharing Platform20160217 - Overview of Vortex Intelligent Data Sharing Platform
20160217 - Overview of Vortex Intelligent Data Sharing Platform
 
20151111 - IoT Sync Up
20151111 - IoT Sync Up20151111 - IoT Sync Up
20151111 - IoT Sync Up
 
20151207 - iot strategy
20151207 - iot strategy20151207 - iot strategy
20151207 - iot strategy
 
20141210 - Microservice Container
20141210 - Microservice Container20141210 - Microservice Container
20141210 - Microservice Container
 
20161027 - edge part2
20161027 - edge part220161027 - edge part2
20161027 - edge part2
 

20211119 - demystified artificial intelligence with NLP