SlideShare a Scribd company logo
1 of 31
Download to read offline
Feature Engineering in
Machine Learning
Chun-Liang Li (李俊良)
chunlial@cs.cmu.edu
About Me
2
Academic Competition
Working
• NTU CSIE BS/MS (2012/2013)
• Advisor: Prof. Hsuan-Tien Lin

• CMU MLD PhD (2014-)
• Advisor: Prof. Jeff Schneider
Prof. Barnabás Póczos
• KDD Cup 2011 Champions

KDD Cup 2013 Champions
• With Prof. Chih-Jen Lin

Prof. Hsuan-Tien Lin

Prof. Shou-De Lin

Many students
(2012 intern) (2015 intern)
What is Machine Learning?
• What is Machine Learning?

3
Learning Prediction
Existing Data
Machine (Algorithm)
Model
New Data
Model
Prediction
Data: Several length-d vectors
Data? Algorithm?
• In academic
• Assume we are given good enough data (in d-dimensional
of course )
• Focus on designing better algorithms

Sometimes complicated algorithms imply publications
• In practice
• Where is your good enough data?
• Or, how to transform your data into a d-dimensional one?
4
From Zero to One:
Create your features by your observations
5
An Apple
6
How to describe this picture?
More Fruits
• Method I: Use size of picture

• Method II: Use RGB average

• Many more powerful features 

developed in computer vision
7
(640, 580) (640, 580)
(219, 156, 140) (243, 194, 113) (216, 156, 155)
Case Study (KDD Cup 2013)
• Determine whether a paper is written by a given
author
8
We are given raw text 

of these
Data: https://www.kaggle.com/c/kdd-cup-2013-author-paper-identification-challenge
NTU Approaches
9
Feature Engineering
Several Algorithms
Combining Different Models
Pipeline Feature Engineering
Observation
Encode into Feature
Result
First observation:
Authors Information• Are these my (Chun-Liang Li) papers? (Easy! check author names)

1. Chun-Liang Li and Hsuan-Tien Lin. Condensed filter tree for cost-sensitive multi-label 

classification.

2. Yao-Nan Chen and Hsuan-Tien Lin. Feature-aware label space dimension reduction for 

multi-label classification.
• Encode by name similarities (e.g., how many characters are the same)
• Are Li, Chun-Liang and Chun-Liang Li the same?
• Yes! Eastern and Western order
• How about Li Chun-Liang? (Calculate the similarity of the reverse order)
• Also take co-authors into account
• 29 features in total
10
Second Observation:
Affiliations
• Are Dr. Chi-Jen Lu and Prof. Chih-Jen Lin the same?
• Similar name: Chi-Jen Lu v.s. Chih-Jen Lin
• Shared co-author (me!)
• Take affiliations into account!
• Academia Sinica v.s. National Taiwan University
• 13 features in total
11
Last of KDD Cup 2013
• Many other features, including
• Can you live for more than 100 years? At least I
think I can’t do research after 100 years
• More advanced: social network features
12
Summary
The 97 features designed by students won the competition
Furthermore
• If I can access the content, can I do better? 

13
Author: Robert Galbraith
Who is Robert Galbraith?
“I thought it was by a very 

mature writer, and not a 

first-timer.” — Peter James
Definitely
Writing Style?
• “I was testing things like word length, sentence
length, paragraph length, frequency of particular
words and the pattern of punctuation” 

— Peter Millican (University of Oxford)
14
1 2
3 4
5
15
Game Changing Point:
Deep Learning
Common Type of Data
• Image





• Text
16
Representation Learning
• Deep Learning as learning hidden representations









• An active research topic in academia and industry
17
Use last layer to extract features (Krizhevsky et al., 2012)
(Check Prof. Lee’s talk and go to deep learning session later )
Raw data
Use Pre-trained Network
• Yon don’t need to train a network by yourself
• Use existing pre-trained network to extract features
• AlexNet
• VGG
• Word2Vector
18
Result
Simply using deep learning features achieves state-of-the-art
performance in many applications
Successful Example
• The PASCAL Visual Object Classes Challenge
19
MeanAverage
Precision
0
0.15
0.3
0.45
0.6
2005 2007 2008 2009 2010 2012 2013 2014
Deep learning result

(Girshick et al. 2014)
HoG feature Slow progress on feature engineering and 

algorithms before deep learning
20
Curse of Dimensionality:
Feature Selection and Dimension Reduction
The more, the better?
21
Practice
If we have 1,000,000 data
with 100,000 dimensions,
how much memory do we
need?


Ans:
Theory
Without any assumption,
you need data to
achieve error for d-
dimensional data
106
⇥ 105
⇥ 8
= 8 ⇥ 1011
(B)
= 800 (GB)
✏
O(
1
✏d
)
Noisy Feature
Is every feature useful?
Redundancy?
Feature Selection
• Select import features
• Reduce dimensions
• Explainable Results
22
Commonly Used Tools
• LASSO (Sparse Constraint)
• Random Forests
• Many others
KDD Cup Again
• In KDD Cup 2013, we actually generated more
than 200 features (some secrets you won’t see in the paper )
• Use random forests to select only 97 features,
since many features are unimportant and even
harmful, but why?
23
Non-useful Features
• Duplicated features
• Example I: Country (Taiwan) v.s. Coordinates (121, 23.5)
• Example II: Date of birth (1990) v.s. Age (26)
• Noisy features
• Noisy information (something wrong in your data)
• Missing values (something missing in your data)
• What if we still have too many features?
24
Dimension Reduction
• Let’s visualize the data (a perfect example)





• Non-perfect example in practice
25
Commonly Used Tools
• Principal Component Analysis (PCA)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1 1.2 1.4
−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1 1.2 1.4
−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5
Trade-off between 

information and space
One dimension is enough
PCA — Intuition
• Let’s apply PCA on these faces 

(raw pixels) and visualize the 

coordinates
26
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
http://comp435p.tk/
PCA — Intuition (cont.)
• We can use very few base faces to approximate
(describe) the original faces
27
http://comp435p.tk/
(Sirovich and Kirby, Low-dimensional procedure for the characterization of human faces)
1 2 3 4 5 6 7 8 9
PCA — Case Study
• CIFAR-10 image classification 

with raw pixels as features and 

using approximated kernel SVM
28
Dimensions Accuracy Time
3072 (all) 63.1% ~2 Hrs
100 (PCA) 59.8% 250 s
(Li and Pòczos, Utilize Old Coordinates: Faster Doubly Stochastic Gradients for
Kernel Methods, UAI 2016)
Trade-off between information, space and time
PCA in Practice
• Practical concern:
• Time complexity:
• Space complexity:
• Remark: Use fast approximation for large-scale problem (e.g.,
>100k dimensions)
1. PCA with random projection (implemented in scikit-learn)

(Halko et al., Finding Structure with Randomness, 2011)
2. Stochastic algorithms (easy to implement from scratch)

(Li et al., Rivalry of Two Families of Algorithms for Memory-Restricted Streaming PCA,
AISTATS 2016)
29
O(Nd2
)
O(d2
)
Small Problem
PCA takes <10 seconds for
CIFAR-10 dataset (d=3072) by
using 12 cores (E5-2620)
Conclusion
• Observe the data and encode them into meaningful features







• Deep learning is a powerful tool to use
• Reduce number of features if necessary
• Reduce non-useful features
• Computational concern
30
Existing Data Machine (Algorithm)
Features (Simple) AlgorithmExisting Data
Beginning:
Now:
31
Thanks!
Any Question?

More Related Content

What's hot

Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...AI Frontiers
 
Deep Learning for Artificial Intelligence (AI)
Deep Learning for Artificial Intelligence (AI)Deep Learning for Artificial Intelligence (AI)
Deep Learning for Artificial Intelligence (AI)Er. Shiva K. Shrestha
 
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...AI Frontiers
 
Begin with Data Scientist
Begin with Data ScientistBegin with Data Scientist
Begin with Data ScientistNarong Intiruk
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesCodePolitan
 
Creative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadCreative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadRoelof Pieters
 
Deep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image ProcessingDeep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image ProcessingGrigory Sapunov
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsXavier Amatriain
 
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI mission
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI missionIlya Sutskever at AI Frontiers : Progress towards the OpenAI mission
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI missionAI Frontiers
 
A friendly introduction to GANs
A friendly introduction to GANsA friendly introduction to GANs
A friendly introduction to GANsCsongor Barabasi
 
Everything you need to know about AutoML
Everything you need to know about AutoMLEverything you need to know about AutoML
Everything you need to know about AutoMLArpitha Gurumurthy
 
ML DL AI DS BD - An Introduction
ML DL AI DS BD - An IntroductionML DL AI DS BD - An Introduction
ML DL AI DS BD - An IntroductionDony Riyanto
 
Intelligent Multimedia Recommendation
Intelligent Multimedia RecommendationIntelligent Multimedia Recommendation
Intelligent Multimedia RecommendationWanjin Yu
 
Deep learning trends
Deep learning trendsDeep learning trends
Deep learning trends준호 김
 
Computer Vision Crash Course
Computer Vision Crash CourseComputer Vision Crash Course
Computer Vision Crash CourseJia-Bin Huang
 
Machine Learning in Big Data
Machine Learning in Big DataMachine Learning in Big Data
Machine Learning in Big DataDataWorks Summit
 

What's hot (20)

Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
 
Deep Learning for Artificial Intelligence (AI)
Deep Learning for Artificial Intelligence (AI)Deep Learning for Artificial Intelligence (AI)
Deep Learning for Artificial Intelligence (AI)
 
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
 
Begin with Data Scientist
Begin with Data ScientistBegin with Data Scientist
Begin with Data Scientist
 
Machine Learning for dummies!
Machine Learning for dummies!Machine Learning for dummies!
Machine Learning for dummies!
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & Opportunities
 
Creative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadCreative AI & multimodality: looking ahead
Creative AI & multimodality: looking ahead
 
Deep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image ProcessingDeep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image Processing
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systems
 
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI mission
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI missionIlya Sutskever at AI Frontiers : Progress towards the OpenAI mission
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI mission
 
A friendly introduction to GANs
A friendly introduction to GANsA friendly introduction to GANs
A friendly introduction to GANs
 
Implementing Artificial Intelligence with Big Data
Implementing Artificial Intelligence with Big DataImplementing Artificial Intelligence with Big Data
Implementing Artificial Intelligence with Big Data
 
Everything you need to know about AutoML
Everything you need to know about AutoMLEverything you need to know about AutoML
Everything you need to know about AutoML
 
ML DL AI DS BD - An Introduction
ML DL AI DS BD - An IntroductionML DL AI DS BD - An Introduction
ML DL AI DS BD - An Introduction
 
Ml intro
Ml introMl intro
Ml intro
 
Intelligent Multimedia Recommendation
Intelligent Multimedia RecommendationIntelligent Multimedia Recommendation
Intelligent Multimedia Recommendation
 
Deep learning trends
Deep learning trendsDeep learning trends
Deep learning trends
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Computer Vision Crash Course
Computer Vision Crash CourseComputer Vision Crash Course
Computer Vision Crash Course
 
Machine Learning in Big Data
Machine Learning in Big DataMachine Learning in Big Data
Machine Learning in Big Data
 

Similar to 李俊良/Feature Engineering in Machine Learning

Three Tools for "Human-in-the-loop" Data Science
Three Tools for "Human-in-the-loop" Data ScienceThree Tools for "Human-in-the-loop" Data Science
Three Tools for "Human-in-the-loop" Data ScienceAditya Parameswaran
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSangwoo Mo
 
Chengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big dataChengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big datajins0618
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureIvo Andreev
 
Domain-Driven Design: The "What" and the "Why"
Domain-Driven Design: The "What" and the "Why"Domain-Driven Design: The "What" and the "Why"
Domain-Driven Design: The "What" and the "Why"bincangteknologi
 
The Opportunities and Challenges of Putting the Latest Computer Vision and De...
The Opportunities and Challenges of Putting the Latest Computer Vision and De...The Opportunities and Challenges of Putting the Latest Computer Vision and De...
The Opportunities and Challenges of Putting the Latest Computer Vision and De...Albert Y. C. Chen
 
Promises of Deep Learning
Promises of Deep LearningPromises of Deep Learning
Promises of Deep LearningDavid Khosid
 
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3Dr. Aparna Varde
 
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Lucidworks
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning SystemsXavier Amatriain
 
The Frontier of Deep Learning in 2020 and Beyond
The Frontier of Deep Learning in 2020 and BeyondThe Frontier of Deep Learning in 2020 and Beyond
The Frontier of Deep Learning in 2020 and BeyondNUS-ISS
 
Алексей Ященко и Ярослав Волощук "False simplicity of front-end applications"
Алексей Ященко и Ярослав Волощук "False simplicity of front-end applications"Алексей Ященко и Ярослав Волощук "False simplicity of front-end applications"
Алексей Ященко и Ярослав Волощук "False simplicity of front-end applications"Fwdays
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesAlan Said
 
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkNYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkVivian S. Zhang
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareTigerGraph
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Turi, Inc.
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning InfrastructureSigOpt
 
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural NetworksJunho Cho
 

Similar to 李俊良/Feature Engineering in Machine Learning (20)

Three Tools for "Human-in-the-loop" Data Science
Three Tools for "Human-in-the-loop" Data ScienceThree Tools for "Human-in-the-loop" Data Science
Three Tools for "Human-in-the-loop" Data Science
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture Note
 
Chengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big dataChengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big data
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with Azure
 
Domain-Driven Design: The "What" and the "Why"
Domain-Driven Design: The "What" and the "Why"Domain-Driven Design: The "What" and the "Why"
Domain-Driven Design: The "What" and the "Why"
 
The Opportunities and Challenges of Putting the Latest Computer Vision and De...
The Opportunities and Challenges of Putting the Latest Computer Vision and De...The Opportunities and Challenges of Putting the Latest Computer Vision and De...
The Opportunities and Challenges of Putting the Latest Computer Vision and De...
 
Promises of Deep Learning
Promises of Deep LearningPromises of Deep Learning
Promises of Deep Learning
 
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
 
Icml2017 overview
Icml2017 overviewIcml2017 overview
Icml2017 overview
 
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
The Frontier of Deep Learning in 2020 and Beyond
The Frontier of Deep Learning in 2020 and BeyondThe Frontier of Deep Learning in 2020 and Beyond
The Frontier of Deep Learning in 2020 and Beyond
 
Алексей Ященко и Ярослав Волощук "False simplicity of front-end applications"
Алексей Ященко и Ярослав Волощук "False simplicity of front-end applications"Алексей Ященко и Ярослав Волощук "False simplicity of front-end applications"
Алексей Ященко и Ярослав Волощук "False simplicity of front-end applications"
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System Challenges
 
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkNYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
 
Lean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science teamLean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science team
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA Hardware
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
 
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
 

More from 台灣資料科學年會

[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用台灣資料科學年會
 
[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告台灣資料科學年會
 
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰台灣資料科學年會
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機台灣資料科學年會
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機台灣資料科學年會
 
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話台灣資料科學年會
 
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇台灣資料科學年會
 
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察 [TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察 台灣資料科學年會
 
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵台灣資料科學年會
 
[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用台灣資料科學年會
 
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告台灣資料科學年會
 
[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話台灣資料科學年會
 
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人台灣資料科學年會
 
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維台灣資料科學年會
 
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察台灣資料科學年會
 
[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰台灣資料科學年會
 
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT台灣資料科學年會
 
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達台灣資料科學年會
 
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳台灣資料科學年會
 

More from 台灣資料科學年會 (20)

[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用
 
[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告
 
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
 
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
 
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
 
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察 [TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
 
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
 
[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用
 
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
 
台灣人工智慧學校成果發表會
台灣人工智慧學校成果發表會台灣人工智慧學校成果發表會
台灣人工智慧學校成果發表會
 
[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話
 
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
 
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
 
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
 
[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰
 
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
 
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
 
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
 

Recently uploaded

Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 

Recently uploaded (20)

Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 

李俊良/Feature Engineering in Machine Learning

  • 1. Feature Engineering in Machine Learning Chun-Liang Li (李俊良) chunlial@cs.cmu.edu
  • 2. About Me 2 Academic Competition Working • NTU CSIE BS/MS (2012/2013) • Advisor: Prof. Hsuan-Tien Lin
 • CMU MLD PhD (2014-) • Advisor: Prof. Jeff Schneider Prof. Barnabás Póczos • KDD Cup 2011 Champions
 KDD Cup 2013 Champions • With Prof. Chih-Jen Lin
 Prof. Hsuan-Tien Lin
 Prof. Shou-De Lin
 Many students (2012 intern) (2015 intern)
  • 3. What is Machine Learning? • What is Machine Learning?
 3 Learning Prediction Existing Data Machine (Algorithm) Model New Data Model Prediction Data: Several length-d vectors
  • 4. Data? Algorithm? • In academic • Assume we are given good enough data (in d-dimensional of course ) • Focus on designing better algorithms
 Sometimes complicated algorithms imply publications • In practice • Where is your good enough data? • Or, how to transform your data into a d-dimensional one? 4
  • 5. From Zero to One: Create your features by your observations 5
  • 6. An Apple 6 How to describe this picture?
  • 7. More Fruits • Method I: Use size of picture
 • Method II: Use RGB average
 • Many more powerful features 
 developed in computer vision 7 (640, 580) (640, 580) (219, 156, 140) (243, 194, 113) (216, 156, 155)
  • 8. Case Study (KDD Cup 2013) • Determine whether a paper is written by a given author 8 We are given raw text 
 of these Data: https://www.kaggle.com/c/kdd-cup-2013-author-paper-identification-challenge
  • 9. NTU Approaches 9 Feature Engineering Several Algorithms Combining Different Models Pipeline Feature Engineering Observation Encode into Feature Result
  • 10. First observation: Authors Information• Are these my (Chun-Liang Li) papers? (Easy! check author names)
 1. Chun-Liang Li and Hsuan-Tien Lin. Condensed filter tree for cost-sensitive multi-label 
 classification.
 2. Yao-Nan Chen and Hsuan-Tien Lin. Feature-aware label space dimension reduction for 
 multi-label classification. • Encode by name similarities (e.g., how many characters are the same) • Are Li, Chun-Liang and Chun-Liang Li the same? • Yes! Eastern and Western order • How about Li Chun-Liang? (Calculate the similarity of the reverse order) • Also take co-authors into account • 29 features in total 10
  • 11. Second Observation: Affiliations • Are Dr. Chi-Jen Lu and Prof. Chih-Jen Lin the same? • Similar name: Chi-Jen Lu v.s. Chih-Jen Lin • Shared co-author (me!) • Take affiliations into account! • Academia Sinica v.s. National Taiwan University • 13 features in total 11
  • 12. Last of KDD Cup 2013 • Many other features, including • Can you live for more than 100 years? At least I think I can’t do research after 100 years • More advanced: social network features 12 Summary The 97 features designed by students won the competition
  • 13. Furthermore • If I can access the content, can I do better? 
 13 Author: Robert Galbraith Who is Robert Galbraith? “I thought it was by a very 
 mature writer, and not a 
 first-timer.” — Peter James Definitely
  • 14. Writing Style? • “I was testing things like word length, sentence length, paragraph length, frequency of particular words and the pattern of punctuation” 
 — Peter Millican (University of Oxford) 14 1 2 3 4 5
  • 16. Common Type of Data • Image
 
 
 • Text 16
  • 17. Representation Learning • Deep Learning as learning hidden representations
 
 
 
 
 • An active research topic in academia and industry 17 Use last layer to extract features (Krizhevsky et al., 2012) (Check Prof. Lee’s talk and go to deep learning session later ) Raw data
  • 18. Use Pre-trained Network • Yon don’t need to train a network by yourself • Use existing pre-trained network to extract features • AlexNet • VGG • Word2Vector 18 Result Simply using deep learning features achieves state-of-the-art performance in many applications
  • 19. Successful Example • The PASCAL Visual Object Classes Challenge 19 MeanAverage Precision 0 0.15 0.3 0.45 0.6 2005 2007 2008 2009 2010 2012 2013 2014 Deep learning result
 (Girshick et al. 2014) HoG feature Slow progress on feature engineering and 
 algorithms before deep learning
  • 20. 20 Curse of Dimensionality: Feature Selection and Dimension Reduction
  • 21. The more, the better? 21 Practice If we have 1,000,000 data with 100,000 dimensions, how much memory do we need? 
 Ans: Theory Without any assumption, you need data to achieve error for d- dimensional data 106 ⇥ 105 ⇥ 8 = 8 ⇥ 1011 (B) = 800 (GB) ✏ O( 1 ✏d ) Noisy Feature Is every feature useful? Redundancy?
  • 22. Feature Selection • Select import features • Reduce dimensions • Explainable Results 22 Commonly Used Tools • LASSO (Sparse Constraint) • Random Forests • Many others
  • 23. KDD Cup Again • In KDD Cup 2013, we actually generated more than 200 features (some secrets you won’t see in the paper ) • Use random forests to select only 97 features, since many features are unimportant and even harmful, but why? 23
  • 24. Non-useful Features • Duplicated features • Example I: Country (Taiwan) v.s. Coordinates (121, 23.5) • Example II: Date of birth (1990) v.s. Age (26) • Noisy features • Noisy information (something wrong in your data) • Missing values (something missing in your data) • What if we still have too many features? 24
  • 25. Dimension Reduction • Let’s visualize the data (a perfect example)
 
 
 • Non-perfect example in practice 25 Commonly Used Tools • Principal Component Analysis (PCA) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.2 0.4 0.6 0.8 1 1.2 1.4 −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.2 0.4 0.6 0.8 1 1.2 1.4 −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 Trade-off between 
 information and space One dimension is enough
  • 26. PCA — Intuition • Let’s apply PCA on these faces 
 (raw pixels) and visualize the 
 coordinates 26 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 http://comp435p.tk/
  • 27. PCA — Intuition (cont.) • We can use very few base faces to approximate (describe) the original faces 27 http://comp435p.tk/ (Sirovich and Kirby, Low-dimensional procedure for the characterization of human faces) 1 2 3 4 5 6 7 8 9
  • 28. PCA — Case Study • CIFAR-10 image classification 
 with raw pixels as features and 
 using approximated kernel SVM 28 Dimensions Accuracy Time 3072 (all) 63.1% ~2 Hrs 100 (PCA) 59.8% 250 s (Li and Pòczos, Utilize Old Coordinates: Faster Doubly Stochastic Gradients for Kernel Methods, UAI 2016) Trade-off between information, space and time
  • 29. PCA in Practice • Practical concern: • Time complexity: • Space complexity: • Remark: Use fast approximation for large-scale problem (e.g., >100k dimensions) 1. PCA with random projection (implemented in scikit-learn)
 (Halko et al., Finding Structure with Randomness, 2011) 2. Stochastic algorithms (easy to implement from scratch)
 (Li et al., Rivalry of Two Families of Algorithms for Memory-Restricted Streaming PCA, AISTATS 2016) 29 O(Nd2 ) O(d2 ) Small Problem PCA takes <10 seconds for CIFAR-10 dataset (d=3072) by using 12 cores (E5-2620)
  • 30. Conclusion • Observe the data and encode them into meaningful features
 
 
 
 • Deep learning is a powerful tool to use • Reduce number of features if necessary • Reduce non-useful features • Computational concern 30 Existing Data Machine (Algorithm) Features (Simple) AlgorithmExisting Data Beginning: Now: