SlideShare a Scribd company logo
1 of 37
Understanding Deep Learning for
Big Data
Le Song
http://www.cc.gatech.edu/~lsong/
College of Computing
Georgia Institute of Technology
1
AlexNet: deep convolution neural networks
2
11
11
5
5
3
3
3
3
256
13
13
3
3
40964096
1000
Rectified linear unit: ℎ 𝑢 = max{0, 𝑢}
224
224
3
55
55
96
256
27
27
384
13
13
384
13
13
3.7 million parameters58.6 million parameters
Pr 𝑦|𝑥 ∝ exp 𝑊8ℎ 𝑊7ℎ 𝑊6ℎ 𝑊5ℎ 𝑊4ℎ 𝑊3ℎ 𝑊2ℎ 𝑊1 𝑥
Image
𝑥
Label
𝑦
cat/
bike/
…?
3
a benchmark image classification problem
~ 1.3 million examples, ~ 1 thousand classes
Training is end-to-end
Minimize negative log-likelihood over 𝑚 data points 𝑥𝑖, 𝑦𝑖 𝑖=1
𝑚
min
𝑓∈𝓕
𝑅 𝑊1, … , 𝑊8 ≔ −
1
𝑚
𝑖=1
𝑚
log Pr 𝑦𝑖|𝑥𝑖
(Stochastic) gradient descent
𝑊8
𝑡+1
= 𝑊8
𝑡
− 𝜂
𝜕 𝑅
𝜕𝑊8
…
𝑊1
𝑡+1
= 𝑊1
𝑡
− 𝜂
𝜕 𝑅
𝜕𝑊1
4
Pr 𝑦|𝑥 ∝ exp 𝑊8ℎ 𝑊7ℎ 𝑊6ℎ 𝑊5ℎ 𝑊4ℎ 𝑊3ℎ 𝑊2ℎ 𝑊1 𝑥
AlexNet achieve
~40%
top-1 error
Traditional image features not learned end-to-end
5
Handcrafted
feature extractor
(eg. SIFT)
Divide image
to patches
Combine features
Learn classifier
Rectified linear unit: ℎ 𝑢 = max{0, 𝑢}
Deep learning not fully understood
11
11
5
5
3
3
3
3
256
13
13
3
3
40964096
1000
224
224
3
55
55
96
256
27
27
384
13
13
384
13
13
3.7 million
parameters
58.6 million parameters
6
ully connected layers
crucial?
Convolution layers
crucial?
Image
𝑥
Train end-to-end important?
Pr 𝑦|𝑥 ∝ exp 𝑊8ℎ 𝑊7ℎ 𝑊6ℎ 𝑊5ℎ 𝑊4ℎ 𝑊3ℎ 𝑊2ℎ 𝑊1 𝑥
Experiments
1. Fully connected layers crucial?
2. Convolution layers crucial?
3. Learn parameters end-to-end crucial?
Kernel methods: alternative nonlinear model
Combination of random basis functions 𝑘(𝑤, 𝑥)
𝑓 𝑥 =
𝑖=1
𝑇
𝛼𝑖 𝑘(𝑤𝑖, 𝑥)
8
𝑖=1
7
𝛼𝑖 exp − 𝑤𝑖 − 𝑥 2
𝛼1 𝛼2 𝛼3 𝛼4 𝛼5 𝛼6 𝛼7
𝑤2 𝑤3 𝑤4 𝑤5 𝑤6 𝑤7
𝑘 𝑤𝑖, 𝑥
= exp − 𝑤𝑖 − 𝑥 2
𝑥𝑤1
[Dai et al. NIPS 14]
𝑥
Replace fully connected by kernel methods
I. Jointly trained neural nets
(AlexNet)
Pr 𝑦 𝑥 ∝
exp 𝑊8ℎ7 𝑊7 ℎ6 … ℎ1 𝑊1 𝑥
Learn
II. Fixed neural nets
III. Scalable kernel methods
[Dai et al. NIPS 14]
Learn Fix
Learn Fix
9
10
Learn classifiers from a benchmark subset of
~ 1.3 million examples, ~ 1 thousand classes
Kernel machine learns faster
ImageNet 1.3M original images, and 1000 classes
Random cropping and mirroring images in streaming fashion
Number of training samples
10
5
40
60
80
100
Test
top-1 error
(%)
10
6
10
7
10
8
jointly-trained neural net
fixed neural net
doubly SGD
Training 1 week
using GPU
47.8
44.5
42.6
Random guessing
99.9% error
11
Similar results with MNIST8M
Classification with handwritten digits
8M images, 10 classes
LeNet5
12
Similar results with CIFAR10
Classification with internet images
60K images, 10 classes
13
Experiments
1. Fully connected layers crucial? No
2. Convolution layers crucial?
3. Learn parameters end-to-end crucial?
Kernel methods directly on inputs?
Fixed convolutionWithout convolution
0
0.2
0.4
0.6
0.8
1
1.2
MNIST
2 convolution layer
0
10
20
30
40
CIFAR10
2 convolution layers
0
20
40
60
80
100
ImageNet
5 convolution layers
15
Kernel methods + random convolutions?
Fixed convolutionWithout convolution Random convolution
0
0.2
0.4
0.6
0.8
1
1.2
MNIST
2 convolution layer
0
10
20
30
40
CIFAR10
2 convolution layers
# random conv
≫
# fixed conv
Random
16
Structured composition useful
Not just fully connected layers, and plain composition
𝑓 𝑥 = ℎ 𝑛 ℎ 𝑛−1 … ℎ1 𝑥
Structured composition of nonlinear functions
𝑓 𝑥 = ℎ 𝑛 ℎ 𝑛−1 … ℎ1 𝑥 𝑝𝑎𝑡𝑐ℎ1
, ℎ1 𝑥 𝑝𝑎𝑡𝑐ℎ2
, … , ℎ1 𝑥 𝑝𝑎𝑡𝑐ℎ 𝑚
17
the same function
Experiments
1. Fully connected layers crucial? No
2. Convolution layers crucial? Yes
3. Learn parameters end-to-end crucial?
Lots of random features used
58M parameters
131M parameters
AlexNet
Scalable
Kernel Method
Error
42.6%
Error
44.5%
1000
4096 4096
256
13
13
256
13
13
131K
1000
19
Fix
131M parameters needed?
58M parameters
32M parameters
AlexNet
Error
42.6%
Error
50.0%
1000
4096 4096
256
13
13
256
13
13
32K
1000
20
Scalable
Kernel Method
Fix
Basis function adaptation crucial
Integrated squared approximation error by 𝑇 basis function [Barron ‘93]
Error of
adapting basis function
≤
1
𝑇
Error of
fixed basis function
≥
1
𝑇2/𝑑
𝑓 𝑥 =
𝑖=1
7
𝛼𝑖 𝑘 𝑥𝑖, 𝑥
𝛼1 𝛼2 𝛼3 𝛼4 𝛼5
𝛼6 𝛼7
𝑥1 𝑥2 𝑥3 𝑥4 𝑥5 𝑥6 𝑥7
𝑘(𝑥𝑖, 𝑥)
𝑓 𝑥 =
𝑖=1
2
𝛼𝑖 𝑘 𝜃 𝑖
𝑥𝑖, 𝑥
𝑥1 𝑥2
𝑘 𝜃 𝑖
(𝑥𝑖, 𝑥)
𝛼1 𝛼2
21
Learning random features helps a lot
58M parameters
32M parameters
Learn and basis adaptation
AlexNet
Error
42.6%
Error
43.7%
1000
4096 4096
256
13
13
256
13
13
32K
1000
Fix
22/50
Scalable
Kernel Method
Learning convolution together helps more
58M parameters
32M parameters
Learn and basis adaptation
AlexNet
Error
42.6%
Error
41.9%
1000
4096 4096
256
13
13
256
13
13
32K
1000
Jointly learn
23
Scalable
Kernel Method
Lesson learned:
Exploit Structure & Train End-to-End
Deep learning over (time-varying) graph
Co-evolutionary features
ChristineAliceDavid Jacob
Item embedding
𝑓𝑖(𝑡)
User embedding
𝑓𝑢(𝑡)
User-item interactions
evolve over time
… 25
ChristineAliceDavid Jacob
User embedding
𝑓𝑢(𝑡)
Co-evolutionary features
Item embedding
𝑓𝑖(𝑡)
User-item interactions
evolve over time
… 26
ChristineAliceDavid Jacob
User embedding
𝑓𝑢(𝑡)
Co-evolutionary features
Item embedding
𝑓𝑖(𝑡)
User-item interactions
evolve over time
… 27
ChristineAliceDavid Jacob
Item embedding
𝑓𝑖(𝑡)
User embedding
𝑓𝑢(𝑡)
Co-evolutionary features
User-item interactions
evolve over time
… 28
ChristineAliceDavid Jacob
Item embedding
𝑓𝑖(𝑡)
User embedding
𝑓𝑢(𝑡)
Co-evolutionary features
User-item interactions
evolve over time
… 29
ChristineAliceDavid Jacob
Co-evolutionary features
Item embedding
𝑓𝑖(𝑡)
User embedding
𝑓𝑢(𝑡)
User-item interactions
evolve over time
… 30
Co-evolutionary embedding
ChristineAliceDavid Jacob
Initialize item embedding
𝑓𝑖 𝑛
𝑡0 = ℎ 𝑉0 ⋅ 𝑓𝑖 𝑛
0
Initialize user embedding
𝑓𝑢 𝑛
𝑡0 = ℎ 𝑊0 ⋅ 𝑓𝑢 𝑛
0
𝑢 𝑛, 𝑖 𝑛, 𝑡 𝑛, 𝑞 𝑛
Item raw profile features
User raw profile features
Drift
Context
Evolution
Co-evolution
User Item𝑓𝑖 𝑛
𝑡 𝑛 = ℎ
𝑉1 ⋅ 𝑓𝑖 𝑛
𝑡 𝑛
−
+𝑉2 ⋅ 𝑓𝑢 𝑛
𝑡 𝑛
−
+𝑉3 ⋅ 𝑞 𝑛
+𝑉4 ⋅ (𝑡 𝑛 − 𝑡 𝑛−1)
Update U2I:
Drift
Context
Evolution
Co-evolution
ItemUser𝑓𝑢 𝑛
𝑡 𝑛 = ℎ
𝑊1 ⋅ 𝑓𝑢 𝑛
𝑡 𝑛
−
+𝑊2 ⋅ 𝑓𝑖 𝑛
𝑡 𝑛
−
+𝑊3 ⋅ 𝑞 𝑛
+𝑊4 ⋅ (𝑡 𝑛 − 𝑡 𝑛−1)
Update I2U:
31[Dai et al. Recsys16]
Deep learning with time-varying computation graph
time
𝑡2
𝑡3
𝑡1
𝑡0
Mini-batch 1
Computation graph of RNN
determined by
1. The bipartite interaction
graph
2. The temporal ordering of
events
32
Much improvement prediction on Reddit dataset
Next item prediction Return time prediction
1,000 users, 1403 groups, ~10K interactions
MAR: mean absolute rank difference
MAE: mean absolute error (hours)
33
Predicting efficiency of solar panel materials
Dataset Harvard clean
energy project
Data point # 2.3 million
Type Molecule
Atom type 6
Avg node # 28
Avg edge # 33
Power Conversion Efficiency (PCE)
(0 -12 %)
predict
Organic
Solar Panel
Materials
34
Structure2Vec
𝜇2
(1)
𝜇2
(0)
𝜇1
(0)
𝜇3
(1)
𝜇1
(1)
……
𝜇2
(𝑇)
𝜇3
(𝑇)
𝜇1
(𝑇)
𝑋6
𝑋1
𝑋2 𝑋3
𝑋4
𝑋5
𝜒
𝜇6
(0)
……
……
Iteration 1:
Iteration 𝑇:
Label 𝑦
classification/regression
with parameter 𝑉
Aggregate
𝜇1
(𝑇)
𝜇2
(𝑇)
+
+
⋮
= 𝜇 𝑎(𝑊, 𝜒)
35
[Dai et al. ICML 16]
Improved prediction with small model
Structure2vec gets ~4% relative error
with 10,000 times smaller model!
Test MAE Test RMSE # parameters
Mean predictor 1.986 2.406 1
WL level-3 0.143 0.204 1.6 m
WL level-6 0.096 0.137 1378 m
structure2vec 0.085 0.117 0.1 m
10% data for testing
36
Take Home Message:
Deep fully connected layers not the key
Exploit structure (CNN, Coevolution,
Structure2vec)
Train end-to-end

More Related Content

What's hot

Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...MLconf
 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processingananth
 
Generating Sequences with Deep LSTMs & RNNS in julia
Generating Sequences with Deep LSTMs & RNNS in juliaGenerating Sequences with Deep LSTMs & RNNS in julia
Generating Sequences with Deep LSTMs & RNNS in juliaAndre Pemmelaar
 
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017MLconf
 
Wapid and wobust active online machine leawning with Vowpal Wabbit
Wapid and wobust active online machine leawning with Vowpal Wabbit Wapid and wobust active online machine leawning with Vowpal Wabbit
Wapid and wobust active online machine leawning with Vowpal Wabbit Antti Haapala
 
Basic ideas on keras framework
Basic ideas on keras frameworkBasic ideas on keras framework
Basic ideas on keras frameworkAlison Marczewski
 
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15MLconf
 
Exploring Optimization in Vowpal Wabbit
Exploring Optimization in Vowpal WabbitExploring Optimization in Vowpal Wabbit
Exploring Optimization in Vowpal WabbitShiladitya Sen
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Oswald Campesato
 
TensorFlow Tutorial Part2
TensorFlow Tutorial Part2TensorFlow Tutorial Part2
TensorFlow Tutorial Part2Sungjoon Choi
 
Josh Patterson MLconf slides
Josh Patterson MLconf slidesJosh Patterson MLconf slides
Josh Patterson MLconf slidesMLconf
 
Online learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopOnline learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopHéloïse Nonne
 
Terascale Learning
Terascale LearningTerascale Learning
Terascale Learningpauldix
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural NetworksDatabricks
 
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15MLconf
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in PythonImry Kissos
 

What's hot (20)

Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
 
TensorFlow in 3 sentences
TensorFlow in 3 sentencesTensorFlow in 3 sentences
TensorFlow in 3 sentences
 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processing
 
Generating Sequences with Deep LSTMs & RNNS in julia
Generating Sequences with Deep LSTMs & RNNS in juliaGenerating Sequences with Deep LSTMs & RNNS in julia
Generating Sequences with Deep LSTMs & RNNS in julia
 
nn network
nn networknn network
nn network
 
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
 
Wapid and wobust active online machine leawning with Vowpal Wabbit
Wapid and wobust active online machine leawning with Vowpal Wabbit Wapid and wobust active online machine leawning with Vowpal Wabbit
Wapid and wobust active online machine leawning with Vowpal Wabbit
 
Basic ideas on keras framework
Basic ideas on keras frameworkBasic ideas on keras framework
Basic ideas on keras framework
 
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
 
Exploring Optimization in Vowpal Wabbit
Exploring Optimization in Vowpal WabbitExploring Optimization in Vowpal Wabbit
Exploring Optimization in Vowpal Wabbit
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
 
TensorFlow Tutorial Part2
TensorFlow Tutorial Part2TensorFlow Tutorial Part2
TensorFlow Tutorial Part2
 
Josh Patterson MLconf slides
Josh Patterson MLconf slidesJosh Patterson MLconf slides
Josh Patterson MLconf slides
 
Online learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and HadoopOnline learning, Vowpal Wabbit and Hadoop
Online learning, Vowpal Wabbit and Hadoop
 
Terascale Learning
Terascale LearningTerascale Learning
Terascale Learning
 
Deep Learning in theano
Deep Learning in theanoDeep Learning in theano
Deep Learning in theano
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in Python
 

Viewers also liked

Amy Langville, Professor of Mathematics, The College of Charleston in South C...
Amy Langville, Professor of Mathematics, The College of Charleston in South C...Amy Langville, Professor of Mathematics, The College of Charleston in South C...
Amy Langville, Professor of Mathematics, The College of Charleston in South C...MLconf
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016MLconf
 
Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...
Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...
Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...MLconf
 
Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016
Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016
Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016MLconf
 
Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016
Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016
Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016MLconf
 
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016MLconf
 
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016MLconf
 
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016MLconf
 
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016MLconf
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016MLconf
 
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016MLconf
 
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016MLconf
 
Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16
Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16
Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16MLconf
 
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016MLconf
 
Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...
Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...
Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...MLconf
 
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016MLconf
 
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...MLconf
 
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016MLconf
 
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016MLconf
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016MLconf
 

Viewers also liked (20)

Amy Langville, Professor of Mathematics, The College of Charleston in South C...
Amy Langville, Professor of Mathematics, The College of Charleston in South C...Amy Langville, Professor of Mathematics, The College of Charleston in South C...
Amy Langville, Professor of Mathematics, The College of Charleston in South C...
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
 
Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...
Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...
Beverly Wright, Executive Director, Business Analytics Center, Georgia Instit...
 
Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016
Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016
Teresa Larsen, Founder & Director, ScientificLiteracy.org at MLconf ATL 2016
 
Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016
Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016
Michael Galvin, Sr. Data Scientist, Metis at MLconf ATL 2016
 
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016
Brian Lucena, Senior Data Scientist, Metis at MLconf SF 2016
 
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
 
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
 
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016
Jonathan Lenaghan, VP of Science and Technology, PlaceIQ at MLconf ATL 2016
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
 
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
Ryan Curtin, Principal Research Scientist, Symantec at MLconf ATL 2016
 
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016
 
Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16
Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16
Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16
 
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
 
Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...
Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...
Alex Dimakis, Associate Professor, Dept. of Electrical and Computer Engineeri...
 
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016
Stephanie deWet, Software Engineer, Pinterest at MLconf SF 2016
 
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...
Jean-François Puget, Distinguished Engineer, Machine Learning and Optimizatio...
 
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
 
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
 

Similar to Le Song, Assistant Professor, College of Computing, Georgia Institute of Technology at MLconf ATL 2016

Digit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDigit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDing Li
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningCastLabKAIST
 
Gan seminar
Gan seminarGan seminar
Gan seminarSan Kim
 
Neural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningNeural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningTapas Majumdar
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedOmid Vahdaty
 
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsDiscovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsWee Hyong Tok
 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdfnyomans1
 
Lesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdfLesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdfssuser7f0b19
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
 
[第34回 WBA若手の会勉強会] Microsoft AI platform
[第34回 WBA若手の会勉強会] Microsoft AI platform[第34回 WBA若手の会勉強会] Microsoft AI platform
[第34回 WBA若手の会勉強会] Microsoft AI platformNaoki (Neo) SATO
 
Convolution Neural Network Lecture Slides
Convolution Neural Network Lecture SlidesConvolution Neural Network Lecture Slides
Convolution Neural Network Lecture SlidesAdnanHaider234505
 
B Eng Final Year Project Presentation
B Eng Final Year Project PresentationB Eng Final Year Project Presentation
B Eng Final Year Project Presentationjesujoseph
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用CHENHuiMei
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentationOwin Will
 
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate AscentCOCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascentjeykottalam
 
딥러닝 중급 - AlexNet과 VggNet (Basic of DCNN : AlexNet and VggNet)
딥러닝 중급 - AlexNet과 VggNet (Basic of DCNN : AlexNet and VggNet)딥러닝 중급 - AlexNet과 VggNet (Basic of DCNN : AlexNet and VggNet)
딥러닝 중급 - AlexNet과 VggNet (Basic of DCNN : AlexNet and VggNet)Hansol Kang
 

Similar to Le Song, Assistant Professor, College of Computing, Georgia Institute of Technology at MLconf ATL 2016 (20)

Digit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDigit recognizer by convolutional neural network
Digit recognizer by convolutional neural network
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
 
Gan seminar
Gan seminarGan seminar
Gan seminar
 
Eye deep
Eye deepEye deep
Eye deep
 
Neural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learningNeural network basic and introduction of Deep learning
Neural network basic and introduction of Deep learning
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
 
DL (v2).pptx
DL (v2).pptxDL (v2).pptx
DL (v2).pptx
 
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsDiscovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdf
 
Lesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdfLesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdf
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
 
[第34回 WBA若手の会勉強会] Microsoft AI platform
[第34回 WBA若手の会勉強会] Microsoft AI platform[第34回 WBA若手の会勉強会] Microsoft AI platform
[第34回 WBA若手の会勉強会] Microsoft AI platform
 
Convolution Neural Network Lecture Slides
Convolution Neural Network Lecture SlidesConvolution Neural Network Lecture Slides
Convolution Neural Network Lecture Slides
 
B Eng Final Year Project Presentation
B Eng Final Year Project PresentationB Eng Final Year Project Presentation
B Eng Final Year Project Presentation
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
 
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
 
2020 12-2-detr
2020 12-2-detr2020 12-2-detr
2020 12-2-detr
 
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate AscentCOCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
 
딥러닝 중급 - AlexNet과 VggNet (Basic of DCNN : AlexNet and VggNet)
딥러닝 중급 - AlexNet과 VggNet (Basic of DCNN : AlexNet and VggNet)딥러닝 중급 - AlexNet과 VggNet (Basic of DCNN : AlexNet and VggNet)
딥러닝 중급 - AlexNet과 VggNet (Basic of DCNN : AlexNet and VggNet)
 

More from MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

More from MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 

Recently uploaded (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 

Le Song, Assistant Professor, College of Computing, Georgia Institute of Technology at MLconf ATL 2016

  • 1. Understanding Deep Learning for Big Data Le Song http://www.cc.gatech.edu/~lsong/ College of Computing Georgia Institute of Technology 1
  • 2. AlexNet: deep convolution neural networks 2 11 11 5 5 3 3 3 3 256 13 13 3 3 40964096 1000 Rectified linear unit: ℎ 𝑢 = max{0, 𝑢} 224 224 3 55 55 96 256 27 27 384 13 13 384 13 13 3.7 million parameters58.6 million parameters Pr 𝑦|𝑥 ∝ exp 𝑊8ℎ 𝑊7ℎ 𝑊6ℎ 𝑊5ℎ 𝑊4ℎ 𝑊3ℎ 𝑊2ℎ 𝑊1 𝑥 Image 𝑥 Label 𝑦 cat/ bike/ …?
  • 3. 3 a benchmark image classification problem ~ 1.3 million examples, ~ 1 thousand classes
  • 4. Training is end-to-end Minimize negative log-likelihood over 𝑚 data points 𝑥𝑖, 𝑦𝑖 𝑖=1 𝑚 min 𝑓∈𝓕 𝑅 𝑊1, … , 𝑊8 ≔ − 1 𝑚 𝑖=1 𝑚 log Pr 𝑦𝑖|𝑥𝑖 (Stochastic) gradient descent 𝑊8 𝑡+1 = 𝑊8 𝑡 − 𝜂 𝜕 𝑅 𝜕𝑊8 … 𝑊1 𝑡+1 = 𝑊1 𝑡 − 𝜂 𝜕 𝑅 𝜕𝑊1 4 Pr 𝑦|𝑥 ∝ exp 𝑊8ℎ 𝑊7ℎ 𝑊6ℎ 𝑊5ℎ 𝑊4ℎ 𝑊3ℎ 𝑊2ℎ 𝑊1 𝑥 AlexNet achieve ~40% top-1 error
  • 5. Traditional image features not learned end-to-end 5 Handcrafted feature extractor (eg. SIFT) Divide image to patches Combine features Learn classifier
  • 6. Rectified linear unit: ℎ 𝑢 = max{0, 𝑢} Deep learning not fully understood 11 11 5 5 3 3 3 3 256 13 13 3 3 40964096 1000 224 224 3 55 55 96 256 27 27 384 13 13 384 13 13 3.7 million parameters 58.6 million parameters 6 ully connected layers crucial? Convolution layers crucial? Image 𝑥 Train end-to-end important? Pr 𝑦|𝑥 ∝ exp 𝑊8ℎ 𝑊7ℎ 𝑊6ℎ 𝑊5ℎ 𝑊4ℎ 𝑊3ℎ 𝑊2ℎ 𝑊1 𝑥
  • 7. Experiments 1. Fully connected layers crucial? 2. Convolution layers crucial? 3. Learn parameters end-to-end crucial?
  • 8. Kernel methods: alternative nonlinear model Combination of random basis functions 𝑘(𝑤, 𝑥) 𝑓 𝑥 = 𝑖=1 𝑇 𝛼𝑖 𝑘(𝑤𝑖, 𝑥) 8 𝑖=1 7 𝛼𝑖 exp − 𝑤𝑖 − 𝑥 2 𝛼1 𝛼2 𝛼3 𝛼4 𝛼5 𝛼6 𝛼7 𝑤2 𝑤3 𝑤4 𝑤5 𝑤6 𝑤7 𝑘 𝑤𝑖, 𝑥 = exp − 𝑤𝑖 − 𝑥 2 𝑥𝑤1 [Dai et al. NIPS 14] 𝑥
  • 9. Replace fully connected by kernel methods I. Jointly trained neural nets (AlexNet) Pr 𝑦 𝑥 ∝ exp 𝑊8ℎ7 𝑊7 ℎ6 … ℎ1 𝑊1 𝑥 Learn II. Fixed neural nets III. Scalable kernel methods [Dai et al. NIPS 14] Learn Fix Learn Fix 9
  • 10. 10 Learn classifiers from a benchmark subset of ~ 1.3 million examples, ~ 1 thousand classes
  • 11. Kernel machine learns faster ImageNet 1.3M original images, and 1000 classes Random cropping and mirroring images in streaming fashion Number of training samples 10 5 40 60 80 100 Test top-1 error (%) 10 6 10 7 10 8 jointly-trained neural net fixed neural net doubly SGD Training 1 week using GPU 47.8 44.5 42.6 Random guessing 99.9% error 11
  • 12. Similar results with MNIST8M Classification with handwritten digits 8M images, 10 classes LeNet5 12
  • 13. Similar results with CIFAR10 Classification with internet images 60K images, 10 classes 13
  • 14. Experiments 1. Fully connected layers crucial? No 2. Convolution layers crucial? 3. Learn parameters end-to-end crucial?
  • 15. Kernel methods directly on inputs? Fixed convolutionWithout convolution 0 0.2 0.4 0.6 0.8 1 1.2 MNIST 2 convolution layer 0 10 20 30 40 CIFAR10 2 convolution layers 0 20 40 60 80 100 ImageNet 5 convolution layers 15
  • 16. Kernel methods + random convolutions? Fixed convolutionWithout convolution Random convolution 0 0.2 0.4 0.6 0.8 1 1.2 MNIST 2 convolution layer 0 10 20 30 40 CIFAR10 2 convolution layers # random conv ≫ # fixed conv Random 16
  • 17. Structured composition useful Not just fully connected layers, and plain composition 𝑓 𝑥 = ℎ 𝑛 ℎ 𝑛−1 … ℎ1 𝑥 Structured composition of nonlinear functions 𝑓 𝑥 = ℎ 𝑛 ℎ 𝑛−1 … ℎ1 𝑥 𝑝𝑎𝑡𝑐ℎ1 , ℎ1 𝑥 𝑝𝑎𝑡𝑐ℎ2 , … , ℎ1 𝑥 𝑝𝑎𝑡𝑐ℎ 𝑚 17 the same function
  • 18. Experiments 1. Fully connected layers crucial? No 2. Convolution layers crucial? Yes 3. Learn parameters end-to-end crucial?
  • 19. Lots of random features used 58M parameters 131M parameters AlexNet Scalable Kernel Method Error 42.6% Error 44.5% 1000 4096 4096 256 13 13 256 13 13 131K 1000 19 Fix
  • 20. 131M parameters needed? 58M parameters 32M parameters AlexNet Error 42.6% Error 50.0% 1000 4096 4096 256 13 13 256 13 13 32K 1000 20 Scalable Kernel Method Fix
  • 21. Basis function adaptation crucial Integrated squared approximation error by 𝑇 basis function [Barron ‘93] Error of adapting basis function ≤ 1 𝑇 Error of fixed basis function ≥ 1 𝑇2/𝑑 𝑓 𝑥 = 𝑖=1 7 𝛼𝑖 𝑘 𝑥𝑖, 𝑥 𝛼1 𝛼2 𝛼3 𝛼4 𝛼5 𝛼6 𝛼7 𝑥1 𝑥2 𝑥3 𝑥4 𝑥5 𝑥6 𝑥7 𝑘(𝑥𝑖, 𝑥) 𝑓 𝑥 = 𝑖=1 2 𝛼𝑖 𝑘 𝜃 𝑖 𝑥𝑖, 𝑥 𝑥1 𝑥2 𝑘 𝜃 𝑖 (𝑥𝑖, 𝑥) 𝛼1 𝛼2 21
  • 22. Learning random features helps a lot 58M parameters 32M parameters Learn and basis adaptation AlexNet Error 42.6% Error 43.7% 1000 4096 4096 256 13 13 256 13 13 32K 1000 Fix 22/50 Scalable Kernel Method
  • 23. Learning convolution together helps more 58M parameters 32M parameters Learn and basis adaptation AlexNet Error 42.6% Error 41.9% 1000 4096 4096 256 13 13 256 13 13 32K 1000 Jointly learn 23 Scalable Kernel Method
  • 24. Lesson learned: Exploit Structure & Train End-to-End Deep learning over (time-varying) graph
  • 25. Co-evolutionary features ChristineAliceDavid Jacob Item embedding 𝑓𝑖(𝑡) User embedding 𝑓𝑢(𝑡) User-item interactions evolve over time … 25
  • 26. ChristineAliceDavid Jacob User embedding 𝑓𝑢(𝑡) Co-evolutionary features Item embedding 𝑓𝑖(𝑡) User-item interactions evolve over time … 26
  • 27. ChristineAliceDavid Jacob User embedding 𝑓𝑢(𝑡) Co-evolutionary features Item embedding 𝑓𝑖(𝑡) User-item interactions evolve over time … 27
  • 28. ChristineAliceDavid Jacob Item embedding 𝑓𝑖(𝑡) User embedding 𝑓𝑢(𝑡) Co-evolutionary features User-item interactions evolve over time … 28
  • 29. ChristineAliceDavid Jacob Item embedding 𝑓𝑖(𝑡) User embedding 𝑓𝑢(𝑡) Co-evolutionary features User-item interactions evolve over time … 29
  • 30. ChristineAliceDavid Jacob Co-evolutionary features Item embedding 𝑓𝑖(𝑡) User embedding 𝑓𝑢(𝑡) User-item interactions evolve over time … 30
  • 31. Co-evolutionary embedding ChristineAliceDavid Jacob Initialize item embedding 𝑓𝑖 𝑛 𝑡0 = ℎ 𝑉0 ⋅ 𝑓𝑖 𝑛 0 Initialize user embedding 𝑓𝑢 𝑛 𝑡0 = ℎ 𝑊0 ⋅ 𝑓𝑢 𝑛 0 𝑢 𝑛, 𝑖 𝑛, 𝑡 𝑛, 𝑞 𝑛 Item raw profile features User raw profile features Drift Context Evolution Co-evolution User Item𝑓𝑖 𝑛 𝑡 𝑛 = ℎ 𝑉1 ⋅ 𝑓𝑖 𝑛 𝑡 𝑛 − +𝑉2 ⋅ 𝑓𝑢 𝑛 𝑡 𝑛 − +𝑉3 ⋅ 𝑞 𝑛 +𝑉4 ⋅ (𝑡 𝑛 − 𝑡 𝑛−1) Update U2I: Drift Context Evolution Co-evolution ItemUser𝑓𝑢 𝑛 𝑡 𝑛 = ℎ 𝑊1 ⋅ 𝑓𝑢 𝑛 𝑡 𝑛 − +𝑊2 ⋅ 𝑓𝑖 𝑛 𝑡 𝑛 − +𝑊3 ⋅ 𝑞 𝑛 +𝑊4 ⋅ (𝑡 𝑛 − 𝑡 𝑛−1) Update I2U: 31[Dai et al. Recsys16]
  • 32. Deep learning with time-varying computation graph time 𝑡2 𝑡3 𝑡1 𝑡0 Mini-batch 1 Computation graph of RNN determined by 1. The bipartite interaction graph 2. The temporal ordering of events 32
  • 33. Much improvement prediction on Reddit dataset Next item prediction Return time prediction 1,000 users, 1403 groups, ~10K interactions MAR: mean absolute rank difference MAE: mean absolute error (hours) 33
  • 34. Predicting efficiency of solar panel materials Dataset Harvard clean energy project Data point # 2.3 million Type Molecule Atom type 6 Avg node # 28 Avg edge # 33 Power Conversion Efficiency (PCE) (0 -12 %) predict Organic Solar Panel Materials 34
  • 35. Structure2Vec 𝜇2 (1) 𝜇2 (0) 𝜇1 (0) 𝜇3 (1) 𝜇1 (1) …… 𝜇2 (𝑇) 𝜇3 (𝑇) 𝜇1 (𝑇) 𝑋6 𝑋1 𝑋2 𝑋3 𝑋4 𝑋5 𝜒 𝜇6 (0) …… …… Iteration 1: Iteration 𝑇: Label 𝑦 classification/regression with parameter 𝑉 Aggregate 𝜇1 (𝑇) 𝜇2 (𝑇) + + ⋮ = 𝜇 𝑎(𝑊, 𝜒) 35 [Dai et al. ICML 16]
  • 36. Improved prediction with small model Structure2vec gets ~4% relative error with 10,000 times smaller model! Test MAE Test RMSE # parameters Mean predictor 1.986 2.406 1 WL level-3 0.143 0.204 1.6 m WL level-6 0.096 0.137 1378 m structure2vec 0.085 0.117 0.1 m 10% data for testing 36
  • 37. Take Home Message: Deep fully connected layers not the key Exploit structure (CNN, Coevolution, Structure2vec) Train end-to-end

Editor's Notes

  1. Why the performance rather than interpret the results
  2. The task: classification (maybe one slide)
  3. Have one slides for the neural networks.
  4. The task: classification (maybe one slide)
  5. The actual classification number Not improving, finish it. Make the meaning of convergence clearer: given sample, fewer error. Same error, fewer samples. Emphasize what does it mean by scalable. (compare to alternative methods).
  6. Take features from the last pooling layer Le-Net5 [LeCun’12]
  7. H(x) the same line!!! Too busy!!! Remove the top. Smaller figure. Fewer gs.
  8. Need theory cited. Lower bound.
  9. Here we tried a large dataset, where the task is to predict the power conversion efficiency for molecular data. Accurate prediction is essential for screening of new form of energy and material. The dataset we used consists of 2.3 million samples from Harvard Clean Energy Project. And the figure here shows the PCE range is from 0 to 11
  10. Now is the time to put them together. We start with the zero embeddings, and then perform one step of fixed point equation update. For example, to get update of mu_2, we use its neighborhood embeddings and input features. Similarly, we can get updates for all other posterior marginal embeddings. Same as traditional graphical model inference, we need to iterate the fixed point update several times. Intuitively, this will allow each embedding capture more and more neighborhood information. In the last step, we merge those marginal embeddings to get a vector representation of entire structure data. We can see this model can be trained in an end to end fashion. Also, the parameters in embedding iteration layers are shared, which makes it similar to recurrent neural network. We can simply extend it by using LSTM to formulate the fixed point equation.
  11. Here is the result we reported. We compared with the Weisfeiler-Lehman kernel with different degrees. Since the kernel matrix cannot work in this scale, we manually created high dimensional explicit feature map for it. Due to its high dimensionality, we can at most work with degree 6. We can see that we get 4% for the relative error on predicting. Also, to get comparable result for the Weisfeiler-Lehman kernel, it requires 1.3 billion parameters. We can get better results with only 0.1m parameters, which is a 10k times smaller model than alternatives.