Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Deep Learning at NMC
Devin Jones
Director Machine Learning Lab, Nielsen
Devin Jones
● Machine Learning & Statistics
○ Research
■ Classification
■ Inference
■ Time Series
○ Application
■ Large sc...
ML at NMC
Intro to Deep Learning
Deep Learning Research at NMC
Agenda
Machine Learning at NMC
“
”
Used to build larger audiences from smaller audience
segments to create reach for advertisers.
In theory, they reflect...
The ML Challenge at NMC: An Example
Supervised Classification for Online Ad Targeting
ML at NMC
Data
Algorithm
Model
Supervised What?
Machine Learning has two main categories:
Supervised Learning
Unsupervised Learning
Supervised What?
Machine Learning has two main categories:
Supervised Learning: Inferences on Labeled Data
Unsupervised Le...
Supervised vs Unsupervised Learning
Supervised:
Spam or Ham?
Unsupervised:
Clustering Wikipedia
Articles
NMC high-level architecture
Machine Learning
The quality of data for a model will influence the model’s success
At NMC, we have access to high dimensional, sparse data...
To date, we have implemented these algorithms in our real time scoring engine:
We score billions of events per day using t...
Deep Learning at NMC
Topics
Motivation Intro to DL
NN
Architecture
GPU vs CPU
Motivation
2
1 Recent Success in Deep Learning
NMC data is similar to Natural Language Processing (NLP) data
Certain ad targeting pro...
Deep Learning: Recent Success
▪ AlphaGo defeats all world top
professional Go players
▪ Image and Speech recognition
excee...
NMC Data & NLP Data
NLP data:
Observation: [‘This’, ‘is’, ‘a’,
‘tokenized’, ‘feature’, ‘vector’,
‘used’, ‘for’, ‘machine’,...
Deep Learning: Some definitions
Neural Network
Input
Hidden
Output
Neural Network
Input
Hidden
Output
Neural Network: Neuron
Neural Network: Neuron
Lives in NYC? = Yes
Orders from
Dominos?
Works in ad
tech?
0.5 =
0.01=
0.7 =
= 1.2= No
= Yes
A Deep Neural Network
Neural Network Phases
Training Inference
NMC high-level architecture
Machine Learning
NMC high-level architecture
Machine Learning (Training)
(Inference)
Definition Summary
● Training
● Inference
○ Matrix Multiplication
● Nodes
● Layers
● Network
DNN Architecture
DNN Architecture
Image Processing :: Convolutional Networks
Speech Recognition :: Recurrent Networks
AlphaGo :: Reinforcem...
An Architecture Example: Conv Nets
A Fully Connected DNN
A Residual Network
Figure 2. Convergence of neural network model with
forward shortcut (Residual Net)
Figure 1. Convergence of neural network...
DNN Architecture
For Structured Data
Category Segment
City Prosperity
World-Class Health
Uptown Elite
Penthouse Chic
Metro High-Flyers
Prestige Positions
Premi...
C1 C2 C3
S1 S3 S4 S5 S6S2
Multi-level Hierarchical Classification
C1 C2 C3
S1 S3 S4 S5 S6S2
Naive
Approach
Multi-level Hierarchical Classification
DNN For Multi-Level Hierarchical Classification
GPU vs. CPU
Batch Size & Processing Time
We are not batching matrix algebra operations
NMC Serving operates on 1 request at a time!!
GPU vs CPU
CPU Computational Improvements
Inference on a Layer ~ Matrix Multiplication
Input
Hidden
Sparse Matrix Multiplication
inference
improvement,
sub-millisecond
model
evaluation
32x
Trimming
WEAK CONNECTIONS
most connections in deep neural network are
very weak and can be removed
TRIMMING
LOW ACCURACY IMPACT
the...
Neural Network Without Trimming
Neural Network Trimming
Model Model File
Size (MB)
Trimming
Threshold
Accuracy Scoring
Time (ms)
Not trimmed 108 0.0 13.29 10.0
Trimmed 2.7 0.001 ...
Key Takeaways
Architecture:
● Residual Networks saved the day
● Leverage expressive power of DNN for your data
Inference:
...
Thanks!
Upcoming SlideShare
Loading in …5
×

Deep learning at nmc devin jones

365 views

Published on

Deep learning at nmc devin jones

Published in: Technology
  • Be the first to comment

Deep learning at nmc devin jones

  1. 1. Deep Learning at NMC Devin Jones Director Machine Learning Lab, Nielsen
  2. 2. Devin Jones ● Machine Learning & Statistics ○ Research ■ Classification ■ Inference ■ Time Series ○ Application ■ Large scale ■ Streaming Introduction ● Columbia University ○ CS/ML ● Rutgers University ○ Statistics ○ Econ ○ Operations Research ● Ad Tech (7 years)
  3. 3. ML at NMC Intro to Deep Learning Deep Learning Research at NMC Agenda
  4. 4. Machine Learning at NMC
  5. 5. “ ” Used to build larger audiences from smaller audience segments to create reach for advertisers. In theory, they reflect similar characteristics to a benchmark set of characteristics the original audience segment represents, such as in-market kitchen-appliance shoppers. adage.com The ML Challenge at NMC Look Alike Modeling
  6. 6. The ML Challenge at NMC: An Example
  7. 7. Supervised Classification for Online Ad Targeting ML at NMC Data Algorithm Model
  8. 8. Supervised What? Machine Learning has two main categories: Supervised Learning Unsupervised Learning
  9. 9. Supervised What? Machine Learning has two main categories: Supervised Learning: Inferences on Labeled Data Unsupervised Learning: Inferences on Unlabeled Data
  10. 10. Supervised vs Unsupervised Learning Supervised: Spam or Ham? Unsupervised: Clustering Wikipedia Articles
  11. 11. NMC high-level architecture Machine Learning
  12. 12. The quality of data for a model will influence the model’s success At NMC, we have access to high dimensional, sparse data: The Feature Set & Scale Models are trained in batches of 100,000 to 100,000,000 users depending on the purpose ~4,000 Segments ~200 Publishers User Agent Geographic Info (zip code) + + + Resulting in over 100k features to choose from
  13. 13. To date, we have implemented these algorithms in our real time scoring engine: We score billions of events per day using these models and our ML infrastructure ML Algorithms at NMC Binary Linear Model kNN Multinomial Linear Models Online Learning for Linear Models Random Forest And of course… Deep Learning
  14. 14. Deep Learning at NMC
  15. 15. Topics Motivation Intro to DL NN Architecture GPU vs CPU
  16. 16. Motivation
  17. 17. 2 1 Recent Success in Deep Learning NMC data is similar to Natural Language Processing (NLP) data Certain ad targeting problems can be framed as expressive, hierarchical relationships MOTIVATION 3
  18. 18. Deep Learning: Recent Success ▪ AlphaGo defeats all world top professional Go players ▪ Image and Speech recognition exceed human abilities ▪ AI in consumer products: Amazon Echo Google Home Autonomous Driving All of these recent AI breakthroughs are based on Deep Neural Networks!
  19. 19. NMC Data & NLP Data NLP data: Observation: [‘This’, ‘is’, ‘a’, ‘tokenized’, ‘feature’, ‘vector’, ‘used’, ‘for’, ‘machine’, ‘learning’, ‘in’, ‘NLP’] NMC data: User: [ ‘segment: Likes Outdoors’, ‘segment: Male 25-35’, ‘location: New York, NY’]
  20. 20. Deep Learning: Some definitions
  21. 21. Neural Network Input Hidden Output
  22. 22. Neural Network Input Hidden Output
  23. 23. Neural Network: Neuron
  24. 24. Neural Network: Neuron Lives in NYC? = Yes Orders from Dominos? Works in ad tech? 0.5 = 0.01= 0.7 = = 1.2= No = Yes
  25. 25. A Deep Neural Network
  26. 26. Neural Network Phases Training Inference
  27. 27. NMC high-level architecture Machine Learning
  28. 28. NMC high-level architecture Machine Learning (Training) (Inference)
  29. 29. Definition Summary ● Training ● Inference ○ Matrix Multiplication ● Nodes ● Layers ● Network
  30. 30. DNN Architecture
  31. 31. DNN Architecture Image Processing :: Convolutional Networks Speech Recognition :: Recurrent Networks AlphaGo :: Reinforcement Learning
  32. 32. An Architecture Example: Conv Nets
  33. 33. A Fully Connected DNN
  34. 34. A Residual Network
  35. 35. Figure 2. Convergence of neural network model with forward shortcut (Residual Net) Figure 1. Convergence of neural network model without forward shortcut(regular net) Residual Network Convergence
  36. 36. DNN Architecture For Structured Data
  37. 37. Category Segment City Prosperity World-Class Health Uptown Elite Penthouse Chic Metro High-Flyers Prestige Positions Premium Fortunes Diamond Days Alpha Families Bank of Mum and Dad Empty-Nest Adventure Multi-level Hierarchical Classification
  38. 38. C1 C2 C3 S1 S3 S4 S5 S6S2 Multi-level Hierarchical Classification
  39. 39. C1 C2 C3 S1 S3 S4 S5 S6S2 Naive Approach Multi-level Hierarchical Classification
  40. 40. DNN For Multi-Level Hierarchical Classification
  41. 41. GPU vs. CPU
  42. 42. Batch Size & Processing Time
  43. 43. We are not batching matrix algebra operations NMC Serving operates on 1 request at a time!! GPU vs CPU
  44. 44. CPU Computational Improvements
  45. 45. Inference on a Layer ~ Matrix Multiplication Input Hidden
  46. 46. Sparse Matrix Multiplication inference improvement, sub-millisecond model evaluation 32x
  47. 47. Trimming
  48. 48. WEAK CONNECTIONS most connections in deep neural network are very weak and can be removed TRIMMING LOW ACCURACY IMPACT the trimming has very little impact on the accuracy COMPRESSED DATA the trimming models can be described by sparse matrices, and thus the data in models are highly compressed
  49. 49. Neural Network Without Trimming
  50. 50. Neural Network Trimming
  51. 51. Model Model File Size (MB) Trimming Threshold Accuracy Scoring Time (ms) Not trimmed 108 0.0 13.29 10.0 Trimmed 2.7 0.001 13.30 0.22 Trimming: Space, Time & Performance inference improvement, in CPU time and storage 50x
  52. 52. Key Takeaways Architecture: ● Residual Networks saved the day ● Leverage expressive power of DNN for your data Inference: ● You might not need a GPU for Deep Learning ● Improvements can be made on Sparse Matrix Algebra libraries ● Use trimming
  53. 53. Thanks!

×