Ruth Garcia presented on using simple machine learning models in an ads manager. Online advertising spending for mobile has grown significantly, with a 76.8% compound annual growth rate for mobile compared to 15.4% overall. The ads manager aims to balance increasing revenue and engaging users. Various machine learning models were considered for click prediction, including logistic regression, random forests, and neural networks. Challenges addressed categorical values through one-hot encoding and hashing tricks. Model performance was evaluated offline using metrics like precision at 1, mean reciprocal rank, and AUC. The talk concluded with lessons on starting lean, communicating machine learning requirements upfront, and balancing exploitation and exploration in ads delivery.
8. Expectation vs. reality
Reality
Not flexible but fast and
easier to implement.
Tensorfiow
• Optimization technique
• Embeddings
• Crossed columns
• Hashing
Features:
• User history,
• User features,
• Route features ,
• Ad features with,
colors, text
9. Challenges: Which model to use?
Model Possibilities (easy to read in node.js):
• Logistic regression
• Random Forest : gets lost
• Neural networks: too slow hard to put it in json
Solvers:
• Logistic regression: Liblinear, sag
• Train data in batches
Train all data at once
SGDClassifier
Saves memory
Gridsearch for
hyperparameteres
10. Challenges: Categorical values
Pros:
• No collisions
• Inverse mapping
Creatives
C1
C2
C3
C1 C2 C3
1 0 0
0 1 0
0 0 1
Creatives
C1
C2
C3
C4
C1 C2 C3 C4
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
Cons:
• Need to know all values in advance
• Not good for online learning
• Keep dictionary in prod
One hot encoding:
11. Challenges: Categorical values
id features
123 creative1,
advertiser2,mobile, etc.
321 creative2,
advertiser4,mobile, etc.
id Feat_1 Feat_2 Feat_3 …. Feat_k
123 0.1 0 1 …. 0
321 0.5 0 0 …. 1
Hashing Trick: map data of arbitrary sizes to data of a fixed size
Pros:
• Memory efficient
• Online learning
• No dictionary
Cons:
• No inverse mapping
• Hash collisions
12. Machine Learning Performance: offline
Precision at 1: based on
target groups.
Mean Reciprocal Rank:
order of ranked ads
AUC: if caring about ranking
Log-Loss: if caring about the
value of CTR
Other metrics :
13. Optimizing evaluation metric
Updating model based on different
sampling methods and training days.
2 3 4 5 6 7
Histogram of training days
6/ 4/ 18 6/ 11/ 18 6/ 18/ 18 6/ 25/ 18 7/ 2/ 18 7/ 9/ 18 7/ 16/ 18 7/ 23/ 18 7/ 30/ 18 8/ 6/ 18 8/ 13/ 18 8/ 20/ 18 8/ 27/ 18 9/ 3/ 18
AUC over time
Best AUC Worst AUC
14. Satisficing metric: Precision at 1
Choose best AUC conditioned of precision
at 1 better than random
6/ 4/ 18 6/ 11/ 18 6/ 18/ 18 6/ 25/ 18 7/ 2/ 18 7/ 9/ 18 7/ 16/ 18 7/ 23/ 18 7/ 30/ 18 8/ 6/ 18 8/ 13/ 18 8/ 20/ 18 8/ 27/ 18 9/ 3/ 18
Precision at 1: Satisficing metric
pr ecisi on_at _1 r andom _pr ecisi on_at _1
15. The road ahead: Balancing exploitation and exploration
Choose ad based on
ONLY CTR
Choose ad based on
OTHER criteria
– Most common
approaches:
– ! − #$%%&'
– ! − &%($%)*+,#
16. Learnings
1. Start lean to prove the value of your Machine Learning project
2. Speak up front since the beginning about the benefits and requirements of
using ML in the product (talk about time and costs)
3. If you have problems with dimensionality, explore different ways of optimizing
your resources, e.g., mini batch, hashing trick.
4, Advertising systems are very dynamic so be aware how often you need to
update the model.
Eng.