This paper describes Google's approach to detecting adversarial advertisements. They used a combination of automated and semi-automated methods: (1) machine learning models trained on various features to classify ads, (2) an ensemble-aided stratified sampling method to select ads for human experts to label, and (3) leveraging expert knowledge through active learning, rule-based models, and quality monitoring. The goal was to protect users while ensuring online ads remain trustworthy.
3. Introduction
4
- Proceedings of the 17th ACM SIGKDD International
Conference on Data Mining and Knowledge
Discovery, KDD (2011)
- Situation: Online Advertising System
- Google’s main profit came from the Ad Revenue
(approximately 80%), and grew yearly.
- Types of adversarial advertisement
• Counterfeit goods
• User safety issues
• Phishing
• Unclear or deceptive billing
• Malware
(source) Google(GOOGL)經營策略分析
4. What problem they solved
5
Challenges
• High cost of FPs, FNs
• Minority-class and multi-class issues
• Training many models at scale
Goal
• to detect and block those adversarial adversaries
• protecting users and ensuring that online advertisement remains a trustworthy
source of commercial information
5. How they solved: automated and semi-automated
6
Ad Crawl
Data Feed
Model
Aggregation High
confidence?
Allow to Serve
Block from
Serving
model model model model
Train
and
Evaluate
Train
and
Evaluate
Train
and
Evaluate
Train
and
Evaluate
Labeled
Ad Data
yes
Ensemble-Aided
Sampling
no
Domain
Experts
Exploratory
Tools
Unbiased
Metrics
Human Expert
Quality
Monitoring
Ensemble + MapReduce
(I)
(II)
(III)
6. • Features
- string-based, page type, crawl-based, link-based, non-textual content-based, advertiser
account level, policy-specific…etc
• Minority-class and multi-class issues
- One-vs-Good Multi-Class Classification
- Learning-to-Rank Methods for Classification (ROC-SVM)
- Cascade Models
How they solved (I): Learning methods (1/2)
7
(Figure5) Performance on Cascade Models vs. Single Models.
Improvement in recall at high precision level
high recall
high precision
(Figure4) Multi-class Cascade Framework
labeled pairwise example ( 𝑋# − 𝑋% , +1)
(Figure2) Class Structure.
7. • Training many models at scale
- Focus on scalability, engineering work
- MapReduce SGD
- Control Model Size (feature-hashing + projected-gradient)
How they solved (I): Learning methods (2/2)
8
Do expensive work in parallel
Do cheap work in sequentially
Preprocessing is parallelized; training is sequential!
(Figure6) SGD learning via MapReduce
• Model management
- Calibration
- Monitoring
- model performance good or not ( precision/recall ; no production)
- input features stable or not (re-tuned model)
- model output scores (ground truth y) drift or not (aware, re-tuned model)
- system quality , based on pipeline (ensemble-aided stratified sampling)
8. How they solved (II) : Ensemble-aided stratified sampling
9
(Figure7) Ensemble-Aided Stratified Sampling.
• The multiple needs from hand-labeled data:
- Catching hard adversaries
- Improving learned models
- Detecting new trends
• Ensemble-aided stratification
- Divided into 3 categories
- Scores from ensemble model is used to divide the ads in each
category into score-bins containing different numbers of ads.
- How many ads to select that depends on the goals above.
• Priority sampling from bins
- Impression counts following heavy-tail distribution
- Priority Sampling ads from bins (Duffield et al.)
- near-optimally low variance
• Increased the effective impact of human experts by 50%
à selecting from new & all others
à mid-probability
9. How they solved (III) : Leveraging Expert Knowledge &
Data Quality Evaluation
10
• Active Learning
- Periodically detect new categories of bad ads
- Margin-based uncertainty sampling (crowd-sourcing & experts)
- When to stop?
• Exploring Adversaries
- Information retrieval system
• Rule-Based Model
- Only account for 4% of the overall system impact, they provide an important capability to
respond to new adversarial attacks within minutes of discovery.
• Monitor
- Human rater quality
- User experience
Actively select hard samples through algorithms
(source): Active Learning: 一個降低深度學習時間,空間,經濟成本的解決方案
10. Combining automated and semi-automated effort is powerful
11
Ad Crawl
Data Feed
Model
Aggregation High
confidence?
Allow to Serve
Block from
Serving
model model model model
Train
and
Evaluate
Train
and
Evaluate
Train
and
Evaluate
Train
and
Evaluate
Labeled
Ad Data
yes
Ensemble-Aided
Sampling
no
Domain
Experts
Exploratory
Tools
Unbiased
Metrics
Human Expert
Quality
Monitoring
(I)
(II)
(III)
More research is needed, automated classification methods, system-level challenges