This document discusses Vpon's mobile advertising system and recommender model. It describes the basic concept, challenges, and infrastructure of Vpon's ad serving platform. It then focuses on the recommender system, outlining the design, implementation, and evaluation process. Key steps include calculating ad and user similarities, predicting user preferences, optimizing ad delivery, and continuously improving based on results. The recommender significantly increased click-through and conversion rates while reducing costs.
2. Outline
Background, challenges and KPIs
Basic concept
Challenges and KPIs
Vpon Ad service infrastructure
AD effectiveness related work
Recommender
System flows
Summary
Q&A
8. Integration
Apps
Placing Ads
• Charged in CPC,
CPM
• Criteria:
• time, locations, app
categories, budget,
Performance reports
Advertisers
app
App reports
app app …
Mobile app users
Mobile app publishers
Advertisers
Ad performance reports
9. Vpon AD services backend
Data Archiving & Analysis
User Context
Runtime
information
User’s Ad
Requests
Ad Serving
Scalable
AD Serving
Transaction
& Billing
Real-time
Ad Selection
UserScenario
Modeling
Data
Mining
MR/Spark
HBase
HDFS
Ad-hoc
Analytics
Reporting &
Data
Warehouse
Adaptive AD
Distribution
System
Continues
Improvement
Ad
performance
P3
10. 60+ M
Monthly Active Unique Devices
200+ M
of Daily Ad Requests
2+ T
Ad transaction records over time
25+ M
Cell Towers/Wi-Fi AP Location Data
Some numbers for Vpon AD Network
P2
Taipei, Shanghai, HK, Bejing and Tokyo
2 IDCs at Taipei, Shanghai and Some Amazon EC2 nodes
11. Data Analysis
Ad Requests
Ad web
service
Backend
Memory
cache
In-
memory
Grid
HBase
MapReduce/Spark
HA Proxy
Message Routing (Apache Kafka)
Ad
Request
Cue
Backend
Hadoop Distributed
File System
(HDFS)
User Profiles
Ad Requests
HTTP POST
Avro Avro Avro
Ad videos, images
HTTP Get
Data Processing and Archiving
Creative
and videos
AD
management
Report UI
(Django,
SSH)
Vpon AD services
backend functionsCDN
Recommender System
Other
undergoing
topics
Reporting system
Sales
Support
System
AD-hoc
reporting
Operation
Ganglia
Solr
AD Operation
AD
Monitoring
System
Scenario
modeling
Avro
Web
Proxy
+
Cache
Memory
cache
In-
memory
Grid
Cue
User Profiles
(Couch DB
and HBase)
Rsync, Avro Avro
Python + pig, hive,
Hadoop Streaming, spark
Python + pig, hive,
Hadoop Streaming, spark
Advertisers
14. Recommender
Types
User(imei) based recommender system
Item(ad) based recommender system
Steps
Step1: Campaign/AD similarity table
Step2: Prediction Phase
Step3: Verification Phase
(Continuous Improvement)
15. • Serve ads according to users
preference
Recommender flow
Prediction
Machine
Learning
(e.g. recommender)
Evaluation
Data
Selection
• Select user records of the Ad
Click/Conversion action by
different kinds of Apps
• Select users logs of the
Location, Date/Time, Usage
Freq., Area, Movement Speed…
• Identify relation of the conversion
types, App info, Ad info and user
info to best choose configurations
• Campaign/AD similarity calculation
• User preferences
• Advertising in accordance with
the identified targeted users
• Feedback the AD execution
results into the system for
adjusting the modeling adaptively
P5
16. Ad 1 Ad 2 Ad 3 Ad 4 Ad N
User 1 0 0 1 0 0
User 2 1 1 0 1 0
User 3 1 1 1 1 1
User 4 1 1 0 0 0
User N … … … … …
Step1: Ads' Similarities
Unique
device IDs
from latest
K months
Historical and ongoing ads (App downloads as conversions)
17. Ad 1 Ad 2 Ad 3 Ad 4 Ad N
User 1 P(1,1) P(1,2) P(1,3) P(1,4) P(1,5)
User 2 P(2,1) P(2,2) P(2,3) P(2,4) P(2,5)
User 3 P(3,1) P(3,2) P(3,3) P(3,4) P(3,5)
User 4 P(4,1) 1P(4,2) P(4,3) P(4,4) P(4,5)
User Z … … … … …
Step2: Users' Preferences
Unique
device IDs
from latest
K months
Historical and ongoing ads (App downloads as conversions)
18. User 1
User 2
… … … … … …
Step3: Prediction Phase:
ADs sorted by preference
19. Data Analysis
Ad Requests
Ad web
service
Backend
Memory
cache
In-
memory
Grid
HBase
MapReduce/Spark
HA Proxy
Message Routing (Apache Kafka)
Ad
Request
Cue
Backend
Hadoop Distributed
File System
(HDFS)
User Profiles
Ad Requests
HTTP POST
Avro Avro Avro
Ad videos, images
HTTP Get
Data Processing and Archiving
Creative
and videos
Billing
System
CDN
Recommender System
Other
undergoing
topics
Reporting system
Sales
Support
System
AD-hoc
reporting
Operation
Ganglia
Solr
AD Operation
AD
Monitoring
System
Scenario
modeling
Avro
Billing
Proxy
+
Cache
Memory
cache
In-
memory
Grid
Cue
User Profiles
(Couch DB
and HBase)
Rsync, Avro Avro
Step3: Prediction Phase:
Serving Ads based on
Preferences
user1 ad1,ad2, ad5
user2 ad2,ad4, ad5
user3 ad4,ad5,ad6,ad8
user1
Persisted on Apache CouchDB
Replicated to in-memory grid
22. Implementation
Hadoop MapReduce as computing platform
Using Hadoop streaming with Python
Map: a list of ad pairs as input for similarity caculation
Reduce: simply aggregate the map results
Re-modeling on a daily basis based on results
Will go on to use Haoop HDFS + Spark + Python for performance
benefit
23.
24. Summary
Build the infra. that proves models effective or not as early
as possible
AB testing for new models
Automate as much as possible
Monitoring and measurement
Computing resource
Properly manage Product, ad-hoc, analysis jobs
Optimization does work
Use Python wherever it fits