2. Agenda
• The online advertising prediction problem
• The User-Domain matrix as a graph
• PageRank
• K-Core
• Clustering with community detection
• Conclusions
2
4. The User-Domain Matrix
Represent m users’ visits to n web domains as a matrix
4
a.co
m
b.co
m
… n
1 0 0 … 1
2 0 0 … 0
… … … … …
m 0 1 … 1
m = 1
billion
users
n = 500,000 domains How do we make
sense of user
interactions across
sites?
How do we turn very
sparse 500k domains
into signals useful for
modeling?
10. Conclusions
• Graph-based approaches offer new opportunities
to extract information for online ad targeting
• Dato’s GraphLab makes graph analytics very easy
• GraphLab a good first step and can be extended
with more specialized libraries
10
11. Thank you! Contact me
@n_kowski
kyle@radiumone.com
Learn more
RadiumOne.com
LeanDataScience.com
Work with us
radiumone.com/careers
11
Questions?
Editor's Notes
< bipartite -> unipartite visual>
< Community Detection counts>
< Performance by ML cluster >
< bipartite -> unipartite visual>
< Community Detection counts>
< Performance by ML cluster >