For media organizations an understanding of one's audience is key to delivering optimal content. The path to this understanding often runs through logged behavioral data, but can we leverage methods from the world of polling to generate additional insights?
7. Ultimately ads constitute a large fraction of our revenue!
And clearly we depend heavily on the top 2 to do well…
8. Ultimately ads constitute a large fraction of our revenue!
And clearly we depend heavily on the top 2 to do well.
“We are but barnacles living off the back of a massive ad whale…”
9. Ultimately ads constitute a large fraction of our revenue!
And clearly we depend heavily on the top 2 to do well.
“We are but barnacles living off the back of a massive ad whale…”
10. There’s actually a non-trivial symbiosis here
Instead of obligate commensalism…
11. It’s an asymmetric relationship, to be sure, but
publishers provide the ‘real estate’ over which
these big whales make their revenue. And we
make their platforms sticky.
Publishers
12. • Like barnacles there’s a lot of competition for space
• The terms of that competition are dictated by black-box algorithms
• They get to keep all that good data
Not all smooth sailing on the back of the whale
13.
14. Hopeless…?
How do we gain an advantage?
Can we predict our way to a competitive edge?
16. Marketing ApproachIlluminating the structure of social diffusion
To simplify we remove the leaves—uniques that arrive on our content through shared
links and only view. We’re left with just sharers.
In this case 40 sharers generate a cascade bringing in 2886 views - a 70x efficiency!
17. Next Steps!
Knowledge Graph
• Tracks intra-Mashable network of share
interactions
• Anonymously tracks browsing behavior and
usage attributes
• Allows us to observe and perhaps predict
cross network sharing events
19. Social cascades are fascinating…can we get a deeper view?
What can one do with this sort of
data?
20. Marketing ApproachArriving at a phenomenological model
The above is a cascade generated from a simulation with simple update rules.
It bears a strong resemblance to what we actually see in our share button experiment.
In fact, it turns out that a simple model of leave growth/viewer rate yields a model of share behavior with
predictive power!
21. Marketing ApproachVelocity
● Discovered right as it was
published
● Over 3,000 data points
collected
● Several points where story
trajectory changed & prediction
found & adapted.
● Early projections very
accurately modeled each
subseries in the total dataset.
Success!
23. • Facebook’s Accuracy: 79.5%
• Velocity’s Accuracy:
• 75% accurate after 5min across all content
• 80% accurate after 5hrs across all content
• 95% accurate after 1 day across all content
• 100% accurate after 5min for 70% of content!!
28. Marketing
Approach
Admiral Robert FitzRoy:
a true Data Scientist
• attacked a problem for which there was no or mostly dirty
data - data collection and munging
• formulated/borrowed a model appropriate to the data
• crafted a classifier (for storm prediction) and aimed for
increasing accuracy
• over-inflated title - Meteorological Statist…
• under routine evaluation to justify his salary
29. Social cascades are fascinating…can we get a deeper view?
What can one do with this sort of
data?
1) Build Velocity
2) Segment your audience by graph properties
32. Marketing ApproachOur results!
5000 10000 15000 20000 25000
nth Engaging
Node
0.2
0.4
0.6
0.8
1.0
α
α(t) for Second Largest Cascade
Surprising!
33. KG
Through these sorts of properties:
• We’re able to identify community structure by
topic
• Make branded content campaigns
significantly more efficient
34. Conclusion
Publishers are heavily challenged
but
• It’s possible to carve out a data advantage
• Turn it into a predictive analytic capability
• And even a bit of competitive analysis