Presented at ad:techSF 2015 by Michael Tiffany, the co-founder and CEO of White Ops, a security company founded in 2013 to break the profit models of cybercriminals, and Brandon Miller, Sr. Engagement Strategist for Carmichael Lynch.
11. Why you care:
Your money gets home users hacked.
You are being tricked into tracking bots.
1
2
12. Why hack home users?
Hint: not to rob their digital funds and identities
(not that they don’t)
13. If you want to get targeted, you (often) need a consumer’s identity.
That can be arranged.
14. False assumptions:
Bots are afraid of tracking (nope: hacked
goods make them seem legitimate)
Optimizing for performance, or
viewability, or conversions squeezes out
the bots automatically (nope)
x
x
…our findings show otherwise.
15. Bot fraud is the
scalable ad fraud
Yes, you should probably care about pixel stuffing,
ad clutter, ad collision, etc. etc. etc. But those things
don’t happen on expensive placements. Those
things don’t add up to $6.3 billion dollars. Those
things don’t funnel money to organized crime. Your
CFO cares about stopping money going to
organized crime. He may not care about ad clutter.
35. The attackers adapt Here they come.
Turn the bots
off!
They’re leaving.
Turn the bots
back on.
We have a
complaint. Clean
it up.
Here they
come again…
36. There are some interesting patterns…
When advertisers demand more
traffic, the differential between
available humans and advertiser
demand for traffic can be made up
with bots.
Bots will often supply traffic as
needed in bursts – in this case,
every Saturday
37. There are some interesting patterns…
Not all botnets are run by geniuses: some bots are too
dumb to keep daylight hours:
38. Old Browsers Are Bot Browsers
Bots both:
Cycle through many
fake user-agents
(browsers) to hide in
the noise
Provide real user-
agents, but don’t get
auto-updated
Why are we still
supporting old
browsers?!
40. • Taking on all the botnets at once requires
hardcore malware reverse-engineering and
major intelligence operations.
• We’re in an arms race against the world’s
best cybercriminals.
• It’s fun to point out these patterns, but if all
we had to do was find the patterns, this
problem would have been solved already.
42. We all need to work together
to solve the problem of ad fraud.
43. On the Sell Side,
real can’t compete with fake
If the Buy Side
can’t tell the difference
44. In December 2014, on behalf of a large brand,
the ad agency Carmichael Lynch
decided to make an above-average campaign even better.
45.
46. Carmichael Lynch’s
Anti-Fraud Formula:
Monitor for fraud in all the brand’s campaigns
Use continuous monitoring (Detection) to hold all supply
partners accountable and to reward great ones
Take proactive steps (Prevention) only where it makes
sense for the buyer to take that burden
49. Solution: Protect high value media investment –
reduce fraud where it hits the hardest by dollars
Campaign Human Bots Bots %
1* 350M 20M 5%
2* 260M 20M 7%
3* 190M 14M 7%
4 76M 3M 4%
5* 63M 10M 13%
50. 1. Top volume campaigns had
expensive bot problems
2. Small but significant bot
percentages across too many
placements to address manually
Top bot problems:
52. Solution: Anti-targeting!
In one day, Carmichael Lynch
cut the brand’s bot percentage
by 43%.
5.90%
7.80%
6.70%
3.80% 3.40%
2/22, 13 MM 2/23, 15 MM 2/24, 16 MM 2/25, 14 MM 2/26, 13 MM
Bot % of total
53. 1. Top volume campaigns had expensive
bot problems
2. Small but significant bot percentages
across too many placements to address
manually
3. Bot fraud varied by placement by time:
being clean today didn’t guarantee being
clean tomorrow
Top bot problems:
54. In ongoing fraud-cutting activities, Carmichael Lynch
improved traffic by cutting or repairing the worst offenders
Solution: Continuous monitoring
55. Authorize and approve third-party traffic validation technology
Be aware and involved
Use third-party monitoring
Budget for security
Protect yourself, your users, and your media from ad fraud
✓
✓
✓
✓
✓
To defend against sophisticated
and basic ad fraud attacks,
Very effective, clearly very profitable
“Playing with fire” (but not always burning)
They’re called bots, they’ve always been something of a problem, they’ve become something more: Fake web browsers, going to real (or fake) sites, “viewing” real advertisements and demanding payment for the service
It’s worse than this.
We are highly conservative scientist hackers
The $6.3B we’re asserting is based on the smokiest of guns
That’s OK. $6.3B in yearly losses is bad enough even as an understatement
What’s more significant is
Who
Where
We are highly conservative scientist hackers
The $6.3B we’re asserting is based on the smokiest of guns
That’s OK. $6.3B in yearly losses is bad enough even as an understatement
What’s more significant is
Who
Where
1) This is not a victimless crime
Criminal networks are being paid to hack home users…with your money
Why home users?
Besides the fact that, as long as you’re there, might as well look around for $$$
2) Targeting is getting hacked too
They hack home users because ads don’t target Amazon EC2
It’s not just about the IPs – when you hack a machine, you get its cookies
Intel: When we see sites that exist purely to host advertisements to bots, they sign up for every tracking scheme they can
The worst bot sites run 4x tracking of legitimate/popular sites
They’re not afraid of tracking – they have the goods from the legitimate user
Many significant systems think they’re safe because they assume bots lack magic cookies
The data does not support that.
Why home users?
Besides the fact that, as long as you’re there, might as well look around for $$$
2) Targeting is getting hacked too
They hack home users because ads don’t target Amazon EC2
It’s not just about the IPs – when you hack a machine, you get its cookies
Intel: When we see sites that exist purely to host advertisements to bots, they sign up for every tracking scheme they can
The worst bot sites run 4x tracking of legitimate/popular sites
They’re not afraid of tracking – they have the goods from the legitimate user
Many significant systems think they’re safe because they assume bots lack magic cookies
The data does not support that.
Fraud is not evenly distributed. Neither is tuberculosis.
Video is (on average) almost 2.5x as botty as display
Almost a quarter of video advertisement went to nobody
Programmatic is 50% bottier, retargeting is 75% bottier than average
“Premium sites” are safer – only 25% of fraud lived on them – but bots make their way to them too
Huge variance in bottiness according to domain categories
Finance/Family/Food: 16-22%
Sport/Science/Info: 2-3%
Video is (on average) almost 2.5x as botty as display
Almost a quarter of video advertisement went to nobody
Programmatic is 50% bottier, retargeting is 75% bottier than average
Programmatic buys can be OK, but are often risky
One of the largest exchanges consistently yielded about 33% bots
We did see programmatic buys sometimes down at the 3% bot level, though, so it’s not universally bad
DSPs allow some really crazy stunts
One publisher funneled over 90% bot traffic through DSPs to half of study participants
Remember, we also have a selection bias in that our 36 participants are some of the largest advertisers in the world
Their ads were still showing up on sites no human would ever visit or appreciate
Very effective, clearly very profitable
“Playing with fire” (but not always burning)
You can’t just target users, let alone “retarget” them
“Oh, I know you’re somebody who browses CNN, I’ll advertise on fakesite”
Almost twice as many bots when retargeting
One case study: 17% bot on overall traffic, 55% bot on the retargeted campaign
Remember, these are where we’re seeing smoking guns, and the numbers are still severe.
“Premium sites” are safer – only 25% of fraud lived on them – but bots make their way to them too
We are able to detect traffic sourcing – when a site pays another site to “send it traffic”
The majority of sourced traffic that we witnessed was obviously botty, even/especially for premium publishers
Actually our single strongest predictor of bottiness
One direct buy, premium set of 60 campaigns was “shuffled”:
30 highly human placements
30 highly botty placements, varying between 16% and 64% bot
Active campaign to “play the game of averages” (17% bot total)
One direct video buy at an unambiguously premium publisher yielded 98% bottiness
How does this happens?
1) There was money on the table
2) They didn’t think they’d get caught
The bad guy is not the advertiser, the agency, the exchanges, or the publishers (usually)
The bad guy is the hacker.
Everybody needs to work together to trace where the hacker is.
Surprisingly, the most common question: “How do the bad guys make money?”
1) Fake sites
2) Real sites that need a little more traffic
Fake sites
Objectively and measurably awful content
Scraped or copied from other sites wholesale
Doesn’t matter, nobody human goes there anyway
Sites host ads, ads generate revenue
75% of bot traffic went to non-premium, mostly fake sites
Definitely a dominant paradigm…
…but does this mean the real sites are cleaner?
…are there real sites?
Fake sites
Objectively and measurably awful content
Scraped or copied from other sites wholesale
Doesn’t matter, nobody human goes there anyway
Sites host ads, ads generate revenue
We are able to detect traffic sourcing – when a site pays another site to “send it traffic”
The majority of sourced traffic that we witnessed was obviously botty, even/especially for premium publishers
Actually our single strongest predictor of bottiness
75% of bot traffic went to non-premium, mostly fake sites
Difficulties: It’s a huge ecosystem, and everyone profits from bigger numbers but you
They knew we were coming, and turned off the bots
They thought we were leaving, and turned the bots back on
We lied about when we were coming and going
We get to do that
We watched ‘em come and go
(For one particular study participant)
Also watched significant adaptation to complaints
If we called out an ad flow as botty, suddenly it’d be less botty
Somebody knows
Annoying, we sure spend a lot to make sure we’re compatible with old browsers…
Bots either:
A) Fake their user-agent to be some old/random value, so they can’t easily be identified
B) Don’t fake their user-agent, but also don’t get autoupdated by OS/real user browser
Surprisingly, the most common question: “How do the bad guys make money?”
1) Fake sites
2) Real sites that need a little more traffic
Fake sites
Objectively and measurably awful content
Scraped or copied from other sites wholesale
Doesn’t matter, nobody human goes there anyway
Sites host ads, ads generate revenue
75% of bot traffic went to non-premium, mostly fake sites
Definitely a dominant paradigm…
…but does this mean the real sites are cleaner?
…are there real sites?
Surprisingly, the most common question: “How do the bad guys make money?”
1) Fake sites
2) Real sites that need a little more traffic
Fake sites
Objectively and measurably awful content
Scraped or copied from other sites wholesale
Doesn’t matter, nobody human goes there anyway
Sites host ads, ads generate revenue
75% of bot traffic went to non-premium, mostly fake sites
Definitely a dominant paradigm…
…but does this mean the real sites are cleaner?
…are there real sites?
Surprisingly, the most common question: “How do the bad guys make money?”
1) Fake sites
2) Real sites that need a little more traffic
Fake sites
Objectively and measurably awful content
Scraped or copied from other sites wholesale
Doesn’t matter, nobody human goes there anyway
Sites host ads, ads generate revenue
75% of bot traffic went to non-premium, mostly fake sites
Definitely a dominant paradigm…
…but does this mean the real sites are cleaner?
…are there real sites?
Surprisingly, the most common question: “How do the bad guys make money?”
1) Fake sites
2) Real sites that need a little more traffic
Fake sites
Objectively and measurably awful content
Scraped or copied from other sites wholesale
Doesn’t matter, nobody human goes there anyway
Sites host ads, ads generate revenue
75% of bot traffic went to non-premium, mostly fake sites
Definitely a dominant paradigm…
…but does this mean the real sites are cleaner?
…are there real sites?
Surprisingly, the most common question: “How do the bad guys make money?”
1) Fake sites
2) Real sites that need a little more traffic
Fake sites
Objectively and measurably awful content
Scraped or copied from other sites wholesale
Doesn’t matter, nobody human goes there anyway
Sites host ads, ads generate revenue
75% of bot traffic went to non-premium, mostly fake sites
Definitely a dominant paradigm…
…but does this mean the real sites are cleaner?
…are there real sites?
One direct video buy at an unambiguously premium publisher yielded 98% bottiness
How does this happen?
1) There was money on the table
2) They didn’t think they’d get caught