How To Utilize Calculated Properties in your HubSpot Setup
Bot detection deck 042514 final
1.
2. A CLOSER LOOK AT BOTS
A Vindico Investigation – 1Q2014
3. What’s a Bot?
• An Internet bot is a software application that runs automated
tasks over the Internet
• Bots can be used for good (search indexing) or bad (ad
impressions, hacking, etc.)
• Reports now indicate there is more bot traffic than human traffic
on the Internet
• There are 3 main ‘types’ of bots:
• Crawler/Spider
• Covert Crawler
• Zombie Computers (Botnet)
• Bad bots are impacting the video advertising industry
Crawler/Spider
Covert Crawler
Zombie Computers
4. Bot: Crawler / Spider
• USES: Automated data collection, indexing
• HARDWARE: Typically runs on a cluster of Virtual Machines (VM) on servers located in a
datacenter
• ACTIVITY: Generally just makes ‘GET’ requests to static webpages and analyzes responses
for links, content, etc. Crawler/spiders do not render the webpage in a browser
• EXAMPLES: GoogleBot, BingBot
• DETECTION: These bots usually identify themselves in their user-agent string
• ADS: Typically would not render an ad. In addition, these bots are almost always on the
IAB Bot List and are excluded in impression accounts for MRC accredited ad servers by
leveraging the fact that they identify themselves in their user agent string
• VIEWABILITY: Not Applicable (ads not rendered, impressions filtered)
Benign
5. Bot: Covert Crawler
• USES: Generally malicious – associated with ad fraud, spam, hacking, scraping
• HARDWARE: Typically runs on a cluster of Virtual Machines (VM) on servers located in a
datacenter
• ACTIVITY: Mimics a human with full browsing and rendering behavior (plugins, cookies,
user-agent, mouse movement, time delays, engage with pages of site)
• EXAMPLES: Client Connections Media, VERSA*, DDC*
• ADS: Attempts to trick ad tracking systems so it registers as a true impression. These
crawlers do not identify themselves. In fact, they use a variety of real user-agent strings
that are undistinguishable from real users
• VIEWABILITY: Both geometric and browser optimization approaches to viewability will
think ads are viewable
Generally Malicious
*Source: detailed within this deck
6. Bot: Zombie Computer (Botnet)
Real machines ‘infected’ with software (‘virus,’ ‘worm,’ ‘malware’) that allows
a remote party to take control of various parts of the system.
• USES: Malicious – associated with ad fraud, hacking (bank accounts, emails, credit cards),
Bitcoin Mining, Ransomware
• HARDWARE: Can take over any PC, smart phone, or device. Typically created for
Windows (PC) and Android (mobile) environments, but not limited to those
• ACTIVITY: ‘Borrows’ users’ machine, processing or Internet / IP as a proxy, for opening
invisible browser windows and loading sites/ads, snooping on users. Replication over
network
• EXAMPLES: CryptoLocker, ZeuS, TDSS, ZeroAccess, ASPROX
• ADS: Attempt to trick ad tracking systems so they get paid. Use real user machines,
inherit real user IP addresses, real user agent strings, cookies, etc.
• VIEWABILITY: Exploits geometric viewability flaws
Malicious
7. Bots Have Negative Impact on Video Advertising
• Ad fraud has become an incredibly lucrative business for bot operators, especially with
the rise of online video where CPMs are much higher and detection capabilities have
historically been much lower.
• This has caused two major trends in the industry over the past 2 years:
• Number of impressions to skyrocket
• CPMs to decrease
• The two parties that are negatively impacted the most are advertisers and real
publishers.
• Middle men are still able to make their margin, but lower CPMs force them to use the
(cheaper) fraudulent inventory sources, which therefore continue to feed the beast and
grow the problem.
Soure: Vindico Adtricity, Q1 2014; Annual Estimate based on $15 CPM
8. What Vindico Bot Detection Uncovered
Using the Adtricity system we’ve identified the top 700,000 bots and zombie machines
(botnets) over Q1 2014.
• Initial launch will focus on the top 50% of Bots:
• 11.23% of all Vindico-Adtricity VPAID Imps in Q1
• 7.9B Vindico-Adtricity Bot Impressions in Q1
• $76 million* in fraud in Q1 alone, just in US online video.
• 66% of bot impressions were from ‘zombie computers’; 34% were from
‘covert crawlers’
• Affected Advertisers
• Avg: 10.06% of impressions
• Highest: 52.66% of impressions
• Breakdown by Publishers:
• Highest: 50.9% of impressions
• Media Companies: <2% of impressions
• Networks: 24% of impressions
*Estimate based on $15 CPM
The number of bots is rising and number of impressions
affected are rising (see Q1 trend graph above)
9. Exposing Bots: Covert Crawlers
Covert Crawler ‘Versa’
• Stats: 55 million imps / month = $825k / month*
• Total Sites: 5 Core with at least 100 total
• Notes: Sites are same template, fake display ads, tokenized urls, VMs spoofing
user agents, exact amount of caps ads / IP, rotated screen resolutions, etc.
Distributed Data Center
• 150 million imps / month = $2.2 million / month*
• Top Sites: techbrowsing.com (1/2 the size of all Versa), anchorfree.us,
recipeaccess.com
• Total Sites: 15 – 20 Core with at least 100 total
• 7-10 Core datacenters
• Examples: Host Protocol, EGIHosting, MyPrivateProxy.net, GIGLINX, Alentus%,
ManageDNS
Generally Malicious
*Estimate based on $15 CPM
10. Exposing Bots: Botnets
The Asprox / Kuluoz Botnet
• Currently this botnet is extremely active
• Current main method of initial infection: malware-phishing emails
• WhatsApp Message (via a link)
• Notice to Appear in Court (via an attachment)
• Once installed, it follows the below chain to PPC networks *:
*Source: techhelplist.com
Malicious
12. How to Fight Bots in Video Advertising
Viewability alone is not enough.
• Bots can fool viewability
• Good viewability vendors will record bot impressions as non-viewable, but some bots can
manipulate viewability metrics for the campaign
Bot filtering alone is not enough.
• 1x1 iframes can still be manipulated
Bot filtering + viewability is not enough.
• Certain sites and measurements can be manipulated (i.e. porn sites, player size, etc.)
A combination of multiple metrics including viewability, execution,
content, and traffic are the only way to truly protect ad dollars and grow
the ecosystem to the point where it can truly complement TV for brand
advertisers.
13. How Vindico Helps
There are 3 strategic components to our Detection System:
1. Data Collection
• 40% of all online videos. More data points than anyone else.
2. Data Processing
• Big Data.
• Adtricity servers processes over 1 million events every minute.
• This data has to be logged, loaded, and ready for analysis in real time.
• Even Hadoop, the most well known Big Data framework, was not
enough.
• Adtricity utilizes a cutting edge Big Data framework called Spark.
3. Data Analysis
More data than a human could ever analyze.
Adtricity uses cognitive thinking (artificial intelligence) through machine
learning to detect and block bots in real time. Adtricity is a comprehensive measure of quality
offering a standardized and transparent system of
measurement to the industry. Adtricity brings together
viewability and verification into a single solution.
14. Conclusion
Bots have infiltrated the video advertising industry and are increasing scale and impressions at an alarming rate.
Vindico’s Bot Detection technology was developed to help advertisers combat fraudulent activity in video advertising. Bot detection is most powerful when
part of a buy-side platform as it is organically integrated from the point of delivery and can be used across the full scope of the advertiser’s buy.