Like most ecommerce sites, StubHub’s competitors try to scrape their prices, and monitor inventory and customer behavior. Meanwhile, other nefarious actors attempt brute force attacks and transaction fraud. Learn advanced website security and web infrastructure management strategies from StubHub, the world’s largest ticket marketplace, and Distil Networks, the global leader in bot detection and mitigation.
Learn how to:
- Protect prices and product listings from being scraped or monitored by competitors
- Defend your site against brute force login attacks and carding
- Ensure brand secrets and pricing schedules are kept safe
- Increase revenues by ensuring traffic is from legitimate sources
- Protect your brand image, reputation and SEO rankings
3. #RSPS15
Questions, Tweets & Resources
Submit your
questions
here
Download
today’s
resources
Join the
conversation
#RSPS15
4. #RSPS15
About Retail TouchPoints
Launched in 2007
Over 30,000 retail subscribers
To provide executives with relevant,
insightful content across a variety of
digital medium
Sign up for our weekly newsletter:
www.retailtouchpoints.com/subscribe
6. StubHub’s Field Guide to Preventing Competitor Price
Scraping, Unwanted Transactions, Brute Force Attacks, and
Click Fraud
7. Agenda
The growing bot problem
The impact of bots on e-commerce businesses
How StubHub squashed malicious bots
Selection criteria for a bot detection solution
Q & A
8. What Is Web Scraping?
Web Scraping
Also known as screen scraping, web scraping is the act of
copying large amounts of data from a website – either
manually or with an automated program (Bot)
Legitimate Scraping
Scraping can sometimes be benevolent and totally
acceptable. For example, the search engine bots that index
your website
Malicious Scraping
A systematic theft of intellectual property accessible on a
website, including pricing, content, images, and proprietary
data
9. Web Scraping at Large Online Beauty Retailer
Black Friday saw a
100x Increase in
Bad Bots
10. Challenges Distil Results
Competitors were scraping product and pricing data,
using it to lure customers away
Stopped competitors from scraping pricing and product data by
blocking bad bots
Traffic from malicious bots was consuming server
resources and slowing site performance
Eliminated bad bot traffic, cutting server resource needs by
22% while improving performance
Tracking suspicious IP addresses manually was a
tedious manual process
Automated the bot detection and mitigation process, saving
valuable IT resources
Beauty Retailer Clamps Down on Competitive Data Mining
One of Europe’s largest
online beauty retailers.
We have a handful of competitors that cause us a lot of
headaches. With Distil, we’ve stopped them from scraping our
data, which protects our competitive advantage. In addition,
we’ve reduced the load by 22%, and our customers experience
faster response times. ”
-Principal Solutions Developer
“
11. How Big is the Problem?
Up to 60% of traffic on ecommerce websites are Bad Bots
4.2 million IP addresses impacted by “Pushdo” botnet alone
15% bot traffic can equate to hitting each of your pricing pages
30 times per month
12. Why the Massive Increase in Bot Traffic?
Online data has increased in value
Pricing information, product availability, product
descriptions, and vendor reviews are changing
daily and highly valuable to competitors
Anyone can get in the game
Cheap or free virtual servers, bandwidth, easy-to-
use tools, and scrapers for hire
Bots no longer tied to IP addresses
Bots cycle through random IP addresses
Bots hide behind anonymous proxies
Consumer IPs now infected with bot traffic too
13. High Profile Web Scraping in the Ecommerce
Industry
QVC is an American television home shopping
network and online ecommerce site.
Aggressive price and inventory scraping by shopping aggregator app
resulted in the following repercussions for QVC
● Two day website outage
● Loss of $2M in revenue
● Highly publicized lawsuit
● Damage to QVC Brand
14. Negative SEO Attacks
Bots steal content, product lists, and prices for
duplication elsewhere on the Internet
Duplicated content reduces your company’s
uniqueness and thus quality score
SEO damage may result, especially if
○Your prices are undercut
○The content is repurposed on a more popular site
Bots and Negative SEO Attacks
15. Bots and Competitive Data Mining
Duplicating your Product Portfolio
Bots can easily gather product and supplier lists
for replication elsewhere
Undermining your Prices
Bots monitor your prices, ensuring competitors
can undercut with lower price listings
Availability Tracking
Identifying when your supply has been exhausted provides competitors a unique
opportunity to raise the price of their goods.
16. Bots and Security Breaches
Brute Force Account Takeover
Using a bot to try stolen usernames and passwords from
breaches at other websites on your site
Newly compromised accounts are then used for various forms
of fraud/theft
17. Bots and Transaction Fraud
Carding
Creating micro-transactions with stolen credit cards
against e-commerce sites to test their validity
18. About StubHub
Largest secondary ticket marketplace in the world
An eBay company
Processes nearly 500 transactions per second
StubHub is an online marketplace which provides
services for buyers and sellers of tickets for sports,
concerts, theater and other live entertainment
events.
19. StubHub Bot Challenges
Bot Challenges
○ Bots were used for brute force account takeovers
○ Competitors tried to game the system, scraping prices, and
monitoring inventory and customer behavior
○ Random spikes in bot traffic were causing increased utilization
of resources
○ Tested multiple competitor solutions, but they were difficult to
configure and in some cases broke our website
20. StubHub Bot Selection Criteria
Bot Detection and Mitigation Solution Requirements
○Block web scrapers without impacting human visitors
○Accurately identify good bots vs. bad bots
○Cannot solely rely on rule based system
Must include automated learning to “self tune”for defending against emerging
and unknown threats
○Needs to include Distil community to improve accuracy of bot detection
○Must seamlessly co-exist with existing solutions
(SIEM, CDN, WAF, etc.)
21. StubHub Results with Distil Networks
Reduced competitive data mining and fraud
Drastically reduced competitive data mining,
increased SEO rankings, and protected our
marketplace ecosystem
Distil is a key piece of our fraud detection and
prevention suite of tools
22. StubHub Results with Distil Networks
Improved traffic quality and enriched
analytic data
Cut pageviews in half, without impacting
human users or ad deliveries
Quality of traffic has greatly improved by
stopping unwanted bots and limiting site
access for trusted bots
26. Good bots make up over 35% of all traffic to the average website
○ Search engines - Google, Bing, Baidu, etc.,
○ Alexa Crawler
○ Pingdom, Keynote, etc.
Effective solutions block bad bots but leave good bots unhindered
The Importance of Accurately Identifying Good vs Bad Bots
Source: Distil Networks,
2015 Bad Bot Landscape Report
27. Bot detection should never rely on static signatures or manual rule creation
Automation and machine learning must be performed in real-time
Effective bot mitigation solutions
○Dynamically classify users by correlating dozens of data points
as well as behavior patterns
○Constantly “self-tune” to evolve alongside
the morphing threats they encounter and protect against
The Importance of Machine Learning and Self
Tuning
28. ○ Real-time updates from a centralized violators database help protect
all sites and improve accuracy
○ Data from attacks detected anywhere on the network should be
centralized, correlated, and analyzed by a big data analysis platform
○ Signatures are then constantly updated to
drastically reduce false positives (blocking humans)
and false negatives (missing bad bots)
The Importance of Community Supported Centralized Threat
Database
29. Many organizations have complex web environments which may include a
multitude of different solutions including
○ Content Delivery Networks (CDNs)
○ WAFs, FW, IPS
○ SIEMs
○ Load balancers
○ and more..
Bot mitigation must be able to seamlessly deployed alongside these
technologies without impacting their performance or usage
The Importance of Seamless Compatibility
30. The First Easy and Accurate Way to Defend
Websites Against Malicious Bots
31. The World’s Most Accurate Bot Detection
System
Inline Fingerprinting
Fingerprints stick to the bot even if it attempts to
reconnect from random IP addresses or hide behind an
anonymous proxy.
Known Violators Database
Real-time updates from the world’s largest Known
Violators Database, which is based on the collective
intelligence of all Distil-protected sites.
Browser Validation
The first solution to disallow browser spoofing by
validating each incoming request as self-reported and
detects all known browser automation tools.
Behavioral Modeling and Machine Learning
Machine-learning algorithms pinpoint behavioral
anomalies specific to your site’s unique traffic patterns.
32. How Ecommerce Companies Benefit from
Distil
Increase insight & control
over human, good bot &
bad bot traffic
Block 99.9% of
malicious bots without
impacting legitimate
users
Slash the high tax bots
place on internal teams
& web infrastructure
Protect data from web
scrapers, unauthorized
aggregators & hackers
Side Owner: Rami
QVC Sues Shopping App for Web Scraping That Allegedly Triggered Site Outage - http://newmedialaw.proskauer.com/2014/12/05/qvc-sues-shopping-app-for-web-scraping-that-allegedly-triggered-site-outage/
Side Owner: Rami
Side Owner: Rami
Side Owner: Rami
Side Owner: Rami
Slide Owner: Marty
Slide Owner: Marty
Slide Owner: Marty
Slide Owner: Marty
Rami can ask- What have you noticed in terms of trends or changes in the fraud environment
Ashley Madison hack prompt
What are you seeing out there that concerns you in terms of fraud
Slide Owner: Marty
Rami can ask: Marty, what do you mean by trusted bots?
Slide Owner: Marty
Slide Owner: Marty
Rami can ask- you’ve got a huge amount of whitelisted traffic. Why do you have to whitelist so much?
Slide Owner: MartyHow often do we update our threat database?
Marty can ask Rami.
The last thing we want to do is to take down a device to update a rule set. It’s nice that this is a totally hands off approach.
Slide Owner: Marty
Complex environments that include a multitude of security and web infrascture solutions
Hadoop
Note that this is Marty’s last slide
Transition slide back to Rami
Slide owner: Rami
Slide Owner: Rami
Slide Owner: Rami
If you’re on this webinar, we’ve got your information and you’re eligible for two months of free service + traffic analysis at no charge.