SlideShare a Scribd company logo
1 of 25
Ensuring Real Estate Website
Listing Data Security
Avoid Litigation by Protecting Your Listing Data
Before the Theft Occurs
Presenters
Charlie Minesinger
Director of Solution Sales
Distil Networks
Matt Cohen
Chief Technologist
Clareity Consulting
Introductions and Background
Trends in Scraping Real Estate Websites
Overview of Study and Findings
Immediate Opportunities and Threats from Scraping
Agenda
Toward better Security for Real Estate Data Online
Distil in Real Estate and Premium Brands
Market Leader in Bot Detection and Mitigation
● Only bot detection vendor to be included
in Gartner’s 2015 Online Fraud Detection
Market Guide
● Key Attack Trend: “Fraudsters spreading
their attacks over thousands of IP
addresses”
● Key Inclusion Criteria: “Ability to detect
online fraud as transactions occur in real
time or near real time”
● Interesting to note: No WAF vendors in
this report (as their detection model is
primarily rules-based)
What Is Web Scraping?
Web Scraping
Also known as screen scraping, web scraping is the act of
copying large amounts of data from a website – either
manually or with an automated program (Bot)
Legitimate Scraping
Scraping can sometimes be benevolent and totally
acceptable. For example, the search engine bots that index
your website
Malicious Scraping
A systematic theft of intellectual property accessible on a
website, including pricing, content, images, and proprietary
data
MLSs:
○ Obligation to protect copyright
○ Higher cost to use reactive methods - beacons, legal, etc
○ Duty to enforce NAR Policy (VOWs, so far)
○ Missed revenue opportunities for licensing content
Brokers / Agents:
○Provided content license on listing for specific purpose
○Responsible for NAR Policy (VOWs, so far)
○Stale (scraped) data undermines trust and reputation in brand
○Higher costs - bots drive up costs for online services
Why Bots / Scraping is a Problem in Real Estate
Software Vendors / Publishers:
○ Resource Utilization – more servers and bandwidth costs
○ Poor Website Performance – latency and brownouts, etc.
○ Clean up Marketing Metrics – optimize for humans
○ Ad Fraud – advertisers are not paying for non-human traffic
○ People Resources – keep your team focused on revenue!
Bottom Line
Scrapers scrape because they are making money with your listings!
And the Real Estate industry is left with...
→ Higher costs
→ Lost revenues
Why Bots / Scraping is a Problem in Real Estate
Realtor.org offers free tools to track data - Reactive = expensive
○Checklist for Syndication has many references to data scraping – legal guidance
○NoScrape – aborted project - no update since 2010?
Problem is not going away
Industry Help? ...Way behind on Bad Bots
Ads for Scraping Programs
on Realtor.com!
Realtor.com blog to “deter scraping” relies on
obsolete IP address blocking and expensive IP
litigation
“REALTOR.com® logging, tracking and monitoring
patterns that indicate data is being stolen for these
illegitimate purposes. Once an offender is identified, their
IP address is blocked from accessing the site.”
(Oct 10, 2014)
Scraping as a service sites proliferate – scraping VERY accessible!
o Search for “web data scraping” on elance.com, odesk.com, freelancer.com, etc
o Google Search terms: “scraping real estate data” and “scrape MLS listings”
o Services: Mozenda.com, 80legs.com, webharvey.com, scraping.pro, etc
Problem is not going away
Web Scraping - Cheap, Easy & DIY
Costs of Scraping MLS Data
○ Resource costs - 10% to 40% of server utilization and bandwidth
○ Customer Care - Cost per call from consumer? Calls per month?
○ Website Performance – brownouts results in 3 days of low traffic
○ Ad Fraud - If 30% of ads are seen by bots, are advertisers paying?
○ Lead Gen… $15/mover, $30/storage facility, … $100s per listing
going to third parties, not the broker, not the agent
→ Biggest Losers: MLS and Brokers
Value of solution?
○ Antivirus is $40 to $75 year per member ( = $3 - $6/month)
○ Anti-scraping protection should be same or less cost
Bottom Line on Scraping
For now, two surveys:
○MLS Executives - 100 MLS Executives rep. MLSs with over 600,000 subscribers.
○ IDX Vendors – 14 rep. 400,000 IDX & VOW websites. Others would only speak informally.
Because they manage the largest set of scraping targets
Email invitation, web-survey over several weeks.
Study Methodology
Because they play a part in all scraping contexts – MLS, Publishers, and IDX/VOW.
● Technology Selection. Selects and contracts for the MLS systems.
● Data Licensing. Manages the data license agreements with the Advertising Portals
● Industry Policy. Collectively set IDX / VOW rules
99% say compliance with rules protecting misuse of MLS data is important
Implementing anti-scraping should be a priority for MLS vendors:
95% agree that IDX sites should be subject to rules specifically mandating
scraping protections. This needs follow-up w/ NAR committees.
59% of respondents do NOT test VOW sites for anti-scraping compliance
Most testing performed is not rigorous
Some rely on self-reporting
98% of respondents want a set of standardized tests to verify
that VOW and syndication sites are protected
MLS Study – Key Results
43% of IDX/VOW vendors were not aware of issue pervasiveness.
62% rate Compliance with MLS rules is most important factor in having
IDX/VOW vendors implement an anti-scraping solution
Other drivers for adoption of anti-scraping protection
○Customer demand for anti-scraping protections
○Cost of infrastructure use/abuse
○Security concerns
○System performance issues
IDX / VOW Study – Key Results
○ 50% of IDX vendor respondents believe 15-30% bot traffic is acceptable
○ 50% believe less than 1% bot traffic is acceptable (more like MLS)
○ Most IDX/VOW vendors are using reactive detection tactics
Log analysis - reactive and labor-intensive monitoring
IP-based methods - ineffective against sophisticated scrapers
Obsolete Preventions - IP-based rate limiting and CAPTCHAs
→ Likely underestimating (missing bots) with these methods!
○ More than half cannot identify the costs of bots to their business...if you
cannot measure it, you cannot manage it, & certainly not budget it
○ While 100% put NAR compliance as a priority, only 25% have budgeted for
services to provide anti-scraping service to comply with VOW rules
IDX / VOW Study - Misaligned, Lacking Key Data
○Scripts, such as CURL or Ruby, making requests at any rate
○Selenium, fully automated browser making requests at any rate (fully automating browser)
○Headless browser with or without Phantom JS (fully simulating browser, browser pre-rendering)
○IP cycling using any bot technology at rate of less than 5 requests per IP Address, then change IP
○Crawlers - at any speed, even slow crawlers making 10 requests per minute or less
○Anonymized proxy for IP to make requests using any technology or at any rate of requests
○Spoofed bot user-agent, e.g. using fake “googlebot” or “bingbot” as user-agent, IE running on Linux, etc
○Non-Browser user-agent, spoofed user-agents for mobile browsers or mobile applications
○Blocking traffic from data centers and hosting providers (why would consumers be using those IP?)
○Blocking bots from Consumer ISPs while letting legitimate requests through
It’s An Arms Race … More Detail:
Modern Anti-Scraping Tool Requirements
○ 7 of top 10 sources of bots are Consumer ISPs:
(1) Comcast, (2) Time Warner Cable, (3) Verizon FIOS,
(4) Charter, (5) Cox, (6) CenturyLink, and (7) AT&T
Uverse
○ 50% - 75% of bot traffic on RE sites is from Consumer ISPs
○ Most Consumer ISPs had 1,500+ IPs with bot traffic
○ 18-45% Automated browsers - mimicking humans
○ 14-25% in Bot Database - fingerprinted, known bots
○ 16-42% Slow Crawlers - recycling IPs and user agents
Highlights of Bot Sophistication in Real Estate
The Facts on Scraping Real Estate Data
Purpose Built Solution, Not a Feature
Bot Detection is a New Category, NOT a Feature
○ NOT a Content Delivery Service (CDN)
○ NOT a Distributed Denial of Service (DDoS) protection solution
○ NOT a simple IP list or set of scripts
○ NOT a Web Application Firewall (WAF)
A purpose built bot detection solution is always updating and evolving
Catch 99.9% of Malicious Bots with Distil
A Typical WAF Catches 20%
IP BLOCK
USER AGENT
TESTING
IP ANALYSIS
USER AGENT
TESTING
JAVASCRIPT
TEST
COOKIE
SELENIUM TEST
BROWSER RATE
LIMITING
AUTOMATED
BROWSER
PHANTOM JS
MACHINE
LEARNING
IP CYCLING
Distil Catches up to 99.9%
Detect Your Bot Traffic
Control Over Your Bot Traffic
Monitor
Monitor to inspect requests and record
the traffic to Distil and/or your own
server logs
Block
Set to Block to serve the client an
unblock verification form
CAPTCHA
Serve a hardened CAPTCHA to test the
client for verification
Drop
Drop them to present them with an
access denied page
Flexible Deployment Options
Cloud
Deploys in hours
Blazing fast Anycast DNS-based
GeoIP Routing. Automatic
content compression optimizes
for faster delivery
17 datacenters automatically fail
over when a primary location
goes offline
Automatically increases
infrastructure and bandwidth to
accommodate spikes
USER DISTIL CLOUD CDN LOAD BALANCER WEB SERVER
Flexible Deployment Options
Physical or Virtual Appliance(s)
Install on virtualized or Bare Metal
appliance(s)
Deploys in days
High availability configurations
with failover monitoring
Heartbeat up to Distil Cloud
USER INTERNET LOAD BALANCER WEB SERVER
DISTIL APPLIANCE
Best of Breed Solution will Include:
○99% Accuracy, cannot rely on IP address to identify bots or use rate limiting on IP
○Dedicated Service - NOT a button/feature/add-on
○Layers of tactics, multiple detection tactics, with ongoing R&D
○Easy to Implement - deploy in days or weeks
○Real-time detection and mitigation - be proactive to save time and money
○Flexible Configurable options for actions to mitigate bots
○Affordable cost per member, per site, or per MLS - flexible business model
Selection Criteria for Anti-Scraping
www.distilnetworks.com
QUESTIONS….COMMENTS?
1.703.962.1614
http://resources.distilnetworks.com/h/c/175726-real-estate
Call Charlie
C H A R L I E @ D I S T I L N E T W O R K S . C O M

More Related Content

More from Distil Networks

Are Bad Bots Destroying Your Conversion Rate and Costing You Money?
Are Bad Bots Destroying Your Conversion Rate and Costing You Money?Are Bad Bots Destroying Your Conversion Rate and Costing You Money?
Are Bad Bots Destroying Your Conversion Rate and Costing You Money?Distil Networks
 
How the BOTS Act Impacts Premium Onsales and the Ticketing Industry Ecosystem
How the BOTS Act Impacts Premium Onsales and the Ticketing Industry EcosystemHow the BOTS Act Impacts Premium Onsales and the Ticketing Industry Ecosystem
How the BOTS Act Impacts Premium Onsales and the Ticketing Industry EcosystemDistil Networks
 
Are Bot Operators Eating Your Lunch?
Are Bot Operators Eating Your Lunch?Are Bot Operators Eating Your Lunch?
Are Bot Operators Eating Your Lunch?Distil Networks
 
The Inconvenient Truth About API Security
The Inconvenient Truth About API SecurityThe Inconvenient Truth About API Security
The Inconvenient Truth About API SecurityDistil Networks
 
2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad Bots
2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad Bots2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad Bots
2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad BotsDistil Networks
 
Using Permaculture to Cultivate a Sustainable Security Program
Using Permaculture to Cultivate a Sustainable Security ProgramUsing Permaculture to Cultivate a Sustainable Security Program
Using Permaculture to Cultivate a Sustainable Security ProgramDistil Networks
 
Better Metrics, Less Hacks: Online Travel and The Future of Web Security
Better Metrics, Less Hacks: Online Travel and The Future of Web SecurityBetter Metrics, Less Hacks: Online Travel and The Future of Web Security
Better Metrics, Less Hacks: Online Travel and The Future of Web SecurityDistil Networks
 
Ensuring Property Portal Listing Data Security
Ensuring Property Portal Listing Data SecurityEnsuring Property Portal Listing Data Security
Ensuring Property Portal Listing Data SecurityDistil Networks
 
Keeping up with the Revolution in IT Security
Keeping up with the Revolution in IT SecurityKeeping up with the Revolution in IT Security
Keeping up with the Revolution in IT SecurityDistil Networks
 
Field Guide To Preventing Competitor Price Scraping, Unwanted Transactions, B...
Field Guide To Preventing Competitor Price Scraping, Unwanted Transactions, B...Field Guide To Preventing Competitor Price Scraping, Unwanted Transactions, B...
Field Guide To Preventing Competitor Price Scraping, Unwanted Transactions, B...Distil Networks
 
Field Guide for Validating Premium Ad Inventory
Field Guide for Validating Premium Ad InventoryField Guide for Validating Premium Ad Inventory
Field Guide for Validating Premium Ad InventoryDistil Networks
 
Tune in for the Ultimate WAF Torture Test: Bots Attack!
Tune in for the Ultimate WAF Torture Test: Bots Attack!Tune in for the Ultimate WAF Torture Test: Bots Attack!
Tune in for the Ultimate WAF Torture Test: Bots Attack!Distil Networks
 
Cleaning up website traffic from bots & spammers
Cleaning up website traffic from bots & spammersCleaning up website traffic from bots & spammers
Cleaning up website traffic from bots & spammersDistil Networks
 

More from Distil Networks (13)

Are Bad Bots Destroying Your Conversion Rate and Costing You Money?
Are Bad Bots Destroying Your Conversion Rate and Costing You Money?Are Bad Bots Destroying Your Conversion Rate and Costing You Money?
Are Bad Bots Destroying Your Conversion Rate and Costing You Money?
 
How the BOTS Act Impacts Premium Onsales and the Ticketing Industry Ecosystem
How the BOTS Act Impacts Premium Onsales and the Ticketing Industry EcosystemHow the BOTS Act Impacts Premium Onsales and the Ticketing Industry Ecosystem
How the BOTS Act Impacts Premium Onsales and the Ticketing Industry Ecosystem
 
Are Bot Operators Eating Your Lunch?
Are Bot Operators Eating Your Lunch?Are Bot Operators Eating Your Lunch?
Are Bot Operators Eating Your Lunch?
 
The Inconvenient Truth About API Security
The Inconvenient Truth About API SecurityThe Inconvenient Truth About API Security
The Inconvenient Truth About API Security
 
2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad Bots
2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad Bots2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad Bots
2016 Bad Bot Report: Quantifying the Risk and Economic Impact of Bad Bots
 
Using Permaculture to Cultivate a Sustainable Security Program
Using Permaculture to Cultivate a Sustainable Security ProgramUsing Permaculture to Cultivate a Sustainable Security Program
Using Permaculture to Cultivate a Sustainable Security Program
 
Better Metrics, Less Hacks: Online Travel and The Future of Web Security
Better Metrics, Less Hacks: Online Travel and The Future of Web SecurityBetter Metrics, Less Hacks: Online Travel and The Future of Web Security
Better Metrics, Less Hacks: Online Travel and The Future of Web Security
 
Ensuring Property Portal Listing Data Security
Ensuring Property Portal Listing Data SecurityEnsuring Property Portal Listing Data Security
Ensuring Property Portal Listing Data Security
 
Keeping up with the Revolution in IT Security
Keeping up with the Revolution in IT SecurityKeeping up with the Revolution in IT Security
Keeping up with the Revolution in IT Security
 
Field Guide To Preventing Competitor Price Scraping, Unwanted Transactions, B...
Field Guide To Preventing Competitor Price Scraping, Unwanted Transactions, B...Field Guide To Preventing Competitor Price Scraping, Unwanted Transactions, B...
Field Guide To Preventing Competitor Price Scraping, Unwanted Transactions, B...
 
Field Guide for Validating Premium Ad Inventory
Field Guide for Validating Premium Ad InventoryField Guide for Validating Premium Ad Inventory
Field Guide for Validating Premium Ad Inventory
 
Tune in for the Ultimate WAF Torture Test: Bots Attack!
Tune in for the Ultimate WAF Torture Test: Bots Attack!Tune in for the Ultimate WAF Torture Test: Bots Attack!
Tune in for the Ultimate WAF Torture Test: Bots Attack!
 
Cleaning up website traffic from bots & spammers
Cleaning up website traffic from bots & spammersCleaning up website traffic from bots & spammers
Cleaning up website traffic from bots & spammers
 

Recently uploaded

Girls in Kalyanpuri }Delhi↫8447779280↬Escort Service. In Delhi NCR
Girls in Kalyanpuri }Delhi↫8447779280↬Escort Service. In Delhi NCRGirls in Kalyanpuri }Delhi↫8447779280↬Escort Service. In Delhi NCR
Girls in Kalyanpuri }Delhi↫8447779280↬Escort Service. In Delhi NCRasmaqueen5
 
9990771857 Call Girls in Dwarka Sector 7 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 7 Delhi (Call Girls) Delhi9990771857 Call Girls in Dwarka Sector 7 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 7 Delhi (Call Girls) Delhidelhimodel235
 
Best Deal Virtual Space in Satya The Hive Tata Zudio 750 Sqft 1.89 Cr All inc...
Best Deal Virtual Space in Satya The Hive Tata Zudio 750 Sqft 1.89 Cr All inc...Best Deal Virtual Space in Satya The Hive Tata Zudio 750 Sqft 1.89 Cr All inc...
Best Deal Virtual Space in Satya The Hive Tata Zudio 750 Sqft 1.89 Cr All inc...ApartmentWala1
 
2k Shot Call girls Aiims Delhi 9205541914
2k Shot Call girls Aiims Delhi 92055419142k Shot Call girls Aiims Delhi 9205541914
2k Shot Call girls Aiims Delhi 9205541914Delhi Call girls
 
Call Girls In Mayur Vihar Delhi ☆↫8447779280 ❤Escorts Service In Delhi
Call Girls In Mayur Vihar Delhi ☆↫8447779280 ❤Escorts Service In DelhiCall Girls In Mayur Vihar Delhi ☆↫8447779280 ❤Escorts Service In Delhi
Call Girls In Mayur Vihar Delhi ☆↫8447779280 ❤Escorts Service In Delhiasmaqueen5
 
9990771857 Call Girls in Dwarka Sector 1 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 1 Delhi (Call Girls) Delhi9990771857 Call Girls in Dwarka Sector 1 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 1 Delhi (Call Girls) Delhidelhimodel235
 
2k Shots ≽ 9205541914 ≼ Call Girls In Sainik Farm (Delhi)
2k Shots ≽ 9205541914 ≼ Call Girls In Sainik Farm (Delhi)2k Shots ≽ 9205541914 ≼ Call Girls In Sainik Farm (Delhi)
2k Shots ≽ 9205541914 ≼ Call Girls In Sainik Farm (Delhi)Delhi Call girls
 
SVN Live 5.6.24 Weekly Property Broadcast
SVN Live 5.6.24 Weekly Property BroadcastSVN Live 5.6.24 Weekly Property Broadcast
SVN Live 5.6.24 Weekly Property BroadcastSVN International Corp.
 
9990771857 Call Girls in Dwarka Sector 3 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 3 Delhi (Call Girls) Delhi9990771857 Call Girls in Dwarka Sector 3 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 3 Delhi (Call Girls) Delhidelhimodel235
 
Call Girls In Laxmi Nagar Delhi +91-8447779280! !Best Woman Seeking Man Escor...
Call Girls In Laxmi Nagar Delhi +91-8447779280! !Best Woman Seeking Man Escor...Call Girls In Laxmi Nagar Delhi +91-8447779280! !Best Woman Seeking Man Escor...
Call Girls In Laxmi Nagar Delhi +91-8447779280! !Best Woman Seeking Man Escor...asmaqueen5
 
9990771857 Call Girls Dwarka Sector 9 Delhi (Call Girls ) Delhi
9990771857 Call Girls Dwarka Sector 9 Delhi (Call Girls ) Delhi9990771857 Call Girls Dwarka Sector 9 Delhi (Call Girls ) Delhi
9990771857 Call Girls Dwarka Sector 9 Delhi (Call Girls ) Delhidelhimodel235
 
call girls in ganesh nagar Delhi 8264348440 ✅ call girls ❤️
call girls in ganesh nagar Delhi 8264348440 ✅ call girls ❤️call girls in ganesh nagar Delhi 8264348440 ✅ call girls ❤️
call girls in ganesh nagar Delhi 8264348440 ✅ call girls ❤️soniya singh
 
Premium Villa Projects in Sarjapur Road Bengaluru
Premium Villa Projects in Sarjapur Road BengaluruPremium Villa Projects in Sarjapur Road Bengaluru
Premium Villa Projects in Sarjapur Road BengaluruShivaSeo3
 
M3M The Line Brochure - Premium Investment Opportunity for Commercial Ventures
M3M The Line Brochure - Premium Investment Opportunity for Commercial VenturesM3M The Line Brochure - Premium Investment Opportunity for Commercial Ventures
M3M The Line Brochure - Premium Investment Opportunity for Commercial Venturessheltercareglobal
 
3D Architectural Rendering Company by Panoram CGI
3D Architectural Rendering Company by Panoram CGI3D Architectural Rendering Company by Panoram CGI
3D Architectural Rendering Company by Panoram CGIPanoram CGI
 
Call Girls In Vasant Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Vasant Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Vasant Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Vasant Vihar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
9990771857 Call Girls in Dwarka Sector 2 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 2 Delhi (Call Girls) Delhi9990771857 Call Girls in Dwarka Sector 2 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 2 Delhi (Call Girls) Delhidelhimodel235
 
Kolte Patil Kharadi Pune E Brochure.pdf
Kolte Patil Kharadi Pune E  Brochure.pdfKolte Patil Kharadi Pune E  Brochure.pdf
Kolte Patil Kharadi Pune E Brochure.pdfabbu831446
 
Low Rate ↬Call Girls in Trilokpuri Delhi ↫8447779280}Escorts Service In Delhi
Low Rate ↬Call Girls in Trilokpuri Delhi ↫8447779280}Escorts Service In DelhiLow Rate ↬Call Girls in Trilokpuri Delhi ↫8447779280}Escorts Service In Delhi
Low Rate ↬Call Girls in Trilokpuri Delhi ↫8447779280}Escorts Service In Delhiasmaqueen5
 
Greater Vancouver Realtors Statistics Package April 2024
Greater Vancouver Realtors Statistics Package April 2024Greater Vancouver Realtors Statistics Package April 2024
Greater Vancouver Realtors Statistics Package April 2024VickyAulakh1
 

Recently uploaded (20)

Girls in Kalyanpuri }Delhi↫8447779280↬Escort Service. In Delhi NCR
Girls in Kalyanpuri }Delhi↫8447779280↬Escort Service. In Delhi NCRGirls in Kalyanpuri }Delhi↫8447779280↬Escort Service. In Delhi NCR
Girls in Kalyanpuri }Delhi↫8447779280↬Escort Service. In Delhi NCR
 
9990771857 Call Girls in Dwarka Sector 7 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 7 Delhi (Call Girls) Delhi9990771857 Call Girls in Dwarka Sector 7 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 7 Delhi (Call Girls) Delhi
 
Best Deal Virtual Space in Satya The Hive Tata Zudio 750 Sqft 1.89 Cr All inc...
Best Deal Virtual Space in Satya The Hive Tata Zudio 750 Sqft 1.89 Cr All inc...Best Deal Virtual Space in Satya The Hive Tata Zudio 750 Sqft 1.89 Cr All inc...
Best Deal Virtual Space in Satya The Hive Tata Zudio 750 Sqft 1.89 Cr All inc...
 
2k Shot Call girls Aiims Delhi 9205541914
2k Shot Call girls Aiims Delhi 92055419142k Shot Call girls Aiims Delhi 9205541914
2k Shot Call girls Aiims Delhi 9205541914
 
Call Girls In Mayur Vihar Delhi ☆↫8447779280 ❤Escorts Service In Delhi
Call Girls In Mayur Vihar Delhi ☆↫8447779280 ❤Escorts Service In DelhiCall Girls In Mayur Vihar Delhi ☆↫8447779280 ❤Escorts Service In Delhi
Call Girls In Mayur Vihar Delhi ☆↫8447779280 ❤Escorts Service In Delhi
 
9990771857 Call Girls in Dwarka Sector 1 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 1 Delhi (Call Girls) Delhi9990771857 Call Girls in Dwarka Sector 1 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 1 Delhi (Call Girls) Delhi
 
2k Shots ≽ 9205541914 ≼ Call Girls In Sainik Farm (Delhi)
2k Shots ≽ 9205541914 ≼ Call Girls In Sainik Farm (Delhi)2k Shots ≽ 9205541914 ≼ Call Girls In Sainik Farm (Delhi)
2k Shots ≽ 9205541914 ≼ Call Girls In Sainik Farm (Delhi)
 
SVN Live 5.6.24 Weekly Property Broadcast
SVN Live 5.6.24 Weekly Property BroadcastSVN Live 5.6.24 Weekly Property Broadcast
SVN Live 5.6.24 Weekly Property Broadcast
 
9990771857 Call Girls in Dwarka Sector 3 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 3 Delhi (Call Girls) Delhi9990771857 Call Girls in Dwarka Sector 3 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 3 Delhi (Call Girls) Delhi
 
Call Girls In Laxmi Nagar Delhi +91-8447779280! !Best Woman Seeking Man Escor...
Call Girls In Laxmi Nagar Delhi +91-8447779280! !Best Woman Seeking Man Escor...Call Girls In Laxmi Nagar Delhi +91-8447779280! !Best Woman Seeking Man Escor...
Call Girls In Laxmi Nagar Delhi +91-8447779280! !Best Woman Seeking Man Escor...
 
9990771857 Call Girls Dwarka Sector 9 Delhi (Call Girls ) Delhi
9990771857 Call Girls Dwarka Sector 9 Delhi (Call Girls ) Delhi9990771857 Call Girls Dwarka Sector 9 Delhi (Call Girls ) Delhi
9990771857 Call Girls Dwarka Sector 9 Delhi (Call Girls ) Delhi
 
call girls in ganesh nagar Delhi 8264348440 ✅ call girls ❤️
call girls in ganesh nagar Delhi 8264348440 ✅ call girls ❤️call girls in ganesh nagar Delhi 8264348440 ✅ call girls ❤️
call girls in ganesh nagar Delhi 8264348440 ✅ call girls ❤️
 
Premium Villa Projects in Sarjapur Road Bengaluru
Premium Villa Projects in Sarjapur Road BengaluruPremium Villa Projects in Sarjapur Road Bengaluru
Premium Villa Projects in Sarjapur Road Bengaluru
 
M3M The Line Brochure - Premium Investment Opportunity for Commercial Ventures
M3M The Line Brochure - Premium Investment Opportunity for Commercial VenturesM3M The Line Brochure - Premium Investment Opportunity for Commercial Ventures
M3M The Line Brochure - Premium Investment Opportunity for Commercial Ventures
 
3D Architectural Rendering Company by Panoram CGI
3D Architectural Rendering Company by Panoram CGI3D Architectural Rendering Company by Panoram CGI
3D Architectural Rendering Company by Panoram CGI
 
Call Girls In Vasant Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Vasant Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Vasant Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Vasant Vihar Delhi 💯Call Us 🔝8264348440🔝
 
9990771857 Call Girls in Dwarka Sector 2 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 2 Delhi (Call Girls) Delhi9990771857 Call Girls in Dwarka Sector 2 Delhi (Call Girls) Delhi
9990771857 Call Girls in Dwarka Sector 2 Delhi (Call Girls) Delhi
 
Kolte Patil Kharadi Pune E Brochure.pdf
Kolte Patil Kharadi Pune E  Brochure.pdfKolte Patil Kharadi Pune E  Brochure.pdf
Kolte Patil Kharadi Pune E Brochure.pdf
 
Low Rate ↬Call Girls in Trilokpuri Delhi ↫8447779280}Escorts Service In Delhi
Low Rate ↬Call Girls in Trilokpuri Delhi ↫8447779280}Escorts Service In DelhiLow Rate ↬Call Girls in Trilokpuri Delhi ↫8447779280}Escorts Service In Delhi
Low Rate ↬Call Girls in Trilokpuri Delhi ↫8447779280}Escorts Service In Delhi
 
Greater Vancouver Realtors Statistics Package April 2024
Greater Vancouver Realtors Statistics Package April 2024Greater Vancouver Realtors Statistics Package April 2024
Greater Vancouver Realtors Statistics Package April 2024
 

Ensuring Real Estate Website Listing Data Security

  • 1. Ensuring Real Estate Website Listing Data Security Avoid Litigation by Protecting Your Listing Data Before the Theft Occurs
  • 2. Presenters Charlie Minesinger Director of Solution Sales Distil Networks Matt Cohen Chief Technologist Clareity Consulting
  • 3. Introductions and Background Trends in Scraping Real Estate Websites Overview of Study and Findings Immediate Opportunities and Threats from Scraping Agenda Toward better Security for Real Estate Data Online
  • 4. Distil in Real Estate and Premium Brands
  • 5. Market Leader in Bot Detection and Mitigation ● Only bot detection vendor to be included in Gartner’s 2015 Online Fraud Detection Market Guide ● Key Attack Trend: “Fraudsters spreading their attacks over thousands of IP addresses” ● Key Inclusion Criteria: “Ability to detect online fraud as transactions occur in real time or near real time” ● Interesting to note: No WAF vendors in this report (as their detection model is primarily rules-based)
  • 6. What Is Web Scraping? Web Scraping Also known as screen scraping, web scraping is the act of copying large amounts of data from a website – either manually or with an automated program (Bot) Legitimate Scraping Scraping can sometimes be benevolent and totally acceptable. For example, the search engine bots that index your website Malicious Scraping A systematic theft of intellectual property accessible on a website, including pricing, content, images, and proprietary data
  • 7. MLSs: ○ Obligation to protect copyright ○ Higher cost to use reactive methods - beacons, legal, etc ○ Duty to enforce NAR Policy (VOWs, so far) ○ Missed revenue opportunities for licensing content Brokers / Agents: ○Provided content license on listing for specific purpose ○Responsible for NAR Policy (VOWs, so far) ○Stale (scraped) data undermines trust and reputation in brand ○Higher costs - bots drive up costs for online services Why Bots / Scraping is a Problem in Real Estate
  • 8. Software Vendors / Publishers: ○ Resource Utilization – more servers and bandwidth costs ○ Poor Website Performance – latency and brownouts, etc. ○ Clean up Marketing Metrics – optimize for humans ○ Ad Fraud – advertisers are not paying for non-human traffic ○ People Resources – keep your team focused on revenue! Bottom Line Scrapers scrape because they are making money with your listings! And the Real Estate industry is left with... → Higher costs → Lost revenues Why Bots / Scraping is a Problem in Real Estate
  • 9. Realtor.org offers free tools to track data - Reactive = expensive ○Checklist for Syndication has many references to data scraping – legal guidance ○NoScrape – aborted project - no update since 2010? Problem is not going away Industry Help? ...Way behind on Bad Bots Ads for Scraping Programs on Realtor.com! Realtor.com blog to “deter scraping” relies on obsolete IP address blocking and expensive IP litigation “REALTOR.com® logging, tracking and monitoring patterns that indicate data is being stolen for these illegitimate purposes. Once an offender is identified, their IP address is blocked from accessing the site.” (Oct 10, 2014)
  • 10. Scraping as a service sites proliferate – scraping VERY accessible! o Search for “web data scraping” on elance.com, odesk.com, freelancer.com, etc o Google Search terms: “scraping real estate data” and “scrape MLS listings” o Services: Mozenda.com, 80legs.com, webharvey.com, scraping.pro, etc Problem is not going away Web Scraping - Cheap, Easy & DIY
  • 11. Costs of Scraping MLS Data ○ Resource costs - 10% to 40% of server utilization and bandwidth ○ Customer Care - Cost per call from consumer? Calls per month? ○ Website Performance – brownouts results in 3 days of low traffic ○ Ad Fraud - If 30% of ads are seen by bots, are advertisers paying? ○ Lead Gen… $15/mover, $30/storage facility, … $100s per listing going to third parties, not the broker, not the agent → Biggest Losers: MLS and Brokers Value of solution? ○ Antivirus is $40 to $75 year per member ( = $3 - $6/month) ○ Anti-scraping protection should be same or less cost Bottom Line on Scraping
  • 12. For now, two surveys: ○MLS Executives - 100 MLS Executives rep. MLSs with over 600,000 subscribers. ○ IDX Vendors – 14 rep. 400,000 IDX & VOW websites. Others would only speak informally. Because they manage the largest set of scraping targets Email invitation, web-survey over several weeks. Study Methodology Because they play a part in all scraping contexts – MLS, Publishers, and IDX/VOW. ● Technology Selection. Selects and contracts for the MLS systems. ● Data Licensing. Manages the data license agreements with the Advertising Portals ● Industry Policy. Collectively set IDX / VOW rules
  • 13. 99% say compliance with rules protecting misuse of MLS data is important Implementing anti-scraping should be a priority for MLS vendors: 95% agree that IDX sites should be subject to rules specifically mandating scraping protections. This needs follow-up w/ NAR committees. 59% of respondents do NOT test VOW sites for anti-scraping compliance Most testing performed is not rigorous Some rely on self-reporting 98% of respondents want a set of standardized tests to verify that VOW and syndication sites are protected MLS Study – Key Results
  • 14. 43% of IDX/VOW vendors were not aware of issue pervasiveness. 62% rate Compliance with MLS rules is most important factor in having IDX/VOW vendors implement an anti-scraping solution Other drivers for adoption of anti-scraping protection ○Customer demand for anti-scraping protections ○Cost of infrastructure use/abuse ○Security concerns ○System performance issues IDX / VOW Study – Key Results
  • 15. ○ 50% of IDX vendor respondents believe 15-30% bot traffic is acceptable ○ 50% believe less than 1% bot traffic is acceptable (more like MLS) ○ Most IDX/VOW vendors are using reactive detection tactics Log analysis - reactive and labor-intensive monitoring IP-based methods - ineffective against sophisticated scrapers Obsolete Preventions - IP-based rate limiting and CAPTCHAs → Likely underestimating (missing bots) with these methods! ○ More than half cannot identify the costs of bots to their business...if you cannot measure it, you cannot manage it, & certainly not budget it ○ While 100% put NAR compliance as a priority, only 25% have budgeted for services to provide anti-scraping service to comply with VOW rules IDX / VOW Study - Misaligned, Lacking Key Data
  • 16. ○Scripts, such as CURL or Ruby, making requests at any rate ○Selenium, fully automated browser making requests at any rate (fully automating browser) ○Headless browser with or without Phantom JS (fully simulating browser, browser pre-rendering) ○IP cycling using any bot technology at rate of less than 5 requests per IP Address, then change IP ○Crawlers - at any speed, even slow crawlers making 10 requests per minute or less ○Anonymized proxy for IP to make requests using any technology or at any rate of requests ○Spoofed bot user-agent, e.g. using fake “googlebot” or “bingbot” as user-agent, IE running on Linux, etc ○Non-Browser user-agent, spoofed user-agents for mobile browsers or mobile applications ○Blocking traffic from data centers and hosting providers (why would consumers be using those IP?) ○Blocking bots from Consumer ISPs while letting legitimate requests through It’s An Arms Race … More Detail: Modern Anti-Scraping Tool Requirements
  • 17. ○ 7 of top 10 sources of bots are Consumer ISPs: (1) Comcast, (2) Time Warner Cable, (3) Verizon FIOS, (4) Charter, (5) Cox, (6) CenturyLink, and (7) AT&T Uverse ○ 50% - 75% of bot traffic on RE sites is from Consumer ISPs ○ Most Consumer ISPs had 1,500+ IPs with bot traffic ○ 18-45% Automated browsers - mimicking humans ○ 14-25% in Bot Database - fingerprinted, known bots ○ 16-42% Slow Crawlers - recycling IPs and user agents Highlights of Bot Sophistication in Real Estate The Facts on Scraping Real Estate Data
  • 18. Purpose Built Solution, Not a Feature Bot Detection is a New Category, NOT a Feature ○ NOT a Content Delivery Service (CDN) ○ NOT a Distributed Denial of Service (DDoS) protection solution ○ NOT a simple IP list or set of scripts ○ NOT a Web Application Firewall (WAF) A purpose built bot detection solution is always updating and evolving
  • 19. Catch 99.9% of Malicious Bots with Distil A Typical WAF Catches 20% IP BLOCK USER AGENT TESTING IP ANALYSIS USER AGENT TESTING JAVASCRIPT TEST COOKIE SELENIUM TEST BROWSER RATE LIMITING AUTOMATED BROWSER PHANTOM JS MACHINE LEARNING IP CYCLING Distil Catches up to 99.9%
  • 20. Detect Your Bot Traffic
  • 21. Control Over Your Bot Traffic Monitor Monitor to inspect requests and record the traffic to Distil and/or your own server logs Block Set to Block to serve the client an unblock verification form CAPTCHA Serve a hardened CAPTCHA to test the client for verification Drop Drop them to present them with an access denied page
  • 22. Flexible Deployment Options Cloud Deploys in hours Blazing fast Anycast DNS-based GeoIP Routing. Automatic content compression optimizes for faster delivery 17 datacenters automatically fail over when a primary location goes offline Automatically increases infrastructure and bandwidth to accommodate spikes USER DISTIL CLOUD CDN LOAD BALANCER WEB SERVER
  • 23. Flexible Deployment Options Physical or Virtual Appliance(s) Install on virtualized or Bare Metal appliance(s) Deploys in days High availability configurations with failover monitoring Heartbeat up to Distil Cloud USER INTERNET LOAD BALANCER WEB SERVER DISTIL APPLIANCE
  • 24. Best of Breed Solution will Include: ○99% Accuracy, cannot rely on IP address to identify bots or use rate limiting on IP ○Dedicated Service - NOT a button/feature/add-on ○Layers of tactics, multiple detection tactics, with ongoing R&D ○Easy to Implement - deploy in days or weeks ○Real-time detection and mitigation - be proactive to save time and money ○Flexible Configurable options for actions to mitigate bots ○Affordable cost per member, per site, or per MLS - flexible business model Selection Criteria for Anti-Scraping