Cloud hosting providers, such as Amazon AWS, Google Cloud, DigitalOcean, Microsoft Azure, and many others, have to respond to a regular barrage of abuse complaint reports from all around the world when their customers virtual private servers are used for malicious activity. This activity can happen knowingly by the "renter" of the system or on behalf of an attacker if the server becomes infected. Although by no means the end all, one way of measuring the trust posture of a cloud hosting provider is by analyzing the amount of time between shared hosts beginning to attack other hosts on the Internet and the activity ceasing, generally by way of forced-decommissioning, quarantining, or remediation of the root-cause, such as a malware infection. In this talk, we discuss using the data collected by GreyNoise, a large network of passive collector nodes, to measure the time-to-remediation of infected or malicious machines. We will discuss methodology, results, and actionable takeaways for conference attendees who use shared cloud hosting in their businesses.
3. About Me
Andrew Morris
Founder, interim CEO @ GreyNoise
Intelligence
Previously:
- Endgame R&D
- Intrepidus (NCC Group)
- KCG (ManTech)
Twitter: @andrew___morris
Email: andrew@greynoise.io
4. Agenda
• What is GreyNoise
• Background
• Research Questions
• Qualifiers
• Methodologies 1-4
• Results
• Recommendations
5. About GreyNoise
GreyNoise is a gigantic system of globally-distributed passive collector sensors
that monitor, analyze, and contextualize Internet-wide background scan and
attack traffic.
GreyNoise processes tens of millions of packets from hundreds of thousands
of IPs every day
Security professionals use GreyNoise to filter noisy alerts, identify
compromised devices, and observe emerging threats.
Learn more at https://greynoise.io or explore the data at https://viz.greynoise.io/
6. GreyNoise Examples (benign)
• Cliqz
• ShadowServer.org
• Riddler.io
• BinaryEdge.io
• FindMalware
• Quadmetrics.com
• Stretchoid.com
• CCBot
• Internet Census
• Stanford University
• Brown University
• CheckMarkNetwork
• Mojeek
• Cambridge Cybercrime Centre
• Moz DotBot
• Net Systems Research
• Uptime.com
• PDRLabs.net
• Ruhr-Universitat Bochum
• MJ12bot
• A10 Networks
• Baidu Spider
• CompanyBook
• Sogou
• PingZapper
• ipip.net
• CyberGreen
• Qwant
• ExposureMonitoring
• IBM oBot
• University of Michigan
• Team Cymru
• Panscient
• MauiBot
• Facebook NetProbe
• Archive.org
• RWTH AACHEN University
• Shodan.io
• SiteExplorer
• Kudelski Security
• Project Sonar
• SEMrush
• ProbeTheNet.com
• University of California
Berkeley
• Project25499
• Intrinsec
• LoSec
• DomainCrawler
• BingBot
• GoogleBot
• DataProvider
• Yandex Search Engine
• Statastico
• Censys
• Cloud System Networks
• DomainTools
• Mail.RU
• Talaia
• Ampere Innotech
• aiHit
• ONYPHE
• Ahrefs
• OpenLinkProfiler
• NetCraft
• University of New Mexico
• Seznam
• Coc Coc
• SafeDNS
• Pingdom.com
8. Background
• There are many compromised devices around the Internet
• A subset of these compromised devices become infected by exposing vulnerable services
directly to the rest of the Internet
• These services can be configured to accept default/easily guessable credentials or they can be
unpatched and vulnerable to a large series of remote exploits
• Botnets and bad guys constantly scan and crawl the Internet to find and infect these hosts with
unsophisticated malware
• Once infected, many of these compromised devices spread their malware to other hosts
by opportunistically scanning for and attacking other similarly vulnerable devices around
the Internet
• This kind of activity is executed by a specific sub-category of botnet (Mirai, Satori,
Muhstik, etc)
• Most of the attacks an organization sees to the network perimeter originate from these
devices
9. Background (cont’d)
• When a device becomes compromised inside of a cloud provider (such as
AWS, Google Cloud, DigitalOcean, etc) it will often start to attack other hosts
around the Internet
• The cloud provider abuse contact will start to receive some signals that the
devices is compromised:
• Abuse complaint emails from other network administrators around the Internet
• A large uptick in traffic to thousands or millions of destination IPs over protocols that
are generally used by authentication (SSH, Telnet, etc)
• Some period of time after the device is infected / attacking other servers
around the Internet (without being remediated by the owner), the cloud
hosting provider abuse team will decommission or quarantine the server
10. Research Questions
1. How much time generally passes between a device in a cloud provider
starting to attack other hosts around the Internet and for the activity to be
remediated?
2. How much variance is there between different cloud hosting providers,
such as Amazon AWS, Microsoft Azure, Google Cloud, DigitalOcean, etc?
3. How much variance is there internally on a given cloud hosting provider?
4. How many compromised devices are there in each cloud hosting
provider?**
5. How long have is the longest “total compromised host time” of each
provider?
- …using compromised device data collected and labeled by GreyNoise
** This is a Dumb Question™️
11. Goal
• Walk away with an understanding of how different cloud providers
handle compromised devices within their network
• Walk away with an understanding of how to reproduce this research
with your own data sources
• Walk away with an understanding of GreyNoise
12. Using GreyNoise to Quantify
Response Time of Cloud Provider
Abuse Teams
In other words…
13. Qualifiers
SHAME
• This talk is not intended to shame anyone or compare what organizations are “better”
or “worse”
• We are measuring “response time”, not “effectiveness”
Work sucks
• Security is hard enough without some asshole telling you how to do your job
Trust / Abuse / Fraud is hard
• SOC / trust / abuse / fraud teams are INSANELY overworked
• It’s a hard and thankless job
• I have an ENORMOUS amount of respect for the folks who work these jobs
Grace period
• Every cloud provider has different internal policies on how long to wait until
decommissioning or quarantining a customer server, as to not disrupt their business
14. Qualifiers (cont’d)
Collection Bias
• There are many gaps in my methodology and it is by no means conclusive. This talk is
one data point from one source
Comparison
• Cloud providers have different problems and different priorities than other hosting
types (ISPs, corporate networks, telecommunications providers)
Customer != Corporate
• These devices are not located in any of these organization’s corporate networks
CYA
• This talk is in no way a reflection of these organization’s security posture. It is a set of
observations based on a specific set of data.
15. Data sources
• Infection status – GreyNoise
• Cloud provider ranges – IPinfo.io
• Cloud provider sizes – IPinfo.io
• I love IPinfo.io with all of my heart
16. Subjects
• Amazon AWS
• Google Cloud
• DigitalOcean
• Microsoft Azure
• Oracle Cloud
• Rackspace
• Linode
• IBM Softlayer
• Vultr
• Tencent
• Ali Cloud
• OVH
• SingleHop
• CenturyLink
17. Methodology 1: Pure quantity**
• How many unique compromised IPs are there inside a given cloud
hosting provider?
• Over a four (4) month sample
• This is a really, really, really stupid way to gauge anything
• This breaks down when “size” is taken into consideration
• A larger cloud hosting provider will simply have more compromised devices
• E.g. if Google Cloud has 100 compromised devices (out of ten million customer machines)
that’s very little
• If a small cloud hosting provider with 1024 IPs has 100 infected devices, that’s *really*
bad
• Recreate:
$ gnql $org classification:malicious | jq '.count'
21. Methodology 2: Ham-fisted cumulative time**
• What is the total time (in hours) between the time GreyNoise first and
last saw all infected IPs in a given cloud hosting provider?
• This is also a really, really bad way to measure anything
• Breaks down when size is taken into consideration
• Also breaks down when you take IP recycling into consideration
• IP re-use between accounts
• IP recycling between “known good” Internet scanners (Shodan, BinaryEdge)
and briefly infected devices
• Multiple infections on the same host over a long period of time (the “state” of
“infectedness”)
23. Total time (in hours) between the time GreyNoise first and last
saw all infected IPs in a given cloud hosting provider?
• Amazon : 652.42
• DigitalOcean : 648.64
• Vultr : 1060.51
• Google : 447.66
• Microsoft Corporation : 490.80
• Oracle : 1991.11
• Rackspace : 2694.37
• Linode : 578.67
• Softlayer : 910
• Tencent : 1835.27
• Alibaba : 1186.23
• OVH : 2048.07
• SingleHop : 1265.77
• CenturyLink : 1139.16
24. Methodology 3: Cumulative “infected” time
(adjusted)
• How many cumulative “state of infected” hours (per 10,000 IPs) are there for a given
cloud hosting provider?
• Get all IPs that have ever been infected inside a given cloud provider
• Use GreyNoise to find the exact time ranges they were infected
• Account for time interval overlap (this was a lot harder than I thought it would be)
• Add all of the hours together
• Get total size of Cloud provider, in IPs (using IPinfo.io)
• This is harder for some providers than others
• This is also not accurate, but it’s better than nothing
• Divide hours by total IPs/10,000
• This breaks down because it makes smaller, busier hosting providers look slower and
positively biases gigantic hosting providers (by dividing their “hour” amount by a huge
number)
29. Methodology 4: Average time-to-close of an
infected host
• What is the average amount of time between a machine getting
infected (and attacking other hosts) and the infection being
remediated (no longer attacking other hosts)
• Find and average intervals of time (in hours) between attacks starting
and stopping
• Problem: Just because the machine stops attacking other hosts does
not mean it is no longer infected
• This is equally evident to GreyNoise and the hosting provider itself
32. Bonus Metric
• How many devices are compromised in each provider right now?
$ gnql metadata.organization:whatever classification:malicious
last_seen:>=2019-01-29 | jq .count
33.
34. Recommendations
• Be proactive
• Use third-party threat intelligence feeds to identify compromised hosts in your environment
• Trust (believable) abuse complaint reports
• Scrutinize hosts that receive a large amount of abuse complaints
• Ignore reports that come from people who complain when the receive a single port scan
• Start with strict firewall controls
• Deny any inbound traffic beyond what is necessary and force the user to whitelist additional
services
• Consider vulnerability scanning your own environment and relaying concerning
results to your customers
• Google and Vultr do this
• [SHAMELESS SELF PLUG] Ask me about the GreyNoise Cloud Trust Alliance
program