Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Leveraging DNS to
Surface Attacker
Activity
March 2017 • Josh Liburdi & Chris McCubbin
Presenters
Chris McCubbin
Sqrrl Director of Data Science
Josh Liburdi
Sqrrl Security Technologist
Agenda
• Leveraging DNS data for investigations
• DNS-based data science techniques
• An example of Tunneling and DGA dete...
Leveraging DNS
Data for
Investigations
What is DNS?
Client needs to connect to:
https://www.sqrrl.com
Client's DNS server doesn't know where
sqrrl.com is hosted,...
How do attackers use DNS?
• Attackers target DNS
– DNS spoofing
– DNS reflection
• Attackers utilize DNS
– Tunneling
– Dom...
Why is DNS data useful?
Threat Detection
Opportunity for attacker to leave traceable
footprints in your network
Incident I...
DNS Tunneling Overview
• Data encoded inside of DNS queries is sent to an attacker-controlled server
• Used for command an...
DNS Tunneling Overview
Many queries required to transfer moderate
amounts of data
1MB transfer would take ~5k domains
Tunn...
DGA Overview
def generate_domain(year, month, day):
domain = ""
for i in range(16):
year = ((year ^ 8 * year) >> 11) ^
((y...
DGA Overview
Source: https://johannesbader.ch/2014/12/the-dga-of-newgoz/
DGAs produce patterns
Visually appear “off”
Human...
DGA Overview
Malware Seed # Domains in wild
Alureon Thread ID + milliseconds since boot 5/day
Padcrypt Date 24/day or 72/d...
DNS-Based Data Science
Techniques
DNS Data Sources
DNS Tunnel Detection
DNS
Data
Filter
DNS
Data
Collation
Features
Classifier Risk Outliers
DNS Tunnel Detection
DNS Data Filter DNS Data
0.
0.5
1.
1.5
2.
2.5
NumberofDNS
requests
Time
1 hour buckets
IP + Destinati...
• Number of queries
• Number of subdomains
• Average subdomain length
• Average information content of subdomains
Features...
• Number of queries
• Number of subdomains
• Average subdomain length
• Average information content of subdomains
Classifi...
DNS Data Filter DNS Data
DNS Tunnel Validation
paeqcigq.tunnel.com
pafich3i.tunnel.com
gxqwl0eaytioruga5.tunnel.com IP + D...
Lessons Learned from testing on Sqrrl DNS data
• There are several potential sources of false positives:
– CDNs
– Anti-vir...
Sqrrl traffic data feature plots
0
45
90
135
180
0 2250 4500 6750 9000
Number of Subdomains
Phishing
YouTube, Amazon AWS,
...
Sqrrl traffic data feature plots
0.
0.25
0.5
0.75
1.
1.25
0 2250 4500 6750 9000 11250
Number of Subdomains
0.
0.25
0.5
0.7...
eclampsialemontree.net
• Queries to 284 unique subdomains with names like:
– ykzcpj1j4ovv3nc1mcgg27ji7uzf4o,
yhgir5h3ts3rp...
DNS DGA Detection
DNS
Data
Filter
DNS
Data
Collation
Features
Classifier Risk Outliers
DNS DGA Detection
DNS Data Filter DNS Data
Collation
IP → Domain Session
IP → Domain Session
IP → Domain Session
IP → Doma...
DNS DGA Classification Features
Features
0.
0.1
0.2
0.2
0.3
0 1 2 3 4 5 6
Day of the week
Histogram for day of the
week
0....
DNS DGA Classification
Classifier Risk Outliers
Features
0.
0.1
0.2
0.2
0.3
0 1 2 3 4 5 6
Day of the week
Histogram for da...
DNS DGA Validation
DNS Data Filter DNS Data
ci4u0c10b77f5opvn211n5poa3.com
wiqyhl13dkep615aec27ue2t2t.net
mkguv3bd2hi317d9...
Combined DGA Risk Score
-400
-200
0
200
400
600
800
1000
1200
1400
1600
1800
-400 -200 0 200 400 600 800 1000 1200 1400 16...
Example Tunneling
and DGA Detection
DNS Tunnel
DGA
Graph Investigation
info.sqrrl.com/download-ueba-ebook
User & Entity Behavior Analytics
What's included in this
• What you need to know about ...
Questions
Upcoming SlideShare
Loading in …5
×

Leveraging DNS to Surface Attacker Activity

483 views

Published on

In this training session, two leading security experts review how adversaries use DNS to achieve their mission, how to use DNS data as a starting point for launching an investigation, the data science behind automated detection of DNS-based malicious techniques and how DNS tunneling and DGA machine learning algorithms work.

Watch the presentation with audio here: http://info.sqrrl.com/leveraging-dns-for-proactive-investigations

Published in: Software
  • Be the first to comment

  • Be the first to like this

Leveraging DNS to Surface Attacker Activity

  1. 1. Leveraging DNS to Surface Attacker Activity March 2017 • Josh Liburdi & Chris McCubbin
  2. 2. Presenters Chris McCubbin Sqrrl Director of Data Science Josh Liburdi Sqrrl Security Technologist
  3. 3. Agenda • Leveraging DNS data for investigations • DNS-based data science techniques • An example of Tunneling and DGA detection
  4. 4. Leveraging DNS Data for Investigations
  5. 5. What is DNS? Client needs to connect to: https://www.sqrrl.com Client's DNS server doesn't know where sqrrl.com is hosted, forwards query to upstream server Upstream DNS server knows sqrrl.com resolves to 104.196.225.76, returns response Client's DNS server caches response, sends response to client Client connects to https://www.sqrrl.com DNS Server https://sqrrl.com 2 3 5 DNS Server 1 4 1 2 3 4 5
  6. 6. How do attackers use DNS? • Attackers target DNS – DNS spoofing – DNS reflection • Attackers utilize DNS – Tunneling – Domain Generation Algorithms (DGA) – Dynamic DNS
  7. 7. Why is DNS data useful? Threat Detection Opportunity for attacker to leave traceable footprints in your network Incident Investigations Keep track of attacker access in your network
  8. 8. DNS Tunneling Overview • Data encoded inside of DNS queries is sent to an attacker-controlled server • Used for command and control, data exfiltration • Bypasses common security controls (firewalls, web proxies) Local Network Local DNS Resolver Intermediate DNS Resolver *.tunnel.com DNS Tunnel Server *.tunnel.com DNS Tunnel Client Remote Network
  9. 9. DNS Tunneling Overview Many queries required to transfer moderate amounts of data 1MB transfer would take ~5k domains Tunnels produce patterns paeqcigq.tunnel.com pafich3i.tunnel.com gxqwl0eaytioruga5.tunnel.com Queried DNS domains tend to be unique Assuming no repeats in data, each domain will contain unique labels
  10. 10. DGA Overview def generate_domain(year, month, day): domain = "" for i in range(16): year = ((year ^ 8 * year) >> 11) ^ ((year & 0xFFFFFFF0) << 17) month = ((month ^ 4 * month) >> 25) ^ 16 * (month & 0xFFFFFFF8) day = ((day ^ (day << 13)) >> 19) ^ ((day & 0xFFFFFFFE) << 12) domain += chr(((year ^ month ^ day) % 25) + 97) return domain Method of establishing a connection with a command and control server Used to protect / hide infrastructure and evade detection Avoids DNS domain blacklisting Malware generates DNS domains based on an algorithm and a seed Seed may be hardcoded or determined dynamically (e.g., current datetime) en.wikipedia.org/wiki/Domain_generation_algorithm# Example
  11. 11. DGA Overview Source: https://johannesbader.ch/2014/12/the-dga-of-newgoz/ DGAs produce patterns Visually appear “off” Human would interpret the domain as strange (pmwtrdsv.ru) or nonsensical (turnipboxsea.com) Malware may attempt to resolve many unregistered domains ci4u0c10b77f5opvn211n5poa3.comwiq yhl13dkep615aec27ue2t2t.net kguv3bd2hi317d9l8vdy4i6m.org xah67i2ayufesns8mh12h1kab.net 7m4oq6jngoka7zxtoq1taebe1.com
  12. 12. DGA Overview Malware Seed # Domains in wild Alureon Thread ID + milliseconds since boot 5/day Padcrypt Date 24/day or 72/day ProsLikeFan Date, hardcoded 100/day Qadars Date 200/day Qakbot Date 5000/day Sisron Date 4/day Source: https://johannesbader.ch/
  13. 13. DNS-Based Data Science Techniques
  14. 14. DNS Data Sources
  15. 15. DNS Tunnel Detection DNS Data Filter DNS Data Collation Features Classifier Risk Outliers
  16. 16. DNS Tunnel Detection DNS Data Filter DNS Data 0. 0.5 1. 1.5 2. 2.5 NumberofDNS requests Time 1 hour buckets IP + Destination → Domain Session IP + Destination → Domain Session IP + Destination → Domain Session IP + Destination → Domain Session Collation
  17. 17. • Number of queries • Number of subdomains • Average subdomain length • Average information content of subdomains Features DNS Tunnel Classification Features IP + Destination → Domain Session IP + Destination → Domain Session IP + Destination → Domain Session IP + Destination → Domain Session
  18. 18. • Number of queries • Number of subdomains • Average subdomain length • Average information content of subdomains Classifier Risk Outliers Features DNS Tunnel Classification
  19. 19. DNS Data Filter DNS Data DNS Tunnel Validation paeqcigq.tunnel.com pafich3i.tunnel.com gxqwl0eaytioruga5.tunnel.com IP + Destination → Domain Session IP + Destination → Domain Session IP + Destination → Domain Session IP + Destination → Domain Session Collation
  20. 20. Lessons Learned from testing on Sqrrl DNS data • There are several potential sources of false positives: – CDNs – Anti-virus software – Internal DNS traffic – Popular services (Spotify, Slack, …) • Many of these organize content under long, random-looking subdomain names • Whitelisting can remove some of these false positives • A hard cut requiring > K unique subdomains per user per hour helps significantly
  21. 21. Sqrrl traffic data feature plots 0 45 90 135 180 0 2250 4500 6750 9000 Number of Subdomains Phishing YouTube, Amazon AWS, CDNs, anti-virus, anti-spam sqrrl-lab.net slack-msgs.com AverageLength
  22. 22. Sqrrl traffic data feature plots 0. 0.25 0.5 0.75 1. 1.25 0 2250 4500 6750 9000 11250 Number of Subdomains 0. 0.25 0.5 0.75 1. 1.25 0 225 450 675 900 1125 Number of subdomains eclampsialemontree.net slack sqrrl-lab anti-virus Ad servers UniqueQueries UniqueQueries
  23. 23. eclampsialemontree.net • Queries to 284 unique subdomains with names like: – ykzcpj1j4ovv3nc1mcgg27ji7uzf4o, yhgir5h3ts3rppd3j3bph1se4rjqtj, – Pkbenvnzwo2jl2onldka17rv5uu2kd, – Kinkascic, – Kinkascie, – Kinkascig • Most queried just once, a few 2-4 times • Length always a multiple of 3, almost always 30 or 9 characters • Appears to be a malware site that attempts to inject invisible frames into ads
  24. 24. DNS DGA Detection DNS Data Filter DNS Data Collation Features Classifier Risk Outliers
  25. 25. DNS DGA Detection DNS Data Filter DNS Data Collation IP → Domain Session IP → Domain Session IP → Domain Session IP → Domain Session 0. 0.5 1. 1.5 2. 2.5 Requestssent Time DNS Session
  26. 26. DNS DGA Classification Features Features 0. 0.1 0.2 0.2 0.3 0 1 2 3 4 5 6 Day of the week Histogram for day of the week 0. 0.04 0.07 0.11 0.14 0.18 0 2 4 6 8 10 12 14 16 18 20 22 24 Hour of the day Histogram for hour of the day IP → Domain Session IP → Domain Session IP → Domain Session IP → Domain Session • Session duration • Number of unique NxDomains • Average information content of subdomains
  27. 27. DNS DGA Classification Classifier Risk Outliers Features 0. 0.1 0.2 0.2 0.3 0 1 2 3 4 5 6 Day of the week Histogram for day of the week 0. 0.04 0.07 0.11 0.14 0.18 0 2 4 6 8 10 12 14 16 18 20 22 24 Hour of the day Histogram for hour of the day • Session duration • Number of unique NxDomains • Average information content of subdomains
  28. 28. DNS DGA Validation DNS Data Filter DNS Data ci4u0c10b77f5opvn211n5poa3.com wiqyhl13dkep615aec27ue2t2t.net mkguv3bd2hi317d9l8vdy4i6m.org 1xah67i2ayufesns8mh12h1kab.net 17m4oq6jngoka7zxtoq1taebe1.com Collation IP → Domain Session IP → Domain Session IP → Domain Session IP → Domain Session
  29. 29. Combined DGA Risk Score -400 -200 0 200 400 600 800 1000 1200 1400 1600 1800 -400 -200 0 200 400 600 800 1000 1200 1400 1600 1800 CombinedRank Index Combined Rank Separation • Normal • DGA
  30. 30. Example Tunneling and DGA Detection
  31. 31. DNS Tunnel
  32. 32. DGA
  33. 33. Graph Investigation
  34. 34. info.sqrrl.com/download-ueba-ebook User & Entity Behavior Analytics What's included in this • What you need to know about advanced behavioral analytics • How it can automate and revolutionize threat hunting • How to use it for streamlined threat detection practices The Heart of Next-Generation Threat Hunting
  35. 35. Questions

×