Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Threat Hunting with Splunk


Published on

Your adversaries continue to attack and get into companies. You can no longer rely on alerts from point solutions alone to secure your network. To identify and mitigate these advanced threats, analysts must become proactive in identifying not just indicators, but attack patterns and behavior. In this workshop we will walk through a hands-on exercise with a real world attack scenario. The workshop will illustrate how advanced correlations from multiple data sources and machine learning can enhance security analysts capability to detect and quickly mitigate advanced attacks.

Published in: Technology
  • Be the first to comment

Threat Hunting with Splunk

  1. 1. 1 > René Agüero • > 1 year at Splunk – Security Specialist • Based in Manhattan • 18 years in security – MCSE NT4.0 • CISSP, MSBA – Information Assurance (forensics, auditing and security) • Offensive Security • Exploitation – Metasploit, Web attacks • Rapid7 SE Director $whoami
  2. 2. Agenda • Threat Huting Basics • Threat Hunting Data Sources • Sysmon Endpoint Data • Cyber Kill Chain • Walkthrough of Attack Scenario Using Core Splunk (hands on) • Enterprise Security Walkthrough • Applying Machine Learning and Data Science to Security
  3. 3. Log In Credentials January, February & March April, May & June July and August September and October November and December User: hunter Pass: pr3dator Birth Month
  4. 4. These won’t work…
  5. 5. Am I in the right place? Some familiarity with… ● CSIRT/SOC Operations ● General understanding of Threat Intelligence ● General understanding of DNS, Proxy, and Endpoint types of data 5
  6. 6. This is a hands-on session. The overview slides are important for building your “hunt” methodology 10 minutes - Seriously.
  7. 7. Threat Hunting with Splunk 7 Vs.
  8. 8. SANS Threat Hunting Maturity 8 Ad Hoc Search Statistical Analysis Visualization Techniques Aggregation Machine Learning/ Data Science 85% 55% 50% 48% 32% Source: SANS IR & Threat Hunting Summit 2016
  9. 9. Hunting Tools: Internal Data 9 • IP Addresses: threat intelligence, blacklist, whitelist, reputation monitoring Tools: Firewalls, proxies, Splunk Stream, Bro, IDS • Network Artifacts and Patterns: network flow, packet capture, active network connections, historic network connections, ports and services Tools: Splunk Stream, Bro IDS, FPC, Netflow • DNS: activity, queries and responses, zone transfer activity Tools: Splunk Stream, Bro IDS, OpenDNS • Endpoint – Host Artifacts and Patterns: users, processes, services, drivers, files, registry, hardware, memory, disk activity, file monitoring: hash values, integrity checking and alerts, creation or deletion Tools: Windows/Linux, Carbon Black, Tanium, Tripwire, Active Directory • Vulnerability Management Data Tools: Tripwire IP360, Qualys, Nessus • User Behavior Analytics: TTPs, user monitoring, time of day location, HR watchlist Splunk UBA, (All of the above)
  10. 10. Endpoint: Microsoft Sysmon Primer 10 ● TA Available on the App Store ● Great Blog Post to get you started ● Increases the fidelity of Microsoft Logging Blog Post:
  11. 11. User: hunter Pass: pr3dator January, February & March April, May & June July and August September and October November and December
  12. 12. Sysmon Event Tags 12 Maps Network Comm to process_id Process_id creation and mapping to parentprocess_id
  13. 13. sourcetype=X* | search tag=communicate 13
  14. 14. sourcetype=X* | dedup tag| search tag=process 14
  15. 15. Data Source Mapping
  16. 16. Demo Story - Kill Chain Framework Successful brute force – download sensitive pdf document Weaponize the pdf file with Zeus Malware Convincing email sent with weaponized pdf Vulnerable pdf reader exploited by malware. Dropper created on machine Dropper retrieves and installs the malware Persistence via regular outbound comm Data Exfiltration Source: Lockheed Martin
  17. 17. Servers Storage DesktopsEmail Web Transaction Records Network Flows DHCP/ DNS Hypervisor Custom Apps Physical Access Badges Threat Intelligence Mobile CMDB Intrusion Detection Firewall Data Loss Prevention Anti- Malware Vulnerability Scans Traditional Authentication Stream Investigations– chooseyour data wisely 17
  18. 18. 18 Let’s dig in! Please, raise that hand if you need us to hit the pause button
  19. 19. APT Transaction Flow Across Data Sources 19 http (proxy) session to command & control server Remote control Steal data Persist in company Rent as botnet Proxy Conduct Business Create additional environment Gain Access to systemTransaction Threat Intelligence Endpoint Network Email, Proxy, DNS, and Web Data Sources .pdf .pdf executes & unpacks malware overwriting and running “allowed” programs Svchost.exe (malware) Calc.exe (dropper) Attacker hacks website Steals .pdf files Web Portal.pdf Attacker creates malware, embed in .pdf, emails to the target MAIL Read email, open attachment Our Investigation begins by detecting high risk communications through the proxy, at the endpoint, and even a DNS call.
  20. 20. To begin our investigation, we will start with a quick search to familiarize ourselves with the data sources. In this demo environment, we have a variety of security relevant data including… Web DNS Proxy Firewall Endpoint Email
  21. 21. Take a look at the endpoint data source. We are using the Microsoft Sysmon TA. We have endpoint visibility into all network communication and can map each connection back to a process. } We also have detailed info on each process and can map it back to the user and parent process.} Lets get our day started by looking using threat intel to prioritize our efforts and focus on communication with known high risk entities.
  22. 22. We have multiple source IPs communicating to high risk entities identified by these 2 threat sources. We are seeing high risk communication from multiple data sources. We see multiple threat intel related events across multiple source types associated with the IP Address of Chris Gilbert. Let’s take closer look at the IP Address. We can now see the owner of the system (Chris Gilbert) and that it isn’t a PII or PCI related asset, so there are no immediate business implications that would require informing agencies or external customers within a certain timeframe. This dashboard is based on event data that contains a threat intel based indicator match( IP Address, domain, etc.). The data is further enriched with CMDB based Asset/identity information.
  23. 23. We are now looking at only threat intel related activity for the IP Address associated with Chris Gilbert and see activity spanning endpoint, proxy, and DNS data sources. These trend lines tell a very interesting visual story. It appears that the asset makes a DNS query involving a threat intel related domain or IP Address. ScrollDown Scroll down the dashboard to examine these threat intel events associated with the IP Address. We then see threat intel related endpoint and proxy events occurring periodically and likely communicating with a known Zeus botnet based on the threat intel source (zeus_c2s).
  24. 24. It’s worth mentioning that at this point you could create a ticket to have someone re-image the machine to prevent further damage as we continue our investigation within Splunk. Within the same dashboard, we have access to very high fidelity endpoint data that allows an analyst to continue the investigation in a very efficient manner. It is important to note that near real-time access to this type of endpoint data is not not common within the traditional SOC. The initial goal of the investigation is to determine whether this communication is malicious or a potential false positive. Expand the endpoint event to continue the investigation. Proxy related threat intel matches are important for helping us to prioritize our efforts toward initiating an investigation. Further investigation into the endpoint is often very time consuming and often involves multiple internal hand-offs to other teams or needing to access additional systems. This encrypted proxy traffic is concerning because of the large amount of data (~1.5MB) being transferred which is common when data is being exfiltrated.
  25. 25. Exfiltration of data is a serious concern and outbound communication to external entity that has a known threat intel indicator, especially when it is encrypted as in this case. Lets continue the investigation. Another clue. We also see that svchost.exe should be located in a Windows system directory but this is being run in the user space. Not good. We immediately see the outbound communication with via https is associated with the svchost.exe process on the windows endpoint. The process id is 4768. There is a great deal more information from the endpoint as you scroll down such as the user ID that started the process and the associated CMDB enrichment information.
  26. 26. We have a workflow action that will link us to a Process Explorer dashboard and populate it with the process id extracted from the event (4768).
  27. 27. This is a standard Windows app, but not in its usual directory, telling us that the malware has again spoofed a common file name. We also can see that the parent process that created this suspicuous svchost.exe process is called calc.exe. This has brought us to the Process Explorer dashboard which lets us view Windows Sysmon endpoint data. Suspected Malware Lets continue the investigation by examining the parent process as this is almost certainly a genuine threat and we are now working toward a root cause. This is very consistent with Zeus behavior. The initial exploitation generally creates a downloader or dropper that will then download the Zeus malware. It seems like calc.exe may be that downloader/dropper. Suspected Downloader/Dropper This process calls itself “svchost.exe,” a common Windows process, but the path is not the normal path for svchost.exe. …which is a common trait of malware attempting to evade detection. We also see it making a DNS query (port 53) then communicating via port 443.
  28. 28. The Parent Process of our suspected downloader/dropper is the legitimate PDF Reader program. This will likely turn out to be the vulnerable app that was exploited in this attack. Suspected Downloader/Dropper Suspected Vulnerable AppWe have very quickly moved from threat intel related network and endpoint activity to the likely exploitation of a vulnerable app. Click on the parent process to keep investigating.
  29. 29. We can see that the PDF Reader process has no identified parent and is the root of the infection. ScrollDown Scroll down the dashboard to examine activity related to the PDF reader process.
  30. 30. Chris opened 2nd_qtr_2014_report.pdf which was an attachment to an email! We have our root cause! Chris opened a weaponized .pdf file which contained the Zeus malware. It appears to have been delivered via email and we have access to our email logs as one of our important data sources. Lets copy the filename 2nd_qtr_2014_report.pdf and search a bit further to determine the scope of this compromise.
  31. 31. Lets dig a little further into 2nd_qtr_2014_report.pdf to determine the scope of this compromise.
  32. 32. Lets search though multiple data sources to quickly get a sense for who else may have have been exposed to this file. We will come back to the web activity that contains reference to the pdf file but lets first look at the email event to determine the scope of this apparent phishing attack.
  33. 33. We have access to the email body and can see why this was such a convincing attack. The sender apparently had access to sensitive insider knowledge and hinted at quarterly results. There is our attachment. Hold On! That’s not our Domain Name! The spelling is close but it’s missing a “t”. The attacker likely registered a domain name that is very close to the company domain hoping Chris would not notice. This looks to be a very targeted spear phishing attack as it was sent to only one employee (Chris).
  34. 34. Root Cause Recap 34 Data Sources .pdf executes & unpacks malware overwriting and running “allowed” programs http (proxy) session to command & control server Remote control Steal data Persist in company Rent as botnet Proxy Conduct Business Create additional environment Gain Access to systemTransaction Threat Intelligence Endpoint Network Email, Proxy, DNS, and Web .pdf Svchost.exe (malware) Calc.exe (dropper) Attacker hacks website Steals .pdf files Web Portal.pdf Attacker creates malware, embed in .pdf, emails to the target MAIL Read email, open attachment We utilized threat intel to detect communication with known high risk indicators and kick off our investigation then worked backward through the kill chain toward a root cause. Key to this investigative process is the ability to associate network communications with endpoint process data. This high value and very relevant ability to work a malware related investigation through to root cause translates into a very streamlined investigative process compared to the legacy SIEM based approach.
  35. 35. 35 Lets revisit the search for additional information on the 2nd_qtr_2014- _report.pdf file. We understand that the file was delivered via email and opened at the endpoint. Why do we see a reference to the file in the access_combined (web server) logs? Select the access_combined sourcetype to investigate further.
  36. 36. 36 The results show has accessed this file from the web portal of There is also a known threat intel association with the source IP Address downloading (HTTP GET) the file.
  37. 37. 37 Select the IP Address, left-click, then select “New search”. We would like to understand what else this IP Address has accessed in the environment.
  38. 38. 38 That’s an abnormally large number of requests sourced from a single IP Address in a ~90 minute window. This looks like a scripted action given the constant high rate of requests over the below window. ScrollDown Scroll down the dashboard to examine other interesting fields to further investigate. Notice the Googlebot useragent string which is another attempt to avoid raising attention..
  39. 39. 39 The requests from are dominated by requests to the login page (wp-login.php). It’s clearly not possible to attempt a login this many times in a short period of time – this is clearly a scripted brute force attack. After successfully gaining access to our website, the attacker downloaded the pdf file, weaponized it with the zeus malware, then delivered it to Chris Gilbert as a phishing email. The attacker is also accessing admin pages which may be an attempt to establish persistence via a backdoor into the web site.
  40. 40. Kill Chain Analysis Across Data Sources 40 http (proxy) session to command & control server Remote control Steal data Persist in company Rent as botnet Proxy Conduct Business Create additional environment Gain Access to systemTransaction Threat Intelligence Endpoint Network Email, Proxy, DNS, and Web Data Sources .pdf .pdf executes & unpacks malware overwriting and running “allowed” programs Svchost.exe (malware) Calc.exe (dropper) Attacker hacks website Steals .pdf files Web Portal.pdf Attacker creates malware, embed in .pdf, emails to the target MAIL Read email, open attachment We continued the investigation by pivoting into the endpoint data source and used a workflow action to determine which process on the endpoint was responsible for the outbound communication. We Began by reviewing threat intel related events for a particular IP address and observed DNS, Proxy, and Endpoint events for a user in Sales. Investigation complete! Lets get this turned over to Incident Reponse team. We traced the svchost.exe Zeus malware back to it’s parent process ID which was the calc.exe downloader/dropper. Once our root cause analysis was complete, we shifted out focus into the web logs to determine that the sensitive pdf file was obtained via a brute force attack against the company website. We were able to see which file was opened by the vulnerable app and determined that the malicious file was delivered to the user via email. A quick search into the mail logs revealed the details behind the phishing attack and revealed that the scope of the compromise was limited to just the one user. We traced calc.exe back to the vulnerable application PDF Reader.
  41. 41. Splunk Enterprise Security
  42. 42. Other Items To Note Items to Note Navigation - How to Get Here Description of what to click on Click
  43. 43. Key Security Indicators (build your own!) Sparklines Editable
  44. 44. Various ways to filter data Malware-Specific KSIs and Reports Security Domains -> Endpoint -> Malware Center
  45. 45. Filterable KSIs specific to Risk Risk assigned to system, user or other Under Advanced Threat, select Risk Analysis
  46. 46. (Scroll Down) Recent Risk Activity Under Advanced Threat, select Risk Analysis
  47. 47. Filterable, down to IoC KSIs specific to Threat Most active threat source Scroll down… Scroll Under Advanced Threat, select Threat Activity
  48. 48. Specifics about recent threat matches Under Advanced Threat, select Threat Activity
  49. 49. To add threat intel go to: Configure -> Data Enrichment -> Threat Intelligence Downloads Click
  50. 50. Click “Threat Artifacts” Under “Advanced Threat” Click
  51. 51. Artifact Categories – click different tabs… STIX feed Custom feed Under Advanced Threat, select Threat Artifacts
  52. 52. Review the Advanced Threat content Click
  53. 53. Data from asset framework Configurable Swimlanes Darker=more events All happened around same timeChange to “Today” if needed Asset Investigator, enter “”
  54. 54. Data Science & Machine Learning In Security 54
  55. 55. Disclaimer: I am not a data scientist
  56. 56. Types of Machine Learning Supervised Learning: generalizing from labeled data
  57. 57. Supervised Machine Learning 57 Domain Name TotalCnt RiskFactor AGD SessionTime RefEntropy NullUa Outcome 144 6.05 1 1 0 0 Malicious 6192 5.05 0 1 0 0 Malicious 107 3 0 0 0 0 Benign 111 2 0 1 0 0 Benign 170 2 0 0 0 0 Benign 310 2 0 1 0 0 Benign 107 1 0 0 0 0 Benign 111 1 0 1 0 0 Benign
  58. 58. Unsupervised Learning: generalizing from unlabeled data
  59. 59. Unsupervised Machine Learning • No tuning • Programmatically finds trends • UBA is primarily unsupervised • Rigorously tested for fit 59 AlgorithmRaw Security Data Automated Clustering
  60. 60. 60
  61. 61. ML Toolkit & Showcase • Splunk Supported framework for building ML Apps – Get it for free: • Leverages Python for Scientific Computing (PSC) add-on: – Open-source Python data science ecosystem – NumPy, SciPy, scitkit-learn, pandas, statsmodels • Showcase use cases: Predict Hard Drive Failure, Server Power Consumption, Application Usage, Customer Churn & more • Standard algorithms out of the box: – Supervised: Logistic Regression, SVM, Linear Regression, Random Forest, etc. – Unsupervised: KMeans, DBSCAN, Spectral Clustering, PCA, KernelPCA, etc. • Implement one of 300+ algorithms by editing Python scripts
  62. 62. Machine Learning Toolkit Demo 62
  63. 63. Splunk UBA
  64. 64. Splunk UBA Use Cases ACCOUNT TAKEOVER • Privileged account compromise • Data exfiltration LATERAL MOVEMENT • Pass-the-hash kill chain • Privilege escalation SUSPICIOUS ACTIVITY • Misuse of credentials • Geo-location anomalies MALWARE ATTACKS • Hidden malware activity BOTNET, COMMAND & CONTROL • Malware beaconing • Data leakage USER & ENTITY BEHAVIOR ANALYTICS • Suspicious behavior by accounts or devices EXTERNAL THREATSINSIDER THREATS
  65. 65. Splunk User Behavior Analytics (UBA) • ~100% of breaches involve valid credentials (Mandiant Report) • Need to understand normal & anomalous behaviors for ALL users • UBA detects Advanced Cyberattacks and Malicious Insider Threats • Lots of ML under the hood: – Behavior Baselining & Modeling – Anomaly Detection (30+ models) – Advanced Threat Detection • E.g., Data Exfil Threat: – “Saw this strange login & data transferfor user kwestin at 3am in China…” – Surface threat to SOC Analysts
  66. 66. Workflow Raw Events 1 Statistical methods Security semantics 2 Threat Models Lateral movement ML Patterns Sequences Beaconing Land-speed violation Threats Kill chain sequence 5 Supporting evidence Threat scoring Graph Mining 4 Continuousself-learning Anomalies graph Entity relationship graph 3 Anomalies
  67. 67. Splunk UBA Demo 68
  68. 68. 1:1 Security Workshops ● Threat Intelligence Workshop ● CSC 20 Workshop ● SIEM+ ● Splunk UBA Data Science Workshop ● Enterprise Security Benchmark Assessment
  69. 69. Security Workshop Survey