Towards Interactive Network Forensics and            Incident Response             Matthias Vallentin             UC Berke...
MotivationWhat do the following activities have in common?    Network troubleshooting    Incident response    Network fore...
MotivationWhat do the following activities have in common?    Network troubleshooting    Incident response    Network fore...
MotivationWhat do the following activities have in common?    Network troubleshooting    Incident response    Network fore...
Outline1. Incident Response and Network Forensics2. Operational Network Monitoring using Bro3. Building an Interactive Ana...
About4th -year PhD student at UC Berkeley, advised by Vern PaxsonWorking with researchers at ICSI/ICIR and the AMPlabInter...
Outline1. Incident Response and Network Forensics2. Operational Network Monitoring using Bro3. Building an Interactive Ana...
Use Case #1: Classic Incident Response    Goal: fast and comprehensive analysis of security incidents   Often begins with ...
Use Case #2: Network Troubleshooting    Goal: find root cause of component failure   Often no specific hint, merely symptoma...
Use Case #3: Combating Insider Abuse    Goal: uncover policy violations of personnel   Analysis procedure: connect the dot...
Outline1. Incident Response and Network Forensics2. Operational Network Monitoring using Bro3. Building an Interactive Ana...
Basic Network Monitoring               Internet             Tap          Local Network                                   M...
High-Performance Network Monitoring:     The NIDS Cluster [VSL+ 07]      Internet                    Tap                Lo...
The Bro Cluster                                   Internet              Tap               Local NetworkWe run it operation...
The Bro Network Security Monitor    Fundamentally different from other IDS    Real-time network analysis framework    User ...
From Packets to High-Level Descriptions of ActivityEvent declaration type connection: record { orig: addr, resp: addr, ......
From Packets to High-Level Descriptions of ActivityEvent declaration type connection: record { orig: addr, resp: addr, ......
Event Extraction with BroEvent and data model     Rich-typed: first-class networking types (addr, port, subnet, . . . )    ...
After the Fact: Bro Logs      Policy-neutral by default: no notion of good or bad             Forensic investigations high...
After the Fact: Bro Logs                           17 / 36
Log AnalysisWhat do we do with Bro logs?    Process (ad-hoc analysis)    Summarize (time series data, histogram/top-k, qua...
Log AnalysisWhat do we do with Bro logs?    Process (ad-hoc analysis)    Summarize (time series data, histogram/top-k, qua...
Outline1. Incident Response and Network Forensics2. Operational Network Monitoring using Bro3. Building an Interactive Ana...
From Ephemeral to Persistent Activity   Bro events                                                        User Interface  ...
From Ephemeral to Persistent Activity                               Bro Apache                     Events   Query   Result...
Today’s Open-Source Solutions for Analytics                                              22 / 36
Caveats in Real-Time Analytics1. Getting poor performance       Batch processing (MapReduce)       Architectural flaws (infl...
Inspiration1. Dremel         Columnar storage         Nested data model2. Bigtable         Sharding: distributed tablets3....
Design Philosophy Touch Stones [Lam11]Storage    Keep data sorted → reduce seeks, easy random entry    Shard with access l...
Design Philosophy Touch Stones [Lam11]Storage    Keep data sorted → reduce seeks, easy random entry    Shard with access l...
Design Philosophy Touch Stones [Lam11]Storage    Keep data sorted → reduce seeks, easy random entry    Shard with access l...
VAST: Visibility Across Space and TimeVisibility     Deep understanding of the data     Visualization: you know how to do ...
QueriesTwo types 1. Search: historical query 2. Feed: live query → use case: crawl archive first, then make query permanent...
VAST: Architecture Overview                                     Ingest          QueryDistributed architecture    Elasticit...
VAST: Ingestion Architecture                                                       Store                                  ...
VAST: Query Architecture                                                       Store                                      ...
Bitmap Indexes                                        Data        Bitmap Index                                            ...
Conclusion1. Motivation: incident response, network troubleshooting, insider abuse2. The Bro network security monitor     ...
Questions?             33 / 36
References IA. Colantonio and R. Di Pietro.Concise: Compressed ’n’ Composable Integer Set.Information Processing Letters, ...
References IIAndrew Lamb.Building Blocks for Large Analytic Systems.In 5th Extremely Large Databases Conference, XLDB ’11,...
References IIIMatthias Vallentin, Robin Sommer, Jason Lee, Craig Leres, VernPaxson, and Brian Tierney.The NIDS Cluster: Sc...
Upcoming SlideShare
Loading in …5
×

Matthias Vallentin - Towards Interactive Network Forensics and Incident Response, Boundary Tech Talks November 17, 2011

5,435 views

Published on

Incident response, post-facto forensics, and network troubleshooting rely on the ability to quickly extract relevant information. To this end, security analysts and network operators need a system that (i) allows for directly expressing a query using domain-specific constructs, (ii) that delivers the performance required for interactive analysis, and (iii) that is not affected by a continuously arriving stream of semi-structured data.

This talk covers the design and implementation plans of a distributed analytics platform that meets these requirements. Well-proven Google architectures like GFS, BigTable, Chubby, and Dremel heavily influenced the design of the system, which leverages bitmap indexes to meet the interactive query requirements. The goal is to develop a prototype ready for production usage in the next few months and obtain feedback from using it on various large-scale sites serving tens of thousands of machines.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
5,435
On SlideShare
0
From Embeds
0
Number of Embeds
3,178
Actions
Shares
0
Downloads
27
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Matthias Vallentin - Towards Interactive Network Forensics and Incident Response, Boundary Tech Talks November 17, 2011

  1. 1. Towards Interactive Network Forensics and Incident Response Matthias Vallentin UC Berkeley / ICSI vallentin@icir.org Boundary Tech Talk San Francisco, CA November 17, 2011
  2. 2. MotivationWhat do the following activities have in common? Network troubleshooting Incident response Network forensics 2 / 36
  3. 3. MotivationWhat do the following activities have in common? Network troubleshooting Incident response Network forensics → Data-intensive analysis of past activity → Interactive response times often critical 2 / 36
  4. 4. MotivationWhat do the following activities have in common? Network troubleshooting Incident response Network forensics → Data-intensive analysis of past activity → Interactive response times often critical“How to build a platform that efficiently supports these activities?” 2 / 36
  5. 5. Outline1. Incident Response and Network Forensics2. Operational Network Monitoring using Bro3. Building an Interactive Analytics Platform 3 / 36
  6. 6. About4th -year PhD student at UC Berkeley, advised by Vern PaxsonWorking with researchers at ICSI/ICIR and the AMPlabInterests Large-scale network intrusion detection High-performance traffic analysis Network forensics and incident response → with strong operational emphasisProjects The Bro network security monitor VAST: Visibility Across Space and Time HILTI: High-Level Intermediate Language for Traffic Inspection 4 / 36
  7. 7. Outline1. Incident Response and Network Forensics2. Operational Network Monitoring using Bro3. Building an Interactive Analytics Platform 5 / 36
  8. 8. Use Case #1: Classic Incident Response Goal: fast and comprehensive analysis of security incidents Often begins with an external piece of intelligence “IP X serves malware over HTTP” “This MD5 hash is malware” “Connections to 128.11.5.0/27 at port 42000 are malicious” Analysis style: Ad-hoc, interactive, several refinements/adaptions Typical operations Filter: project, select Aggregate: mean, sum, quantile, min/max, histogram, top-k, unique⇒ Concrete starting point, then widen scope (bottom-up) 6 / 36
  9. 9. Use Case #2: Network Troubleshooting Goal: find root cause of component failure Often no specific hint, merely symptomatic feedback “I can’t access my Gmail” Typical operations Zoom: slice activity at different granularities Time: seconds, minutes, days, . . . Space: layer 2/3/4/7, host, subnet, port, URL, . . . Study time series data of activity aggregates Find abnormal activity “Today we see 20% less outbound DNS compared to yesterday” Infer dependency graphs: use joint behavior from past to asses present impact [KMV+ 09] Judicious machine learning [SP10]⇒ No concrete starting point, narrow scope (top-down) 7 / 36
  10. 10. Use Case #3: Combating Insider Abuse Goal: uncover policy violations of personnel Analysis procedure: connect the dots Insider attack: Chain of authorized actions, hard to detect individually E.g., data exfiltration 1. User logs in to internal machine 2. Copies sensitive document to local machine 3. Sends document to third party via email Typical operations Compare activity profiles “Jon never logs in to our backup machine at 3am” “Seth accessed 10x more files on our servers today”⇒ Relate temporally distant events, behavior-based detection 8 / 36
  11. 11. Outline1. Incident Response and Network Forensics2. Operational Network Monitoring using Bro3. Building an Interactive Analytics Platform 9 / 36
  12. 12. Basic Network Monitoring Internet Tap Local Network MonitorSites UC Berkeley (10 Gbps, 50,000 hosts) NCSA, IL (8×10 Gbps, 10,000 hosts) LBNL, Berkeley (10 Gbps, 12,000 hosts) ICSI, Berkeley (100 Mbps, 250 hosts) AirJaldi, India (10 Mbps, 500 hosts) 10 / 36
  13. 13. High-Performance Network Monitoring: The NIDS Cluster [VSL+ 07] Internet Tap Local Network Frontend Worker ... Worker ... Worker Proxy Manager Packets Logs State User 11 / 36
  14. 14. The Bro Cluster Internet Tap Local NetworkWe run it operationally at: Frontend UC Berkeley (26 workers) LBNL (15 workers) Proxy NCSA (10 4-core workers) Worker Worker WorkerRuns at numerous large sites: Proxy ... ... Industry Worker Worker Worker Proxy Academia Government Packets Logs Manager State 12 / 36
  15. 15. The Bro Network Security Monitor Fundamentally different from other IDS Real-time network analysis framework User Interface Policy-neutral at the core Logs Notifications Highly stateful Script InterpreterKey components Events 1. Event engine TCP stream reassembly Event Engine Protocol analysis Policy-neutral Packets 2. Script interpreter “Domain-specific Python” Network Generate extensive logs Apply site policy 13 / 36
  16. 16. From Packets to High-Level Descriptions of ActivityEvent declaration type connection: record { orig: addr, resp: addr, ... } event connection_established(c: connection) event http_request(c: connection, method: string, URI: string) event http_reply(c: connection, status: string, data: string) 14 / 36
  17. 17. From Packets to High-Level Descriptions of ActivityEvent declaration type connection: record { orig: addr, resp: addr, ... } event connection_established(c: connection) event http_request(c: connection, method: string, URI: string) event http_reply(c: connection, status: string, data: string)Event instantiation connection_established({127.0.0.1, 128.32.244.172, ... }) http_request({127.0.0.1, 128.32.244.172, ..}, "GET", "/index.html") http_reply({127.0.0.1, 128.32.244.172, ..}, "200", "<!DOCTYPE ht..") http_request({127.0.0.1, 128.32.244.172, ..}, "GET", "/favicon.ico") http_reply({127.0.0.1, 128.32.244.172, ..}, "200", "xBExEFx..") connection_established({127.0.0.1, 128.32.112.224, ... }) 14 / 36
  18. 18. Event Extraction with BroEvent and data model Rich-typed: first-class networking types (addr, port, subnet, . . . ) Deep: across the whole network stack Fine-grained: detailed protocol-level information Expressive: nested data with container types (aka. semi-structured) Messages Application http_request, smtp_reply, ssl_certificate Byte stream Transport new_connection, udp_request Packets (Inter)Network new_packet, packet_contents Frames Link arp_request, arp_reply 15 / 36
  19. 19. After the Fact: Bro Logs Policy-neutral by default: no notion of good or bad Forensic investigations highly benefit from unbiased information Flexible output formats: ASCII, binary, DB, custom% more conn.log#fields ts id.orig_h id.orig_p id.resp_h id.resp_p proto service duration obytes ..1144876741.1198 192.150.186.169 53115 82.94.237.218 80 tcp http 16.14929 4351144876612.6063 192.150.186.169 53090 198.189.255.82 80 tcp http 4.437460 86611144876596.5597 192.150.186.169 53051 193.203.227.129 80 tcp http 0.372440 4611144876606.7789 192.150.186.169 53082 198.189.255.73 80 tcp http 0.597711 3371144876741.4693 192.150.186.169 53116 82.94.237.218 80 tcp http 16.02667 30271144876745.6102 192.150.186.169 53117 66.102.7.99 80 tcp http 1.004346 4221144876605.6847 192.150.186.169 53075 207.151.118.143 80 tcp http 0.029663 347% more http.log#fields ts id.orig_h id.orig_p host uri status_code user_agent ..1144876741.6335 192.150.186.169 53116 docs.python.org /lib/lib.css 200 Mozilla/5.01144876742.1687 192.150.186.169 53116 docs.python.org /icons/previous.png 304 Mozilla/5.01144876741.2838 192.150.186.169 53115 docs.python.org /lib/lib.html 200 Mozilla/5.01144876742.3337 192.150.186.169 53116 docs.python.org /icons/up.png 304 Mozilla/5.01144876742.3337 192.150.186.169 53116 docs.python.org /icons/next.png 304 Mozilla/5.01144876742.3337 192.150.186.169 53116 docs.python.org /icons/contents.png 304 Mozilla/5.01144876742.3337 192.150.186.169 53116 docs.python.org /icons/modules.png 304 Mozilla/5.01144876742.3338 192.150.186.169 53116 docs.python.org /icons/index.png 304 Mozilla/5.01144876745.6144 192.150.186.169 53117 www.google.com / 200 Mozilla/5.0 16 / 36
  20. 20. After the Fact: Bro Logs 17 / 36
  21. 21. Log AnalysisWhat do we do with Bro logs? Process (ad-hoc analysis) Summarize (time series data, histogram/top-k, quantile) Correlate (machine learning, statistical tests) Age (elevate old data into higher levels of abstraction) Visualize 18 / 36
  22. 22. Log AnalysisWhat do we do with Bro logs? Process (ad-hoc analysis) Summarize (time series data, histogram/top-k, quantile) Correlate (machine learning, statistical tests) Age (elevate old data into higher levels of abstraction) VisualizeHow do we do it? All eggs in one basket SIEM: Splunk, ArcSight, NarusInsight, . . . $$$ VAST In-situ processing Tools of the trade (awk, sort, uniq, . . . ) MapReduce / Hadoop 18 / 36
  23. 23. Outline1. Incident Response and Network Forensics2. Operational Network Monitoring using Bro3. Building an Interactive Analytics Platform 19 / 36
  24. 24. From Ephemeral to Persistent Activity Bro events User Interface Policy-neutral activity Ephemeral Logs Notifications Only inside the Bro process Script Interpreter → Can I haz access? Broccoli 3rd-party Events Application Send/Receive Bro events Comm Events Broccoli Event Engine Written in C Language bindings Packets Ruby Python Network Perl→ Send-them-while-they-are-hot (Broccoli = Bro client communications library) 20 / 36
  25. 25. From Ephemeral to Persistent Activity Bro Apache Events Query Result Broccoli Events OpenSSH Query Events Result User Broccoli 21 / 36
  26. 26. Today’s Open-Source Solutions for Analytics 22 / 36
  27. 27. Caveats in Real-Time Analytics1. Getting poor performance Batch processing (MapReduce) Architectural flaws (inflexible MQ) Bloated runtime (Java)2. Losing domain-specific context Typing Nesting Causality “Can we do better?” 23 / 36
  28. 28. Inspiration1. Dremel Columnar storage Nested data model2. Bigtable Sharding: distributed tablets3. GFS Single master with meta data Locate chunks via master4. Sawzall Aggregators: collection, sample, sum, maximum, quantile, top-k, unique5. FastBit Bitmap indexes “work” for high-cardinality attributes 24 / 36
  29. 29. Design Philosophy Touch Stones [Lam11]Storage Keep data sorted → reduce seeks, easy random entry Shard with access locality → minimize involved nodes Store data in columns → don’t waste I/O Use append-only disk format → avoid expensive index updates 25 / 36
  30. 30. Design Philosophy Touch Stones [Lam11]Storage Keep data sorted → reduce seeks, easy random entry Shard with access locality → minimize involved nodes Store data in columns → don’t waste I/O Use append-only disk format → avoid expensive index updatesCompute Use disk appropriately → large sequential reads Trade CPU for I/O → type-specific, aggressive compression Use pipelined parallelism → hide latency Ship compute to data → aggregation serving tree 25 / 36
  31. 31. Design Philosophy Touch Stones [Lam11]Storage Keep data sorted → reduce seeks, easy random entry Shard with access locality → minimize involved nodes Store data in columns → don’t waste I/O Use append-only disk format → avoid expensive index updatesCompute Use disk appropriately → large sequential reads Trade CPU for I/O → type-specific, aggressive compression Use pipelined parallelism → hide latency Ship compute to data → aggregation serving treeQuery Make it user-friendly → declarative query interface Provide query hooks → support complex analysis 25 / 36
  32. 32. VAST: Visibility Across Space and TimeVisibility Deep understanding of the data Visualization: you know how to do that already. . .Across space: Unify heterogeneous data formats One query language Apache logs, SSH logs, Bro events, sensor data, . . .Across time: 1. From the ancient past (old historical data) 2. To subscribing to data that may arrive in the future 26 / 36
  33. 33. QueriesTwo types 1. Search: historical query 2. Feed: live query → use case: crawl archive first, then make query permanentUnify two ends of a spectrum Live Historical Operation Push Pull Latency O(|Xresult |) O(|Xdata |) Data location In-memory Disk (ideally cached) Flexibility Predefined Ad-hoc, adjustable Cost Pay-As-You-Go Lumpsum 27 / 36
  34. 34. VAST: Architecture Overview Ingest QueryDistributed architecture Elasticity via MQ middle layer Store Few component dependenciesDFS: fault-tolerance, replicationArchive: key-value store Contains serialized events ArchiveIndex: sharded column-store Index Compressed bitmap indexesIn-memory store Caches tablets (LRU) DFS Flushes in batches 28 / 36
  35. 35. VAST: Ingestion Architecture Store Event Indexer Router1. Events arrive at Event Router 1.1 Assign UUID x write put 1.2 Put (x, event) in archive Tablets 1.3 Forward event to Indexer ripe?2. Indexer writes event into tablet Tablet and updates indexes Manager3. Tablet Manager flushes “ripe” flush tablets Archive Capacity (space/rows) Tablets Lifetime Index DFS 29 / 36
  36. 36. VAST: Query Architecture Store Query Query Manager Proxy query1. User or NIDS issues query get Tablets2. Query Manager distributes it to relevant nodes LRU3. Tablet Manager load tablets Tablet Manager4. Query Proxy hits index flush a Returns direct result Archive load b Returns set of UUIDs Tablets Index DFS 30 / 36
  37. 37. Bitmap Indexes Data Bitmap Index b0 b1 b2 b3Column cardinality: # distinct values 2 0 0 1 0One bitmap bi for each value i 1 0 1 0 0Sparse, but compressible 2 0 0 1 0 WAH [WOSN01] COMPAX [FSV10] 0 1 0 0 0 Consice [CDP10] 0 1 0 0 0Can operate on compressed bitmaps 1 0 1 0 0 No need to decompress 3 0 0 0 1 31 / 36
  38. 38. Conclusion1. Motivation: incident response, network troubleshooting, insider abuse2. The Bro network security monitor High-performance network monitoring Expressive representation of activity Publish/subscribe event model3. Design sketch of a distributed analytics platform 32 / 36
  39. 39. Questions? 33 / 36
  40. 40. References IA. Colantonio and R. Di Pietro.Concise: Compressed ’n’ Composable Integer Set.Information Processing Letters, 110(16):644–650, 2010.Francesco Fusco, Marc Ph. Stoecklin, and Michail Vlachos.NET-FLi: On-the-fly Compression, Archiving and Indexing ofStreaming Network Traffic.Proceedings of the VLDB Endowment, 3:1382–1393, September 2010.Srikanth Kandula, Ratul Mahajan, Patrick Verkaik, Sharad Agarwal,Jitendra Padhye, and Paramvir Bahl.Detailed Diagnosis in Enterprise Networks.In Proceedings of the ACM SIGCOMM 2009 Conference on DataCommunication, SIGCOMM ’09, pages 243–254, New York, NY, USA,2009. ACM. 34 / 36
  41. 41. References IIAndrew Lamb.Building Blocks for Large Analytic Systems.In 5th Extremely Large Databases Conference, XLDB ’11, Menlo Park,California, October 2011.Robin Sommer and Vern Paxson.Outside the Closed World: On Using Machine Learning for NetworkIntrusion Detection.In Proceedings of the 2010 IEEE Symposium on Security and Privacy,SP ’10, pages 305–316, Washington, DC, USA, 2010. IEEE ComputerSociety. 35 / 36
  42. 42. References IIIMatthias Vallentin, Robin Sommer, Jason Lee, Craig Leres, VernPaxson, and Brian Tierney.The NIDS Cluster: Scalably Stateful Network Intrusion Detection onCommodity Hardware.In Proceedings of the 10th International Conference on RecentAdvances in Intrusion Detection, RAID’07, pages 107–126.Springer-Verlag, September 2007.Kesheng Wu, Ekow J. Otoo, Arie Shoshani, and Henrik Nordberg.Notes on Design and Implementation of Compressed Bit Vectors.Technical Report LBNL-3161, Lawrence Berkeley National Laboratory,Berkeley, CA, USA, 94720, 2001. 36 / 36

×