Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Anomaly Detection at Scale

718 views

Published on

Jeff Henrikson presents at the first annual O'Reilly Security Conference, in New York City, 2016

Published in: Technology
  • Hello! Who wants to chat with me? Nu photos with me here http://bit.ly/helenswee
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Anomaly Detection at Scale

  1. 1. ANOMALY DETECTION AT SCALE: A CYBERSECURITY STREAMING DATA PIPELINE USING KAFKA AND AKKA CLUSTERING O'Reilly Security Conference NYC, November 2, 2016 Jeff Henrikson Groovescale http://www.groovescale.com
  2. 2. OUTLINE framing problem statement streaming tech concepts outline of solution architecture, learnings
  3. 3. FRAMING
  4. 4. Why build predictive models? Models continue to do usefulwork a er humans are not looking Models are based on assumptions Only humans can make assumptions
  5. 5. INTRUSION DETECTION 1) Log Data 2) Configure rules 3) Human awareness examines alarms and logs 4) Quick action taken (e.g. deauthorize) 5) Re-authorize once human awareness deems longer-term mitigation is adequate Sometimes for high-confidence rules we allow 2) to trigger 4) without human intervention
  6. 6. HOW IS A SKILLED PERSON'S AWARENESS CAN BE MORE EFFECTIVELY GUIDED? 1) Matching of network behavior against localized rules 2) Predictive modeling of the aggregate network behavior
  7. 7. HOW IS A SKILLED PERSON'S AWARENESS CAN BE MORE EFFECTIVELY GUIDED? 1) Matching of network behavior against localized rules 2) Predictive modeling of the aggregate network behavior Hypothesis: Let's see if 2 is better.
  8. 8. AI Artificial Intelligence "IA" Intelligence Augmented From Building practical AI systems Adam Cheyer, (Siri, Sentient, and Viv Labs) Strata 2016
  9. 9. INTRUSION DETECTION TOOLS AS "INTELLIGENCE AUGMENTED" Intruders are trying to evade detection. Let's not worry about making the human protector of the network going away. Probably not possible given evasive response.
  10. 10. PROBLEM STATEMENT
  11. 11. NETWORK PACKET BROKER
  12. 12. CAPTURE SERVER dumpcap (from Wireshark)
  13. 13. NETFLOW (V5) BASICS Attributes: Source/Destination IP Source/Destination Port Input interface Metrics: Number ofPackets, Sum of Bytes, Start Time, End Time. IPv4 only https://nsrc.org/workshops/2015/sanog25-nmm-tutorial/materials/netflow.pdf
  14. 14. Functional Requirements Produce netflow from PCAP Score netflow for anomalies Control the number of anomalous events brought to the human expert's attention
  15. 15. Nonfunctional Requirements Process line rate 10Gb/s Be within 2x perf of tcpdump Be within 4x of netflow latency Do not add single points of failure
  16. 16. SOLUTION OUTLINE
  17. 17. OVERVIEW OF SERVICES
  18. 18. EXTERNAL DESIGN
  19. 19. EXTERNAL DESIGN System coupling: Do not prescribe deploying kafka upstream or downstream (Which Kafka version? Which language binding?) External APIs: Ingress HTTP POST octet encoding Egress HTTP GET Long Polling
  20. 20. INTERNAL DESIGN
  21. 21. INTERNAL DESIGN Record state only in: Kafka Pcap temporary files on local fs Need to write block id to EFH and dedupe for sumsto be correct in the presence of retries Prefer late delivery to dropping data Prefer reading capture time in data stream to wall clock time
  22. 22. Akka-cluster in one slide: Framework for Actor-based concurrency Program in Scala or Java Akka-cluster more general than map reduce, data pipelines Makes use local and remote resources work the same
  23. 23. MINIMUM VIABLE PREDICTIVE MODEL 1) Take Netflow metrics: sum(bytes), sum(packets), count 2) For each metric, compute mean and variance 3) Emit an "anomaly" when signal exceeds (mean + 3.0*sqrt(variance)) Meets minimum requirement: controls the number of events brought to the human expert's attention
  24. 24. EXERCISE FOR THE READER Model for periodicity: Ihler et al, Adaptive Event Detection with Time–Varying Poisson Processes, ACM SIGKDD 2006 http://www.datalab.uci.edu/papers/event_detection_kdd06.pdf
  25. 25. Symmetrical mapping of docker containers to hosts: DEPLOYMENT
  26. 26. RESULTS Qualitatively, users can find relevant Anomalies in a reasonable sized stream System operates reliably Numbers are correct within assumptions
  27. 27. ARCHITECTURE, LEARNINGS
  28. 28. SO WHY KAFKA VS ANY OTHER STREAMING COMPONENT? https://databaseline.wordpress.com/2016/03/12/an-overview-of-apache-streaming-technologies/comment-page-1/
  29. 29. HOW DOES YOUR ORGANIZATION PICK COMPONENTS?
  30. 30. STREAMING DATA LITERATURE: A data entity is created by one module, is passed from module to module until it is no longer needed and is then destroyed. . . . Punched card accounting systems exemplify this environment. J. P. Morrison, "Data Stream Linkage Mechanism", IBM Systems Journal, 1978. http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=45DED06EC91474F5938A9E05CC3D5A61? doi=10.1.1.89.2601&rep=rep1&type=pdf
  31. 31. BIND ARCHITECTURAL COUPLINGS EARLY SO THAT ARCHICTECTURAL COMPONENTS CAN BE CHOSEN WITH AMPLE EVIDENCE Examples of components: which database which streaming engine Examples of couplings: format of data (e.g. newline delimited json) how to notify how to checkpoint
  32. 32. HTTP COUPLING: WINS Win #1: Can't get access to pcap over API Win #2: Only RHEL-distributed reqs (perl-core, curl) required for ingress Win #3: Upgrade kafka when improved
  33. 33. HTTP COUPLING: WIN #3: UPGRADE WHEN READY Kafka Version 0.9.0 0.10.0.1 0.10.1.0 Partition by Hash x x x Write timestamp to message x x Read seek by timestamp x
  34. 34. LEARNING #1 https://github.com/akka/reactive-kafka Using this library in place of KafkaConsumer
  35. 35. LEARNING #2, HIDING IN PLAIN SIGHT http://www.reactive-streams.org/
  36. 36. FAVOR INTEGRATION TESTING TO UNIT TESTING Ingress, egress have optional flag placebo={true,false}. Default to true. Every deployment simulates low volume placebo sinks, sources. Transmit heartbeats when each component is sure to have made forward progress.
  37. 37. ON EVALUATING FAULT TOLERANCE AND SCALABILITY My smart buddy LinkedIn runs it in production The NSA Can we do better?
  38. 38. ON EVALUATING FAULT TOLERANCE AND SCALABILITY: The idea: Create linked containers for app Use tc to tell netfilter to drop and/or delay packets Run simulated data source
  39. 39. ON EVALUATING FAULT TOLERANCE AND SCALABILITY: Hands on create container: Hands on with the container: Hands on with the host: (docker-machine's boot2docker has tc built-in) docker run -it --rm ubuntu:14.04.2 bash root@07e330775e98:/# apt-get update && apt-get install -y ethtool root@07e330775e98:/# ethtool -S eth0 NIC statistics: peer_ifindex: 875 dev=$(ip link | grep '^875:') tc qdisc change dev $dev root netem delay 100ms 20ms distribution normal tc qdisc change dev eth0 root netem loss 0.1%
  40. 40. Myth: Code should always go into docker containers through an image
  41. 41. Myth: Code should always go into docker containers through an image Alternative: docker run -v $dirSrc:$dirSrc # to convey source code docker exec # to restart program
  42. 42. Myth: A docker image is something that came from a Dockerfile:
  43. 43. Myth: A docker image is something that came from a Dockerfile: Alternative docker run ansible-playbook -c local docker commit
  44. 44. ACKNOWLEDGEMENTS Ilya Levner Gunjan Gupta, Lightsphere AI Trey Blalock, Firewall Consulting
  45. 45. RECOMMENDED READING I Heart Logs, Jay Kreps (creator of Kafka) Akka in Action, Roestenburg et al Released Sept 30, 2016 Scala for the Impatient, 1e, Cay Horstman Second edition coming December 2016 https://www.amazon.com/Heart-Logs-Stream-Processing-Integration/dp/1491909382 https://www.amazon.com/Akka-Action-Raymond-Roestenburg/dp/1617291013 https://www.amazon.com/Scala-Impatient-Cay-S-Horstmann/dp/0321774094
  46. 46. READINGS ON LOW LATENCY DATA ENGINEERING (ORGANIZED BY COMMUNITY) Community Title URL Reactive The Reactive Manifesto http://www.reactivemanifesto.org/ Reactive Streams http://www.reactive-streams.org/ Kafka I Heart Logs, Jay Kreps, 2014 https://www.amazon.com/Heart-Logs-Stream-Processing- Integration/dp/1491909382 Kafka: The Definitive Guide, prerelease/2017 https://www.amazon.com/Kafka-Definitive-Real-time-stream- processing/dp/1491936169 NiFi The core concepts of NifFi http://nifi.apache.org/docs/nifi-docs/html/overview.html#the-core- concepts-of-nifi Flow Based Programming Flow-Based Programming, J. Paul Morrison, 2010 https://www.amazon.com/Flow-Based-Programming-2nd- Application-Development/dp/1451542321 Storm Big Data, Nathan Marz, 2015 https://www.amazon.com/Big-Data-Principles-practices- scalable/dp/1617290343
  47. 47. QUESTIONS?

×