Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Battling Unknown Malware with Machine Learning

1,548 views

Published on

Learn about the first signature-less engine to be integrated into VirusTotal. In this CrowdCast deck, CrowdStrike’s Chief Scientist Dr. Sven Krasser offers an exclusive look “under the hood” of this unique machine learning engine, revealing how it works, how it differs from all other signature-based engines integrated into VirusTotal to date, and how it fits into the larger ecosystem of techniques used by CrowdStrike Falcon to keep endpoints and environments safe.

Topics will include:

- What CrowdStrike Falcon machine learning is and how it works
- How to interpret results of machine learning-based threat detection
- How users can benefit from the CrowdStrike Falcon machine learning engine
- How this cutting-edge technology fits into the CrowdStrike Falcon breach prevention platform

Published in: Technology
  • Be the first to comment

Battling Unknown Malware with Machine Learning

  1. 1. BATTLING UNKNOWN MALWARE WITH MACHINE LEARNING DR. SVEN KRASSER CHIEF SCIENTIST @SVENKRASSER
  2. 2. FALCON ON VIRUSTOTAL 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  3. 3. SUBMITTING TO VIRUSTOTAL 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  4. 4. SCAN RESULTS 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  5. 5. SCAN RESULTS 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  6. 6. SCAN RESULTS 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  7. 7. MACHINE LEARNING PRIMER More on this: watch http://tinyurl.com/MLcrowdcast 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  8. 8. Some Data to Get Started: 1988 ANTHROPOMETRIC SURVEY OF ARMY PERSONNEL Source: http://mreed.umtri.umich.edu/mreed/downloads.html#anthro 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  9. 9. • Over 4000 soldiers surveyed • Over 100 measurements • Reported by gender Data 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  10. 10. FIRST LOOK Height [mm] Density • Difference in distribution • Significant overlap 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  11. 11. SECOND DIMENSION Height [mm] Weight[10-1 kg] • Correlation • Overlap 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  12. 12. FEATURE SELECTION “Buttock Circumference” [mm] Weight[10-1 kg] • Correlation • Reduced overlap • Selection of features matters 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  13. 13. LET’S CLASSIFY “Buttock Circumference” [mm] Weight[10-1 kg] • Let’s assume we want to detect males (blue) • I.e. “blue” is our positive class • TP: classify blue as blue • Note some misclassifications • FP: classify red as blue • FN: classify blue as red 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  14. 14. “Buttock Circumference” [mm] Weight[10-1 kg] 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. LET’S CLASSIFY 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  15. 15. “Buttock Circumference” [mm] Weight[10-1 kg] 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. LET’S CLASSIFY 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  16. 16. “Buttock Circumference” [mm] Weight[10-1 kg] 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. LET’S CLASSIFY 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  17. 17. “Buttock Circumference” [mm] Weight[10-1 kg] 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. LET’S CLASSIFY 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  18. 18. “Buttock Circumference” [mm] Weight[10-1 kg] 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. LET’S CLASSIFY • Get more “blue” right (true positives) • Get more “red” wrong (false positives) 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  19. 19. RECEIVER OPERATING CHARACTERISTICS CURVE False Positive Rate TruePositiveRate Detect more by accepting more false positives 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  20. 20. MORE DIMENSIONS 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  21. 21. MISSION ACCOMPLISHED: WE JUST ADD MORE DIMENSIONS… RIGHT? 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  22. 22. CURSE OF DIMENSIONALITY REDUCED predictive performance INCREASED training time SLOWER classification LARGER memory footprint 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  23. 23. Source: https://commons.wikimedia.org/w/index.php?curid=2257082 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  24. 24. Source: https://commons.wikimedia.org/w/index.php?curid=2257082
  25. 25. Height (mm) Weight[10-1 kg] DIMENSIONALITY AND SPARSENESS 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  26. 26. 2016 CrowdStrike, Inc. All rights reserved. Height (mm) Weight[10-1 kg] DIMENSIONALITY AND SPARSENESS 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  27. 27. LET’S APPLY THIS TO SECURITY 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  28. 28. FILE ANALYSIS AKA Static Analysis • THE GOOD – Relatively fast – Scalable – No need to detonate – Platform independent, can be done at gateway • THE BAD – Limited insight due to narrow view – Different file types require different techniques – Different subtypes need special consideration – Packed files – .Net – Installers – EXEs vs DLLs – Obfuscations (yet good if detectable) – Ineffective against exploitation and malware-less attacks – Asymmetry: a fraction of a second to decide for the defender, months to craft for the attacker 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  29. 29. 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. FILE CONTENT 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  30. 30. EXAMPLE FEATURES 32/64 BIT EXECUTABLE GUI SUBSYSTEM COMMAND LINE SUBSYSTEM FILE SIZE TIMESTAMP DEBUG INFORMATION PRESENT PACKER TYPE FILE ENTROPY NUMBER OF SECTIONS NUMBER WRITABLE NUMBER READABLE NUMBER EXECUTABLE DISTRIBUTION OF SECTION ENTROPY IMPORTED DLL NAMES IMPORTED FUNCTION NAMES COMPILER ARTIFACTS LINKER ARTIFACTS RESOURCE DATA EMBEDDED PROTOCOL STRINGS EMBEDDED IPS/DOMAINS EMBEDDED PATHS EMBEDDED PRODUCT META DATA DIGITAL SIGNATURE ICON CONTENT … 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  31. 31. String-based feature Executablesectionsize-basedfeature 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. COMBINING FEATURES 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  32. 32. Subspace Projection A SubspaceProjectionB 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. COMBINING FEATURES 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  33. 33. False Positive Rate TruePositiveRate Detect more by accepting more false positives 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. ARMY DATA ROC CURVE 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  34. 34. False Positive Rate TruePositiveRate 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED. ML MALWARE DETECTION ROC CURVE 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  35. 35. APTS & 99% OF MALWARE DETECTED… 36 Chanceofatleastone successforadversary Number of attempts 1% >99% 500 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  36. 36. MALWARE 40% THREAT SOPHISTICATION MALWARE STOPPING MALWARE IS NOT ENOUGH HARDERTOPREVENT &DETECT LOW HIGH HIGH LOW 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  37. 37. THREAT SOPHISTICATION MALWARE NON-MALWARE ATTACKS MALWARE 40% NATION- STATES 60% NON-MALWARE ATTACKS ORGANIZED CRIMINAL GANGS HACKTIVISTS/ VIGILANTES TERRORISTS CYBER- CRIMINALS YOU NEED COMPLETE BREACH PREVENTION HARDERTOPREVENT &DETECT LOW HIGH HIGH LOW 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  38. 38. Next-Generation Endpoint Protection Cloud Delivered. Enriched by Threat Intelligence MANAGED HUNTING ENDPOINT DETECTION AND RESPONSE NEXT-GEN ANTIVIRUS 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  39. 39. ML SETTINGS WITHIN FALCON HOST 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  40. 40. ML PREVENTION IN ACTION 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  41. 41. KEY POINTS • Machine Learning is an effective tool against unknown malware • Try it out on VirusTotal • Trading off true positives and false positives • Detecting 99% malware means an APT has a 100% chance of getting malware into your environment • The majority of intrusions are not malware- based • Avoid silent failure • Use a comprehensive array of techniques 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
  42. 42. www.crowdstrike.com 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.

×