1 | © 2018 Interset Software
How to
Operationalize
Big Data Security
Analytics
Stephan Jou
Chief Technology Officer
Interset.AI
2 | © 2018 Interset Software
Welcome
About Interset
• 75 employees & growing
• 450% ARR growth
• Data science & analytics focused on cybersecurity
• 100 person-years of Anomaly Detection R&D
• Offices in Ottawa, Canada & Newport Beach,
California
Partners
About Me
• Data miner scientist since 2006
• 4+ years building machine
learning systems for threat
hunting
• 8 years experience using
Hadoop for large scale
advanced analytics
Field Data Scientist
• Identify valuable data feeds
• Optimize system for use cases
We uncover the threats that matter!
3 | © 2018 Interset Software
3 | © 2018 Interset Software
What is AI-Based Security Analytics About?
Advanced analytics to help you catch the bad guys
4 | © 2018 Interset Software
4 | © 2018 Interset Software
Bringing Together a Fragmented Landscape
Fragmented security landscape Integrated view of security data
5 | © 2018 Interset Software
5 | © 2018 Interset Software
zz
Increasing Threat Hunting Efficiency
Low Success Rate SOC Cycle Generate Highly Anomalous Threat Leads
6 | © 2018 Interset Software
6 | © 2018 Interset Software
Increasing Visibility by Augmenting Existing Tools
SECURITY ANALYTICS
SIEM
IAMENDPOINT
BUSINESS
APPLICATIONS
CUSTOM
DATA
NETWORK DLP
SIEM
IAMENDPOINT NETWORK DLP
7 | © 2018 Interset Software
7 | © 2018 Interset Software
Case Study #1: Every Interset Customer
Billions of events
analyzed with
machine learning
Anomalies
discovered by
data science
High quality
“most wanted”
list
Analyzes the intersection of data from users, machines, files, projects,
servers, sharing behavior, resource, websites, IP Addresses and more
5,210,465,083
8 | © 2018 Interset Software
8 | © 2018 Interset Software
z
Lesson #1: Less Alerts, Not More
 Solution should help you deal
with less alerts, not more alerts
 Solution should leverage sound
statistical methods to reduce
false positives and noise
 Should allow you to do more
with the limited resources you
have
Recommendations
Measure and quantify the amount of work effort involved with and without the
Security Analytics system
9 | © 2018 Interset Software
Telecom
• Potential Data Staging/Theft
• Account Compromise
• Lateral Movement Indicators
Healthcare
• Data Theft
Field Examples
10 | © 2018 Interset Software
10 | © 2018 Interset Software
Case Study #2: Large Telco
The Situation
• Highly secure & diverse environment – protected by multiple security products
The Challenge
• Large rule/policy set developed
• Too many indicators to optimize threat leads
• Inefficient SOC cycle
The Solution
• Surface mathematically valid leads – ”legit anomalies”
• Unique normal baselines – removes threshold/rule limitations
Google Drive
• Permissive controls
• Personal/external sharing
Authentication
• Sudden change in workstation access
• Odd working hours
USB
• Sudden increase in file
copy volumes
11 | © 2018 Interset Software
11 | © 2018 Interset Software
z
Lesson #2: The Math Matters – Test It
Recommendations
• Agree on the use cases in advance
• Use a proof-of-concept with historical/existing data to test the SA’s math
• Engage red team or pen testing if available
• Evaluate the results: Do they support the use cases?
Google Drive
• Permissive controls
• Personal/external sharing
USB
• Sudden increase in
file copy volumes
Authentication
• Sudden change in
workstation access
• Odd working hours
• Data Theft
• Data
Staging
• Lateral
Movement
• Account
Compromise
12 | © 2018 Interset Software
12 | © 2018 Interset Software
Case Study #3: Healthcare Records & Payments
 Profile: 6.5 billion transactions annually, 750+
customers, 500+ employees
 Team of 7: CISO, 1 security architect, 3 security
analysts, 2 network security
 Analytics surfaced (for example) an employee who
attempted to move “sensitive data” from endpoint to
personal Dropbox
 Employee was arrested and prosecuted using
incident data
Focus and prioritized incident responses
Incident alert accuracy increased from 28% to 92%
Incident mitigation coverage doubled from 70 per week to 140
13 | © 2018 Interset Software
13 | © 2018 Interset Software
Lesson #3: Meaningful Metrics
Hawthorne Effect: Whatever gets measured,
gets optimized
Recommendations
 Define meaningful operational metrics (not just
“false positives”)
 Build a process for measuring and quantifying over
time, not just during a pilot
 Ensure the Security Analytics system supports a
feedback process to adjust the analytics to support
your target metrics
14 | © 2018 Interset Software
14 | © 2018 Interset Software
What Have We Learned?
Lessons Learned
 The Math Matters – Test It
 Less Alerts, Not More
 Automated, Measured Responses
 Meaningful Metrics
Recommendations
 Agree on the use cases in advance
 Evaluate results with and without security
analytics system
 Assess risk level, not binary alert
 Ensure integrated feedback and
automated response
15 | © 2018 Interset Software
15 | © 2018 Interset Software
QUESTIONS?
Roy Wilds – Field Data Scientist
@roywilds
Learn more at Interset.AI
16 | © 2018 Interset Software
How Millions of Events Become Qualified Threats
Leads
ACQUIRE
DATA
CREATE UNIQUE
BASELINES
DETECT,
MEASURE AND
SCORE
ANOMALIES
HIGH QUALITY
THREAT
LEADS
INTERNAL RECON
INFECTED HOST
DATA STAGING
& THEFT
COMPROMISED
ACCOUNT
LATERAL
MOVEMENT
ACCOUNT MISUSE
CUSTOM
FRAUD
Contextual views.
Drill-down and
cyber-hunting.
Broad data
collection
DLP
ENDPOINT
Buz Apps
CUSTOM
DATA
NETWORK
IAM
Determine what is
normal
Gather the
raw
materials
Find the behavior
that matters
W orkflow engine
for incident
response.
17 | © 2018 Interset Software
17 | © 2018 Interset Software
More Case Studies
Digital Advertising Agency
Prove not cause of Star Wars leak
“Interset’s unique ability to analyze billions
of events and distill them into high-
confidence security intelligence allows us to
quickly separate threats from the noise.
This visibility and focus allows our security
practitioners to quickly investigate and
remediate threats even as they become
increasingly pervasive and sophisticated.”
MSSP
“SOC in Box”
“By embedding Interset’s security analytics
into our Prescriptive Security managed
service, we provide our customers with very
rapid reaction to threats and improved
Security Operations Center efficiency.” -
Stephen Shibel, head of Big Data &
Cybersecurity for Atos North American
Operations
18 | © 2018 Interset Software
18 | © 2018 Interset Software
Case Study: Defense Contractor
zz
High Probability Anomalous Behavior Models
 Detected large copies to the portable hard drive,
at an unusual time of day
 Bayesian models to measure and detect highly
improbable events
High Risk File Models
 Detected high risk files, including PowerPoints
collecting large amounts of inappropriate content
 Risk aggregation based on suspicious behaviors
and unusual derivative movement
19 | © 2018 Interset Software
19 | © 2018 Interset Software
Lesson: Automated, Measured Responses
 Security Analytics system should
allow you to quantify risk, not just a
binary alert
 Consider how to automate responses
to low, medium, high and extreme risk
scenarios
 Where does security analytics fit into
your existing runbook?
Recommendations
• Ensure the Security Analytics system has the ability to output a risk assessment level
or score, not just a binary alert
• Ensure the Security Analytics system can integrate with downstream systems
• Evaluate the solution with automated response systems as part of the deployment
20 | © 2018 Interset Software
20 | © 2018 Interset Software
About Interset.AI
SECURITY ANALYTICS LEADER PARTNERSABOUT US
Data science & analytics
focused on cybersecurity
100 person-years of security
analytics and anomaly
detection R&D
Offices in Ottawa, Canada;
Newport Beach, CA
Interset.AI

Operationalizing Big Data Security Analytics - IANS Forum Toronto Keynote

  • 1.
    1 | ©2018 Interset Software How to Operationalize Big Data Security Analytics Stephan Jou Chief Technology Officer Interset.AI
  • 2.
    2 | ©2018 Interset Software Welcome About Interset • 75 employees & growing • 450% ARR growth • Data science & analytics focused on cybersecurity • 100 person-years of Anomaly Detection R&D • Offices in Ottawa, Canada & Newport Beach, California Partners About Me • Data miner scientist since 2006 • 4+ years building machine learning systems for threat hunting • 8 years experience using Hadoop for large scale advanced analytics Field Data Scientist • Identify valuable data feeds • Optimize system for use cases We uncover the threats that matter!
  • 3.
    3 | ©2018 Interset Software 3 | © 2018 Interset Software What is AI-Based Security Analytics About? Advanced analytics to help you catch the bad guys
  • 4.
    4 | ©2018 Interset Software 4 | © 2018 Interset Software Bringing Together a Fragmented Landscape Fragmented security landscape Integrated view of security data
  • 5.
    5 | ©2018 Interset Software 5 | © 2018 Interset Software zz Increasing Threat Hunting Efficiency Low Success Rate SOC Cycle Generate Highly Anomalous Threat Leads
  • 6.
    6 | ©2018 Interset Software 6 | © 2018 Interset Software Increasing Visibility by Augmenting Existing Tools SECURITY ANALYTICS SIEM IAMENDPOINT BUSINESS APPLICATIONS CUSTOM DATA NETWORK DLP SIEM IAMENDPOINT NETWORK DLP
  • 7.
    7 | ©2018 Interset Software 7 | © 2018 Interset Software Case Study #1: Every Interset Customer Billions of events analyzed with machine learning Anomalies discovered by data science High quality “most wanted” list Analyzes the intersection of data from users, machines, files, projects, servers, sharing behavior, resource, websites, IP Addresses and more 5,210,465,083
  • 8.
    8 | ©2018 Interset Software 8 | © 2018 Interset Software z Lesson #1: Less Alerts, Not More  Solution should help you deal with less alerts, not more alerts  Solution should leverage sound statistical methods to reduce false positives and noise  Should allow you to do more with the limited resources you have Recommendations Measure and quantify the amount of work effort involved with and without the Security Analytics system
  • 9.
    9 | ©2018 Interset Software Telecom • Potential Data Staging/Theft • Account Compromise • Lateral Movement Indicators Healthcare • Data Theft Field Examples
  • 10.
    10 | ©2018 Interset Software 10 | © 2018 Interset Software Case Study #2: Large Telco The Situation • Highly secure & diverse environment – protected by multiple security products The Challenge • Large rule/policy set developed • Too many indicators to optimize threat leads • Inefficient SOC cycle The Solution • Surface mathematically valid leads – ”legit anomalies” • Unique normal baselines – removes threshold/rule limitations Google Drive • Permissive controls • Personal/external sharing Authentication • Sudden change in workstation access • Odd working hours USB • Sudden increase in file copy volumes
  • 11.
    11 | ©2018 Interset Software 11 | © 2018 Interset Software z Lesson #2: The Math Matters – Test It Recommendations • Agree on the use cases in advance • Use a proof-of-concept with historical/existing data to test the SA’s math • Engage red team or pen testing if available • Evaluate the results: Do they support the use cases? Google Drive • Permissive controls • Personal/external sharing USB • Sudden increase in file copy volumes Authentication • Sudden change in workstation access • Odd working hours • Data Theft • Data Staging • Lateral Movement • Account Compromise
  • 12.
    12 | ©2018 Interset Software 12 | © 2018 Interset Software Case Study #3: Healthcare Records & Payments  Profile: 6.5 billion transactions annually, 750+ customers, 500+ employees  Team of 7: CISO, 1 security architect, 3 security analysts, 2 network security  Analytics surfaced (for example) an employee who attempted to move “sensitive data” from endpoint to personal Dropbox  Employee was arrested and prosecuted using incident data Focus and prioritized incident responses Incident alert accuracy increased from 28% to 92% Incident mitigation coverage doubled from 70 per week to 140
  • 13.
    13 | ©2018 Interset Software 13 | © 2018 Interset Software Lesson #3: Meaningful Metrics Hawthorne Effect: Whatever gets measured, gets optimized Recommendations  Define meaningful operational metrics (not just “false positives”)  Build a process for measuring and quantifying over time, not just during a pilot  Ensure the Security Analytics system supports a feedback process to adjust the analytics to support your target metrics
  • 14.
    14 | ©2018 Interset Software 14 | © 2018 Interset Software What Have We Learned? Lessons Learned  The Math Matters – Test It  Less Alerts, Not More  Automated, Measured Responses  Meaningful Metrics Recommendations  Agree on the use cases in advance  Evaluate results with and without security analytics system  Assess risk level, not binary alert  Ensure integrated feedback and automated response
  • 15.
    15 | ©2018 Interset Software 15 | © 2018 Interset Software QUESTIONS? Roy Wilds – Field Data Scientist @roywilds Learn more at Interset.AI
  • 16.
    16 | ©2018 Interset Software How Millions of Events Become Qualified Threats Leads ACQUIRE DATA CREATE UNIQUE BASELINES DETECT, MEASURE AND SCORE ANOMALIES HIGH QUALITY THREAT LEADS INTERNAL RECON INFECTED HOST DATA STAGING & THEFT COMPROMISED ACCOUNT LATERAL MOVEMENT ACCOUNT MISUSE CUSTOM FRAUD Contextual views. Drill-down and cyber-hunting. Broad data collection DLP ENDPOINT Buz Apps CUSTOM DATA NETWORK IAM Determine what is normal Gather the raw materials Find the behavior that matters W orkflow engine for incident response.
  • 17.
    17 | ©2018 Interset Software 17 | © 2018 Interset Software More Case Studies Digital Advertising Agency Prove not cause of Star Wars leak “Interset’s unique ability to analyze billions of events and distill them into high- confidence security intelligence allows us to quickly separate threats from the noise. This visibility and focus allows our security practitioners to quickly investigate and remediate threats even as they become increasingly pervasive and sophisticated.” MSSP “SOC in Box” “By embedding Interset’s security analytics into our Prescriptive Security managed service, we provide our customers with very rapid reaction to threats and improved Security Operations Center efficiency.” - Stephen Shibel, head of Big Data & Cybersecurity for Atos North American Operations
  • 18.
    18 | ©2018 Interset Software 18 | © 2018 Interset Software Case Study: Defense Contractor zz High Probability Anomalous Behavior Models  Detected large copies to the portable hard drive, at an unusual time of day  Bayesian models to measure and detect highly improbable events High Risk File Models  Detected high risk files, including PowerPoints collecting large amounts of inappropriate content  Risk aggregation based on suspicious behaviors and unusual derivative movement
  • 19.
    19 | ©2018 Interset Software 19 | © 2018 Interset Software Lesson: Automated, Measured Responses  Security Analytics system should allow you to quantify risk, not just a binary alert  Consider how to automate responses to low, medium, high and extreme risk scenarios  Where does security analytics fit into your existing runbook? Recommendations • Ensure the Security Analytics system has the ability to output a risk assessment level or score, not just a binary alert • Ensure the Security Analytics system can integrate with downstream systems • Evaluate the solution with automated response systems as part of the deployment
  • 20.
    20 | ©2018 Interset Software 20 | © 2018 Interset Software About Interset.AI SECURITY ANALYTICS LEADER PARTNERSABOUT US Data science & analytics focused on cybersecurity 100 person-years of security analytics and anomaly detection R&D Offices in Ottawa, Canada; Newport Beach, CA Interset.AI

Editor's Notes

  • #5 Reactive Scattered Overwhelming 60–80% false positives Not enough data for visibility Not enough staff
  • #13 Use cases: Insider threat, account compromise, fraud, HIPAA compliance Data sources: Endpoints, AD via Splunk, Fileshare logs, EMR application logs, prescription logs
  • #17 4 key components you need for an effective security analytics solution -You need to compute unique normal -You need unsupervised machine learning – making no assumptions as to behavior or distribution of data. In fact, these types of datasets involved in insider attacks rarely have much meta-data that describes the data itself. -You need a Big Data infrastructure – need the ability to compute at scale in a cost effective manner -You need a mathematical framework – to ingest billions of events every day and reduce it down to a handful of real threat leads. -Also, the ability to integrate into your security eco-system is critical so the solution is completely API driven