Deep Dive Into Deep Learning : How AI is Powering the Future of Endpoint Security

Deep Dive Into
Deep Learning
Greg Iddon
Senior Product Manager
April 2018

The Threat Landscape Has Shifted
3
Exploits
Most organizations have
no exploit prevention^
83% agree it has become more
difficult to stop threats ^
Advanced Threats
Ransomware
54% of organizations hit
twice on average in 2017^
^Source: The State of Endpoint Security Today SurveySource: SophosLabs
26%
20%
20%
12%
12%
8%
Advanced
Malware
Ransomware
Email
Malware
Web
Malware
Generic
Malware
Cryptocurrency

Vulnerabilities Waiting to Be Exploited
4
Software Vulnerabilities Reported by Year
Source information NIST National Vulnerability Database as of 6th January 2018
https://nvd.nist.gov/vuln/search/statistics.
4,639
4,150
5,288 5,187
7,937
6,487 6,447
14,643
5,456
2010 2011 2012 2013 2014 2015 2016 2017 2018
16,368

75%
75% of the malicious files
SophosLabs detects are found
only within a single
organization.
400,000
SophosLabs receives and processes
400,000 previously unseen malware
samples each day.
The Age of Single-Use / Unseen Malware

8”Recognize Dog Deep Learning Training Set” - https://github.com/yskmt/dog_recognition

16
Artificial Intelligence
Machine Learning
K-Means Hidden
Markov
Nearest
Neighbor
Clustering,
Statistical
distribution
Deep
Learning Decision
Trees

A Simple Algorithm
17
70
80
90
100
110
120
130
140
150
160
1 2 3 4 5 6 7 8 9 10 11 12
Height(cm)
Age (years)

A Simple Algorithm
18
y = 6.8242x + 70.653
70
80
90
100
110
120
130
140
150
160
1 2 3 4 5 6 7 8 9 10 11 12
Height(cm)
Age (years)

Machine Learning vs Signatures
19
• Machine learning’s
job is to place the
blue line in the best
place possible
• Human analysts do
the same thing: (e.g.
defining that if file
size > 2000000 and
compression level >
0.5, it’s malware)
0
0.2
0.4
0.6
0.8
1
1.2
0 500000 1000000 1500000 2000000 2500000 3000000
CompressionLevel
File Size
?

Overfitting
20
• Limited data when
training a model can
result in overfitting
• False Positives are
hard to avoid with
generic machine
learning algorithms
0
0.2
0.4
0.6
0.8
1
1.2
0 500000 1000000 1500000 2000000 2500000 3000000
CompressionLevel
File Size

Overfitting
21
• Limited data when
training a model can
result in overfitting
• False Positives are
hard to avoid with
generic machine
learning algorithms
0
0.2
0.4
0.6
0.8
1
1.2
0 500000 1000000 1500000 2000000 2500000 3000000
CompressionLevel
File Size

Adding dimensions: A classifier in three dimensions
File size
• The blue plane is the
machine learning model,
defined by a simple
equation
• Humans can still write a
rule that expresses the
same basic idea: (e.g. if file
size > 2000000 and
compression level > 0.5
and number of strings >
1000, it’s malware)

23
Artificial Intelligence
Machine Learning
K-Means Hidden
Markov
Nearest
Neighbor
Clustering,
Statistical
distribution
Deep
Learning Decision
Trees

24
K-Means Hidden
MarkovNearest
Neighbor
Clustering,
Statistical
distribution
Deep
Learning
Decision
Trees
Supervised Unsupervised

Deep Neural Networks are the top performing
classifiers, highlighting the added value of Deep
Neural Networks over other more conventional
methods. Moreover, [Deep Neural Networks]
performed significantly better at almost one
standard deviation higher than the mean
performance.
26
Beyond the hype: deep neural networks
outperform established methods using a
ChEMBL bioactivity benchmark set
Eelke B. Lenselink, Niels ten Dijke, Brandon Bongers, George
Papadatos, Herman W. T. van Vlijmen, Wojtek Kowalczyk,
Adriaan P. IJzerman and Gerard J. P. van Westen

Machine Learning vs. Deep LearningDEEPLEARNING
Interconnected Layers of Neurons, Each
Identifying More Complex Features
INPUT OUTPUT
OUTPUT
MACHINELEARNING
Decision Tree
INPUT
Random Forest
OUTPUTINPUT

Deep Learning Neural Network
Faster
o DL detections in 20-100 milliseconds per file
o Traditional ML 100-500 milliseconds per file
Smaller
o Deep learning models are about 10-20 MB
o Traditional ML models can get huge
500 MB-10 GB
Smarter
o Deep learning provides proven higher
detection rates that improve with more
data
o Traditional ML has lower detection rates
and diminishing returns with more data

Deep Learning Neural Networks
31
DEEPLEARNING
Interconnected Layers of Neurons, Each
Identifying More Complex Features
INPUT
6 7 8 9 10
OUTPUT
1 2 3 4 5
INPUT OUTPUT
3

Unprecedented Synergies of Man and Machine
LABS: Source 100s of millions of
samples for the best possible
predictions
LABS: Use established Labs
systems and processes to ensure
labeling precision
DATA SCIENCE: Create the most
efficient algorithms for solving
hard cybersecurity problems
DATA SCIENCE + LABS:
Continuously incorporate
feedback to improve system
accuracy and predictive power
Only Sophos has this
critical combination
of Labs Research and
Data Science
For the first time ever, we can memorize
the entire observable threat universe.

Deep Dive Into Deep Learning : How AI is Powering the Future of Endpoint Security

Deep Dive Into Deep Learning : How AI is Powering the Future of Endpoint Security

Recommended

Recommended

More Related Content

Similar to Deep Dive Into Deep Learning : How AI is Powering the Future of Endpoint Security

Similar to Deep Dive Into Deep Learning : How AI is Powering the Future of Endpoint Security (20)

More from Digital Transformation EXPO Event Series

More from Digital Transformation EXPO Event Series (20)

Recently uploaded

Recently uploaded (20)

Deep Dive Into Deep Learning : How AI is Powering the Future of Endpoint Security