8. Discoveries of the Hubble Telescope
• Scientific Papers Written: 14,658 (and Counting)
• The Age of the Universe
• The Existence of Dark Energy
• Supermassive Black Holes
• The Existence of Earth-like Planets
9. • 7
Billio
n
12
Billion
13
Billion
13.3
Billion
13.5
Billion
1 4 7 10 >20REDSHIFT
VISIBILITY INTO THE KNOWN UNIVERSE Present
1917
Hubble Space Telescope (and 2004 upgrade)
James Webb Next-Gen Space Telescope
1990
2018
Hooker Telescope (and ground-based observatories)
Next-Gen Visibility
10. 1990
Abstractions and Generalizations like Heuristics, HIPS, Dynamic, Behavioral Analysis
Next-Gen Machine Learning, Deep Learning Applied to Cybersecurity, Neural Networks, Big Data Analytics, Synchronized
2004
FUTURE
Stop Malware by Name (i.e., Signatures)
Next-Gen Cybersecurity
MONTHS WEEKS DAYS HOURS MINUTES SECONDS MILLISECONDS
X x * 10yx * 103x * 102 x * 104 x * 105
RESPONSE TIME TO KNOWN
UNIVERSE OF THREATS
THREAT SOPHISTICATION
11. Sophos Mission
Our mission is to be the best in the
world at delivering innovative, simple,
and highly-effective cybersecurity
solutions to IT professionals and the
channel that serves them.
12. It’s Working! Sophos FY17 Results
+7%
INDUSTRY
+4%
INDUSTRY
+6%
INDUSTRY
NETWORK EMEAENDUSER
13. It’s Working! Sophos FY17 Results
+22%
+7%
INDUSTRY
+4%
INDUSTRY
+31%
+6%
INDUSTRY
+26%
NETWORK EMEAENDUSER
14. Sophos Central and Intercept X
$0
$5,000,000
$10,000,000
$15,000,000
$20,000,000
$25,000,000
$30,000,000
$35,000,000
$40,000,000
Q2FY14
Q3FY14
Q4FY14
Q1FY15
Q2FY15
Q3FY15
Q4FY15
Q1FY16
Q2FY16
Q3FY16
Q4FY16
Q1FY17
Q2FY17
Q3FY17
Q4FY17
Q1FY18
0
10,000
20,000
30,000
40,000
50,000
CUSTOMERSBILLINGS
SOPHOS CENTRAL INTERCEPT X
0
5,000
10,000
15,000
20,000
Q2 FY17 Q3 FY17 Q4 FY17 Q1 FY18
$0
$5,000,000
$10,000,000
$15,000,000
$20,000,000
CUSTOMERSBILLINGS
14
* Intercept X billings also include EXP
15. NETWORK
SECURITY
WINNER
2017
Security Vendor
of the Year
Security Vendor
of the Year
Endpoint Vendor
of the Year
5 Star Partner
Program
Best UTM
Vendor
Best Security
Hardware
Best Partner Portal
Sophos and the Channel: A True Partnership
15
Security-Email
Winner Sophos
Email
17. Getting to the Podium
17
EndpointMobileEncryptionServerWeb Wireless Email Sophos
Central
Firewall
18. Analyst Validation: Endpoint
18
Gartner Magic Quadrant 2017
COMPLETENESS OF VISION
ABILITYTOEXECUTE
CHALLENGERS LEADERS
NICHE PLAYERS VISIONARIES
As of January 2017
Intel Security (McAfee)
Microsoft
Sophos
Symantec Kaspersky Labs
Trend Micro
SentinelOne
CrowdStrike
Invincea
Cylance
Carbon Black
ESET360 Enterprise
Security Group
AhnLab
G Data Software Comodo
Malwarebytes
Webroot
F-Secure
Panda Security
Palo Alto Networks
Bitdefender
Distance and Direction of Movement
19. Analyst Validation: Endpoint
19
WEAKCURRENTOFFERINGSTRONG
WEAK STRATEGY STRONG
CONTENDERS
STRONG
PERFORMERS LEADERS
Bromium
Kaspersky Labs
Trend Micro
Symantec
Sophos
McAfee
ESET
SentinalOne
Palo Alto
Networks
CrowdStrike
Cylance
Carbon Black
Forrester Wave October 2016
Sophos’ road map to develop
strong signatureless prevention
and detection capabilities
should make the product highly
competitive over the long term.
20. Analyst Validation: UTM
20
Gartner Magic Quadrant
COMPLETENESS OF VISION
ABILITYTOEXECUTE
CHALLENGERS LEADERS
NICHE PLAYERS VISIONARIES
As of August 2016
Cisco
Dell SonicWALL
Juniper Networks
Huawei
Aker Security Systems
Untangle
Rohde & Schwarz
Stormshield
Barracuda
Hillstone Networks
Venustech
WatchGuard
Sophos
Check Point
Fortinet
Sophos is a Leader because it
continues to grow its market share
based on features, support
services and customer trust in its
UTM roadmap…Sophos UTMs
continue to be rated higher at
ease of management…
23. What Is Machine Learning?
23
FEATURES
Color
Shape
Size
Mass
INPUT
TRAINING
THE
MODEL
LABELS
Vegetable
Fruit
PREDICTIONS
LEARNER
ANSWERS
PRODUCTION MODELPREDICTION Fruit
26. Dog or Mop?
26”Recognize Dog Deep Learning Training Set” - https://github.com/yskmt/dog_recognition
27. What Is Machine Learning?
27
FEATURES
Color
Shape
Size
Mass
INPUT
TRAINING
THE
MODEL
LABELS
Vegetable
Fruit
PREDICTIONS
LEARNER
ANSWERS
PRODUCTION MODELPREDICTION Fruit
The more samples and attributes that are used to train,
the better the prediction. Unfortunately, traditional
machine learning is limited in the numbers of samples
and attributes that it can efficiently process.
28. Machine Learning vs. Deep Learning
28
DEEPLEARNING
Interconnected Layers of Neurons, Each
Identifying More Complex Features
INPUT OUTPUT
OUTPUT
MACHINELEARNING
Decision Tree
INPUT
Random Forest
OUTPUTINPUT
29. DEEP LEARNINGMACHINE LEARNING
Deep Learning in Intercept X
Automatically learns to optimize attributesAUTOMATION
SCALE
EFFICIENCY
Manually extracted attributes of the target data
Difficulty scaling to 10s of millions
Requires huge model sizes Highly compressed model sizes
Elegantly scales to 100s of millions
Ht: 5’ 10”
Wt: 175 lbs
Hair: Brown
Eye: Brown
Age: 32
Right-handed
…
Better
Protection
Better
Performance
Better
Accuracy
29
30. Unprecedented Synergies of Man and Machine
LABS: Source 100s of millions of
samples for the best possible
predictions
LABS: Use established Labs
systems and processes to ensure
labeling precision
DATA SCIENCE: Create the most
efficient algorithms for solving
hard cybersecurity problems
DATA SCIENCE + LABS:
Continuously incorporate
feedback to improve system
accuracy and predictive power
Only Sophos has this
critical combination
of Labs Research and
Data Science
For the first time ever, we can memorize
the entire observable threat universe.
30
33. Sophos XG Firewall
Unrivalled security, simplicity, and insight
Exposes Hidden Risks
Automatically Responds to Incidents
Apps, Users, Payloads, Threats
Traffic-light dashboard indicators
Comprehensive On-Box Reporting
Prevent issues from becoming problems
Unique Security Heartbeat™
Integrates EP Health into rules
Instantly ID compromised systems
Automatically isolate them
Blocks Network Threats
Full suite of protection
IPS, APT, Sandboxing
Web and App Control
Easily managed from a single screen
33
34. 34
What Firewalls See Today What XG Firewall Sees
All firewalls today depend on static
application signatures to identify apps. But
those don’t work for most custom, obscure,
evasive, or any apps using generic HTTP or
HTTPS. You can’t control what you can’t
see.
XG Firewall utilizes Synchronized Security
to automatically identify, classify, and
control all unknown applications. Easily
blocking the apps you don’t want and
prioritizing the ones you do.
Synchronized App Control
A breakthrough in network visibility and control
35. The Road Ahead for Sophos Next-Gen Firewall
35
Synchronized App Control
Automatic classification and control of unknown apps
IoT Identification
Automatically identify and classify IoT devices
Device Access and Provisioning
Control access and provision agent before accessing
network resources
Superior SSL Inspection
Ultra-high performance SSL inspection
CASB
Cloud Access Security Broker for visibility into cloud apps
Lateral Movement Detection
Preventing lateral movement on the same segment
Up to 200% Performance Boost
Next-gen hardware to include custom ASICs for
further network acceleration
1000+ Seat Enterprise Scaling Features
Data plane stacking, advanced switching and routing,
expanded protocols, resiliency, actionable log mgmt, APIs
N-Node Elegant Clustering
Multi-node simple clustering
Deep Learning based malware protection
Using Deep Learning to more efficiently and effectively
protect against unknown malware attacks
36. Synchronized Security
Cloud Intelligence
Sophos Labs
Analytics | Analyze data across all of Sophos’ products to create simple, actionable insights and automatic resolutions
| 24x7x365, multi-continent operation |
Malware Identities | URL Database | Machine Learning | Threat Intelligence | Genotypes | Reputation |
Behavioral Rules | APT Rules | App Identities | Anti-Spam | DLP | SophosID | Sandboxing | API Everywhere
Sophos Central
Admin Self Service Partner| Manage All Sophos Products | User Customizable Alerts | Management of Customer Installations
In Cloud On Prem
Next-Gen Endpoint
Mobile
Server
Encryption
UTM/Next-Gen Firewall
Wireless
Email
Web
36
Editor's Notes
Our efforts around XG have been recognized by several leading organizations. NSS Labs, the industry leader in network security testing, has rated Sophos one of the top vendors in their most recent tests on security effectiveness.
[Simon]
In short, we cannot be complacent that the methods we use today will be good enough for tomorrow. As we have been over the last 20+ years SophosLabs needs to forward looking, continue to innovate and look for disruptive methods to apply to threat protection.
[Handover to Joe]
For a long time, defenders in cybersecurity have felt that the attacker had an asymmetric advantage. This was expressed in the saying “the defender has to be right every time, the attacker has to be right only once.” Defenders have been keeping ahead of wave, but it was gaining on them. Our goal is to turn the tables. And machine learning is helping us to do that…
Before I tell you about the unique ways that Sophos is applying machine learning to cybersecurity, I thought it would be useful to start with a conceptual overview of how machine learning works. First question to answer is “what is the difference between machine learning and artificial intelligence?” Very simply, ML is a modern form of artificial intelligence, and while one is an instance of the other, you’ll often hear the terms used interchangeably. In this overview, we won’t be doing any matrix multiplication, but it will be more than enough for an interesting cocktail party conversation.
There are two main styles of machine learning: supervised and unsupervised.
Unsupervised machine learning refers to algorithms that operate on unlabeled data, where the answers are not known in advance. It is primarily used to help discover underlying structures in the data, for tasks such as for grouping or clustering.
Supervised machine learning refers to algorithms that operate on labelled data, where answers are known in advance. It is primarily used to make predictions, which is the distillation of what we want to do in cybersecurity: we want to predict if the things we encounter in our information systems are good or bad.
We start by training our model with a labelled training set. In our example here, we’re using fruits and vegetables. The model operates on a set of attributes, commonly referred to as features, which are characteristics that are useful in helping distinguish one class from another. Here, we are looking at such features as the color, shape, size, and mass of the fruits and vegetables, but in the case of detecting malware, we might look at things like the header of the file, the amount of file entropy, or string sequences within the file. Our example here shows 4 features, but in practice, it’s common for the top performing models to operate on thousands of features.
Our model starts making predictions based on the input data. When it first gets started, it will make a lot of mistakes, so we use a learner to help correct its errors. The learner is equipped with the answers, and adjusts parameters of the model to help its predictions more accurately match the answers. This goes through hundreds, thousands, or even millions of iterations, typically inside computers powered by GPUs, which are exceptionally good at this kind of math.
Once it’s performing well on the training data, we then measure its view of reality against a portion of our labelled data that we help aside, call the test set. If the model performs well here, then it’s ready for production.
Examples of production systems would include self-driving cars, next-gen anti-virus software for your computer, or computer vision systems.
Speaking of computer vision systems, I thought it might be interesting to show you some of the actual training data that is used to produce a machine learning vision system designed to identify pictures of dogs.
Sure, these are kind of humorous, but they illustrate a very good point: sometimes the thing you are trying to classify can be frustratingly ambiguous, or even outright evasive.
The better your model is at operating in the face of such ambiguity, the better its performance will be in real world. We measure this with the true positive and false positive rates of the model’s predictive power. In this sense, adversity in the training environment makes for a more robust model.
Coming back for a moment to the overview, I’d like to make the point that not all machine learning approaches are equal. It is well established that the ability for a model to make predictions is greatly influenced by the number of features and the size of the training set. And this was one of the limiting factors that held machine learning back for a very long time: for a model to perform well enough for use in the real-world, it would simply be too big, too slow, or outright untrainable in the first place.
If you’ve spent any time on the topic of machine learning, you’ve probably heard of methods like decision trees, random forests, Naïve Bayes, and SVMs being used to make predictions. With all of these options (and there are many more), it’s reasonable to ask the question: “which algorithm is right for the task?” As a general rule, it is best to employ Occam’s Razor, or the law of parsimony, in designing any technology; Machine learning is no exception.
A very well-known and commonly used method of machine learning is the decision tree. As the name implies, it is a tree structure that makes decisions based on the attributes of the data it is evaluating. You can think of it as an automated for of “20 questions”, where the output will be a verdict such as fruit or vegetable, dog or bagel, malicious or benign. Decision trees are useful for very simple problems, such as creating rules for your email inbox, but have too many limitations for more complicated tasks.
Another popular method is random forest. Random forests address some of the problems of decision trees, such as their tendency to overfit (a term used to describe a model’s overreaction to variance in the training data) which results in high error rates, but in order to achieve acceptable performance with complicated tasks, you must sacrifice compactness and run-time performance.
Deep Learning is a kind of artificial neural network. As the name implies, its design is inspired by our modern understanding of the hierarchical structure of the human brain. The same way that we understand that our visual system works by processing a progression of structures, beginning with simple edges and blobs, then mouths and eyes, and then finally faces, Deep Learning works through a layered progression of structures to produce models with more accurate representations of reality than have ever been possible before.
And being the first method of machine learning to demonstrate real predictive performance in the space of complex problems, they adhere to Occam’s razor.
Joe walked you through our differentiated approach to machine learning with our Deep Learning. What that provides us in the product is Better Protection, Better Performance and Better Accuracy.
Sophos is in a unique position in the industry because of how we are integrating our Labs and Data Science operations. With our size, breadth, and depth of expertise, SophosLabs is perhaps one of a half-dozen “tier one” Labs operations in the industry.
Our Labs experts help us to source massive data sets that we use for training, augmented with invaluable real-time telemetry from the hundred million end-users we are protecting.
We then use automated systems in Labs, designed over many years by our human experts, to ensure that the training data is precisely labeled. This is critical to producing high-performance, accurate detectors. As we say in the industry: GIGO – garbage in, garbage out. Our curation systems and processes are mature and scalable.
Our Data Science team then get to focus their energies on developing the most efficient algorithms and models to solve a broad set of cybersecurity problems.
And then each learns from the other. As Yogi Berra (or perhaps Niels Bohr) once said: “it’s hard to make predictions, especially about the future.” Neither man nor machine can be perfect when making predictions. The joint operation we’ve built learns from our mistakes, driving continuous improvement.
The result is unprecedented synergies of man and machine. It is a virtuous circle.
Using this manifesto we created our own next generation endpoint. We’ve create an innovative ensemble of protection protecting the endpoint before an attack can execute, with technologies like our Deep Learning and protecting against malware that may have evaded that with post execution protections like exploit protection and Crytoguard our ransomware protection.
Which is why we have XG Firewall. It really has been designed from the ground up to solve today’s top problems with existing firewalls…
XG exposes hidden risks, blockets network threats, including advanced threats like Wannacry, and through Synchronized Security automatically respond to incidents.
XG Firewall is about to take application visibility and control to a whole new level.
XG Firewall utilizes Synchronized Security to ask the endpoint what application is generating unknown traffic to automatically identify, classify, and control all unknown applications. Synchronized App Control will automatically classify unknown apps into categories which will automatically enforce policies on these newly identified apps.
You can block the apps you don’t want and prioritize the ones you do.
It’s a breakthrough in application visibility and control and something no other firewall can do.
V17 is just a stepping stone on our relentless journey of innovation in next generation firewall. Our future roadmap contains many innovative and exciting investments.
We will work to improve our leading performance by as much as 200%. We will integrate our deep learning protections for more efficient and effective protection from unknown malware. We will provide protections for new segments of technology like IoT and applications that run and reside in the cloud with Cloud Access Security Brokers.