SlideShare a Scribd company logo
© Copyright Fortinet Inc. All rights reserved.
AI – A Self Evolving Detection System
Q3 / 2018
2Fortinet - Confidential
Introduction
21 years military
»USAF – nuclear weapons
»USA – FC/test pilot, threat officer, various leadership roles
20 years security management and consulting
»NASA JPL, World Bank, CME, Delta Air Lines, Humana, HECO, Honda
NA, USAA, Corning, many others
»Consulting, Practice Manager, Director of Security, CISO
Currently: Strategist at Fortinet
»Technology + resource management + business needs = CISO success
3Fortinet - Confidential
Bingo Time!
Deep Analytics
Crowdculture
Virtual / Augmented Reality
Big / Real Big / Way Big Data
Holistic Eco_______
Artificial Intelligence
Machine Learning
Deep learning
4Fortinet - Confidential
LEVEL II
» Content Pattern Recognition
» Looks at wrappers and payload for repeats
» Handles large volumes of permutations
» Proactive in nature, but manually created
Malicious File Packed/Encrypted Run time/memory
Pack Run
Headers
1111010101010
Code
0010101010101
1010101010101
10111101010111
Data
1010101010111
1010101010101
1010101010101
Headers
1111010101010
Code
0010101010101
1010101010101
10111101010111
Data
1010101010111
1010101010101
1010101010101
Headers
1111010101010
Code
0010101010101
1010101010101
10111101010111
Data
1010101010111
1010101010101
1010101010101
Antivirus Evolution
LEVEL I
» Simple MD5 / SHA 56 computations
» Resulted in large DBs for file comparisons
» One signature – one piece of malware
» Reactive and non-responsive to mutations
C:Md5sum malware.exe
5e3830ee3282a53920e00784fec44cfd (malware.exe)
Cfac6385a0cdd5f09b2e38c833c93c9d
5ae8c55fbc7b8f5bafa1af1675478cba
1af8e09e41fc850e15ffc4ea0be68c21
ce1ff097a3f0afec3bd5c5f0fb57cfda
80f27e4d562dc4f55e38f4088251e83c
bf6ba9baa2e0dcb8d175a4ff594dccd9
5e3830ee3282a53920e00784fec44cfd
Malware Found
5Fortinet - Confidential
AI Visions
Facebook ‘emergency shutdown
Fear over planet being “Zuckerburged”’
Reality Was different
6Fortinet - Confidential
Alan Turing called an infant’s mind an
‘unorganized machine’ in 1930s
Created early definitions of machine learning
» First type (A) consists of simple NAND
(negative – AND gates
» Second type (B) is combination of A types with modifiers
added – results in weighted input/variable output method
» Saw the need for:
 Seeded solution set of accurate or known potential output
 Population of variably weighted pieces or functions
 A method for removing the worst solutions while retaining the best
Early AI Defined
Major inhibitor of his research – was far ahead of available capabilities in
terms of computing power.
7Fortinet - Confidential
Artificial Intelligence – Nearing a Century
History of AI Outside Cyber Security Industry
1930s
Turing, Kleene and
Church propose
machine learning
solution
1943
McCullouch and
Pitts create formal
design of Turing’s
‘artificial neurons’
1956
AI research
formally founded as
a discipline at
Dartmouth
College
1960s
AI research
heavily funded
by the U. S.
military
1974
The First AI Winter
Difficulty resulted in
funding cuts in US
and Britain
1980s
Proliferation of
Expert Systems
Lisp vs. PC
1990s
AI applied to
data mining,
medical
diagnosis with
increased CPU
power
1997
IBM Deep Blue
beats Grand
Master Kasparov
in chess
2003
Deep learning is
achieved using
faster computing,
large data
structures
2011
IBM Watson
defeats Jeopardy
champions Brad
Rutter and Ken
Jennings
2013
IBM Watson
application for
management
decisions of lung
cancer treatment
2015
2,700+ AI projects
in place at
Google
2017
Elon Musk
calls for the
regulation of AI
before we hit
100 years
8Fortinet - Confidential
 Supervised Learning – Using known solution sets
to embed proper functions and create proper output.
» Reinforcement – action on an environment
triggers an observation resulting in a defined state.
 Unsupervised Learning – unknown solution sets
» Clustering – group according to similarities.
» Dimensionality Reduction – deductive reasoning.
» Structured Prediction – random fields are analyzed to
predict according to defined output probabilities.
» Anomaly Detection – input does not match expectations.
Types of Problem Solving
Artificial Neural Networks (ANNs)
Large collections of simple interconnected nodes (neurons), each with a weighted input and output value.
?!
9Fortinet - Confidential
 Consists of three or more layers
» Input layer
» One or more hidden layers
» Output layer
 Layers are made of up nodes
» Connected to every node in the previous
and subsequent layer
» Provide discrete processing of input information
(files and features)
» Produces an output value based on inputs, function,
and weighted valuation
Type of AI – Artificial Neural Network (Multilayer Perceptron)
The Multilayer Perceptron approach provides deep
machine learning capabilities.
MP behavior is similar to human neurons - if
input is strong enough, signal is passed
according to weighted value
Input layer
Hidden layer 1 Hidden layer 2
Output layer
Inputs Weights Sum Output
Input 1
Input 2
Input 3
YES/NO
decision
∑
10Fortinet - Confidential
 Point observable characteristics based on code blocks
» Files are parsed when entering the system
 1 : 1 relationship with nodes
 Features are maintained in a knowledgebase
repository (features, pattern recognition sigs, other
information)
 Quality is critical
» Provides more accurate determination of file status
» We leveraged internal legacy samples (~.5PB) to create
features from samples
 Each feature is weighted to assist in decisions
 Feature weighting can change over time
Features
f - feature
w - weight
Func(f1*w1+f2*w2+...+fn*wn) -> {0,1}
Feature/Node Algorithm
11Fortinet - Confidential
 System is fed initial data sets for analysis
» Supervised machine learning approach
 Files are broken into code blocks
 Information (features) extracted during the
learning phase.
» Binary present within the blocks
» Represents behavioral activation
 The system learns by weight features based on
» Surety of indicators (‘tells’)
» Frequency of observation
Layers + Nodes + Features = Learning
Source accuracy of the population input A computer program is said to learn from
experience E with respect to some class of tasks T and performance measure P if
its performance at tasks in T, as measured by P, improves with experience E.
12Fortinet - Confidential
Features, Nodes & Weights – Single Instance
70
FEATURE
100 wt.
NODE
+90
FEATURE
100 wt.
NODE
-20File Result
INPUT LAYER HIDDEN LAYER HIDDEN LAYER OUTPUT LAYER
File
1. We start with an input file – malicious or clean
2. Feature presence is calculated, re-weighted and passed forward to the next node
3. The analysis is repeated using the next layer feature, then passed to the next node
4. Result – the overall probability based on a score of feature presence
13Fortinet - Confidential
Features, Nodes & Weights – Multiple Instance
FEATURES FEATURES
Result
INPUT LAYER MALICIOUS LAYER CLEAN LAYER OUTPUT LAYER
File
14Fortinet - Confidential
AI Operational Example - Machine vs. Malware
1 = process the input file
2 = 2.3 billion nodes analyzing potential malicious
features
3 = 3.2 Billion nodes analyzing files for clean features
4 = output or decision layer (1 = malicious , 0 = clean)
4 Layer Architecture
Consists of separate layers for either malicious or clean feature processing
Mathematical models compare samples and features to decide output
Input layer
Malicious Layer Clean Layer
Output layer
FortiGuard AI
Output is a result of 2.3B x 3.2B individual node computations.
Initial capability – 58 samples per second
15Fortinet - Confidential
System Training
FEATURES
REPOSITORY
TRAINING
FILES
1. We start with AI and an
empty feature repository
2. A training set of files are
input, consisting of clean and
malicious files. Files are
labeled for initial training
3. Files are input and
broken into code blocks.
AI logic builds a set of
features.
4. Features are modified as the
system learns (weighting values,
next phase)
AI
16Fortinet - Confidential
System Testing
MALICIOUS
CLEAN
FEATURES
REPOSITORY
TESTING
FILES
OUTPUT1. Test samples are selected
and input to the system
2. Using the feature repository,
samples are analyzed
3. As this occurs, existing
features may get modified
or others added
4. The system determines clean
or malicious output
5. Output is compared to expected
results. If not accurate, reset to
known point and retrain.
RESULTS
EXAMINED
AI
17Fortinet - Confidential
AI in Operation
MALICIOUS
CLEAN
OUTPUT
 
INPUT
RAW SAMPLES
Feature Set Improvements
 Quality
 Stabilized Number
 Weighting Confidence
Continued Accuracy
to a High Degree of
Confidence
FEATURESQuantity
Quality
18Fortinet - Confidential
 Has to integrate to support current systems
 Must use features (code blocks), create signatures
 Continuous, seamless operational support
Managing New Malware
?!
 Problem - how to introduce new samples and create new features
 New features are naturally regressed in weight – no history
 Requires false weighting to have value in decisions
 Cannot subvert existing weighted features
 Cannot use single samples / false weight
 Would cause heavy skewing of results
 Introduces potentially large errors in main system
19Fortinet - Confidential
New Malware Management System – Part I
AI
130+ Threat Feeds
Global Sensors
CTA
1
New Files
2 Sandbox
Non-matching Feature Sets
(all code blocks)
 A ‘rolling’ input is used – last batch + new
 AI is fast – millions per day
 Sandbox is slower – 10s/K per day
 Features created as in original training
 Features are relative to AI – consist of code
blocks, which are also used by CPR
AI
3 4
20Fortinet - Confidential
New Malware Management System – Part II (AI Side)
1
2
3
4
5
6
Knowledgebase Repository
Production
Feature Set
21Fortinet - Confidential
New Malware Management System – Part III
2 4
5
Knowledgebase Repository
Production CPR
Signature Set
CB
CB
CB
CB
CB
CB
CB
CB
CB
CB
7Customer Enterprise
1 3
6
CPR
Engine
8
22Fortinet - Confidential
Results
• Significantly Increases cybercriminal cost burden for limited return
• Obtain unsurpassed accuracy not possible through manual methods
• Eliminate manual labor/potential errors
• Create autonomous system – operational and continued learning
23Fortinet - Confidential
Take-Aways
• AI as a business solution is not
trivial
• Large effort to implement
• Example ~ 8 years to date
• AI should ease technical resource
demands
• Reallocate to more critical tasks
• Hybrid approaches cause more labor
Anyone pitching AI without some level of
detail is probably BS Bingo selling
• Should not have to watch it
• Tell it right from wrong?
• Lack of trust = poor implementation
Thanks for your time!

More Related Content

Similar to Global CCISO Forum 2018 | AI vs Malware 2018

AI for Cybersecurity Innovation
AI for Cybersecurity InnovationAI for Cybersecurity Innovation
AI for Cybersecurity Innovation
Pete Burnap
 
2020-04-29 SIT Insights in Technology - Serguei Beloussov
2020-04-29 SIT Insights in Technology - Serguei Beloussov2020-04-29 SIT Insights in Technology - Serguei Beloussov
2020-04-29 SIT Insights in Technology - Serguei Beloussov
Schaffhausen Institute of Technology
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day Threats
Rahul Mohandas
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day Threats
Rahul Mohandas
 
Nota Padat ICT SPM
Nota Padat ICT SPMNota Padat ICT SPM
Nota Padat ICT SPM
D.J Md Thani
 
Ai ml dl_bct and mariners
Ai  ml dl_bct and marinersAi  ml dl_bct and mariners
Ai ml dl_bct and mariners
cmmindia2017
 
Ai ml dl_bct and mariners-1
Ai  ml dl_bct and mariners-1Ai  ml dl_bct and mariners-1
Ai ml dl_bct and mariners-1
cmmindia2017
 
Ai ml dl_bct and mariners
Ai  ml dl_bct and marinersAi  ml dl_bct and mariners
Ai ml dl_bct and mariners
cmmindia2017
 
SplunkLive! Customer Presentation – HCA
SplunkLive! Customer Presentation – HCASplunkLive! Customer Presentation – HCA
SplunkLive! Customer Presentation – HCA
Stephanie Bies
 
Customer Presentation with a Healthcare Company
Customer Presentation with a Healthcare CompanyCustomer Presentation with a Healthcare Company
Customer Presentation with a Healthcare Company
Splunk
 
20181212 ibm aot
20181212 ibm aot20181212 ibm aot
20181212 ibm aot
Hiroshi Maruyama
 
AI/ML/DL/BCT A Revolution in Maritime Sector
AI/ML/DL/BCT A Revolution in Maritime SectorAI/ML/DL/BCT A Revolution in Maritime Sector
AI/ML/DL/BCT A Revolution in Maritime Sector
Infinity Tech Solutions
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
abh.arya
 
Security Analytics for Data Discovery - Closing the SIEM Gap
Security Analytics for Data Discovery - Closing the SIEM GapSecurity Analytics for Data Discovery - Closing the SIEM Gap
Security Analytics for Data Discovery - Closing the SIEM Gap
Eric Johansen, CISSP
 
Microsoft Dryad
Microsoft DryadMicrosoft Dryad
Microsoft Dryad
Colin Clark
 
Threat Intelligence Ops In-Depth at Massive Enterprise
Threat Intelligence Ops In-Depth at Massive EnterpriseThreat Intelligence Ops In-Depth at Massive Enterprise
Threat Intelligence Ops In-Depth at Massive Enterprise
Jeremy Li
 
Ansaldo STS at CPExpo 2013: "Risks and Security Management in Logistics and ...
Ansaldo STS at CPExpo 2013:  "Risks and Security Management in Logistics and ...Ansaldo STS at CPExpo 2013:  "Risks and Security Management in Logistics and ...
Ansaldo STS at CPExpo 2013: "Risks and Security Management in Logistics and ...
Leonardo
 
Cyber Security in Railways Systems, Ansaldo STS experience
Cyber Security in Railways Systems, Ansaldo STS  experienceCyber Security in Railways Systems, Ansaldo STS  experience
Cyber Security in Railways Systems, Ansaldo STS experience
Community Protection Forum
 
Machine_Learning_with_MATLAB_Seminar_Latest.pdf
Machine_Learning_with_MATLAB_Seminar_Latest.pdfMachine_Learning_with_MATLAB_Seminar_Latest.pdf
Machine_Learning_with_MATLAB_Seminar_Latest.pdf
Carlos Paredes
 
Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...
Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...
Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...
APJ ABDUL KALAM TECHNICAL UNIVERSITY
 

Similar to Global CCISO Forum 2018 | AI vs Malware 2018 (20)

AI for Cybersecurity Innovation
AI for Cybersecurity InnovationAI for Cybersecurity Innovation
AI for Cybersecurity Innovation
 
2020-04-29 SIT Insights in Technology - Serguei Beloussov
2020-04-29 SIT Insights in Technology - Serguei Beloussov2020-04-29 SIT Insights in Technology - Serguei Beloussov
2020-04-29 SIT Insights in Technology - Serguei Beloussov
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day Threats
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day Threats
 
Nota Padat ICT SPM
Nota Padat ICT SPMNota Padat ICT SPM
Nota Padat ICT SPM
 
Ai ml dl_bct and mariners
Ai  ml dl_bct and marinersAi  ml dl_bct and mariners
Ai ml dl_bct and mariners
 
Ai ml dl_bct and mariners-1
Ai  ml dl_bct and mariners-1Ai  ml dl_bct and mariners-1
Ai ml dl_bct and mariners-1
 
Ai ml dl_bct and mariners
Ai  ml dl_bct and marinersAi  ml dl_bct and mariners
Ai ml dl_bct and mariners
 
SplunkLive! Customer Presentation – HCA
SplunkLive! Customer Presentation – HCASplunkLive! Customer Presentation – HCA
SplunkLive! Customer Presentation – HCA
 
Customer Presentation with a Healthcare Company
Customer Presentation with a Healthcare CompanyCustomer Presentation with a Healthcare Company
Customer Presentation with a Healthcare Company
 
20181212 ibm aot
20181212 ibm aot20181212 ibm aot
20181212 ibm aot
 
AI/ML/DL/BCT A Revolution in Maritime Sector
AI/ML/DL/BCT A Revolution in Maritime SectorAI/ML/DL/BCT A Revolution in Maritime Sector
AI/ML/DL/BCT A Revolution in Maritime Sector
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
 
Security Analytics for Data Discovery - Closing the SIEM Gap
Security Analytics for Data Discovery - Closing the SIEM GapSecurity Analytics for Data Discovery - Closing the SIEM Gap
Security Analytics for Data Discovery - Closing the SIEM Gap
 
Microsoft Dryad
Microsoft DryadMicrosoft Dryad
Microsoft Dryad
 
Threat Intelligence Ops In-Depth at Massive Enterprise
Threat Intelligence Ops In-Depth at Massive EnterpriseThreat Intelligence Ops In-Depth at Massive Enterprise
Threat Intelligence Ops In-Depth at Massive Enterprise
 
Ansaldo STS at CPExpo 2013: "Risks and Security Management in Logistics and ...
Ansaldo STS at CPExpo 2013:  "Risks and Security Management in Logistics and ...Ansaldo STS at CPExpo 2013:  "Risks and Security Management in Logistics and ...
Ansaldo STS at CPExpo 2013: "Risks and Security Management in Logistics and ...
 
Cyber Security in Railways Systems, Ansaldo STS experience
Cyber Security in Railways Systems, Ansaldo STS  experienceCyber Security in Railways Systems, Ansaldo STS  experience
Cyber Security in Railways Systems, Ansaldo STS experience
 
Machine_Learning_with_MATLAB_Seminar_Latest.pdf
Machine_Learning_with_MATLAB_Seminar_Latest.pdfMachine_Learning_with_MATLAB_Seminar_Latest.pdf
Machine_Learning_with_MATLAB_Seminar_Latest.pdf
 
Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...
Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...
Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...
 

More from EC-Council

CyberOm - Hacking the Wellness Code in a Chaotic Cyber World
CyberOm - Hacking the Wellness Code in a Chaotic Cyber WorldCyberOm - Hacking the Wellness Code in a Chaotic Cyber World
CyberOm - Hacking the Wellness Code in a Chaotic Cyber World
EC-Council
 
Cloud Security Architecture - a different approach
Cloud Security Architecture - a different approachCloud Security Architecture - a different approach
Cloud Security Architecture - a different approach
EC-Council
 
Phases of Incident Response
Phases of Incident ResponsePhases of Incident Response
Phases of Incident Response
EC-Council
 
Weaponizing OSINT – Hacker Halted 2019 – Michael James
 Weaponizing OSINT – Hacker Halted 2019 – Michael James  Weaponizing OSINT – Hacker Halted 2019 – Michael James
Weaponizing OSINT – Hacker Halted 2019 – Michael James
EC-Council
 
Hacking Your Career – Hacker Halted 2019 – Keith Turpin
Hacking Your Career – Hacker Halted 2019 – Keith TurpinHacking Your Career – Hacker Halted 2019 – Keith Turpin
Hacking Your Career – Hacker Halted 2019 – Keith Turpin
EC-Council
 
Hacking Diversity – Hacker Halted . 2019 – Marcelle Lee
Hacking Diversity – Hacker Halted . 2019 – Marcelle LeeHacking Diversity – Hacker Halted . 2019 – Marcelle Lee
Hacking Diversity – Hacker Halted . 2019 – Marcelle Lee
EC-Council
 
Cloud Proxy Technology – Hacker Halted 2019 – Jeff Silver
Cloud Proxy Technology – Hacker Halted 2019 – Jeff SilverCloud Proxy Technology – Hacker Halted 2019 – Jeff Silver
Cloud Proxy Technology – Hacker Halted 2019 – Jeff Silver
EC-Council
 
DNS – Strategies for Reducing Data Leakage & Protecting Online Privacy – Hack...
DNS – Strategies for Reducing Data Leakage & Protecting Online Privacy – Hack...DNS – Strategies for Reducing Data Leakage & Protecting Online Privacy – Hack...
DNS – Strategies for Reducing Data Leakage & Protecting Online Privacy – Hack...
EC-Council
 
Data in cars can be creepy – Hacker Halted 2019 – Andrea Amico
Data in cars can be creepy – Hacker Halted 2019 – Andrea AmicoData in cars can be creepy – Hacker Halted 2019 – Andrea Amico
Data in cars can be creepy – Hacker Halted 2019 – Andrea Amico
EC-Council
 
Breaking Smart [Bank] Statements – Hacker Halted 2019 – Manuel Nader
Breaking Smart [Bank] Statements – Hacker Halted 2019 – Manuel NaderBreaking Smart [Bank] Statements – Hacker Halted 2019 – Manuel Nader
Breaking Smart [Bank] Statements – Hacker Halted 2019 – Manuel Nader
EC-Council
 
Are your cloud servers under attack?– Hacker Halted 2019 – Brian Hileman
Are your cloud servers under attack?– Hacker Halted 2019 – Brian HilemanAre your cloud servers under attack?– Hacker Halted 2019 – Brian Hileman
Are your cloud servers under attack?– Hacker Halted 2019 – Brian Hileman
EC-Council
 
War Game: Ransomware – Global CISO Forum 2019
War Game: Ransomware – Global CISO Forum 2019War Game: Ransomware – Global CISO Forum 2019
War Game: Ransomware – Global CISO Forum 2019
EC-Council
 
How to become a Security Behavior Alchemist – Global CISO Forum 2019 – Perry ...
How to become a Security Behavior Alchemist – Global CISO Forum 2019 – Perry ...How to become a Security Behavior Alchemist – Global CISO Forum 2019 – Perry ...
How to become a Security Behavior Alchemist – Global CISO Forum 2019 – Perry ...
EC-Council
 
Introduction to FAIR Risk Methodology – Global CISO Forum 2019 – Donna Gall...
Introduction to FAIR Risk Methodology – Global CISO Forum 2019  –  Donna Gall...Introduction to FAIR Risk Methodology – Global CISO Forum 2019  –  Donna Gall...
Introduction to FAIR Risk Methodology – Global CISO Forum 2019 – Donna Gall...
EC-Council
 
Alexa is a snitch! Hacker Halted 2019 - Wes Widner
Alexa is a snitch! Hacker Halted 2019 - Wes WidnerAlexa is a snitch! Hacker Halted 2019 - Wes Widner
Alexa is a snitch! Hacker Halted 2019 - Wes Widner
EC-Council
 
Hacker Halted 2018: Don't Panic! Big Data Analytics vs. Law Enforcement
Hacker Halted 2018: Don't Panic! Big Data Analytics vs. Law EnforcementHacker Halted 2018: Don't Panic! Big Data Analytics vs. Law Enforcement
Hacker Halted 2018: Don't Panic! Big Data Analytics vs. Law Enforcement
EC-Council
 
Hacker Halted 2018: HACKING TRILLIAN: A 42-STEP SOLUTION TO EXPLOIT POST-VOGA...
Hacker Halted 2018: HACKING TRILLIAN: A 42-STEP SOLUTION TO EXPLOIT POST-VOGA...Hacker Halted 2018: HACKING TRILLIAN: A 42-STEP SOLUTION TO EXPLOIT POST-VOGA...
Hacker Halted 2018: HACKING TRILLIAN: A 42-STEP SOLUTION TO EXPLOIT POST-VOGA...
EC-Council
 
Hacker Halted 2018: Breaking the Bad News: How to Prevent Your IR Messages fr...
Hacker Halted 2018: Breaking the Bad News: How to Prevent Your IR Messages fr...Hacker Halted 2018: Breaking the Bad News: How to Prevent Your IR Messages fr...
Hacker Halted 2018: Breaking the Bad News: How to Prevent Your IR Messages fr...
EC-Council
 
Hacker Halted 2018: From CTF to CVE – How Application of Concepts and Persist...
Hacker Halted 2018: From CTF to CVE – How Application of Concepts and Persist...Hacker Halted 2018: From CTF to CVE – How Application of Concepts and Persist...
Hacker Halted 2018: From CTF to CVE – How Application of Concepts and Persist...
EC-Council
 
Hacker Halted 2018: SE vs Predator: Using Social Engineering in ways I never ...
Hacker Halted 2018: SE vs Predator: Using Social Engineering in ways I never ...Hacker Halted 2018: SE vs Predator: Using Social Engineering in ways I never ...
Hacker Halted 2018: SE vs Predator: Using Social Engineering in ways I never ...
EC-Council
 

More from EC-Council (20)

CyberOm - Hacking the Wellness Code in a Chaotic Cyber World
CyberOm - Hacking the Wellness Code in a Chaotic Cyber WorldCyberOm - Hacking the Wellness Code in a Chaotic Cyber World
CyberOm - Hacking the Wellness Code in a Chaotic Cyber World
 
Cloud Security Architecture - a different approach
Cloud Security Architecture - a different approachCloud Security Architecture - a different approach
Cloud Security Architecture - a different approach
 
Phases of Incident Response
Phases of Incident ResponsePhases of Incident Response
Phases of Incident Response
 
Weaponizing OSINT – Hacker Halted 2019 – Michael James
 Weaponizing OSINT – Hacker Halted 2019 – Michael James  Weaponizing OSINT – Hacker Halted 2019 – Michael James
Weaponizing OSINT – Hacker Halted 2019 – Michael James
 
Hacking Your Career – Hacker Halted 2019 – Keith Turpin
Hacking Your Career – Hacker Halted 2019 – Keith TurpinHacking Your Career – Hacker Halted 2019 – Keith Turpin
Hacking Your Career – Hacker Halted 2019 – Keith Turpin
 
Hacking Diversity – Hacker Halted . 2019 – Marcelle Lee
Hacking Diversity – Hacker Halted . 2019 – Marcelle LeeHacking Diversity – Hacker Halted . 2019 – Marcelle Lee
Hacking Diversity – Hacker Halted . 2019 – Marcelle Lee
 
Cloud Proxy Technology – Hacker Halted 2019 – Jeff Silver
Cloud Proxy Technology – Hacker Halted 2019 – Jeff SilverCloud Proxy Technology – Hacker Halted 2019 – Jeff Silver
Cloud Proxy Technology – Hacker Halted 2019 – Jeff Silver
 
DNS – Strategies for Reducing Data Leakage & Protecting Online Privacy – Hack...
DNS – Strategies for Reducing Data Leakage & Protecting Online Privacy – Hack...DNS – Strategies for Reducing Data Leakage & Protecting Online Privacy – Hack...
DNS – Strategies for Reducing Data Leakage & Protecting Online Privacy – Hack...
 
Data in cars can be creepy – Hacker Halted 2019 – Andrea Amico
Data in cars can be creepy – Hacker Halted 2019 – Andrea AmicoData in cars can be creepy – Hacker Halted 2019 – Andrea Amico
Data in cars can be creepy – Hacker Halted 2019 – Andrea Amico
 
Breaking Smart [Bank] Statements – Hacker Halted 2019 – Manuel Nader
Breaking Smart [Bank] Statements – Hacker Halted 2019 – Manuel NaderBreaking Smart [Bank] Statements – Hacker Halted 2019 – Manuel Nader
Breaking Smart [Bank] Statements – Hacker Halted 2019 – Manuel Nader
 
Are your cloud servers under attack?– Hacker Halted 2019 – Brian Hileman
Are your cloud servers under attack?– Hacker Halted 2019 – Brian HilemanAre your cloud servers under attack?– Hacker Halted 2019 – Brian Hileman
Are your cloud servers under attack?– Hacker Halted 2019 – Brian Hileman
 
War Game: Ransomware – Global CISO Forum 2019
War Game: Ransomware – Global CISO Forum 2019War Game: Ransomware – Global CISO Forum 2019
War Game: Ransomware – Global CISO Forum 2019
 
How to become a Security Behavior Alchemist – Global CISO Forum 2019 – Perry ...
How to become a Security Behavior Alchemist – Global CISO Forum 2019 – Perry ...How to become a Security Behavior Alchemist – Global CISO Forum 2019 – Perry ...
How to become a Security Behavior Alchemist – Global CISO Forum 2019 – Perry ...
 
Introduction to FAIR Risk Methodology – Global CISO Forum 2019 – Donna Gall...
Introduction to FAIR Risk Methodology – Global CISO Forum 2019  –  Donna Gall...Introduction to FAIR Risk Methodology – Global CISO Forum 2019  –  Donna Gall...
Introduction to FAIR Risk Methodology – Global CISO Forum 2019 – Donna Gall...
 
Alexa is a snitch! Hacker Halted 2019 - Wes Widner
Alexa is a snitch! Hacker Halted 2019 - Wes WidnerAlexa is a snitch! Hacker Halted 2019 - Wes Widner
Alexa is a snitch! Hacker Halted 2019 - Wes Widner
 
Hacker Halted 2018: Don't Panic! Big Data Analytics vs. Law Enforcement
Hacker Halted 2018: Don't Panic! Big Data Analytics vs. Law EnforcementHacker Halted 2018: Don't Panic! Big Data Analytics vs. Law Enforcement
Hacker Halted 2018: Don't Panic! Big Data Analytics vs. Law Enforcement
 
Hacker Halted 2018: HACKING TRILLIAN: A 42-STEP SOLUTION TO EXPLOIT POST-VOGA...
Hacker Halted 2018: HACKING TRILLIAN: A 42-STEP SOLUTION TO EXPLOIT POST-VOGA...Hacker Halted 2018: HACKING TRILLIAN: A 42-STEP SOLUTION TO EXPLOIT POST-VOGA...
Hacker Halted 2018: HACKING TRILLIAN: A 42-STEP SOLUTION TO EXPLOIT POST-VOGA...
 
Hacker Halted 2018: Breaking the Bad News: How to Prevent Your IR Messages fr...
Hacker Halted 2018: Breaking the Bad News: How to Prevent Your IR Messages fr...Hacker Halted 2018: Breaking the Bad News: How to Prevent Your IR Messages fr...
Hacker Halted 2018: Breaking the Bad News: How to Prevent Your IR Messages fr...
 
Hacker Halted 2018: From CTF to CVE – How Application of Concepts and Persist...
Hacker Halted 2018: From CTF to CVE – How Application of Concepts and Persist...Hacker Halted 2018: From CTF to CVE – How Application of Concepts and Persist...
Hacker Halted 2018: From CTF to CVE – How Application of Concepts and Persist...
 
Hacker Halted 2018: SE vs Predator: Using Social Engineering in ways I never ...
Hacker Halted 2018: SE vs Predator: Using Social Engineering in ways I never ...Hacker Halted 2018: SE vs Predator: Using Social Engineering in ways I never ...
Hacker Halted 2018: SE vs Predator: Using Social Engineering in ways I never ...
 

Recently uploaded

UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Jeffrey Haguewood
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 

Recently uploaded (20)

UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 

Global CCISO Forum 2018 | AI vs Malware 2018

  • 1. © Copyright Fortinet Inc. All rights reserved. AI – A Self Evolving Detection System Q3 / 2018
  • 2. 2Fortinet - Confidential Introduction 21 years military »USAF – nuclear weapons »USA – FC/test pilot, threat officer, various leadership roles 20 years security management and consulting »NASA JPL, World Bank, CME, Delta Air Lines, Humana, HECO, Honda NA, USAA, Corning, many others »Consulting, Practice Manager, Director of Security, CISO Currently: Strategist at Fortinet »Technology + resource management + business needs = CISO success
  • 3. 3Fortinet - Confidential Bingo Time! Deep Analytics Crowdculture Virtual / Augmented Reality Big / Real Big / Way Big Data Holistic Eco_______ Artificial Intelligence Machine Learning Deep learning
  • 4. 4Fortinet - Confidential LEVEL II » Content Pattern Recognition » Looks at wrappers and payload for repeats » Handles large volumes of permutations » Proactive in nature, but manually created Malicious File Packed/Encrypted Run time/memory Pack Run Headers 1111010101010 Code 0010101010101 1010101010101 10111101010111 Data 1010101010111 1010101010101 1010101010101 Headers 1111010101010 Code 0010101010101 1010101010101 10111101010111 Data 1010101010111 1010101010101 1010101010101 Headers 1111010101010 Code 0010101010101 1010101010101 10111101010111 Data 1010101010111 1010101010101 1010101010101 Antivirus Evolution LEVEL I » Simple MD5 / SHA 56 computations » Resulted in large DBs for file comparisons » One signature – one piece of malware » Reactive and non-responsive to mutations C:Md5sum malware.exe 5e3830ee3282a53920e00784fec44cfd (malware.exe) Cfac6385a0cdd5f09b2e38c833c93c9d 5ae8c55fbc7b8f5bafa1af1675478cba 1af8e09e41fc850e15ffc4ea0be68c21 ce1ff097a3f0afec3bd5c5f0fb57cfda 80f27e4d562dc4f55e38f4088251e83c bf6ba9baa2e0dcb8d175a4ff594dccd9 5e3830ee3282a53920e00784fec44cfd Malware Found
  • 5. 5Fortinet - Confidential AI Visions Facebook ‘emergency shutdown Fear over planet being “Zuckerburged”’ Reality Was different
  • 6. 6Fortinet - Confidential Alan Turing called an infant’s mind an ‘unorganized machine’ in 1930s Created early definitions of machine learning » First type (A) consists of simple NAND (negative – AND gates » Second type (B) is combination of A types with modifiers added – results in weighted input/variable output method » Saw the need for:  Seeded solution set of accurate or known potential output  Population of variably weighted pieces or functions  A method for removing the worst solutions while retaining the best Early AI Defined Major inhibitor of his research – was far ahead of available capabilities in terms of computing power.
  • 7. 7Fortinet - Confidential Artificial Intelligence – Nearing a Century History of AI Outside Cyber Security Industry 1930s Turing, Kleene and Church propose machine learning solution 1943 McCullouch and Pitts create formal design of Turing’s ‘artificial neurons’ 1956 AI research formally founded as a discipline at Dartmouth College 1960s AI research heavily funded by the U. S. military 1974 The First AI Winter Difficulty resulted in funding cuts in US and Britain 1980s Proliferation of Expert Systems Lisp vs. PC 1990s AI applied to data mining, medical diagnosis with increased CPU power 1997 IBM Deep Blue beats Grand Master Kasparov in chess 2003 Deep learning is achieved using faster computing, large data structures 2011 IBM Watson defeats Jeopardy champions Brad Rutter and Ken Jennings 2013 IBM Watson application for management decisions of lung cancer treatment 2015 2,700+ AI projects in place at Google 2017 Elon Musk calls for the regulation of AI before we hit 100 years
  • 8. 8Fortinet - Confidential  Supervised Learning – Using known solution sets to embed proper functions and create proper output. » Reinforcement – action on an environment triggers an observation resulting in a defined state.  Unsupervised Learning – unknown solution sets » Clustering – group according to similarities. » Dimensionality Reduction – deductive reasoning. » Structured Prediction – random fields are analyzed to predict according to defined output probabilities. » Anomaly Detection – input does not match expectations. Types of Problem Solving Artificial Neural Networks (ANNs) Large collections of simple interconnected nodes (neurons), each with a weighted input and output value. ?!
  • 9. 9Fortinet - Confidential  Consists of three or more layers » Input layer » One or more hidden layers » Output layer  Layers are made of up nodes » Connected to every node in the previous and subsequent layer » Provide discrete processing of input information (files and features) » Produces an output value based on inputs, function, and weighted valuation Type of AI – Artificial Neural Network (Multilayer Perceptron) The Multilayer Perceptron approach provides deep machine learning capabilities. MP behavior is similar to human neurons - if input is strong enough, signal is passed according to weighted value Input layer Hidden layer 1 Hidden layer 2 Output layer Inputs Weights Sum Output Input 1 Input 2 Input 3 YES/NO decision ∑
  • 10. 10Fortinet - Confidential  Point observable characteristics based on code blocks » Files are parsed when entering the system  1 : 1 relationship with nodes  Features are maintained in a knowledgebase repository (features, pattern recognition sigs, other information)  Quality is critical » Provides more accurate determination of file status » We leveraged internal legacy samples (~.5PB) to create features from samples  Each feature is weighted to assist in decisions  Feature weighting can change over time Features f - feature w - weight Func(f1*w1+f2*w2+...+fn*wn) -> {0,1} Feature/Node Algorithm
  • 11. 11Fortinet - Confidential  System is fed initial data sets for analysis » Supervised machine learning approach  Files are broken into code blocks  Information (features) extracted during the learning phase. » Binary present within the blocks » Represents behavioral activation  The system learns by weight features based on » Surety of indicators (‘tells’) » Frequency of observation Layers + Nodes + Features = Learning Source accuracy of the population input A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E.
  • 12. 12Fortinet - Confidential Features, Nodes & Weights – Single Instance 70 FEATURE 100 wt. NODE +90 FEATURE 100 wt. NODE -20File Result INPUT LAYER HIDDEN LAYER HIDDEN LAYER OUTPUT LAYER File 1. We start with an input file – malicious or clean 2. Feature presence is calculated, re-weighted and passed forward to the next node 3. The analysis is repeated using the next layer feature, then passed to the next node 4. Result – the overall probability based on a score of feature presence
  • 13. 13Fortinet - Confidential Features, Nodes & Weights – Multiple Instance FEATURES FEATURES Result INPUT LAYER MALICIOUS LAYER CLEAN LAYER OUTPUT LAYER File
  • 14. 14Fortinet - Confidential AI Operational Example - Machine vs. Malware 1 = process the input file 2 = 2.3 billion nodes analyzing potential malicious features 3 = 3.2 Billion nodes analyzing files for clean features 4 = output or decision layer (1 = malicious , 0 = clean) 4 Layer Architecture Consists of separate layers for either malicious or clean feature processing Mathematical models compare samples and features to decide output Input layer Malicious Layer Clean Layer Output layer FortiGuard AI Output is a result of 2.3B x 3.2B individual node computations. Initial capability – 58 samples per second
  • 15. 15Fortinet - Confidential System Training FEATURES REPOSITORY TRAINING FILES 1. We start with AI and an empty feature repository 2. A training set of files are input, consisting of clean and malicious files. Files are labeled for initial training 3. Files are input and broken into code blocks. AI logic builds a set of features. 4. Features are modified as the system learns (weighting values, next phase) AI
  • 16. 16Fortinet - Confidential System Testing MALICIOUS CLEAN FEATURES REPOSITORY TESTING FILES OUTPUT1. Test samples are selected and input to the system 2. Using the feature repository, samples are analyzed 3. As this occurs, existing features may get modified or others added 4. The system determines clean or malicious output 5. Output is compared to expected results. If not accurate, reset to known point and retrain. RESULTS EXAMINED AI
  • 17. 17Fortinet - Confidential AI in Operation MALICIOUS CLEAN OUTPUT   INPUT RAW SAMPLES Feature Set Improvements  Quality  Stabilized Number  Weighting Confidence Continued Accuracy to a High Degree of Confidence FEATURESQuantity Quality
  • 18. 18Fortinet - Confidential  Has to integrate to support current systems  Must use features (code blocks), create signatures  Continuous, seamless operational support Managing New Malware ?!  Problem - how to introduce new samples and create new features  New features are naturally regressed in weight – no history  Requires false weighting to have value in decisions  Cannot subvert existing weighted features  Cannot use single samples / false weight  Would cause heavy skewing of results  Introduces potentially large errors in main system
  • 19. 19Fortinet - Confidential New Malware Management System – Part I AI 130+ Threat Feeds Global Sensors CTA 1 New Files 2 Sandbox Non-matching Feature Sets (all code blocks)  A ‘rolling’ input is used – last batch + new  AI is fast – millions per day  Sandbox is slower – 10s/K per day  Features created as in original training  Features are relative to AI – consist of code blocks, which are also used by CPR AI 3 4
  • 20. 20Fortinet - Confidential New Malware Management System – Part II (AI Side) 1 2 3 4 5 6 Knowledgebase Repository Production Feature Set
  • 21. 21Fortinet - Confidential New Malware Management System – Part III 2 4 5 Knowledgebase Repository Production CPR Signature Set CB CB CB CB CB CB CB CB CB CB 7Customer Enterprise 1 3 6 CPR Engine 8
  • 22. 22Fortinet - Confidential Results • Significantly Increases cybercriminal cost burden for limited return • Obtain unsurpassed accuracy not possible through manual methods • Eliminate manual labor/potential errors • Create autonomous system – operational and continued learning
  • 23. 23Fortinet - Confidential Take-Aways • AI as a business solution is not trivial • Large effort to implement • Example ~ 8 years to date • AI should ease technical resource demands • Reallocate to more critical tasks • Hybrid approaches cause more labor Anyone pitching AI without some level of detail is probably BS Bingo selling • Should not have to watch it • Tell it right from wrong? • Lack of trust = poor implementation

Editor's Notes

  1. We present a hybrid data mining approach to detect malicious executables. In this approach we identify important features of the malicious and benign executables. These features are used by a classifier to learn a classification model that can distinguish between malicious and benign executables. We construct a novel combination of three different kinds of features: binary n-grams, assembly n-grams, and library function calls. Binary features are extracted from the binary executables, whereas assembly features are extracted from the disassembled executables. The function call features are extracted from the program headers. We also propose an efficient and scalable feature extraction technique. We apply our model on a large corpus of real benign and malicious executables. We extract the abovementioned features from the data and train a classifier using Support Vector Machine. This classifier achieves a very high accuracy and low false positive rate in detecting malicious executables. Our model is compared with other feature-based approaches, and found to be more efficient in terms of detection accuracy and false alarm rate.
  2. Watson used 200 million pages of structured & unstructured content, 4 terabytes
  3. Clustering - Task based, used to define probabilities. Walls + air + constant temperature = room. Air + flat surface does not equal room. Dimensionality - determine solution by eliminating noise. Structured prediction - Used for language modeling Anomaly detection - Think bank fraud detection techniques. Reinforcement learning - Used extensively in game theory. Good action = good result, bad action = bad result.
  4. Different file types are sent to different nodes within each layer, for instance a pdf file will be sent to a different layer than a exe. If we are dealing with EXE files, one of the features that we extract is the code block (assembly code snippets) which should then be normalised (eg extracting patterns that could change between malwares) before hashing them (reduced representation of features)
  5. Different file types are sent to different nodes within each layer, for instance a pdf file will be sent to a different layer than a exe. If we are dealing with EXE files, one of the features that we extract is the code block (assembly code snippets) which should then be normalised (eg extracting patterns that could change between malwares) before hashing them (reduced representation of features)
  6. Different file types are sent to different nodes within each layer, for instance a pdf file will be sent to a different layer than a exe. If we are dealing with EXE files, one of the features that we extract is the code block (assembly code snippets) which should then be normalised (eg extracting patterns that could change between malwares) before hashing them (reduced representation of features)
  7. Training Phase – Malicious and Clean files , flagged as so, are passed through classifier in a feature extraction (collection and selection). We have two sets of data, one that is used for training and one other for testing Both should be the most accurate representation of malware families and behavior to start off with a good initial footprint.
  8. We ingest new samples from a large array of sources, to include 120 threat feeds (such as STIX, FS-ISAC, VirusTotal, CTA, etc.) and our more than 3 million deployed global sensors. We follow the same methodology as when we trained the larger system by breaking files into code blocks using technologies similar to IDA. The difference is that now we will feed them into 2 smaller AI systems with much smaller nodes and feature set capacity. The first system is imply an AI and we get features created as we did when we trained the larger system. This particular instance is very fast – it can process millions of files per day. The second system uses a sandbox to detonate the file and associated code blocks to analyze behavioral results. The features are built much as they are in the simpler AI system, with the exception that the malware was activated resulting in different features than the non-detonated sample. Basically we do this for each new sample we want to analyze. The feature sets fill within both systems, but they are different.
  9. We first look at the AI side. The features we created are compared with the larger production feature set to determine if there are duplicates. If so, the production feature is weighted due to the hit, then removed from the smaller AI feature set. Next, all feature weights are added together, and the sum is divided by the number of features. This provides an average. The features below the average are removed, those above the average are retained. The AI features now only have the best of what was created during this filtering process. Now we reintroduce the malware and test the features to determine performance. We note which features produce the best results as far as speed, malware identification, and zero false positives. Only those features that contribute significantly are retained. We then artificially pad the new features so they will be significant within the production feature set. We did not lose significant features when we did this – after 5 years the system still has hundreds of thousands of features that are insignificant. As the new features are integrated into the analysis process, they are treated the same as any other feature – their weight value may increase or decrease over time. The features we previously extracted have a weighted value at this point. We then check the current production features located in the knowledgebase repository. If duplicates exist, we add some weighting to the production features and remove them from the smaller AI feature set. We then add the AI feature weights in and divide that by the number of features. This gives us an average weight. We then cull out the lower ones and retain the higher weighted features. The system is reset and the are new features are placed in it. The malware is untagged (so the system does not know) and fed into the system. The features that provided the most speed, accuracy of detection, and lowest false positives are retained. The features are given additional weight value The features are added to the knowledge base where the production AI uses them to continue analysis of incoming files
  10. Next we look at the CPRL side, which is very similar to the smaller AI but with a couple of key differences. As in the AI version, the features we previously extracted have a weighted value at this point. We then check the current production features located in the knowledgebase repository. If duplicates exist, we add some weighting to the production features and remove them from the smaller sandbox/AI feature set. We add the sandbox/AI feature weights in and divide that by the number of features. This gives us an average weight. We then cull out the lower ones and retain the higher weighted features. The system now has the best of the features loaded in the feature set. The malware is untagged (so the system does not know) and fed into the system. The features that provided the most speed, accuracy of detection are transformed back into the code block they originated from. The code blocks are clustered according to how well all possible combinations did during testing. CPRL signatures are created and uploaded to the knowledgebase repository. The signatures are distributed to Fortinet customer Secure Fabric implementations.
  11. Next we look at the CPRL side, which is very similar to the smaller AI but with a couple of key differences. As in the AI version, the features we previously extracted have a weighted value at this point. We then check the current production features located in the knowledgebase repository. If duplicates exist, we add some weighting to the production features and remove them from the smaller sandbox/AI feature set. We add the sandbox/AI feature weights in and divide that by the number of features. This gives us an average weight. We then cull out the lower ones and retain the higher weighted features. The system now has the best of the features loaded in the feature set. The malware is untagged (so the system does not know) and fed into the system. The features that provided the most speed, accuracy of detection are transformed back into the code block they originated from. The code blocks are clustered according to how well all possible combinations did during testing. CPRL signatures are created and uploaded to the knowledgebase repository. The signatures are distributed to Fortinet customer Secure Fabric implementations.
  12. Next we look at the CPRL side, which is very similar to the smaller AI but with a couple of key differences. As in the AI version, the features we previously extracted have a weighted value at this point. We then check the current production features located in the knowledgebase repository. If duplicates exist, we add some weighting to the production features and remove them from the smaller sandbox/AI feature set. We add the sandbox/AI feature weights in and divide that by the number of features. This gives us an average weight. We then cull out the lower ones and retain the higher weighted features. The system now has the best of the features loaded in the feature set. The malware is untagged (so the system does not know) and fed into the system. The features that provided the most speed, accuracy of detection are transformed back into the code block they originated from. The code blocks are clustered according to how well all possible combinations did during testing. CPRL signatures are created and uploaded to the knowledgebase repository. The signatures are distributed to Fortinet customer Secure Fabric implementations.