Machine Learning for
Security Analysts
Out of the Buzzword and into the Mainstream
1
$ whoami
Name: GTKlondike
(Independent security researcher)
(Consulting is my day job)
Passionate about network security
(Attack and Defense)
NetSec Explained: A passion project and YouTube
channel which covers intermediate and advanced level
network security topics in an easy to understand way.
I hate these pages
2
What Is Machine Learning?
3
What is it we’re trying to do?
What Is Machine Learning?
4
AI, ML, and deep learning
What Is Machine Learning?
Machine Learning is a set of statistical techniques,
that enables a process of information mining, pattern
discovery, and drawing inferences from data.
Machine Learning uses algorithms to “learn” from
past data to predict future outcomes.
5
What is it we’re trying to do?
Machine Learning Examples
6
Domain Generation Algorithms
Machine Learning Examples
7
Web Application Firewall
Machine Learning Examples
8
Network Anomaly Detection
Why This Talk?
Today, 25% of security products for detection have
some form of machine learning
To properly deploy and manage machine learning
products, you will need to understand how they
operate to ensure they are working efficiently.
9
In the future, we are all Skynet
Source: Gartner Core Security; 2016
7 Step Machine Learning Process
Gather the Data
Prepare the Data
Choose a Model
Train the Model
Evaluate the Model
Hyperparameter Tuning
Deploy
10
Gather, Build, Train, Test, Deploy
Machine Learning, Head First
We’re going to start by building a Spam Filter
(Something we’re all familiar with)
Input: Emails
Output: Determine if this is Spam or not
11
Building it from scratch
Machine Learning, Head First
12
But first, a little background
Text Category
“A great game” Sports
“The election was over” Not sports
“Very clean match” Sports
“A clean but forgettable game” Sports
“It was a close election” Not sports
Source: Applying Multinomial Naïve Bayes
Machine Learning, Head First
13
Bayes’ Theorem
Source: Applying Multinomial Naïve Bayes
Machine Learning, Head First
14
Bayes’ Theorem
Source: Applying Multinomial Naïve Bayes
Machine Learning, Head First
15
Another look at the table
Text Category
“A great game” Sports
“The election was over” Not sports
“Very clean match” Sports
“A clean but forgettable game” Sports
“It was a close election” Not sports
Source: Applying Multinomial Naïve Bayes
Machine Learning, Head First
16
But wait, what if this happens?
Source: Applying Multinomial Naïve Bayes
Machine Learning, Head First
17
But wait, what if this happens?
Source: Applying Multinomial Naïve Bayes
Machine Learning, Head First
18
Multinomial Naive Bayes
Source: Applying Multinomial Naïve Bayes
Machine Learning, Head First
19
Calculate the probabilities
Word P(word | Sports) P(word | Not sports)
A 2 + 1
11 + 14
1 + 1
9 + 14
Very 1 + 1
11 + 14
0 + 1
9 + 14
Close 0 + 1
11 + 14
1 + 1
9 + 14
Game 2 + 1
11 + 14
0 + 1
9 + 14
Source: Applying Multinomial Naïve Bayes
Machine Learning, Head First
20
Let’s finish it up
Source: Applying Multinomial Naïve Bayes
Machine Learning, Head First
(d) - The total number of unique words
(N)spam - The total number of words in Spam
(N)ham - The total number of words in Ham
(Xi)spam - The count of each word in Spam
(Xi)not spam - The count of each word in Ham
21
What we need to keep track of
Machine Learning, Head First
Re: Re: East Asian fonts in Lenny. Thanks for your support.
Installing unifonts did it well for me. ;)
Nima
--
To UNSUBSCRIBE, email to debian-user-
REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact
listmaster@lists.debian.org
22
Let’s look at one of the emails
Machine Learning, Head First
re: re: east asian fonts in lenny. thanks for your support.
Installing unifonts did it well for me. ;)
nima
--
To unsubscribe, email to debian-user-
request@lists.debian.org
with a subject of "unsubscribe". trouble? contact
listmaster@lists.debian.org
23
Remove punctuation and stopwords
References
Gartner Core Security
–The Fast-Evolving State of Security Analytics; April 2016
Applying Multinomial Naïve Bayes
–Applying Multinomial Naive Bayes to NLP Problems: A
Practical Explanation; July 2017
AI Village
–https://aivillage.org/
Machine Learning and Security
–By Clarence Chio & David Freeman
24
And further reading
Thank You!
Email: GTKlondike@gmail.com
YouTube: Netsec Explained
Website: NetsecExplained.com
Github: github.com/NetsecExplained
25

Machine Learning for Security Analysts

  • 1.
    Machine Learning for SecurityAnalysts Out of the Buzzword and into the Mainstream 1
  • 2.
    $ whoami Name: GTKlondike (Independentsecurity researcher) (Consulting is my day job) Passionate about network security (Attack and Defense) NetSec Explained: A passion project and YouTube channel which covers intermediate and advanced level network security topics in an easy to understand way. I hate these pages 2
  • 3.
    What Is MachineLearning? 3 What is it we’re trying to do?
  • 4.
    What Is MachineLearning? 4 AI, ML, and deep learning
  • 5.
    What Is MachineLearning? Machine Learning is a set of statistical techniques, that enables a process of information mining, pattern discovery, and drawing inferences from data. Machine Learning uses algorithms to “learn” from past data to predict future outcomes. 5 What is it we’re trying to do?
  • 6.
  • 7.
    Machine Learning Examples 7 WebApplication Firewall
  • 8.
  • 9.
    Why This Talk? Today,25% of security products for detection have some form of machine learning To properly deploy and manage machine learning products, you will need to understand how they operate to ensure they are working efficiently. 9 In the future, we are all Skynet Source: Gartner Core Security; 2016
  • 10.
    7 Step MachineLearning Process Gather the Data Prepare the Data Choose a Model Train the Model Evaluate the Model Hyperparameter Tuning Deploy 10 Gather, Build, Train, Test, Deploy
  • 11.
    Machine Learning, HeadFirst We’re going to start by building a Spam Filter (Something we’re all familiar with) Input: Emails Output: Determine if this is Spam or not 11 Building it from scratch
  • 12.
    Machine Learning, HeadFirst 12 But first, a little background Text Category “A great game” Sports “The election was over” Not sports “Very clean match” Sports “A clean but forgettable game” Sports “It was a close election” Not sports Source: Applying Multinomial Naïve Bayes
  • 13.
    Machine Learning, HeadFirst 13 Bayes’ Theorem Source: Applying Multinomial Naïve Bayes
  • 14.
    Machine Learning, HeadFirst 14 Bayes’ Theorem Source: Applying Multinomial Naïve Bayes
  • 15.
    Machine Learning, HeadFirst 15 Another look at the table Text Category “A great game” Sports “The election was over” Not sports “Very clean match” Sports “A clean but forgettable game” Sports “It was a close election” Not sports Source: Applying Multinomial Naïve Bayes
  • 16.
    Machine Learning, HeadFirst 16 But wait, what if this happens? Source: Applying Multinomial Naïve Bayes
  • 17.
    Machine Learning, HeadFirst 17 But wait, what if this happens? Source: Applying Multinomial Naïve Bayes
  • 18.
    Machine Learning, HeadFirst 18 Multinomial Naive Bayes Source: Applying Multinomial Naïve Bayes
  • 19.
    Machine Learning, HeadFirst 19 Calculate the probabilities Word P(word | Sports) P(word | Not sports) A 2 + 1 11 + 14 1 + 1 9 + 14 Very 1 + 1 11 + 14 0 + 1 9 + 14 Close 0 + 1 11 + 14 1 + 1 9 + 14 Game 2 + 1 11 + 14 0 + 1 9 + 14 Source: Applying Multinomial Naïve Bayes
  • 20.
    Machine Learning, HeadFirst 20 Let’s finish it up Source: Applying Multinomial Naïve Bayes
  • 21.
    Machine Learning, HeadFirst (d) - The total number of unique words (N)spam - The total number of words in Spam (N)ham - The total number of words in Ham (Xi)spam - The count of each word in Spam (Xi)not spam - The count of each word in Ham 21 What we need to keep track of
  • 22.
    Machine Learning, HeadFirst Re: Re: East Asian fonts in Lenny. Thanks for your support. Installing unifonts did it well for me. ;) Nima -- To UNSUBSCRIBE, email to debian-user- REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org 22 Let’s look at one of the emails
  • 23.
    Machine Learning, HeadFirst re: re: east asian fonts in lenny. thanks for your support. Installing unifonts did it well for me. ;) nima -- To unsubscribe, email to debian-user- request@lists.debian.org with a subject of "unsubscribe". trouble? contact listmaster@lists.debian.org 23 Remove punctuation and stopwords
  • 24.
    References Gartner Core Security –TheFast-Evolving State of Security Analytics; April 2016 Applying Multinomial Naïve Bayes –Applying Multinomial Naive Bayes to NLP Problems: A Practical Explanation; July 2017 AI Village –https://aivillage.org/ Machine Learning and Security –By Clarence Chio & David Freeman 24 And further reading
  • 25.
    Thank You! Email: GTKlondike@gmail.com YouTube:Netsec Explained Website: NetsecExplained.com Github: github.com/NetsecExplained 25