AI in security

AI in Security
Subrat Kumar Panda
AI First Thought Leader,
Director of Engineering, AI and Data Sciences,
Capillary Technologies
Bangalore

Agenda
● AI and Industry 4.0
● Brief intro AI, ML, IoT
● Security Evolution (AI related)
● Era of Data
● AI use cases in security
● Building and deploying a Intelligent Security Product

Brief Introduction about me
● BTech ( 2002) , PhD (2009) – CSE, IIT Kharagpur
● Synopsys (EDA), IBM (CPU), NVIDIA (GPU), Taro (Full Stack Engineer), Capillary (Principal Architect - AI)
● Applying AI to Retail
● Co-Founded IDLI (for social good) with Prof. Amit Sethi (IIT Bombay), Jacob Minz (Synopsys) and Biswa
Gourav Singh (Capillary)
● https://www.facebook.com/groups/idliai/
● Linked In - https://www.linkedin.com/in/subratpanda/
● Facebook - https://www.facebook.com/subratpanda
● Twitter - @subratpanda

Industry 4.0
https://en.wikipedia.org/wiki/Industry_4.0
1. Interoperability
2. Information
transparency
3. Technical assistance
4. Decentralized
decisions

Knowledge is Power - Sir Francis Bacon
- Industry 4.0 enabled by IoT, BigData and AI
- IoT is the intelligent sensor
- BigData will enable processing huge volumes of data
- AI will make sense of the data in decision making
- AI helps transform raw data into power - AI will transform businesses for sure
- Primarily Machine Learning and then the deeper aspects with Deep Learning
AI is the bedrock on which Industry 4.0 relies on.

Machine Learning – http://techleer.com

What AI can and cannot Do today ?
https://hbr.org/2016/11/what-artificial-intelligence-can-and-cant-do-right-now

Supervised Learning
1. Being able to input A and output B will transform many industries.
2. The technical term for building this A→B software is supervised learning.
3. The best solutions today are built with a technology called deep learning or deep neural
networks, which were loosely inspired by the brain.
4. Basically labelled data is the most important requirement for Supervised Learning.
If a typical person can do a mental task with less than one second of thought, we can probably automate it
using AI either now or in the near future. - Andrew Ng

Transfer Learning - http://ruder.io/transfer-learning/

Machine Learning Tasks
● Regression (or prediction) — a task of predicting the next value based on the previous values.
● Classification — a task of separating things into different categories.
● Clustering — similar to classification but the classes are unknown, grouping things by their
similarity.
● Association rule learning (or recommendation) — a task of recommending something based on
the previous experience.
● Dimensionality reduction — or generalization, a task of searching common and most important
features in multiple examples.
● Generative models — a task of creating something based on the previous knowledge of the
distribution.

AI Funding in Cybersecurity
https://www.ciab.com/resources/artificial-intell
igence-cybersecurity/

Trends to Watch
https://www.ciab.com/resources/artificial-intell
igence-cybersecurity/

Future of AI
https://threatpost.com/artificial-intelligence-a-
cybersecurity-tool-for-good-and-sometimes-b
ad/137831/

AI powered Information Security
https://blog.capterra.com/artificial-
intelligence-in-cybersecurity/

Awesome ML Papers and Code for Cyber Security
- https://github.com/jivoi/awesome-ml-for-cybersecurity
- Datasets
- Papers
- Books
- Talks
- Tutorials
- Courses

Applications
https://ccdcoe.org/uploads/2018/10/Art-19-On-the-Effectiveness-of-Machine-and-Deep-Learning-for-Cyber-Security.pdf

Malware Detection
https://arxiv.org/pdf/1904.02441.pdf

Malware Detection Methodology
- Problem Formulation - Binary Classiﬁcation Problem
- Dataset
- Feature Extraction
- Dimensionality Reduction
- Model Building and Analysis

Datasets
- Malicia Project data
- Difference between the number of malware (11, 308) and benign executables (2, 819)
- Oversampling, Undersampling, Cluster based sampling helps
- Generalizability achieved by K-fold Cross Validation

Feature Extraction
- Decoding the executables
- Literature shows that various static attribute such as Windows API calls, strings, opcode, and
control ﬂow graph are good feature vectors
- They used opcode frequency as a discriminatory feature
- Dimensionality Reduction
- Variance Threshold
- Autoencoders

Building the Learning Model
- Exploration/Ensemble of multiple models
- Random Forest
- DNN-2L
- DNN-4L
- DNN-7L

Results
- Achieved the highest accuracy of
99.78% with random forest and
variance threshold which is an
improvement of 1.26% on
previously reported the best
accuracy.
- In feature reduction, variance
threshold outplayed auto-encoders
in improving the model
performance.
- The best result did not come from
any of the deep learning models.
- DL was a overkill for Malicia
Dataset

Hardware Based Malware detector
https://cse.iitk.ac.in/users/spramod/papers/date17.pdf

Feature Sets
https://cse.iitk.ac.in/users/spramod/papers/date17.pdf

AI in security

More Related Content

What's hot

Similar to AI in security

More from Subrat Panda, PhD

Recently uploaded

AI in security