Draft document to present findings of exploratory work on the incorporation of machine learning and AI into an existing data security product. The project was abandoned due to conflicting work done by product management.
2. Table of Contents
Introduction
Cognitive Security Basics
Supporting the Cybersecurity Operations Center
Powering ML/AI with CEM
Incorporating DCD
Research Activities
3. Introduction
Data security’s ever-changing nature requires security measures to evolve quickly in ways that reduce
the workload of security specialists, but ensure their judgments remain reliable.
Cognitive security systems are the latest evolutionary step, pairing self-learning systems with security
specialists.
Cybersecurity Operations Centers could benefit greatly from a cognitive security solution due to their
need to remain flexible, operate proactively, and respond quickly when security breaches occur.
4. Introduction
By 2020, the worldwide market of ML- and AI-based security solutions will be $10B.
Within the next 5 years, ML/AI will shift from focusing on algorithms to focusing on high-value data.
Early adopters of AI-based products will see a boost in their profits that could translate to a 10% revenue
gain for providers of those products.
Per multiple market reports, analysts project CA to remain a key player in the cognitive security market
until at least 2025.
5. Introduction
CA possesses all of the major ingredients for developing a cognitive security solution across the Access
Control, Enterprise Data Protection, and Mainframe Operations value streams.
Access Control provides raw data to supplement data collected via Enterprise Data Protection products.
Enterprise Data Protection products, especially CEM, offer a strong backbone upon which to construct a
prediction and analytics solution to support cognitive security goals.
Mainframe Operations offers a model for surfacing intelligent system recommendations, and potentially
providing feedback to such systems.
6. Introduction
Special thanks from the author goes out to the following Mainframe Security Team members:
Mitch Rozonkiewiecz, Senior Principal Architect, for his assistance in identifying opportunities for
bringing ML/AI to CEM, and understanding the features of security data collected by CEM.
Jim Broadhurst, CEM Product Owner, for his encouragement and support of the partnership
between UX and engineering teams.
7. Cognitive Security
Basics
Understanding a Rising Trend
What Is Cognitive Security?
Structure of a Cognitive Security Solution
Benefits of Cognitive Security
Drawbacks of Cognitive Security
Use Cases
8. What Is Cognitive Security?
Cognitive security solutions rely upon the pairing of a human expert with an intelligent, self-learning
system.
Self-learning systems use machine learning, natural language processing, and data mining to analyze
large volumes of data to synthesize knowledge that supports continuous improvement.
Analyzing across a variety of data sources and interacting with human specialists ensures that cognitive
security solutions operate as accurately as possible within a relevant context.
9. Structure of a Cognitive Security Solution
Perimeter Security Analysis & Prediction Interactive Self-Learning
Layer
Awareness of an organization’s
security landscape, and the
measures taken to secure access
points, such as:
● Passwords
● Masking
● Encryption
● Access Permissions
Across security data from multiple
sources, the system performs
analyses to uncover trends it
surfaces to human partners to
support their decision-making and
task completion.
Understand unstructured data
sources for a particular set of
knowledge, reason through the
contents of the data, and learn
continuously via training from a
human partner and from a constant
stream of data.
10. Benefits of Cognitive Security
Proposes a human-machine partnership that offloads very simple and repetitive tasks, and very complex
tasks, to a self-learning system to empower security specialists to perform other mission critical
activities.
Enables a shift from reactivity to proactivity.
Analyzes data and synthesizes knowledge faster than a human.
Allows for a multi-dimensional view on security that enables analysis of subtle trends beyond the obvious
ones due to anomalies, malicious insiders, and malware.
11. Drawbacks of Cognitive Security
As an analytical system reliant on data and human interaction to learn and become effective, these two
points also stand as potential weaknesses for cognitive security.
Data quality affects the quality of the knowledge a cognitive system synthesizes, with lower quality
requiring more training interactions to compensate.
During training interactions, users can transmit their own biases (due to a lack of knowledge,
misunderstandings, or particular focus) to cognitive systems.
12. Use Cases
Cognitive Security to date has 5 clear use cases, all of which rely heavily on raw and processed data.
Four of the use cases, though important in their own right, roll up into supporting a primary use case:
Support the Cybersecurity Operations Center (CSOC).
CEM’s access to data and reporting capabilities present it as a solid candidate to address the needs
associated with each use case.
13. Use Cases: Support the CSOC
Cybersecurity Operations Center is a multi-tiered collection of security specialists responsible for an
organization’s security.
Cognitive systems process large volumes of data very quickly, empowering CSOC specialists with
knowledge faster than they could on their own:
Builds the knowledge bases of lower-level security analysts quickly to improve their performance.
14. Use Cases: Leveraging External Knowledge
Cognitive systems can tap into external expert knowledge sources to supplement the organization’s
security data.
External insights can provide better context for how to address security incidents.
Combined with the speed at which cognitive systems process data and synthesize knowledge from them,
effective resolutions to security incidents can be found sooner rather than later.
15. Use Cases: Threat Identification via Advanced Analytics
Data analysis in cognitive systems proceeds via a combination of machine learning algorithms and data
mining.
16. Use Cases: Improve Application Security
Advanced analytics can also provide deeper insights and better context into security events.
17. Use Cases: Improve Enterprise Risk Levels
Combining internal data analytics with external expert data sources can help organizations understand
how far they have to go to comply with data security regulations.
19. Cybersecurity Operations Center Overview
Cybersecurity Operations Center is a multi-tiered collection of security specialists who monitor a data
environment’s activities and seek to keep it secure, protected, and in compliance with regulations.
Blends people, process, and technology to enable a security strategy created in response to the
ever-changing security landscape.
CSOC is designed to operate quickly, flexibly, and proactively in response to security threats.
20. CSOC Structure
The CSOC organizes security
specialists in a way that enables
them to detect and neutralize
threats as early as possible. As
threats becomes more serious, they
escalate to another tier to maximize
response effectiveness.
21. CSOC Models
Fully Outsourced Hybrid: Internal+External
Specialists
Fully Internal
Managed Security Service Provider
(MSSP) fills all CSOC roles.
Communication to client is limited
to:
● Incident Response.
● Queries about CSOC
standards & procedures.
Model 1: 8x5 Business Hour
Coverage
Employees fill CSOC roles during
regular business hours.
MSSP fills all CSOC roles for
non-business hours.
Model 2: Support In-House CSOC
Employees fill key CSOC roles.
MSSP fulfills roles dedicated to
vigilance activities (monitoring &
detecting).
Model 1: 8x5 Business Hour
Coverage
Employees fill CSOC roles during
regular business hours.
Leverages technology to automate
escalation and notifications per
Security Architect
recommendations.
Model 2: 24 x 7 Coverage
Employees fill CSOC roles on a
schedule.
Automates features to reduce
operations cost.
22. Supporting the CSOC
CSOC teams face an ever-growing and ever-changing mountain of security threats, and they rely upon
processes and technologies to help them address those threats.
Key challenges CSOC teams face include:
Faster responsiveness. Rising security infrastructure costs.
Better security analytics.
Inconsistent skills, or lack of skills, within security teams.
23. Supporting the CSOC
Addressing security challenges brings to light 3 gaps that must also be addressed in order for a CSOC to
remain effective:
Intelligence, focused on threat research and keeping current on threats and vulnerabilities.
Speed, referring to faster threat response and resolution.
Accuracy, optimizing security alerts and improving threat recommendation.
These gaps can be closed by a cognitive security solution.
24. Supporting the CSOC: Intelligence Gap
Cognitive security solutions provide less knowledgeable CSOC members with insights typically requiring
years of experience.
Improved presentation of analytics to assist with judgments, decision-making, and escalations.
Leverage external intelligence from expert sources to understand trends and how to respond to
them.
25. Supporting the CSOC: Speed Gap
Advanced analysis methods provide insights faster, allowing for faster detection and identification of
threats, and resulting in faster resolutions.
Cognitive security solutions scan and analyze data, and synthesize knowledge faster than humans.
Automation of mundane and repetitive tasks allows security specialists to focus on mission critical
security tasks, but also provides more time to address threats.
26. Supporting the CSOC: Accuracy Gap
Automation of data gathering across multiple sources and reasoning through the findings provides
greater context around security incidents.
Cognitive security solutions blend external expert knowledge sources with internal security data
to provide the most relevant and accurate interpretations of observed security events to specialists.
Good cognitive security solutions act as teammates, performing low-level work that frees up
security specialists’ time for things like training to keep skillsets current or to expand them.
28. Powering ML/AI
with CEM
Evaluating the landscape for a CEM
+ ML/AI Strategy
Why Choose CEM for ML/AI?
Product Features
Applying ML/AI to CEM
29. Why Choose CEM for ML/AI?
CEM has the potential to fit into a cognitive security strategy based upon product features and how they
can be leveraged to support changes in :
Core functionality
Data collected
Security team structure
Security technology trends
30. Product Features: Core Functionality
CEM provides a valuable combination of data collection and descriptive data analysis:
Real-time data collection on security events from the policies set up in the application.
Reporting allows CEM users to combine multiple data points to describe compliance with policies.
Connections with SIEM applications enables more detailed analysis of security data points.
31. Product Features: Data Collected
Security is a data-rich domain, and between the ESM in use at an organization and CEM, a machine
learning and AI-based solution would not face a shortage of training data.
ESMs feed data constantly to the SMF on z/OS that includes I/O activity, network activity, and
errors.
Per engineering input, SMF data could be pulled for use via an utility.
CEM interacts with all 3 ESMs to capture security event data in real time.
32. Applying ML/AI to CEM
Machine learning combines well with CEM’s real-time data to provide a clear way to baseline security
events and detect anomalies.
As pointed out by a CA architect from Team Woz, SMF contains a record of all security events on a
mainframe, meaning baselines could be setup in a relatively short time.
Preprocessed data from CEM & CIA reports could also be analyzed via machine learning.
Report analysis could enable more effective proactive measures to security concerns, such as
recommendations for policy changes or creation.
33. Cognitive Security & CEM
Cognitive security refers to continuously-learning systems that rely upon a combination of machine
learning, natural language processing, data mining, and human interaction to develop hypotheses about
what is happening on system.
It combines two concepts:
Deploying cognitive systems for the analysis of structured and unstructured security data to
uncover insights and provide actionable recommendations for proactive security and business
activities.
Supporting cognitive systems with technologies, techniques, people, and processes to provide
relevant context for data and ensure accuracy of analyses and recommendations.
35. ML/AI in DCD
DCD employs machine learning as part of scanning (soon to be renamed policies).
Machine learning produces the following information:
Form of data source (structured vs. unstructured)
Data source metadata (column header, column data type, comments)
Breakdown of sensitive data matches (location, combinations)
36. Benefits of Current ML Implementation
Currently, ML in DCD provides insight into features of scanned data, such as its structure and
surrounding data points.
Elucidating data features like DCD does helps build data literacy among users.
Gartner, Forbes, InfoWorld, and other major research firms and publications have listed data
literacy as a top challenge for companies within the next few years.
Improved data literacy enables companies to maximize the value of their analytics and reporting.
38. Data Literacy
Data literacy refers to knowing what data are, and having an awareness of their collection, analysis,
visualization, and use with respect to data security and data privacy (Crusoe, 2016).
Two personas to consider within the concept of data literacy:
Subjects, the people data is about, and who may have varying levels of data literacy.
Stewards, the people responsible for the security and privacy of collected data, and who may also
have varying levels of data literacy.
39. Data Literacy
Within the context of CEM, the Steward persona will be of primary interest, as they are the end users
who enact the policy on mainframe data.
Data is meaningless until it is analyzed, highlighting the importance of knowing what data are
available, how it is being analyzed and presented, and how users intend to use it.
Data quality affects the analysis and presentation of the resulting output.
40. Data Literacy
Research on data literacy of CEM users should focus on the following topics:
What data does CEM collect?
What is the value of the collected CEM data?
What are the features of the CEM data that enable you to use it?
43. Thank You!
For questions or comments about this exploration document, please contact:
Leslie A. McFarlin, Senior UX Designer
leslie.mcfarlin@ca.com