A Data Modelling Framework to Unify Cyber Security Knowledge

A Data Modelling Framework
to Unify Cyber Security
Knowledge
OmnibusCyber
Authors:
Dr. Paolo Di Prodi
Dr. Brett Forbes

About Me
Paolo Di Prodi
Phd in Machine Learning
Software and Automation Engineer
Worked for Microsoft and now Fortinet.
Mostly Data Science in Cyber Security.
Prior to that malware reversing.

Problem we have right now!
External: Threat Intelligence Exchange
Internal: Any cyber data

The Sheriff of data
modelling
š Classical drama buy vs build vs reuse
š Buy is not an option
š Build is usually the option
š How can we avoid typical mistakes?
š Can provide a basic structure?
š With the ability to extend to each
company?

External Threat Intelligence
š STIX and TAXII are standards developed in
an effort to improve the prevention and
mitigation of cyber-attacks. STIX states the
“what” of threat intelligence, while TAXII
defines “how” that information is relayed.
Unlike previous methods of sharing, STIX
and TAXII are machine-readable and
therefore easily automated.

STIX and TAXI
STIX, short for Structured Threat Information
eXpression, is a standardized language
developed by MITRE and the OASIS Cyber
Threat Intelligence (CTI) Technical
Committee for describing cyber threat
information.
TAXII, short for Trusted Automated eXchange
of Intelligence Information, defines how
cyber threat information can be shared via
services and message exchanges. It is
designed specifically to support STIX
information

Tons and Tons of TIPs (around 98 vendors)

Internal use cases
š Sensor Telemetry
š Managed Detection Response
š Incident Response
š SOAR
š XDR BUY REUSE BUILD

Classical build environment setup
Sensor
IPS/IDS/AV/EPP/EDR
Transport
ProtoBuf/JSON/Avro/MQTT
Data
Lake
Central or Distributed
Data
Marts

Vulnerabilities concepts
CVE CVSS
EPSS CVRF
SBOM VEX
CSAF MITRE
CAPEC CWE
CPE

God created silos in the last day
IPS
DB
Query
EDR
DB
AV
DB
Query
FIM
DB
Query
ZTN
DB
Query
š Each product have their own syntax,
taxonomies and ontologies
š Building a federated DB is a big
challenge
š I mean just even look at the SIEM vendor
space….
Where
are my
CVE?
What is a
vulnerabili
ty?
What is
the
context?
Where is
my
OLAP?

Omnibus Cyber
Data
Lake
ETL TypeDB
Base Schema
•Entity
•Relations
•Rules
•URI
ETL
•External
reference
•Loaders
OLAP
•Dimensions
•Fact
•Measure

Meet the elephants
UCO ad OCSF
UCO

Unified Cyber Ontology (UCO)
š A foundation for standardized information
representation across the cyber security
domain/ecosystem
š Last version: 0.9.0 on 16 June 2022
š First Version: 01.0 on 5 Jan 2017
š Based on:
š OWL
š Java 11
š Key stats:
š 418 Classes
š 707 Properties
š 11812 Triples
RDF
Adoption
Focus on
Observables

Open Cybersecurity Schema Framework (OCSF)
š The Open Cybersecurity Schema
Framework is an open-source project,
delivering an extensible framework for
developing schemas, along with a vendor-
agnostic core security schema. Vendors
and other data producers can adopt and
extend the schema for their specific
domains.
š OCSF is intended to be used by both
products and devices which produce log
events, analytic systems, and logging
systems which retain log events.
š First Version: 14 July 2022
š Schema: JSON
Loose
inheritance
There is no
reference
database
implementation.

Our advantages
Extensibility
• Base Schema
• Inheritance
Reference
implementation
• TypeDB
• Toolbox
ER
• Entity-
Relationships
• URI
Sharing
• Native STIX
import/export

Why not everything STIX?
What about
CWE,CAPEC,
ATTCK?

STIX Databases and Extensions
Section 7.3
•Extension
Definition
Policy
•JSON
schema
Section 11
•Custom
Object
Extensions
•Deprecated
š A work in progress for now in cooperation
with OASIS
š Is it possible? Yes

Omnibus Design
Prod schema
Corp Schema
Base Schema
š Basic pattern: inherit and extend
š Base Schema contains main concepts:
š CVE/CVSS/CWE/CAPEC
š MAEC
š COCOA
š ATT&CK, DEFEND, ATTCK FLOW etc
š VERIZON/VERIS
š Specialized schema contains business
logic
š Sensor facts
š Incident Response Playbooks

Live Demo
Kafka Bus
DataLake
IPS/IDS
/EDR/
AV
ProtoBuf ProtoBuf ProtoBuf ProtoBuf
OmnibusCyb
er
ETL

Example for IPS packet
Inherit and expand
š It’s excellent for additions
š Example here is to derive CVE entity:
š Add relation to device object
š Add relation to volume count
Simple YAML
config
Specific schema
Hostxyz|2022-01-01T10:00:00|CVE-
2022-1234|1000

Cool benefits
Auto Enrichment
š Each entity could have an authoritative
source
š This means auto enrichment in real time if
required.
Demo Example
š Let’s enrich the CVE data stream
š Source is the NVD database
š https://youtu.be/R0fyiBZCEyg
Project: https://github.com/priamai/omnicyberdb/tree/experimental

TypeDB limitations
Schema
•Dependencies
•Annotations
•Keyword
escaping
Scope
•Namespaces
•Versions
•Multiple
Inheritance
Data
•Validation
•Array/Vector
Type
•Orphan
attributes
handling
•Upsert!!!
Queries
•Materialized
Views?
•More
aggregation
operators!

A Data Modelling Framework to Unify Cyber Security Knowledge

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to A Data Modelling Framework to Unify Cyber Security Knowledge

Similar to A Data Modelling Framework to Unify Cyber Security Knowledge (20)

More from Vaticle

More from Vaticle (20)

Recently uploaded

Recently uploaded (20)

A Data Modelling Framework to Unify Cyber Security Knowledge