FireEye believes in intelligence driven cyber security. Their legacy system used PostgreSQL with a custom graph database system to store and facilitate analysis of threat intelligence data. As their user base increased they ran into scaling issues requiring a system redesign with a new platform.
This presentation will focus on the bac kend systems and migration path to a new technology stack using JanusGraph running on top of Scylla plus Elasticsearch.
Using Scylladb turned out to be a game-changer in terms of performance and the types of analysis our application is able to do effortlessly.
8. FireEye Threat Intelligence
A portfolio of subscriptions and services designed to address all aspects of an
organization’s intelligence needs.
■ Intelligence Subscriptions
■ Intelligence Enablement
■ Intelligence Capability Development
■ Digital Threat Monitoring
■ Advanced Intelligence Access
9. Application Use Case
■ Homegrown custom graph database on Postgres
■ Centralizes, organizes and processes cyber threat intelligence data
■ Tracks threat groups by recording all of the analytic correlations
■ Provides analytic results by processing and analysing historical data
■ Data Objects - DNS data, RSS feeds, file md5s, FQDNs and URLs
■ Data Size: Nodes ~500M and Edges ~1.5B
10. Existing System as Graph DB
Structure of the Graph
■ Stores data as ”nodes” or “edges”
■ Also allows storing tags
Nodes
■ Each node represents a single object, event or evidence
■ E.g. Organizations, actors, hosts, files and FQDNs are represented as nodes in graph
Edges
■ Edges represent the relationships between nodes.
■ E.g. an edge exist from a threat actor to their location
12. Challenges of Existing System
Limitations :
■ Slow performance
■ Not easily scalable
■ Not stable
■ Not highly available
■ Not distributed
Objectives:
■ Replace the current system with a new scalable, highly available,
distributed system.
13. Tech Evaluation for Graph DB
Evaluation Targets - Multiple Graph DB’s
■ Orient DB
■ Synapse
■ AWS Neptune
■ Janus Graph
Evaluation Criteria - Based on MoSCoW Model
■ Functional
■ Non-Functional
■ Supportability
14. Why JanusGraph?
Opinionated Selection Criteria for Janus Graph :
■ Indexing capabilities that can be controlled by the user.
■ Free / Full Text search
■ Embedded as well as Server mode setup capability
■ Schema Management
■ Triggers
■ OLAP Capabilities - Distributed Graph Processing
Result:
■ Based on our requirements, tech evaluation and test results, we selected JanusGraph.
17. Why ScyllaDB ?
Based on tech evaluations and tests we determined Scylla DB is the right
backend storage.
Features :
■ Easy Cluster setup
■ Self Tuning
■ Equal Load distribution
■ Easy to Manage On Cloud
■ Less Administration
■ No GC
■ Compression
19. ScyllaDB Usage for Threat Analysis
■ Since data represents threat activity, we can get answers to questions about:
● Threat actors
● Malware
● Threat activity
● Victims
● Various other things.
■ Graph DB tells a story about data by connecting dots
24. Configurations
■ Running on AWS Cloud
■ Single Region (Multi AZ) deployment
■ Using EC2’s
■ AWS Instance - i3.8xlarge
■ Each Cluster has 7 nodes
■ Clusters - DEV, QA, STAGING, PROD.
H/W Per Node Per Cluster
CPU 32 224
RAM (GB) 244 1708
Disk (TB) 16 112
30. FireEye Traversing with Scylla DB
■ Very good experience and results observed so far
■ Cost Effective
■ Admin Friendly
■ Superfast
■ Looking at potential opportunities to use ScyllaDB in other projects
31. Thank You All ..!!
■ FireEye
● Architects
● Engineers: Developers, DevOps & QA
● Project and Program Managers
■ JanusGraph
■ ScyllaDB
● Scylla University
● Community
● Summit Organisers
32. Thank you Stay in touch
Any questions?
Rahul Gaikwad
rahul.gaikwad@fireeye.com
Krishna Palati
krishna.palati@FireEye.com
linkedin.com/in/rahul-gaikwad-2712b02a
linkedin.com/in/krishnapalati
Editor's Notes
KP: Hello everyone, hope you are enjoying the CA weather. As you heard in the introduction video, today we will talk abt how we at FireEye, used ScyllaDB to redesign an existing product and built a new solution for our Intel product portfolio.
KP: I am Krishna Palati, I manage Devops team for Solutions Engineering comprising of Intel, Managed Defense and Incidence Response for FireEye. We are responsible for Core Devops, Cloud Infrastructure operations & Database systems. In this presentation we will talk abt how we used Scylla to implement a solution that is critical for our Intelligence product portfolio.
RG - Hello, I am Rahul Gaikwad. I am a Staff DevOps Engineer at FireEye cybersecurity. I am responsible for continuous integration and deployment , different database administration and cloud operations. I came from India to talk in Scylla summit about how we are doing Intel Threat analysis using Graph database. We will be talking about the challenges with existing systems and how ScyllaDB helps us solve some of these challenges.
KP
KP
KP:
FireEye is a unique cyber security company in the sense that we bring our Security Appliances & Intelligence capabilities together for our customers.
Appliances could be physical or virtual and include a range of products like Endpoint (HX), Network (NX), Email (ETP). Solutions include Intel, Managed Defense & Incidence Reponse.
KP: As per Forrester Report, FireEye is the leader in cyber Threat Intelligence offering, both for current content and our strategy.
We are specifically focused on Intel because we will be discussing the problems we encountered with current technology and solutions we implemented to address them during rest of this presentation.
KP: As is evident here, we are Industry recognized thought leader in cyber Intelligence and often called upon to provide our analysis and thoughts on this topic.
KP
Subscription: Access to published intelligence reports
Enablement: Include onboarding and provisioning, API integration with your security systems, analyst access, workshops. Digital Threat Monitoring: Tailored, proactive monitoring and analysis of threats to your brand, your VIPs.
Advanced Intelligence Access: This capability enables direct queries into global visibility, insights and intelligence from FireEye.
https://www.fireeye.com/content/dam/fireeye-www/products/pdfs/pf/intel/ds-fireeye-threat-intelligence.pdf
KP: Now that we went through the business aspects of why and how we do Threat Intel, let's briefly talk about our current application and what it does at very high level.
RG: Our customized graph system stores data as “nodes” or “edges”. It also allows analyst to define and apply tags to nodes and edges , we can call it as attributes or characteristics.
Each node represents a single object, event or evidence.
For example, organizations, actor, hacker, host computers, files, and FQDNs are all represented as nodes in the graph database.
Edges represent the relationships between nodes. For example, an edge exist from a threat actor to their location.
RG :
In the above diagram, blue circles indicate nodes, green arrows are edges, red labels are properties, and orange labels are aspects
node 1 - email - sender mail id
node 2 – filemd5 - email content message / file attachment
node 3 - email – receiver mail id
node 4 - ipv4addr – IP address of filemd5 node
SenderEmail-ID (node) sent filemd5 email to ReceiverEmail-Id
Each node has properties in our intel system. For example:
The SenderEmail-Id is associated with APT3 actor - a known hacking group.
Filemd5 has been associated with an email phishing campaign.
ReceiverEmail-id is a tagged as victim
Filemd5 has association with the IP Address from which such phishing campaigns has been executed in the past.
RG : Over time, our intel system became very effective & popular. Its usage has increased from hand full of analysts to several hundred analysts spread across the globe. We became a victim of our own success - as we started running into performance limitations.
RG:
Based on our objectives, we started evaluating Graph database technologies like OrientDB, Synapse, AWS Neptune, JanusGraph.
We had various evaluation criteria like
Functional – Traversing Speed , Full text search, Concurrent users
Non-functional – Pluggable storage backend , High Availability and Disaster Recovery
Supportability – Strong and active user community , Already deployed in Production, Documentations
RG:
Indexing capabilities - We can define the indexes per use case.
Free / Full - Text Search is a capability where the system allows users to search for records that includes one or more word within a Free Text Field.
Embedded - We can embed JG with application code layer.
Schema Management - Allows to define and change Schema. It also validates incoming data (schema validations).
Triggers system generates Events when certain specific actions are performed on the underlying database store.
OLAP - Online Analytical Processing - using distributed graph processing
RG :
RG
RG:
When we setup or scale the cluster, we just need to run scyall_setup.sh which sets up configs automatically.
During data migration from existing to new system we got 80% compression rate.
RG
RG
RG:
Here is an example of how those questions are asked.
We are showing a Gremlin query used to select a Node with specific property.
And then traverse through the graph system and find all the other nodes it is connected via edges.
As shown in the red highlighted box, the query traversed through 15,000 nodes and provided results in 322 ms - abt 10 times faster than it is in our current system.
KP-
KP
This is a high level overview of what we built in the cloud. It is an N-tier architecture.
App UI JanusGraph Scylladb (primary) & Elasticsearch (search)
App API
System is designed with redundancy for each of these components for scalability and HA. They are built across multiple Availability Zones so we are protected against AZ failures.
Everything is in a private VPC with restricted access. Access comes in via Nginx.
Authentication/authorization is handled via an Nginx/OpenResty combination to our internal IDAM server.
All the business logic is abstracted in the the Application Tier.
RG
RG
As Krishna mentioned , we have setup all system components in AWS cloud.
We went through several iterations to come up with the optimal size of the cluster and resources to accomplish our goals like functionality and data migration from current system.
RG
RG
RG :
Using these automation tools we can build the whole stack shown in the architecture diagram with in minutes to an hour.
RG: We ran set of queries on existing and new system , and found the new system based on Scylla is 10 times faster than the existing system.
KP
KP: Our experience with Scylladb has been very good. Its cost effective and performant. We are looking at opportunities to use Scylladb in other projects with in FireEye.
KP: Finally, a big thanks to our internal FireEye team of Architects, Developers, QA & Devops. Architects and Devs worked closely with Devops to iterate and improve this solution. Our teams are spread across Reston, VA, Amsterdam & Pune, India - and we work very closely to deliver world class solutions.
I would also like to extend our gratitude to JanusGraph and ScyllaDB for the excellent Scylla University resources, the community and the organizers of this Summit.