Uploaded byScyllaDB

1,107 views

Using ScyllaDB with JanusGraph for Cyber Security

The document discusses lessons learned from using Scylla with JanusGraph for cybersecurity applications by QOMPLX, Inc. It highlights the importance of real-time data analytics and the challenges faced with batch versus streaming data processes. The findings emphasize dynamic graph creation and the need for effective monitoring and observability in operational environments.

Related topics:

Cyber-Security•NoSQL Database Insights•

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Using ScyllaDB with JanusGraph for Cyber Security

Recommended

PDF

AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...

PPTX

How to be Successful with Scylla

PPTX

Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...

PPTX

Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster

PPTX

Scylla’s Journey Towards Being an Elastic Cloud Native Database

PPTX

How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night

PPTX

Scylla Summit 2018: Cassandra and ScyllaDB at Yahoo! Japan

PPTX

Scylla Summit 2018: The Short and Straight Road That Leads from Cassandra to ...

PPTX

Powering a Graph Data System with Scylla + JanusGraph

PDF

Lookout on Scaling Security to 100 Million Devices

PPTX

Sizing Your Scylla Cluster

PPTX

How Workload Prioritization Reduces Your Datacenter Footprint

PPTX

SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...

PPTX

FireEye & Scylla: Intel Threat Analysis Using a Graph Database

PPTX

Scylla Summit 2018: Kiwi.com Migration to Scylla - The Why, the How, the Fail...

PDF

The True Cost of NoSQL DBaaS Options

PDF

How to Monitor and Size Workloads on AWS i3 instances

PPTX

iFood on Delivering 100 Million Events a Month to Restaurants with Scylla

PPTX

Scylla Summit 2018: Joining Billions of Rows in Seconds with One Database Ins...

PDF

Introducing Scylla Open Source 4.0

PPTX

Scylla Summit 2018: Keynote - 4 Years of Scylla

PPTX

Scylla Summit 2018: Getting the Most Out of Scylla on Kubernetes

PDF

Seastar Summit 2019 vectorized.io

PPTX

Seastar Summit 2019 Keynote

PPTX

Empowering the AWS DynamoDB™ application developer with Alternator

PPTX

Implementing a Distributed NoSQL Database in a Persistent Distributed Ledger ...

PDF

Scylla Summit 2016: Using ScyllaDB for a Microservice-based Pipeline in Go

PPTX

MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...

PDF

On-boarding with JanusGraph Performance

PPTX

Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...

More Related Content

PDF

AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...

PPTX

How to be Successful with Scylla

PPTX

Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...

PPTX

Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster

PPTX

Scylla’s Journey Towards Being an Elastic Cloud Native Database

PPTX

How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night

PPTX

Scylla Summit 2018: Cassandra and ScyllaDB at Yahoo! Japan

PPTX

Scylla Summit 2018: The Short and Straight Road That Leads from Cassandra to ...

AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...

How to be Successful with Scylla

Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...

Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster

Scylla’s Journey Towards Being an Elastic Cloud Native Database

How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night

Scylla Summit 2018: Cassandra and ScyllaDB at Yahoo! Japan

Scylla Summit 2018: The Short and Straight Road That Leads from Cassandra to ...

What's hot

PPTX

Powering a Graph Data System with Scylla + JanusGraph

PDF

Lookout on Scaling Security to 100 Million Devices

PPTX

Sizing Your Scylla Cluster

PPTX

How Workload Prioritization Reduces Your Datacenter Footprint

PPTX

SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...

PPTX

FireEye & Scylla: Intel Threat Analysis Using a Graph Database

PPTX

Scylla Summit 2018: Kiwi.com Migration to Scylla - The Why, the How, the Fail...

PDF

The True Cost of NoSQL DBaaS Options

PDF

How to Monitor and Size Workloads on AWS i3 instances

PPTX

iFood on Delivering 100 Million Events a Month to Restaurants with Scylla

PPTX

Scylla Summit 2018: Joining Billions of Rows in Seconds with One Database Ins...

PDF

Introducing Scylla Open Source 4.0

PPTX

Scylla Summit 2018: Keynote - 4 Years of Scylla

PPTX

Scylla Summit 2018: Getting the Most Out of Scylla on Kubernetes

PDF

Seastar Summit 2019 vectorized.io

PPTX

Seastar Summit 2019 Keynote

PPTX

Empowering the AWS DynamoDB™ application developer with Alternator

PPTX

Implementing a Distributed NoSQL Database in a Persistent Distributed Ledger ...

PDF

Scylla Summit 2016: Using ScyllaDB for a Microservice-based Pipeline in Go

PPTX

MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...

Powering a Graph Data System with Scylla + JanusGraph

Lookout on Scaling Security to 100 Million Devices

Sizing Your Scylla Cluster

How Workload Prioritization Reduces Your Datacenter Footprint

SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...

FireEye & Scylla: Intel Threat Analysis Using a Graph Database

Scylla Summit 2018: Kiwi.com Migration to Scylla - The Why, the How, the Fail...

The True Cost of NoSQL DBaaS Options

How to Monitor and Size Workloads on AWS i3 instances

iFood on Delivering 100 Million Events a Month to Restaurants with Scylla

Scylla Summit 2018: Joining Billions of Rows in Seconds with One Database Ins...

Introducing Scylla Open Source 4.0

Scylla Summit 2018: Keynote - 4 Years of Scylla

Scylla Summit 2018: Getting the Most Out of Scylla on Kubernetes

Seastar Summit 2019 vectorized.io

Seastar Summit 2019 Keynote

Empowering the AWS DynamoDB™ application developer with Alternator

Implementing a Distributed NoSQL Database in a Persistent Distributed Ledger ...

Scylla Summit 2016: Using ScyllaDB for a Microservice-based Pipeline in Go

MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...

Similar to Using ScyllaDB with JanusGraph for Cyber Security

PDF

On-boarding with JanusGraph Performance

PPTX

Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...

PPTX

Zeotap: Moving to ScyllaDB - A Graph of Billions Scale

PPTX

Janus graph lookingbackwardreachingforward

PPTX

HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase

byMichael Stack

PDF

Community-Driven Graphs with JanusGraph

PDF

Graph Computing with JanusGraph

PDF

Graph Processing with Titan and Scylla

PDF

Airline Reservations and Routing: A Graph Use Case

PPTX

Airline reservations and routing: a graph use case

byDataWorks Summit

PDF

JanusGraph: Looking Backward, Reaching Forward

PPTX

Danny Bickson - Python based predictive analytics with GraphLab Create

PDF

Zeotap: Moving to ScyllaDB - A Graph of Billions Scale

bySaurabh Verma

PDF

Architecting a Corporate Compliance Platform with Graph and NoSQL Databases

PDF

Scylla Summit 2016: Graph Processing with Titan and Scylla

PDF

JanusGraph, Jupyter Meetup NYC

PDF

IBM Open by Design: Graph Technology

PDF

JanusGraph DB

byMike Frampton

PPTX

Incorporating JanusGraph into your Scylla Ecosystem

PDF

Scylla Summit 2017: Stretching Scylla Silly: The Datastore of a Graph Databas...

On-boarding with JanusGraph Performance

Scylla Summit 2017: Performance Evaluation of Scylla as a Database Backend fo...

Zeotap: Moving to ScyllaDB - A Graph of Billions Scale

Janus graph lookingbackwardreachingforward

HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase

byMichael Stack

Community-Driven Graphs with JanusGraph

Graph Computing with JanusGraph

Graph Processing with Titan and Scylla

Airline Reservations and Routing: A Graph Use Case

Airline reservations and routing: a graph use case

byDataWorks Summit

JanusGraph: Looking Backward, Reaching Forward

Danny Bickson - Python based predictive analytics with GraphLab Create

Zeotap: Moving to ScyllaDB - A Graph of Billions Scale

bySaurabh Verma

Architecting a Corporate Compliance Platform with Graph and NoSQL Databases

Scylla Summit 2016: Graph Processing with Titan and Scylla

JanusGraph, Jupyter Meetup NYC

IBM Open by Design: Graph Technology

JanusGraph DB

byMike Frampton

Incorporating JanusGraph into your Scylla Ecosystem

Scylla Summit 2017: Stretching Scylla Silly: The Datastore of a Graph Databas...

More from ScyllaDB

PDF

Scaling to 6.6M Read OPS with ScyllaDB on Kubernetes: Achieving Sub-2ms Laten...

PDF

Database Performance at Scale: The TL;DR

PDF

Cassandra vs. ScyllaDB: Evolutionary Differences

PDF

Turbocharging MCP: Speed, Smarts, and Scale by Viraj Sharma

PDF

From Gatekeeper to Kyverno : Kubernetes Policy Management with Performance by...

PDF

How to Evaluate a High Performance Database

PDF

The Tale of Taming TigerBeetle's Tail Latency by Tobias Ziegler

PDF

Building a Fast Lock-free Queue for Trading Systems by Sarthak Sehgal

PPTX

xCapture v3: Efficient, Always-On Thread Level Observability with eBPF by Tan...

PDF

Designing an Energy-efficient Architecture for Geo Databases by Yichen Wei

PDF

Netflix's Scalable Page Construction with Real-Time Impression History by Sau...

PDF

The Gory Details of a Full-Featured Userspace CPU Scheduler by Avi Kivity

PDF

ChatGPT Ain't Got $%@& On Me! by Andy Pavlo

PDF

What Would You Give for Speed: Trade-offs in Eventually Consistent Systems at...

PDF

GPUS and How to Program Them by Manya Bansal

PDF

Design Considerations for P99-optimized Hash Tables by Steve Heller

PDF

ZGC: A Decade of Innovation by Stefan Johansson

PDF

Bridging epoll and io_uring in Async Rust by Tzu Gwo

PDF

Measuring Query Latency the Hard Way: An Adventure in Impractical Postgres Mo...

PDF

40x Faster Binary Search by Ragnar Groot Koerkamp

Scaling to 6.6M Read OPS with ScyllaDB on Kubernetes: Achieving Sub-2ms Laten...

Database Performance at Scale: The TL;DR

Cassandra vs. ScyllaDB: Evolutionary Differences

Turbocharging MCP: Speed, Smarts, and Scale by Viraj Sharma

From Gatekeeper to Kyverno : Kubernetes Policy Management with Performance by...

How to Evaluate a High Performance Database

The Tale of Taming TigerBeetle's Tail Latency by Tobias Ziegler

Building a Fast Lock-free Queue for Trading Systems by Sarthak Sehgal

xCapture v3: Efficient, Always-On Thread Level Observability with eBPF by Tan...

Designing an Energy-efficient Architecture for Geo Databases by Yichen Wei

Netflix's Scalable Page Construction with Real-Time Impression History by Sau...

The Gory Details of a Full-Featured Userspace CPU Scheduler by Avi Kivity

ChatGPT Ain't Got $%@& On Me! by Andy Pavlo

What Would You Give for Speed: Trade-offs in Eventually Consistent Systems at...

GPUS and How to Program Them by Manya Bansal

Design Considerations for P99-optimized Hash Tables by Steve Heller

ZGC: A Decade of Innovation by Stefan Johansson

Bridging epoll and io_uring in Async Rust by Tzu Gwo

Measuring Query Latency the Hard Way: An Adventure in Impractical Postgres Mo...

40x Faster Binary Search by Ragnar Groot Koerkamp

Recently uploaded

PDF

The State of the Gen AI economy - 2025 - The Meliora Company

byClive Dickens

PDF

Workshop on Sustaining & Growing Open Source Communities - GAS2025

PPTX

Coded Agents – with UiPath SDK + LangGraph [Virtual Hands-on Workshop]

byUiPathCommunity

PPTX

Why Most GenAI Projects Fail to Scale and How to Become One of the Success St...

byEarley Information Science

PDF

Real-Time Data Insight Using Microsoft Forms for Business

PPTX

DYNAMICALLY.pptx good for the teachers or students to do seminars and for tea...

PPTX

UiPath Autonomous Agents | Building and Orchestrating Agents End-to-End

byUiPathCommunity

PPTX

MGw_MRS Benfits seu beficios de redes 4g

byJoaquimBarros18

PDF

Access Control 2025: From Security Silo to Software-Defined Ecosystem

PDF

What Is a Private LLM and Why Enterprises Need It

PDF

Day 3 - Data and Application Security - 2nd Sight Lab Cloud Security Class

by2nd Sight Lab

PDF

Internet_of_Things_IoT_for_Next_Generation_Smart_Systems_Utilizing.pdf

PDF

CompTIA Cybersecurity Analyst (CySA+) CS0-003: Unit 5

byVICTOR MAESTRE RAMIREZ

PDF

Cross-Cultural Agile Development -Challenges and Strategies for Overcoming Them-

byTakashi Makino

PDF

Exam Prep Plan Overview: Amazon Web Services (AWS) Certified

byVICTOR MAESTRE RAMIREZ

PPTX

Software Analysis &Design ethiopia chap-2.pptx

PDF

API-First Architecture in Financial Systems

PDF

Is It Possible to Have Wi-Fi Without an Internet Provider

bySidra Jefferi

DOCX

iRobot Post‑Mortem and Alternative Paths - Discussion Document for Boards and...

byDave Litwiller

PDF

TrustArc Webinar - Looking Ahead: The 2026 Privacy Landscape

The State of the Gen AI economy - 2025 - The Meliora Company

byClive Dickens

Workshop on Sustaining & Growing Open Source Communities - GAS2025

Coded Agents – with UiPath SDK + LangGraph [Virtual Hands-on Workshop]

byUiPathCommunity

Why Most GenAI Projects Fail to Scale and How to Become One of the Success St...

byEarley Information Science

Real-Time Data Insight Using Microsoft Forms for Business

DYNAMICALLY.pptx good for the teachers or students to do seminars and for tea...

UiPath Autonomous Agents | Building and Orchestrating Agents End-to-End

byUiPathCommunity

MGw_MRS Benfits seu beficios de redes 4g

byJoaquimBarros18

Access Control 2025: From Security Silo to Software-Defined Ecosystem

What Is a Private LLM and Why Enterprises Need It

Day 3 - Data and Application Security - 2nd Sight Lab Cloud Security Class

by2nd Sight Lab

Internet_of_Things_IoT_for_Next_Generation_Smart_Systems_Utilizing.pdf

CompTIA Cybersecurity Analyst (CySA+) CS0-003: Unit 5

byVICTOR MAESTRE RAMIREZ

Cross-Cultural Agile Development -Challenges and Strategies for Overcoming Them-

byTakashi Makino

Exam Prep Plan Overview: Amazon Web Services (AWS) Certified

byVICTOR MAESTRE RAMIREZ

Software Analysis &Design ethiopia chap-2.pptx

API-First Architecture in Financial Systems

Is It Possible to Have Wi-Fi Without an Internet Provider

bySidra Jefferi

iRobot Post‑Mortem and Alternative Paths - Discussion Document for Boards and...

byDave Litwiller

TrustArc Webinar - Looking Ahead: The 2026 Privacy Landscape

Editor's Notes

#11 So what did we learn during this project?
#12 First of all, as if we needed another reminder, graph analytics jobs are tough and computationally expensive. While not all applications require subsecond responses - in this case, our SOC analysts did. We couldn't afford to wait several minutes or hours in order to do some of the path analysis you saw in Angad’s demo.
#13 OLAP workloads in Janus are commonly offloaded to Spark. As a baseline, we tried the well worn path of Janus + Spark. While we still offer this facility for some more generalized workloads, for the SOC analysts a 30min response time for some of the queries you saw was just not going to cut it. Shorter path queries would return in several minutes or hours, longer queries were wholly impractical - taking days or simply not completing at all.
#14 Therefore, for a small number of key computations, we opted to use something called boostgraph. Boostgraph is a minimum subset of the graph stored in-memory. While you can’t use it as a proper graph database, you can use if for very specific computations. It can be spun and down as needed quickly to keep an eye on hosting costs and you can manage your instance size so that you allocate only the amount of memory that you need.
#15 And it was worth it. In boostgraph, we were able to get the most common path queries to return subsecond. This is what the SOC analyst use cases required and was a game changer in terms of the sorts of user experience we were able to provide.
#16 Another key takeaway, as is common in most data pipelines, was whether updates coming in from client’s Active Directory instances would come in as a batch or streaming message based. Streaming presents problems in sequencing of data (what happens when edges come in before the vertices they connect). How do you send over deletions or user’s removed from AD? And in general, detecting changes in AD is not that easy to begin with. We also learned that wholesale batches posed challenges as well. Rather than doing upserts into fully populated graphs, we opted to keep multiple revisions of a client’s AD representation (today, yesterday, etc). And due to the nuances of how the AD is organized at the client, a single client many have many AD instances and consequently many revisions of that graph.
#17 So a big new learning, that required patches submitted back to the Janus community, was how to handle a high number of average sized graphs. In many graph discussions, you see discussions around how to handle BIG graphs - and while some of our graphs can be of decent size, our challenge was not with abnormally large graphs but rather how to deal with ALOT of distinct graphs. So we found ourselves traipsing through a part of the Janus codebase that was especially immature and needed some surgery. Namely the ability to add and remove graphs on the fly - which we did by submitting updates to the ConfiguredGraphFactory. And in doing so, we also learned that having a high number of graphs also applies alot of memory pressure on the heap. So we had to also address the fashion of pruning older revisions to clean up the previous instances.
#18 What can we say about monitoring and devops deployment? Keeping multiple representations of graph data updated in hosted service, don’t cut corners. Qomplx’s advanced ATO process provided a good rubric for getting ready and keeping optics on the system.
#19 And lastly, why we’re all here. Having spoken at the last scylla summit about how much we like Scylla underneath Janus as the storage layer, our perspective has only strengthen. What can I say, it just works. In a challenging, multi technology environment, there was no drama.