PPTX, PDF3,229 views

Graph Databases at Netflix

The document discusses the use of JanusGraph at Netflix, detailing its capabilities as a scalable graph database for managing vast amounts of data with high concurrency. It highlights integrations with various tools and metrics frameworks, usage statistics including over 200 million nodes, and applications in areas such as authorization and infrastructure mapping. JanusGraph efficiently handles complex queries and real-time graph traversals within Netflix's cloud environment.

Data & Analytics◦

More Related Content

PDF

Introduction to MLflow

byDatabricks

PDF

Big query

byTanvi Parikh

PDF

Polyglot persistence @ netflix (CDE Meetup)

byRoopa Tangirala

PPT

Taking Full Advantage of Galera Multi Master Cluster

byCodership Oy - Creators of Galera Cluster

PPTX

Prometheus design and philosophy

byDocker, Inc.

PDF

Kappa vs Lambda Architectures and Technology Comparison

byKai Wähner

PDF

Kafka Streams State Stores Being Persistent

byconfluent

PDF

Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021

byStreamNative

Introduction to MLflow

byDatabricks

Big query

byTanvi Parikh

Polyglot persistence @ netflix (CDE Meetup)

byRoopa Tangirala

Taking Full Advantage of Galera Multi Master Cluster

byCodership Oy - Creators of Galera Cluster

Prometheus design and philosophy

byDocker, Inc.

Kappa vs Lambda Architectures and Technology Comparison

byKai Wähner

Kafka Streams State Stores Being Persistent

byconfluent

Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021

byStreamNative

What's hot

PDF

Pinot: Realtime Distributed OLAP datastore

byKishore Gopalakrishna

PDF

Using ClickHouse for Experimentation

byGleb Kanterov

PDF

NLP using transformers

byArvind Devaraj

PPTX

What is NoSQL and CAP Theorem

byRahul Jain

PPTX

Autoscaling Flink with Reactive Mode

byFlink Forward

PDF

Introduction to Transformers for NLP - Olga Petrova

byAlexey Grigorev

PDF

Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...

byAltinity Ltd

PPTX

InfluxDb

byGuamaral Vasil

PDF

Presto Summit 2018 - 09 - Netflix Iceberg

bykbajda

PPTX

Grafana.pptx

byBhushan Rane

PDF

PromQL Deep Dive - The Prometheus Query Language

byWeaveworks

PPTX

Word2Vec

bymohammad javad hasani

PPTX

Logstash

byRajgourav Jain

PDF

Vector databases and neural search

byDmitry Kan

PDF

Orchestrating workflows Apache Airflow on GCP & AWS

byDerrick Qin

PPTX

How YugaByte DB Implements Distributed PostgreSQL

byYugabyte

PDF

All about InfluxDB.

bymitesh_sharma

PDF

Deep Learning for Recommender Systems RecSys2017 Tutorial

byAlexandros Karatzoglou

PDF

Real-time, Exactly-once Data Ingestion from Kafka to ClickHouse at eBay

byAltinity Ltd

PDF

Indexing Complex PostgreSQL Data Types

byJonathan Katz

Pinot: Realtime Distributed OLAP datastore

byKishore Gopalakrishna

Using ClickHouse for Experimentation

byGleb Kanterov

NLP using transformers

byArvind Devaraj

What is NoSQL and CAP Theorem

byRahul Jain

Autoscaling Flink with Reactive Mode

byFlink Forward

Introduction to Transformers for NLP - Olga Petrova

byAlexey Grigorev

Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...

byAltinity Ltd

InfluxDb

byGuamaral Vasil

Presto Summit 2018 - 09 - Netflix Iceberg

bykbajda

Grafana.pptx

byBhushan Rane

PromQL Deep Dive - The Prometheus Query Language

byWeaveworks

Word2Vec

bymohammad javad hasani

Logstash

byRajgourav Jain

Vector databases and neural search

byDmitry Kan

Orchestrating workflows Apache Airflow on GCP & AWS

byDerrick Qin

How YugaByte DB Implements Distributed PostgreSQL

byYugabyte

All about InfluxDB.

bymitesh_sharma

Deep Learning for Recommender Systems RecSys2017 Tutorial

byAlexandros Karatzoglou

Real-time, Exactly-once Data Ingestion from Kafka to ClickHouse at eBay

byAltinity Ltd

Indexing Complex PostgreSQL Data Types

byJonathan Katz

Similar to Graph Databases at Netflix

PDF

JanusGraph DB

byMike Frampton

PPTX

Janus graph lookingbackwardreachingforward

byDemai Ni

PDF

JanusGraph: Looking Backward, Reaching Forward

byJason Plurad

PPTX

Large Scale Graph Analytics with JanusGraph

byDataWorks Summit

PPTX

Large Scale Graph Analytics with JanusGraph

byP. Taylor Goetz

PPTX

JanusGraph DataBase Concepts

bySanil Bagzai

PDF

How Graph Databases used in Police Department?

bySamet KILICTAS

PPTX

HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase

byMichael Stack

PDF

Graph Computing with JanusGraph

byJason Plurad

PDF

Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...

byNeo4j

PDF

Graph Databases introduction to rug-b

byPere Urbón-Bayes

PPTX

Once You Go Graph

byRed Pill Now

PDF

Neptue Graph Database - 0 to Production

byisraelio

PDF

Community-Driven Graphs with JanusGraph

byJason Plurad

PPTX

Powers of Ten Redux

byJason Plurad

PPTX

NoSQL 5 2_graph Database Edited - Updated.pptx.pptx

byajajkhan16

PDF

JanusGraph, Jupyter Meetup NYC

byJason Plurad

PDF

What’s the big deal with Graph Databases?

byDaniel Zivkovic

PDF

On-boarding with JanusGraph Performance

byChin Huang

PDF

Graph database Use Cases

byMax De Marzi

JanusGraph DB

byMike Frampton

Janus graph lookingbackwardreachingforward

byDemai Ni

JanusGraph: Looking Backward, Reaching Forward

byJason Plurad

Large Scale Graph Analytics with JanusGraph

byDataWorks Summit

Large Scale Graph Analytics with JanusGraph

byP. Taylor Goetz

JanusGraph DataBase Concepts

bySanil Bagzai

How Graph Databases used in Police Department?

bySamet KILICTAS

HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase

byMichael Stack

Graph Computing with JanusGraph

byJason Plurad

Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...

byNeo4j

Graph Databases introduction to rug-b

byPere Urbón-Bayes

Once You Go Graph

byRed Pill Now

Neptue Graph Database - 0 to Production

byisraelio

Community-Driven Graphs with JanusGraph

byJason Plurad

Powers of Ten Redux

byJason Plurad

NoSQL 5 2_graph Database Edited - Updated.pptx.pptx

byajajkhan16

JanusGraph, Jupyter Meetup NYC

byJason Plurad

What’s the big deal with Graph Databases?

byDaniel Zivkovic

On-boarding with JanusGraph Performance

byChin Huang

Graph database Use Cases

byMax De Marzi

More from Ioannis Papapanagiotou

PDF

Netflix Data Benchmark @ HPTS 2017

byIoannis Papapanagiotou

PPTX

Dynomite @ RedisConf 2017

byIoannis Papapanagiotou

PDF

Dynomite - PerconaLive 2017

byIoannis Papapanagiotou

PPTX

Dynomite @ Redis Conference 2016

byIoannis Papapanagiotou

PPTX

Fast and Scalable Authentication for Vehicular Internet of Things

byIoannis Papapanagiotou

PDF

Internet of Things @ Purdue University

byIoannis Papapanagiotou

PDF

Microservices, Containers and Docker

byIoannis Papapanagiotou

Netflix Data Benchmark @ HPTS 2017

byIoannis Papapanagiotou

Dynomite @ RedisConf 2017

byIoannis Papapanagiotou

Dynomite - PerconaLive 2017

byIoannis Papapanagiotou

Dynomite @ Redis Conference 2016

byIoannis Papapanagiotou

Fast and Scalable Authentication for Vehicular Internet of Things

byIoannis Papapanagiotou

Internet of Things @ Purdue University

byIoannis Papapanagiotou

Microservices, Containers and Docker

byIoannis Papapanagiotou

Recently uploaded

PDF

Social Media in Gurgaon: Strategies, Services & Solutions by Bala Infotech

byBala Infotech

PDF

TOS in History.pdf ENGLISH macro.pdf mnbv

bykristinegorre7

PDF

"Top Excel Tips and Tricks: Conditional Formatting, Sorting, and More - Learn...

byCA Suvidha Chaplot

PPTX

Smart Crop prediction presentation for final year

bySurya386647

PDF

Top 10 Excel Tips for Efficiency | Learn Advanced Excel Techniques CA SUVIDHA...

byCA Suvidha Chaplot

PPTX

Merge Sort Algorithm: Analysis of Time and Space Complexity.pptx

byrigurahu

PDF

"Master Excel with CA Suvidha Chaplot: 40 Must-Know Tips for Efficiency"

byCA Suvidha Chaplot

PDF

What is Context Engineering for Agentforce Developer and Architect

byParis Salesforce Developer Group

PDF

Maricar Payaoan Belarma Portfolio :)))))

bymaricarbelarmava

PDF

CodeMate_Autonomous_Engineering_Documents.pdf

byssuserc287c9

PPTX

Autism types, forms and everything we need to know

byfuzertamas

PPTX

Internship ppt about multiple disease prediction

byGOUTHAMTr2

PPTX

maaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

byAshleyJane9

PPTX

Microsoft-Dynamics-365-Business-Central (1).pptx

byUlirRohwana

PDF

제 23회 보아즈(BOAZ) 빅데이터 컨퍼런스 - [MBOAX] : ABSA를 활용한 소비자 반응 분석 기반 운영 효율화 대시보드 설계

byBOAZ Bigdata

PDF

Content Strategy Presentation.pdfjjjjhhhhh

byCerineAmmarkhodja1

PDF

Whitepaper / No code end-user app development (2026)

byOptymyze

PPTX

SQL FOR DATA ANALYSTS - MSSQLServer for Analysts

bysirtwumasi77

PDF

Baltic Startup Funding Report 2025 by FIRSTPICK and Practica Capital

byFIRSTPICK VC

PDF

Plainte française dans le cadre de l'affaire Epstein

bySociété Tripalio

Social Media in Gurgaon: Strategies, Services & Solutions by Bala Infotech

byBala Infotech

TOS in History.pdf ENGLISH macro.pdf mnbv

bykristinegorre7

"Top Excel Tips and Tricks: Conditional Formatting, Sorting, and More - Learn...

byCA Suvidha Chaplot

Smart Crop prediction presentation for final year

bySurya386647

Top 10 Excel Tips for Efficiency | Learn Advanced Excel Techniques CA SUVIDHA...

byCA Suvidha Chaplot

Merge Sort Algorithm: Analysis of Time and Space Complexity.pptx

byrigurahu

"Master Excel with CA Suvidha Chaplot: 40 Must-Know Tips for Efficiency"

byCA Suvidha Chaplot

What is Context Engineering for Agentforce Developer and Architect

byParis Salesforce Developer Group

Maricar Payaoan Belarma Portfolio :)))))

bymaricarbelarmava

CodeMate_Autonomous_Engineering_Documents.pdf

byssuserc287c9

Autism types, forms and everything we need to know

byfuzertamas

Internship ppt about multiple disease prediction

byGOUTHAMTr2

maaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

byAshleyJane9

Microsoft-Dynamics-365-Business-Central (1).pptx

byUlirRohwana

제 23회 보아즈(BOAZ) 빅데이터 컨퍼런스 - [MBOAX] : ABSA를 활용한 소비자 반응 분석 기반 운영 효율화 대시보드 설계

byBOAZ Bigdata

Content Strategy Presentation.pdfjjjjhhhhh

byCerineAmmarkhodja1

Whitepaper / No code end-user app development (2026)

byOptymyze

SQL FOR DATA ANALYSTS - MSSQLServer for Analysts

bysirtwumasi77

Baltic Startup Funding Report 2025 by FIRSTPICK and Practica Capital

byFIRSTPICK VC

Plainte française dans le cadre de l'affaire Epstein

bySociété Tripalio

Graph Databases at Netflix

1.
Graph Databases @Netflix Ioannis Papapanagiotou Cloud Database Engineering
2.
Data Model
3.
AWS Neptune
4.
● JanusGraph isa scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi- machine cluster. ● Can support thousands of concurrent users executing complex graph traversals in real-time. ● Native Integration with open Graph APIs ○ Tinkerpop: Gremlin Graph Query Language
5.
Netflix integration ● Discovery(Eureka) ● Integration with our Dynamic Fast Property framework (Archaius) ● Metrics (Spectator) ● Request Tracing (Salp) ● Failure Injection Testing (FIT) ● Credential Management (Metatron)
6.
Artwork Display Set B Display SetA Character Movie Video Track PNGWebp Audio Track JPG Video Seg Es Sub Fr Dub Montage TEXT Track Trailer Fr Sub Person Digital Asset Management
7.
● All entities(Assets, Movie, Display Sets etc.) - Vertex ● Collections are also a vertices with links to all entities within the collection ● All relations are edges ● Indexes - Composite for property key based lookups ● Entities are indexed in ElasticSearch for property based search outside of the JanusGraph Context DAM JanusGraph usage
8.
● 200 Mnodes in PROD cluster ● Hundred queries and updates per minute ● 70 Asset Types (Schema Definitions) ● >10 different client applications ● Test cluster with over 200 M nodes Current Stats (2017)
9.
Other use cases ●Authorization, user and partner management ● Distributed Tracing ● Identify Network dependencies ● Mapping Netflix infra and the relationships ○ how code gets committed to stash, built on jenkins, deployed by spinnaker

Editor's Notes

#3 The primary difference is that in a graph database, the relationships are stored at the individual record level, while in a relational database, the structure is defined at a higher level (the table definitions). Graph databases can be faster than relational databases for deeply-connected data - a strength of the underlying model. Graph databases take less memory for traversing deep relationships that would require multiple relational joins. Graph database schema obviates the need to manually manage indexes and reduces the write-time cost of multiple indices to cover every necessary join. If you use many-to-many relationships, in a relational database you have to introduce a JOIN table (or junction table) that holds foreign keys of both participating tables which further increases join operation costs. Those costly join operations are usually addressed by denormalizing data to reduce the number of joins necessary. Graph databases make the modeling and querying more intuitive.
#4 JanusGraph is OSS. It is highly concurrent and it has support for ES and Cassandra. In Cassandra, we have great plugins developed for Datastax Java Driver. JanusGraph provides the ability to indexing with ES or Solr. ES is also heavily used at Netflix.
#6 https://medium.com/netflix-techblog/fit-failure-injection-testing-35d8e2a9bb2
#7 DAM is part of the content platform engineering. CPE deals with a lot of digital entities and assets. These entities needs to be shared across many microservices. We felt that the data can be modeled very well with a graph database. DAM stores the metadata for all the assets. The metadata can be connected or not. They metadata can be of different kind and potentially connected. The assets have to be searchable and the system has to be highly available. A simplified example of how the assets are stored in the database. The artwork is related to a movie or a person. The artwork is related to a movie, a person and a character. For example, the movies can be Jessica Jones, the person Krysten Ritter, and the character Jessica Jones. DAM uses tinkerpop API and Gremlin to query the data. The data are being stored in Cassandra. The movie could have different display sets based on the language. Each display set may have different entities, such subs/dubs.
#8 The queries are all happening based on the vertex id All entities are a vertex
#9 As Netflix Studios grow these numbers will expand and the numbers will grow exponentially. Storage: 12 i2.4xlarge Cassandra + 18 r3.4xlarge ES Issues Index management. Thinking of partitioning the data. Multi-level traversals may not be very performant. Goes one step at a time. When you are doing JG stores the data in C* in binary format. It is hard to perform analytics Lack of Visualization tools
#10 Protego provides authorization as a service along with User and Partner Management service. Currently for some the api requirements indexes and inverted indexes are managed in C* DB. Based on future api requirements (support for TTL and entities) and relationships between some of the concepts graph db can prove to be good underlying storage. By moving to graph db managing relationships between objects will become easy as well as queries and index management to support existing and future api requirements. Salp, our distributed tracing solution, was initially developed by the Runtime platform team. The primary goal was data collection from various services and providing a proof of concept visualization. Later on, the services - Crawler, Aggregator, Backend API were added to enrich the data and allow querying. Dredge wants to identify networks dependencies. This can assist when network events happen and can significantly reduce our mean to detection. #core team would probably be the consumer of this service, which is named Kevin.