SlideShare a Scribd company logo
1 of 28
Download to read offline
How Netflix Uses Druid in
Real-time to Ensure a High
Quality Streaming
Experience
September 2020
Ben Sykes, Sr Software Engineer, Netflix
1
Druid @
Netflix
How Netflix Uses Druid in Real-time to
Ensure a High Quality Streaming
Experience
Druid Summit II - Sep 2020
Druid Summit II
Sep 2020
Talking Points
Background
Quality of Experience
Metrics Pipeline
Using Druid
Druid Cluster
Data Ingestion
Managing Performance
Trade-offs
Tools
Druid Summit II
Sep 2020
Glossary
Measure
A value relating to an event. E.g. The presence of an error, size of a
buffer, or a duration.
Metric
Meaning and value derived from one or more measures. Counts,
Rates or Percentiles of measured values.
Dimension, Tag
An attribute of a metric that can be used to group or summarize
metrics by populations with shared properties.
Cardinality
The count of unique values of a given dimension.
Segment
An index file used by Druid to contain data for a given time block. A
time block may be formed of multiple segments.
Druid Summit II
Sep 2020
Quality of Experience
Druid Summit II
Sep 2020
Dashboards &
Monitoring
Druid Summit II
Sep 2020
Automated Alerting
Druid Summit II
Sep 2020
Automated Canary Analysis
Druid Summit II
Sep 2020
Metrics Pipeline
Druid Summit II
Sep 2020
Measure Extraction
Using Druid
Druid Summit II
Sep 2020
Druid Summit II
Sep 2020
Druid Cluster
Druid Summit II
Sep 2020
Rollup
Dimensions Metrics
Druid Summit II
Sep 2020
Cardinality
Dimensions
Device = { iPhone, Chrome, TV, iPhone, iPhone, TV, iPhone } |Device| = 3
Country = { US, CA, US, CA, CA, US, US } |Country| = 2
Druid Summit II
Sep 2020
Segments
Druid Summit II
Sep 2020
Realtime Indexing
Per Kafka Topic Many Partitions
Many Indexers per
MiddleManager,
Many Partitions per
Indexer
Many Segments per
Indexer
Druid Summit II
Sep 2020
Realtime Indexing
Per Kafka Topic Many Partitions
Many Indexers per
MiddleManager,
Many Partitions per
Indexer
Many Segments per
Indexer
Druid Summit II
Sep 2020
Realtime Indexing
Per Kafka Topic Many Partitions
Many Indexers per
MiddleManager,
Many Partitions per
Indexer
Many Segments per
Indexer
Druid Summit II
Sep 2020
Realtime Indexing
Per Kafka Topic Many Partitions
Many Indexers per
MiddleManager,
Many Partitions per
Indexer
Many Segments per
Indexer
Druid Summit II
Sep 2020
Realtime Indexing
Per Kafka Topic Many Partitions
Many Indexers per
MiddleManager,
Many Partitions per
Indexer
Many Segments per
Indexer
2
Druid Summit II
Sep 2020
Managing Performance
Druid Summit II
Sep 2020
Compaction
2
4 35 3 4
Druid Summit II
Sep 2020
Trade-offs
● Restrict dimensionality
● Limit cardinality (TopN, include list)
● Increase query granularity
● Reduce concurrent query limit
● Over-provision cluster
● Education and context on effects
● Monitor cardinality increases
● Sufficient query granularity
● Fail-fast on overload
● Monitor resources, plan scale
Better Query Performance More Flexibility
Druid Summit II
Sep 2020
Measure and Monitor
Cluster Performance
Druid Summit II
Sep 2020
Frequent Benchmarks
Druid Summit II
Sep 2020
Atlas Lumen
Built on OSS
Time for questions
27
Thank you!
Apache Druid is an independent project of The Apache Software Foundation. More information can be found at https://druid.apache.org.
Apache Druid, Druid, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.
November 2-4, 2020
San Francisco, CA
druidsummit.org
28
Register Now for
Druid Summit

More Related Content

What's hot

Data Science Across Data Sources with Apache Arrow
Data Science Across Data Sources with Apache ArrowData Science Across Data Sources with Apache Arrow
Data Science Across Data Sources with Apache ArrowDatabricks
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
 
PySpark Best Practices
PySpark Best PracticesPySpark Best Practices
PySpark Best PracticesCloudera, Inc.
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaKai Wähner
 
Event Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQEvent Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQAraf Karsh Hamid
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®confluent
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
Architecting for the Cloud using NetflixOSS - Codemash Workshop
Architecting for the Cloud using NetflixOSS - Codemash WorkshopArchitecting for the Cloud using NetflixOSS - Codemash Workshop
Architecting for the Cloud using NetflixOSS - Codemash WorkshopSudhir Tonse
 
How Apache Kafka® Works
How Apache Kafka® WorksHow Apache Kafka® Works
How Apache Kafka® Worksconfluent
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta LakeDatabricks
 
Microservices Architecture - Bangkok 2018
Microservices Architecture - Bangkok 2018Microservices Architecture - Bangkok 2018
Microservices Architecture - Bangkok 2018Araf Karsh Hamid
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
 
3D: DBT using Databricks and Delta
3D: DBT using Databricks and Delta3D: DBT using Databricks and Delta
3D: DBT using Databricks and DeltaDatabricks
 
Airflow presentation
Airflow presentationAirflow presentation
Airflow presentationIlias Okacha
 
Building modern data lakes
Building modern data lakes Building modern data lakes
Building modern data lakes Minio
 
Log management with ELK
Log management with ELKLog management with ELK
Log management with ELKGeert Pante
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Araf Karsh Hamid
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 

What's hot (20)

Data Science Across Data Sources with Apache Arrow
Data Science Across Data Sources with Apache ArrowData Science Across Data Sources with Apache Arrow
Data Science Across Data Sources with Apache Arrow
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
PySpark Best Practices
PySpark Best PracticesPySpark Best Practices
PySpark Best Practices
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
 
Event Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQEvent Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQ
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Architecting for the Cloud using NetflixOSS - Codemash Workshop
Architecting for the Cloud using NetflixOSS - Codemash WorkshopArchitecting for the Cloud using NetflixOSS - Codemash Workshop
Architecting for the Cloud using NetflixOSS - Codemash Workshop
 
How Apache Kafka® Works
How Apache Kafka® WorksHow Apache Kafka® Works
How Apache Kafka® Works
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
Microservices Architecture - Bangkok 2018
Microservices Architecture - Bangkok 2018Microservices Architecture - Bangkok 2018
Microservices Architecture - Bangkok 2018
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
3D: DBT using Databricks and Delta
3D: DBT using Databricks and Delta3D: DBT using Databricks and Delta
3D: DBT using Databricks and Delta
 
AWS glue technical enablement training
AWS glue technical enablement trainingAWS glue technical enablement training
AWS glue technical enablement training
 
Airflow presentation
Airflow presentationAirflow presentation
Airflow presentation
 
Building modern data lakes
Building modern data lakes Building modern data lakes
Building modern data lakes
 
Log management with ELK
Log management with ELKLog management with ELK
Log management with ELK
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 

Similar to How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experience

How TrafficGuard uses Druid to Fight Ad Fraud and Bots
How TrafficGuard uses Druid to Fight Ad Fraud and BotsHow TrafficGuard uses Druid to Fight Ad Fraud and Bots
How TrafficGuard uses Druid to Fight Ad Fraud and BotsImply
 
My past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsaiMy past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsaiKim Kao
 
Druid at Strata Conf NY 2016.pdf
Druid at Strata Conf NY 2016.pdfDruid at Strata Conf NY 2016.pdf
Druid at Strata Conf NY 2016.pdfHimanshuGupta936
 
Game Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid MeetupGame Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid MeetupJelena Zanko
 
Real Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and KafkaReal Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and KafkaDaria Litvinov
 
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby AnandanPivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby AnandanPivotalOpenSourceHub
 
GoGrid 3.0 Webinar: Complex Infrastructure Made Easy - Learn About the GoGrid...
GoGrid 3.0 Webinar: Complex Infrastructure Made Easy - Learn About the GoGrid...GoGrid 3.0 Webinar: Complex Infrastructure Made Easy - Learn About the GoGrid...
GoGrid 3.0 Webinar: Complex Infrastructure Made Easy - Learn About the GoGrid...GoGrid Cloud Hosting
 
The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...
The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...
The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...SSIMeetup
 
PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...
PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...
PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...Alexandr Savchenko
 
"Different software evolutions from Start till Release in PHP product" Oleksa...
"Different software evolutions from Start till Release in PHP product" Oleksa..."Different software evolutions from Start till Release in PHP product" Oleksa...
"Different software evolutions from Start till Release in PHP product" Oleksa...Fwdays
 
Paypal teradata gimel_thrift_server
Paypal teradata gimel_thrift_serverPaypal teradata gimel_thrift_server
Paypal teradata gimel_thrift_serverAnisha Nainani
 
Ataas2016 - Big data hadoop and map reduce - new age tools for aid to test...
Ataas2016 - Big data   hadoop and map reduce  - new age tools for aid to test...Ataas2016 - Big data   hadoop and map reduce  - new age tools for aid to test...
Ataas2016 - Big data hadoop and map reduce - new age tools for aid to test...Agile Testing Alliance
 
Deploy multi-environment application with Azure DevOps
Deploy multi-environment application with Azure DevOpsDeploy multi-environment application with Azure DevOps
Deploy multi-environment application with Azure DevOpsAndrea Tosato
 
Docker data science pipeline
Docker data science pipelineDocker data science pipeline
Docker data science pipelineDataWorks Summit
 
How to Gain Visibility into Containers, VM’s and Multi-Cloud Environments Usi...
How to Gain Visibility into Containers, VM’s and Multi-Cloud Environments Usi...How to Gain Visibility into Containers, VM’s and Multi-Cloud Environments Usi...
How to Gain Visibility into Containers, VM’s and Multi-Cloud Environments Usi...InfluxData
 
OSCON 2018 Getting Started with Hyperledger Indy
OSCON 2018 Getting Started with Hyperledger IndyOSCON 2018 Getting Started with Hyperledger Indy
OSCON 2018 Getting Started with Hyperledger IndyTracy Kuhrt
 
Delivering Quality at Speed with GitOps
Delivering Quality at Speed with GitOpsDelivering Quality at Speed with GitOps
Delivering Quality at Speed with GitOpsWeaveworks
 
The Right Tools for IoT Developers – Dan Gross @ Eclipse IoT Day ThingMonk 2016
The Right Tools for IoT Developers – Dan Gross @ Eclipse IoT Day ThingMonk 2016The Right Tools for IoT Developers – Dan Gross @ Eclipse IoT Day ThingMonk 2016
The Right Tools for IoT Developers – Dan Gross @ Eclipse IoT Day ThingMonk 2016Benjamin Cabé
 
Critical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and AnalyticsCritical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and AnalyticsData Driven Innovation
 

Similar to How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experience (20)

How TrafficGuard uses Druid to Fight Ad Fraud and Bots
How TrafficGuard uses Druid to Fight Ad Fraud and BotsHow TrafficGuard uses Druid to Fight Ad Fraud and Bots
How TrafficGuard uses Druid to Fight Ad Fraud and Bots
 
My past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsaiMy past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsai
 
Druid at Strata Conf NY 2016.pdf
Druid at Strata Conf NY 2016.pdfDruid at Strata Conf NY 2016.pdf
Druid at Strata Conf NY 2016.pdf
 
Game Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid MeetupGame Analytics at London Apache Druid Meetup
Game Analytics at London Apache Druid Meetup
 
Real Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and KafkaReal Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and Kafka
 
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby AnandanPivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
 
Saroj_Profile
Saroj_ProfileSaroj_Profile
Saroj_Profile
 
GoGrid 3.0 Webinar: Complex Infrastructure Made Easy - Learn About the GoGrid...
GoGrid 3.0 Webinar: Complex Infrastructure Made Easy - Learn About the GoGrid...GoGrid 3.0 Webinar: Complex Infrastructure Made Easy - Learn About the GoGrid...
GoGrid 3.0 Webinar: Complex Infrastructure Made Easy - Learn About the GoGrid...
 
The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...
The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...
The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...
 
PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...
PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...
PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...
 
"Different software evolutions from Start till Release in PHP product" Oleksa...
"Different software evolutions from Start till Release in PHP product" Oleksa..."Different software evolutions from Start till Release in PHP product" Oleksa...
"Different software evolutions from Start till Release in PHP product" Oleksa...
 
Paypal teradata gimel_thrift_server
Paypal teradata gimel_thrift_serverPaypal teradata gimel_thrift_server
Paypal teradata gimel_thrift_server
 
Ataas2016 - Big data hadoop and map reduce - new age tools for aid to test...
Ataas2016 - Big data   hadoop and map reduce  - new age tools for aid to test...Ataas2016 - Big data   hadoop and map reduce  - new age tools for aid to test...
Ataas2016 - Big data hadoop and map reduce - new age tools for aid to test...
 
Deploy multi-environment application with Azure DevOps
Deploy multi-environment application with Azure DevOpsDeploy multi-environment application with Azure DevOps
Deploy multi-environment application with Azure DevOps
 
Docker data science pipeline
Docker data science pipelineDocker data science pipeline
Docker data science pipeline
 
How to Gain Visibility into Containers, VM’s and Multi-Cloud Environments Usi...
How to Gain Visibility into Containers, VM’s and Multi-Cloud Environments Usi...How to Gain Visibility into Containers, VM’s and Multi-Cloud Environments Usi...
How to Gain Visibility into Containers, VM’s and Multi-Cloud Environments Usi...
 
OSCON 2018 Getting Started with Hyperledger Indy
OSCON 2018 Getting Started with Hyperledger IndyOSCON 2018 Getting Started with Hyperledger Indy
OSCON 2018 Getting Started with Hyperledger Indy
 
Delivering Quality at Speed with GitOps
Delivering Quality at Speed with GitOpsDelivering Quality at Speed with GitOps
Delivering Quality at Speed with GitOps
 
The Right Tools for IoT Developers – Dan Gross @ Eclipse IoT Day ThingMonk 2016
The Right Tools for IoT Developers – Dan Gross @ Eclipse IoT Day ThingMonk 2016The Right Tools for IoT Developers – Dan Gross @ Eclipse IoT Day ThingMonk 2016
The Right Tools for IoT Developers – Dan Gross @ Eclipse IoT Day ThingMonk 2016
 
Critical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and AnalyticsCritical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and Analytics
 

More from Imply

Pivot 2.0 - The next generation visualization tool for your streaming data
Pivot 2.0 - The next generation visualization tool for your streaming dataPivot 2.0 - The next generation visualization tool for your streaming data
Pivot 2.0 - The next generation visualization tool for your streaming dataImply
 
Druid in Spot Instances
Druid in Spot InstancesDruid in Spot Instances
Druid in Spot InstancesImply
 
Apache Druid®: A Dance of Distributed Processes
 Apache Druid®: A Dance of Distributed Processes Apache Druid®: A Dance of Distributed Processes
Apache Druid®: A Dance of Distributed ProcessesImply
 
Zeotap: Data Modeling in Druid for Non temporal and Nested Data
Zeotap: Data Modeling in Druid for Non temporal and Nested DataZeotap: Data Modeling in Druid for Non temporal and Nested Data
Zeotap: Data Modeling in Druid for Non temporal and Nested DataImply
 
Nielsen: Casting the Spell - Druid in Practice
Nielsen: Casting the Spell - Druid in PracticeNielsen: Casting the Spell - Druid in Practice
Nielsen: Casting the Spell - Druid in PracticeImply
 
Archmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on DruidArchmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on DruidImply
 
Building Data Applications with Apache Druid
Building Data Applications with Apache DruidBuilding Data Applications with Apache Druid
Building Data Applications with Apache DruidImply
 
Maximizing Apache Druid performance: Beyond the basics
Maximizing Apache Druid performance: Beyond the basicsMaximizing Apache Druid performance: Beyond the basics
Maximizing Apache Druid performance: Beyond the basicsImply
 
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...Imply
 
Self Service Analytics at Twitch
Self Service Analytics at TwitchSelf Service Analytics at Twitch
Self Service Analytics at TwitchImply
 
Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...
Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...
Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...Imply
 
August meetup - All about Apache Druid
August meetup - All about Apache Druid August meetup - All about Apache Druid
August meetup - All about Apache Druid Imply
 
Benchmarking Apache Druid
Benchmarking Apache DruidBenchmarking Apache Druid
Benchmarking Apache DruidImply
 
Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)Imply
 
Why data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsWhy data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsImply
 
What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18Imply
 
Apache Druid Vision and Roadmap
Apache Druid Vision and RoadmapApache Druid Vision and Roadmap
Apache Druid Vision and RoadmapImply
 
Analytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at TwitterAnalytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at TwitterImply
 

More from Imply (18)

Pivot 2.0 - The next generation visualization tool for your streaming data
Pivot 2.0 - The next generation visualization tool for your streaming dataPivot 2.0 - The next generation visualization tool for your streaming data
Pivot 2.0 - The next generation visualization tool for your streaming data
 
Druid in Spot Instances
Druid in Spot InstancesDruid in Spot Instances
Druid in Spot Instances
 
Apache Druid®: A Dance of Distributed Processes
 Apache Druid®: A Dance of Distributed Processes Apache Druid®: A Dance of Distributed Processes
Apache Druid®: A Dance of Distributed Processes
 
Zeotap: Data Modeling in Druid for Non temporal and Nested Data
Zeotap: Data Modeling in Druid for Non temporal and Nested DataZeotap: Data Modeling in Druid for Non temporal and Nested Data
Zeotap: Data Modeling in Druid for Non temporal and Nested Data
 
Nielsen: Casting the Spell - Druid in Practice
Nielsen: Casting the Spell - Druid in PracticeNielsen: Casting the Spell - Druid in Practice
Nielsen: Casting the Spell - Druid in Practice
 
Archmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on DruidArchmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on Druid
 
Building Data Applications with Apache Druid
Building Data Applications with Apache DruidBuilding Data Applications with Apache Druid
Building Data Applications with Apache Druid
 
Maximizing Apache Druid performance: Beyond the basics
Maximizing Apache Druid performance: Beyond the basicsMaximizing Apache Druid performance: Beyond the basics
Maximizing Apache Druid performance: Beyond the basics
 
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...
 
Self Service Analytics at Twitch
Self Service Analytics at TwitchSelf Service Analytics at Twitch
Self Service Analytics at Twitch
 
Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...
Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...
Apache Druid: Lightning Fast Analytics on Real-time and Historical Data (Atla...
 
August meetup - All about Apache Druid
August meetup - All about Apache Druid August meetup - All about Apache Druid
August meetup - All about Apache Druid
 
Benchmarking Apache Druid
Benchmarking Apache DruidBenchmarking Apache Druid
Benchmarking Apache Druid
 
Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)
 
Why data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsWhy data warehouses cannot support hot analytics
Why data warehouses cannot support hot analytics
 
What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18
 
Apache Druid Vision and Roadmap
Apache Druid Vision and RoadmapApache Druid Vision and Roadmap
Apache Druid Vision and Roadmap
 
Analytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at TwitterAnalytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at Twitter
 

Recently uploaded

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Recently uploaded (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 

How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experience

  • 1. How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experience September 2020 Ben Sykes, Sr Software Engineer, Netflix 1
  • 2. Druid @ Netflix How Netflix Uses Druid in Real-time to Ensure a High Quality Streaming Experience Druid Summit II - Sep 2020
  • 3. Druid Summit II Sep 2020 Talking Points Background Quality of Experience Metrics Pipeline Using Druid Druid Cluster Data Ingestion Managing Performance Trade-offs Tools
  • 4. Druid Summit II Sep 2020 Glossary Measure A value relating to an event. E.g. The presence of an error, size of a buffer, or a duration. Metric Meaning and value derived from one or more measures. Counts, Rates or Percentiles of measured values. Dimension, Tag An attribute of a metric that can be used to group or summarize metrics by populations with shared properties. Cardinality The count of unique values of a given dimension. Segment An index file used by Druid to contain data for a given time block. A time block may be formed of multiple segments.
  • 5. Druid Summit II Sep 2020 Quality of Experience
  • 6. Druid Summit II Sep 2020 Dashboards & Monitoring
  • 7. Druid Summit II Sep 2020 Automated Alerting
  • 8. Druid Summit II Sep 2020 Automated Canary Analysis
  • 9. Druid Summit II Sep 2020 Metrics Pipeline
  • 10. Druid Summit II Sep 2020 Measure Extraction
  • 12. Druid Summit II Sep 2020 Druid Cluster
  • 13. Druid Summit II Sep 2020 Rollup Dimensions Metrics
  • 14. Druid Summit II Sep 2020 Cardinality Dimensions Device = { iPhone, Chrome, TV, iPhone, iPhone, TV, iPhone } |Device| = 3 Country = { US, CA, US, CA, CA, US, US } |Country| = 2
  • 15. Druid Summit II Sep 2020 Segments
  • 16. Druid Summit II Sep 2020 Realtime Indexing Per Kafka Topic Many Partitions Many Indexers per MiddleManager, Many Partitions per Indexer Many Segments per Indexer
  • 17. Druid Summit II Sep 2020 Realtime Indexing Per Kafka Topic Many Partitions Many Indexers per MiddleManager, Many Partitions per Indexer Many Segments per Indexer
  • 18. Druid Summit II Sep 2020 Realtime Indexing Per Kafka Topic Many Partitions Many Indexers per MiddleManager, Many Partitions per Indexer Many Segments per Indexer
  • 19. Druid Summit II Sep 2020 Realtime Indexing Per Kafka Topic Many Partitions Many Indexers per MiddleManager, Many Partitions per Indexer Many Segments per Indexer
  • 20. Druid Summit II Sep 2020 Realtime Indexing Per Kafka Topic Many Partitions Many Indexers per MiddleManager, Many Partitions per Indexer Many Segments per Indexer 2
  • 21. Druid Summit II Sep 2020 Managing Performance
  • 22. Druid Summit II Sep 2020 Compaction 2 4 35 3 4
  • 23. Druid Summit II Sep 2020 Trade-offs ● Restrict dimensionality ● Limit cardinality (TopN, include list) ● Increase query granularity ● Reduce concurrent query limit ● Over-provision cluster ● Education and context on effects ● Monitor cardinality increases ● Sufficient query granularity ● Fail-fast on overload ● Monitor resources, plan scale Better Query Performance More Flexibility
  • 24. Druid Summit II Sep 2020 Measure and Monitor Cluster Performance
  • 25. Druid Summit II Sep 2020 Frequent Benchmarks
  • 26. Druid Summit II Sep 2020 Atlas Lumen Built on OSS
  • 27. Time for questions 27 Thank you! Apache Druid is an independent project of The Apache Software Foundation. More information can be found at https://druid.apache.org. Apache Druid, Druid, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.
  • 28. November 2-4, 2020 San Francisco, CA druidsummit.org 28 Register Now for Druid Summit