SlideShare a Scribd company logo
1 of 36
Case Study: Real-time Analytics
With Druid
Salil Kalia, Tech Lead, TO THE NEW Digital
About Presenter
• Over 10 years in software industry
• Working with TO THE NEW Digital since 2009
• Using mainly Java/Groovy/Grails eco-systems
for the development purpose
• Working on Digital marketing domain for the
last few years
• Cassandra certified trainer
• Loves traveling and exploring new places
Agenda
Understanding the use-case
• Ad workflow
• Our use case
Experiments with technologies
• Redis
• Cassandra
Introduction to Druid
• Architecture
• Druid in production
• Demo
Understanding the
use-case
Understanding The Ad Workflow
AD
AGENCY-2
AD
AGENCY-2
AD
AGENCY-3
AD
AGENCY-3
AD
AGENCY-1
AD
AGENCY-1
USER
Web Page
Request
Ad
Request
Ad-Content
PUBLISHER
SERVER
AD
EXCHANGE
Examples From Our Use Case
•How many times a video has been viewed ?
•How many times a video has been viewed in a particular
time-span ?
•How many times a video has been viewed in a particular
time-span at a particular site ?
•How many times a video has been viewed in a particular
time-span at a particular site in a particular country ?
•How many times a video has been viewed in a particular
time-span at a particular site in a particular country on a
particular device ?
Video Events For The Analysis
• LOAD
• START
• PLAYING
• VIEW
• STOP / PAUSE
• FINISH
Event Data (Sample)
TIMESTAMP Ad Site Advertiser Event Action
2011-01-01T01:01:27 Z 123 abc.com Brand X Player Load
2011-01-01T01:01:33 Z 234 abcd.com Brand Y Player Load
2011-01-01T01:01:40 Z 123 abc.com Brand X Player Start
2011-01-01T01:01:45 Z 123 abc.com Brand X Player Playing
2011-01-01T01:01:50 Z 123 abc.com Brand Y Player Playing
2011-01-01T01:01:51 Z 123 abc.com Brand X Player Stop
What Is Analytics ?
Processing the HISTORICAL data to:
•Understand potential trends
•Analyze the effects of certain decisions or events
•Evaluate the performance of a system
•Make better business decisions
What Is Real-time Analytics ?
Why (We Need) Real-time Analytics ?
• Understand the real-time performance
• Control the velocity
• Avoid over serving
• Avoid under serving
• Control the targeting
Recap – Things We Understood
• How the ad-tech works (in general)
• Our use-case
• Different video player events
• We are expecting a huge amount of data coming
at a very high velocity.
Experiments with
technologies
Why We Picked Redis
• Great buzz in the market
• Highly scalable
• Easy to setup, configure and use
• We were not very clear with our use-case
Realizations From Redis
• Not a good fit to deal with time-series (big) data
• Persistence is another issue – we can’t afford
loosing data
• There was a huge variety of keys all over the
place
• Complexity in the (application side) code started
increasing
Working With Cassandra
• Very good support for the time-series data
• Extremely good for writing the data at a very
high speed
• Very easy to scale horizontally
• Supports aggregations through Counters
Writing into Cassandra
ANALYTICS
SERVER
CASSANDRA
AD PLAYER
Reading from Cassandra
ANALYTICS
SERVER CASSANDRA
CAMPAIGN
MANAGER
What didn’t work with Cassandra
• Inconsistent results
• Unreliable counters
• No ad-hoc queries support
• Nodes were crashing out very frequently
Crossroads – What next ?
• Third party tools on the top of Cassandra for
better consistency
• DataStax Enterprise edition
• Taking a deeper dive into Cassandra to
reconfigure the whole architecture and setup
• Switching to different technology
Understanding druid
About Druid (http://druid.io)
• An open-source analytics data store
• Supports streaming - data ingestion
• Flexible filters for ad-hoc queries
• Fast aggregations – sub second queries
• Distributed, shared-nothing architecture
• Easily scalable
Setting Up Druid In Production
KAFKA
(CLUSTER)
ANALYTICS
SERVER
DRUID
CLUSTER
CASSANDRA
AD PLAYER
Druid’s Reliability Check
KAFKA
(CLUSTER)
ANALYTICS
SERVER
DRUID
CLUSTER
RAW FILE
CONSUMER
RAW
FILES
RAW
FILES
RAW
FILES
Job To Test
Druid’s
Integrity
AD PLAYER
A Quick Demo
Druid Architecture
DEEP
STORAGE
DEEP
STORAGE
ZOOKEEPERZOOKEEPER
Druid Nodes
External
Dependencies
Queries
MetaData
Data/Segments
Client
Queries
Streaming
Data
REAL
TIME
NODES
COORDINATOR
NODES
HISTORICAL
NODES
BROKER
NODES
MY SQLMY SQL
Druid Data Ingestion
DEEP
STORAGE
DEEP
STORAGE
ZOOKEEPERZOOKEEPER
Druid Nodes
External
Dependencies
Queries
MetaData
Data/Segments
Client
Queries
Streaming
Data
REAL
TIME
NODES
COORDINATOR
NODES
HISTORICAL
NODES
BROKER
NODES
MY SQLMY SQL
Druid Data Ingestion (Our System)
KAFKA
(CLUSTER)
DRUID
Real-
time
Node
ANALYTICS
SERVER
AD PLAYER
Druid Data Retrieval
DEEP
STORAGE
DEEP
STORAGE
ZOOKEEPERZOOKEEPER
Druid Nodes
External
Dependencies
Queries
MetaData
Data/Segments
Client
Queries
Streaming
Data
REAL
TIME
NODES
COORDINATOR
NODES
HISTORICAL
NODES
BROKER
NODES
MY SQLMY SQL
Coordinator Nodes
DEEP
STORAGE
DEEP
STORAGE
ZOOKEEPERZOOKEEPER
Druid Nodes
External
Dependencies
Queries
MetaData
Data/Segments
Client
Queries
Streaming
Data
REAL
TIME
NODES
COORDINATOR
NODES
HISTORICAL
NODES
BROKER
NODES
MY SQLMY SQL
Druid Data Segment Propagation
DEEP
STORAGE
DEEP
STORAGE
ZOOKEEPERZOOKEEPER
Druid Nodes
External
Dependencies
Queries
MetaData
Data/Segments
Streaming
Data
REAL
TIME
NODES
COORDINATOR
NODES
HISTORICAL
NODES
MY SQLMY SQL
Our Production Stats
•Over 200 million events per day – ingested into
Druid cluster
•4 boxes with 8 cores, 64GB RAM, 1TB SSD
•2 coordinator nodes (only one master)
•2 real-time nodes
•4 historical nodes (on each box)
Companies Using Druid
Questions ?
Case Study: Realtime Analytics with Druid

More Related Content

What's hot

Zero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsightZero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsightDataWorks Summit
 
Lifting the hood on spark streaming - StampedeCon 2015
Lifting the hood on spark streaming - StampedeCon 2015Lifting the hood on spark streaming - StampedeCon 2015
Lifting the hood on spark streaming - StampedeCon 2015StampedeCon
 
Monitoring @ scale over diverse data sources @ PayPal - Druid, TSDB, Hadoop
Monitoring @ scale over diverse data sources @ PayPal  - Druid, TSDB, HadoopMonitoring @ scale over diverse data sources @ PayPal  - Druid, TSDB, Hadoop
Monitoring @ scale over diverse data sources @ PayPal - Druid, TSDB, HadoopSenthil Pandurangan
 
Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoopgregchanan
 
Visual Mapping of Clickstream Data
Visual Mapping of Clickstream DataVisual Mapping of Clickstream Data
Visual Mapping of Clickstream DataDataWorks Summit
 
Apache Druid Design and Future prospect
Apache Druid Design and Future prospectApache Druid Design and Future prospect
Apache Druid Design and Future prospectc-bslim
 
Resilience: the key requirement of a [big] [data] architecture - StampedeCon...
Resilience: the key requirement of a [big] [data] architecture  - StampedeCon...Resilience: the key requirement of a [big] [data] architecture  - StampedeCon...
Resilience: the key requirement of a [big] [data] architecture - StampedeCon...StampedeCon
 
Whoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
Whoops, The Numbers Are Wrong! Scaling Data Quality @ NetflixWhoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
Whoops, The Numbers Are Wrong! Scaling Data Quality @ NetflixDataWorks Summit
 
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...Data Con LA
 
Hadoop application architectures - using Customer 360 as an example
Hadoop application architectures - using Customer 360 as an exampleHadoop application architectures - using Customer 360 as an example
Hadoop application architectures - using Customer 360 as an examplehadooparchbook
 
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesApache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesDataWorks Summit
 
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!Tugdual Grall
 
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...MapR Technologies
 
Real Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and KafkaReal Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and KafkaDaria Litvinov
 
Real-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache SparkReal-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache SparkGuido Schmutz
 
A real-time architecture using Hadoop and Storm @ JAX London
A real-time architecture using Hadoop and Storm @ JAX LondonA real-time architecture using Hadoop and Storm @ JAX London
A real-time architecture using Hadoop and Storm @ JAX LondonNathan Bijnens
 
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaDataWorks Summit
 
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...DataWorks Summit
 

What's hot (20)

Zero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsightZero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsight
 
Lifting the hood on spark streaming - StampedeCon 2015
Lifting the hood on spark streaming - StampedeCon 2015Lifting the hood on spark streaming - StampedeCon 2015
Lifting the hood on spark streaming - StampedeCon 2015
 
Monitoring @ scale over diverse data sources @ PayPal - Druid, TSDB, Hadoop
Monitoring @ scale over diverse data sources @ PayPal  - Druid, TSDB, HadoopMonitoring @ scale over diverse data sources @ PayPal  - Druid, TSDB, Hadoop
Monitoring @ scale over diverse data sources @ PayPal - Druid, TSDB, Hadoop
 
Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoop
 
Visual Mapping of Clickstream Data
Visual Mapping of Clickstream DataVisual Mapping of Clickstream Data
Visual Mapping of Clickstream Data
 
Apache Druid Design and Future prospect
Apache Druid Design and Future prospectApache Druid Design and Future prospect
Apache Druid Design and Future prospect
 
Scalable Real-time analytics using Druid
Scalable Real-time analytics using DruidScalable Real-time analytics using Druid
Scalable Real-time analytics using Druid
 
Resilience: the key requirement of a [big] [data] architecture - StampedeCon...
Resilience: the key requirement of a [big] [data] architecture  - StampedeCon...Resilience: the key requirement of a [big] [data] architecture  - StampedeCon...
Resilience: the key requirement of a [big] [data] architecture - StampedeCon...
 
Whoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
Whoops, The Numbers Are Wrong! Scaling Data Quality @ NetflixWhoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
Whoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
 
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
Big Data Day LA 2016/ Use Case Driven track - Hydrator: Open Source, Code-Fre...
 
Hadoop application architectures - using Customer 360 as an example
Hadoop application architectures - using Customer 360 as an exampleHadoop application architectures - using Customer 360 as an example
Hadoop application architectures - using Customer 360 as an example
 
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on KubernetesApache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
Apache Druid Auto Scale-out/in for Streaming Data Ingestion on Kubernetes
 
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
 
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
Xactly: How to Build a Successful Converged Data Platform with Hadoop, Spark,...
 
Druid deep dive
Druid deep diveDruid deep dive
Druid deep dive
 
Real Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and KafkaReal Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and Kafka
 
Real-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache SparkReal-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache Spark
 
A real-time architecture using Hadoop and Storm @ JAX London
A real-time architecture using Hadoop and Storm @ JAX LondonA real-time architecture using Hadoop and Storm @ JAX London
A real-time architecture using Hadoop and Storm @ JAX London
 
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
 
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...
 

Viewers also liked

Aggregated queries with Druid on terrabytes and petabytes of data
Aggregated queries with Druid on terrabytes and petabytes of dataAggregated queries with Druid on terrabytes and petabytes of data
Aggregated queries with Druid on terrabytes and petabytes of dataRostislav Pashuto
 
Real-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerReal-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerMichael Spector
 
Data Analytics with Druid
Data Analytics with DruidData Analytics with Druid
Data Analytics with DruidYousun Jeong
 
Interactive analytics at scale with druid
Interactive analytics at scale with druidInteractive analytics at scale with druid
Interactive analytics at scale with druidJulien Lavigne du Cadet
 
Real-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and DruidReal-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and DruidJan Graßegger
 
Pulsar: Real-time Analytics at Scale with Kafka, Kylin and Druid
Pulsar: Real-time Analytics at Scale with Kafka, Kylin and DruidPulsar: Real-time Analytics at Scale with Kafka, Kylin and Druid
Pulsar: Real-time Analytics at Scale with Kafka, Kylin and DruidTony Ng
 
Druid realtime indexing
Druid realtime indexingDruid realtime indexing
Druid realtime indexingSeoeun Park
 
Druid at Hadoop Ecosystem
Druid at Hadoop EcosystemDruid at Hadoop Ecosystem
Druid at Hadoop EcosystemSlim Bouguerra
 
eBay Experimentation Platform on Hadoop
eBay Experimentation Platform on HadoopeBay Experimentation Platform on Hadoop
eBay Experimentation Platform on HadoopTony Ng
 
July 2014 HUG : Pushing the limits of Realtime Analytics using Druid
July 2014 HUG : Pushing the limits of Realtime Analytics using DruidJuly 2014 HUG : Pushing the limits of Realtime Analytics using Druid
July 2014 HUG : Pushing the limits of Realtime Analytics using DruidYahoo Developer Network
 
Click-Through Example for Flink’s KafkaConsumer Checkpointing
Click-Through Example for Flink’s KafkaConsumer CheckpointingClick-Through Example for Flink’s KafkaConsumer Checkpointing
Click-Through Example for Flink’s KafkaConsumer CheckpointingRobert Metzger
 
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep DiveApache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep DiveXu Jiang
 
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)SANG WON PARK
 
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015NoSQLmatters
 
Case Study: Analytics at CMC Markets: from measuring clicks to driving business
Case Study: Analytics at CMC Markets: from measuring clicks to driving businessCase Study: Analytics at CMC Markets: from measuring clicks to driving business
Case Study: Analytics at CMC Markets: from measuring clicks to driving businessJohn Sinke
 
Case Studies Utilizing Real Time Data Analytics
Case Studies Utilizing Real Time Data AnalyticsCase Studies Utilizing Real Time Data Analytics
Case Studies Utilizing Real Time Data AnalyticsHarrison Hayes, LLC
 
Business Intelligence for kids (example project)
Business Intelligence for kids (example project)Business Intelligence for kids (example project)
Business Intelligence for kids (example project)Enrique Benito
 
netflix-real-time-data-strata-talk
netflix-real-time-data-strata-talknetflix-real-time-data-strata-talk
netflix-real-time-data-strata-talkDanny Yuan
 
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search EngineElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search EngineDaniel N
 

Viewers also liked (19)

Aggregated queries with Druid on terrabytes and petabytes of data
Aggregated queries with Druid on terrabytes and petabytes of dataAggregated queries with Druid on terrabytes and petabytes of data
Aggregated queries with Druid on terrabytes and petabytes of data
 
Real-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerReal-time analytics with Druid at Appsflyer
Real-time analytics with Druid at Appsflyer
 
Data Analytics with Druid
Data Analytics with DruidData Analytics with Druid
Data Analytics with Druid
 
Interactive analytics at scale with druid
Interactive analytics at scale with druidInteractive analytics at scale with druid
Interactive analytics at scale with druid
 
Real-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and DruidReal-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and Druid
 
Pulsar: Real-time Analytics at Scale with Kafka, Kylin and Druid
Pulsar: Real-time Analytics at Scale with Kafka, Kylin and DruidPulsar: Real-time Analytics at Scale with Kafka, Kylin and Druid
Pulsar: Real-time Analytics at Scale with Kafka, Kylin and Druid
 
Druid realtime indexing
Druid realtime indexingDruid realtime indexing
Druid realtime indexing
 
Druid at Hadoop Ecosystem
Druid at Hadoop EcosystemDruid at Hadoop Ecosystem
Druid at Hadoop Ecosystem
 
eBay Experimentation Platform on Hadoop
eBay Experimentation Platform on HadoopeBay Experimentation Platform on Hadoop
eBay Experimentation Platform on Hadoop
 
July 2014 HUG : Pushing the limits of Realtime Analytics using Druid
July 2014 HUG : Pushing the limits of Realtime Analytics using DruidJuly 2014 HUG : Pushing the limits of Realtime Analytics using Druid
July 2014 HUG : Pushing the limits of Realtime Analytics using Druid
 
Click-Through Example for Flink’s KafkaConsumer Checkpointing
Click-Through Example for Flink’s KafkaConsumer CheckpointingClick-Through Example for Flink’s KafkaConsumer Checkpointing
Click-Through Example for Flink’s KafkaConsumer Checkpointing
 
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep DiveApache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
 
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
 
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
 
Case Study: Analytics at CMC Markets: from measuring clicks to driving business
Case Study: Analytics at CMC Markets: from measuring clicks to driving businessCase Study: Analytics at CMC Markets: from measuring clicks to driving business
Case Study: Analytics at CMC Markets: from measuring clicks to driving business
 
Case Studies Utilizing Real Time Data Analytics
Case Studies Utilizing Real Time Data AnalyticsCase Studies Utilizing Real Time Data Analytics
Case Studies Utilizing Real Time Data Analytics
 
Business Intelligence for kids (example project)
Business Intelligence for kids (example project)Business Intelligence for kids (example project)
Business Intelligence for kids (example project)
 
netflix-real-time-data-strata-talk
netflix-real-time-data-strata-talknetflix-real-time-data-strata-talk
netflix-real-time-data-strata-talk
 
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search EngineElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
 

Similar to Case Study: Realtime Analytics with Druid

Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic IntelAPAC
 
DevOps, what should you decide, when, why & how - Vinita Rathi
DevOps, what should you decide, when, why & how - Vinita RathiDevOps, what should you decide, when, why & how - Vinita Rathi
DevOps, what should you decide, when, why & how - Vinita RathiJAXLondon_Conference
 
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?TechWell
 
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Lucidworks
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAmazon Web Services
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsCloudera, Inc.
 
(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects
(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects
(SPOT205) 5 Lessons for Managing Massive IT Transformation ProjectsAmazon Web Services
 
How Cloud Providers are Playing with Traditional Data Center
How Cloud Providers are Playing with Traditional Data CenterHow Cloud Providers are Playing with Traditional Data Center
How Cloud Providers are Playing with Traditional Data CenterHostway|HOSTING
 
The Straight Skinny on Cloud Platforms
The Straight Skinny on Cloud PlatformsThe Straight Skinny on Cloud Platforms
The Straight Skinny on Cloud PlatformsHostway|HOSTING
 
How Celtra Optimizes its Advertising Platform with Databricks
How Celtra Optimizes its Advertising Platformwith DatabricksHow Celtra Optimizes its Advertising Platformwith Databricks
How Celtra Optimizes its Advertising Platform with DatabricksGrega Kespret
 
Sage Summit 2012: Cloud Computing for Accountants
Sage Summit 2012: Cloud Computing for AccountantsSage Summit 2012: Cloud Computing for Accountants
Sage Summit 2012: Cloud Computing for AccountantsGrant M Howe
 
Canary Analyze All The Things: How We Learned to Keep Calm and Release Often
Canary Analyze All The Things: How We Learned to Keep Calm and Release OftenCanary Analyze All The Things: How We Learned to Keep Calm and Release Often
Canary Analyze All The Things: How We Learned to Keep Calm and Release OftenC4Media
 
Barbri: Barbri's journey from on-prem to cloud, featuring auto-remediation w...
Barbri: Barbri's  journey from on-prem to cloud, featuring auto-remediation w...Barbri: Barbri's  journey from on-prem to cloud, featuring auto-remediation w...
Barbri: Barbri's journey from on-prem to cloud, featuring auto-remediation w...Dynatrace
 
Barbri barbri's journey from on-prem to cloud, featuring auto-remediation wi...
Barbri  barbri's journey from on-prem to cloud, featuring auto-remediation wi...Barbri  barbri's journey from on-prem to cloud, featuring auto-remediation wi...
Barbri barbri's journey from on-prem to cloud, featuring auto-remediation wi...Laura Stack
 
Le big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entrepriseLe big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entrepriseRubedo, a WebTales solution
 
Trending with Purpose
Trending with PurposeTrending with Purpose
Trending with PurposeJason Dixon
 

Similar to Case Study: Realtime Analytics with Druid (20)

Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
 
DevOps, what should you decide, when, why & how - Vinita Rathi
DevOps, what should you decide, when, why & how - Vinita RathiDevOps, what should you decide, when, why & how - Vinita Rathi
DevOps, what should you decide, when, why & how - Vinita Rathi
 
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
 
(R)evolutionize APM
(R)evolutionize APM(R)evolutionize APM
(R)evolutionize APM
 
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
 
Innovations @ Neev
Innovations @ NeevInnovations @ Neev
Innovations @ Neev
 
(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects
(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects
(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects
 
How Cloud Providers are Playing with Traditional Data Center
How Cloud Providers are Playing with Traditional Data CenterHow Cloud Providers are Playing with Traditional Data Center
How Cloud Providers are Playing with Traditional Data Center
 
The Straight Skinny on Cloud Platforms
The Straight Skinny on Cloud PlatformsThe Straight Skinny on Cloud Platforms
The Straight Skinny on Cloud Platforms
 
Enterprise Journey to the Cloud
Enterprise Journey to the CloudEnterprise Journey to the Cloud
Enterprise Journey to the Cloud
 
Euro IT Group
Euro IT GroupEuro IT Group
Euro IT Group
 
How Celtra Optimizes its Advertising Platform with Databricks
How Celtra Optimizes its Advertising Platformwith DatabricksHow Celtra Optimizes its Advertising Platformwith Databricks
How Celtra Optimizes its Advertising Platform with Databricks
 
Sage Summit 2012: Cloud Computing for Accountants
Sage Summit 2012: Cloud Computing for AccountantsSage Summit 2012: Cloud Computing for Accountants
Sage Summit 2012: Cloud Computing for Accountants
 
Canary Analyze All The Things: How We Learned to Keep Calm and Release Often
Canary Analyze All The Things: How We Learned to Keep Calm and Release OftenCanary Analyze All The Things: How We Learned to Keep Calm and Release Often
Canary Analyze All The Things: How We Learned to Keep Calm and Release Often
 
Barbri: Barbri's journey from on-prem to cloud, featuring auto-remediation w...
Barbri: Barbri's  journey from on-prem to cloud, featuring auto-remediation w...Barbri: Barbri's  journey from on-prem to cloud, featuring auto-remediation w...
Barbri: Barbri's journey from on-prem to cloud, featuring auto-remediation w...
 
Barbri barbri's journey from on-prem to cloud, featuring auto-remediation wi...
Barbri  barbri's journey from on-prem to cloud, featuring auto-remediation wi...Barbri  barbri's journey from on-prem to cloud, featuring auto-remediation wi...
Barbri barbri's journey from on-prem to cloud, featuring auto-remediation wi...
 
Le big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entrepriseLe big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entreprise
 
Trending with Purpose
Trending with PurposeTrending with Purpose
Trending with Purpose
 

Recently uploaded

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 

Recently uploaded (20)

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 

Case Study: Realtime Analytics with Druid

Editor's Notes

  1. 5m
  2. So the case study is about one of our potential customer (based out of US) – ViralGains ViralGains is a Viral Video Marketing platform. Basically runs campaigns for different ad-agencies or brands. Let’s see an overview of ad world.
  3. 5m Our role here Requests at high velocity Usually, 2 million requests per minute
  4. 10m In order to do so, What do we need to do now? Orange highlights are filters
  5. 12m Now we have data – it’s time to analyze it.
  6. Everybody knows, but just to revive the fundamentals
  7. 14 Before going ahead on the real-time analytics part – let’s see how an ad-tech works.
  8. 16m
  9. 17m
  10. Redis Cassandra Druid
  11. 18m Not talking about “What is Redis and all”. Assuming that you already know about this!! Using Redis was like a Too early decision
  12. 20m
  13. Explain (one liners): Why inconsistent results – delays in the sync process distributed counters Very much possible – that we couldn’t configure it very well
  14. 24m
  15. 25m
  16. 27m What are filters Not explaining – why fast aggregations
  17. 29m What is Kafka (in between) ? Coloring strategy
  18. 35m
  19. 35m
  20. 45m
  21. 50m External dependencies Realtime node vs Historical node (data lies here only) What is Deep Storage – S3 MySQL for meta-data What are segments
  22. 53m
  23. 55m # Events hits a processing node called realtime, it indexes it (means compacts it) does the aggregation and then data is ready for querying within no time # Once it indexed it can offload the older data indexed data to deep storage, so that it can process new data and we don't overwhelm it. # It pushes to deep storage generally (s3) and writes to meta storage that I want to offload. # Coordinator sees that and ask historical to read from s3 and broad case once done with loading into memory. # Once historical says ready for query, it broadcasts that i am ready, then realtime offloads
  24. 55m