SlideShare a Scribd company logo
1 of 30
Download to read offline
DATA ANALYTICS WITH DRUID
YOU SUN JEONG
DATA ANALYTICS WITH DRUID
WHO AM I ?
Senior Software Engineer of SK Telecom
Commercial Products
Big Data Discovery Solution (~’16)
Hadoop DW (~’15)
PaaS(CloudFoundry) (~’13)
Iaas (OpenStack) (~’13)
Mail to : jerryjung@apache.org
2
DATA ANALYTICS WITH DRUID
FOOTPRINTS
2014
2015 

- Hadoop DW 

- Realtime NW Analytics
2016 

- Big Data Discovery

- Streaming Processing
3
DATA ANALYTICS WITH DRUID
AGENDA
‣ History
‣ What is Druid?
‣ Druid Architecture
‣ Real-Time Ingestion Demo (15m)
‣ Cohort Analysis (15m)
4
DATA ANALYTICS WITH DRUID
HISTORY
▸ Development started at Meta markets in 2011
▸ Apache V2 in early 2015
▸ 150+ contributors today
▸ https://github.com/druid-io
5
DATA ANALYTICS WITH DRUID
DATA LAKE
6
https://www.linkedin.com/pulse/more-analytics-than-just-fishing-data-lake-john-poppelaars
DATA ANALYTICS WITH DRUID
DW VS DATA LAKE
http://www.kdnuggets.com/2015/09/data-lake-vs-data-warehouse-key-differences.html
7
DATA ANALYTICS WITH DRUID
WHAT IS DRUID
Distributed, 

In-memory Multi-dimensional
OLAP store
8
DATA ANALYTICS WITH DRUID
PROBLEMS
timestamp domain user gender clicked
2011-01-01T00:01:35Z bieber.com 4312345532 Female 1
2011-01-01T00:03:03Z bieber.com 3484920241 Female 0
2011-01-01T00:04:51Z ultra.com 9530174728 Male 1
2011-01-01T00:05:33Z ultra.com 4098310573 Male 1
2011-01-01T00:05:53Z ultra.com 5832057930 Female 0
2011-01-01T00:06:17Z ultra.com 5789283478 Female 1
2011-01-01T00:23:15Z bieber.com 4730093842 Female 0
2011-01-01T00:38:51Z ultra.com 3909846810 Male 1
2011-01-01T00:49:33Z bieber.com 4930097162 Female 1
2011-01-01T00:49:53Z ultra.com 0381837193 Female 0
timestamp impressions clicks
2011-01-01T00:00:00Z 10 6
timestamp domain user gender clicked
2011-01-01T00:01:35Z bieber.com 4312345532 Female 1
2011-01-01T00:03:03Z bieber.com 3484920241 Female 0
2011-01-01T00:04:51Z ultra.com 9530174728 Male 1
2011-01-01T00:05:33Z ultra.com 4098310573 Male 1
2011-01-01T00:05:53Z ultra.com 5832057930 Female 0
2011-01-01T00:06:17Z ultra.com 5789283478 Female 1
2011-01-01T00:23:15Z bieber.com 4730093842 Female 0
2011-01-01T00:38:51Z ultra.com 9530174728 Male 1
2011-01-01T00:49:33Z bieber.com 4930097162 Female 1
2011-01-01T00:49:53Z ultra.com 0381837193 Female 0
timestamp domain gender impressions clicks
2011-01-01T00:00:00Z bieber.com Female 4 2
2011-01-01T00:00:00Z ultra.com Female 3 1
2011-01-01T00:00:00Z ultra.com Male 3 2
9
DATA ANALYTICS WITH DRUID
BIG DATA DISCOVERY
▸ Roll-up
▸ Summarizing over a dimension
▸ Drill-down
▸ Focusing (zooming in)
▸ Slicing and dicing
▸ Reducing dimensions (slice)
▸ Picking values of specific dimensions (dice)
▸ Pivoting
▸ Rotating multi-dimensional cube
10
DATA ANALYTICS WITH DRUID
OLAP CUBE
▸ Slice and Dice
11
DATA ANALYTICS WITH DRUID
IN-MEMORY
12
DATA ANALYTICS WITH DRUID
COLUMNAR STORAGE
13
DATA ANALYTICS WITH DRUID
DRUID TERMS
▸ Data
▸ Timestamp
▸ Dimension
▸ Metric
▸ Datasource
▸ Segment
▸ Granularity
14
DATA ANALYTICS WITH DRUID
DRUID ARCHITECTURE
REALTIME
BROKER HISTORICAL
15
DATA ANALYTICS WITH DRUID
ARCHITECTURE - BATCH INGESTION
HDFS
HISTORICAL
NODE
HISTORICAL
NODE
HISTORICAL
NODE
BROKER
NODE
Segments
Queries
16
DATA ANALYTICS WITH DRUID
ARCHITECTURE - STREAMING INGESTION
REALTIME
NODE
HISTORICAL
NODE
HISTORICAL
NODE
HISTORICAL
NODE
BROKER
NODE
Segments
Queries
Streaming
17
DATA ANALYTICS WITH DRUID
ARCHITECTURE - LAMBDA
REALTIME
NODE
HISTORICAL
NODE
HISTORICAL
NODE
HISTORICAL
NODE
BROKER
NODE
Segments
Queries
Streaming
HDFS
18
DATA ANALYTICS WITH DRUID
GLUE ARCHITECTURE
REAL TIME
TASK
HISTORICAL
NODE
HISTORICAL
NODE
HISTORICAL
NODE
BROKER
NODE
Segments
Queries
Streaming
STREAM
PROCESSOR

(TRANQUILITY)
Kafka Indexing Service
19
DATA ANALYTICS WITH DRUID
REAL WORLD ARCHITECTURE
DATA 

NODE #1
DATA 

NODE #N
OVERLORD
MIDDLE
MANAGE

#1
COORDI

NATOR
MYSQL
HA 

PROXY
MEMCACHED

#2
BROKER
NODE

#1
BROKER
NODE

#1
MEMCACHED

#3
MEMCACHED

#1
HISTORICAL
NODE #1
HISTORICAL
NODE #N
MIDDLE
MANAGE

#N
ZK1
ZK2
ZK3
20
DATA ANALYTICS WITH DRUID
DRUID MONITORING
21
http://www.slideshare.net/CharlesAllen9/programmatic-bidding-data-streams-druid
DATA ANALYTICS WITH DRUID
DRUID DATASOURCE
22
RDRUID
DATA ANALYTICS WITH DRUID
https://github.com/druid-io/RDruid
23
DATA ANALYTICS WITH DRUID
PYDROID
24
https://github.com/druid-io/pydruid
DATA ANALYTICS WITH DRUID
DEMO
▸ Jupyter Notebook(PyDruid)
▸ Mobile App User Events for 1 week 

: 2 billion events
▸ Scenario 

: Unique users

Cohort Analysis
25
DEMO
DATA ANALYTICS WITH DRUID
MAY THE FORCE BE WITH YOU
27
DATA ANALYTICS WITH DRUID
REFERENCES
▸ Druid

: http://www.popit.kr/tag/druid/ 

(https://www.facebook.com/popitkr/)

: http://druid.io/
▸ Cohort Analysis

: http://www.gregreda.com/2015/08/23/cohort-analysis-
with-python/
▸ Druid Meetup@Seoul

: http://www.meetup.com/Druid-Seoul/
28
DATA ANALYTICS WITH DRUID
POPIT
29
https://www.facebook.com/popitkr/
Q&A
THANK YOU
DATA ANALYTICS WITH DRUID 30

More Related Content

What's hot

How to Reduce Your Database Total Cost of Ownership with TimescaleDB
How to Reduce Your Database Total Cost of Ownership with TimescaleDBHow to Reduce Your Database Total Cost of Ownership with TimescaleDB
How to Reduce Your Database Total Cost of Ownership with TimescaleDBTimescale
 
Lightning Talk: What You Need to Know Before You Shard in 20 Minutes
Lightning Talk: What You Need to Know Before You Shard in 20 MinutesLightning Talk: What You Need to Know Before You Shard in 20 Minutes
Lightning Talk: What You Need to Know Before You Shard in 20 MinutesMongoDB
 
I have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingI have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingDavid Murphy
 
Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Markus Höfer
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDBMongoDB
 
Time Series Data in a Time Series World
Time Series Data in a Time Series WorldTime Series Data in a Time Series World
Time Series Data in a Time Series WorldMapR Technologies
 
Berlin buzzwords 2013 - Faceting analyzed fields with some sprinkles of proba...
Berlin buzzwords 2013 - Faceting analyzed fields with some sprinkles of proba...Berlin buzzwords 2013 - Faceting analyzed fields with some sprinkles of proba...
Berlin buzzwords 2013 - Faceting analyzed fields with some sprinkles of proba...Boaz Leskes
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesAltinity Ltd
 
Druid realtime indexing
Druid realtime indexingDruid realtime indexing
Druid realtime indexingSeoeun Park
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...Altinity Ltd
 
MongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB
 
MongoDB for Time Series Data
MongoDB for Time Series DataMongoDB for Time Series Data
MongoDB for Time Series DataMongoDB
 
Redis Day TLV 2018 - 10 Reasons why Redis should be your Primary Database
Redis Day TLV 2018 - 10 Reasons why Redis should be your Primary DatabaseRedis Day TLV 2018 - 10 Reasons why Redis should be your Primary Database
Redis Day TLV 2018 - 10 Reasons why Redis should be your Primary DatabaseRedis Labs
 
Webinar 2017. Supercharge your analytics with ClickHouse. Vadim Tkachenko
Webinar 2017. Supercharge your analytics with ClickHouse. Vadim TkachenkoWebinar 2017. Supercharge your analytics with ClickHouse. Vadim Tkachenko
Webinar 2017. Supercharge your analytics with ClickHouse. Vadim TkachenkoAltinity Ltd
 
Using MongoDB As a Tick Database
Using MongoDB As a Tick DatabaseUsing MongoDB As a Tick Database
Using MongoDB As a Tick DatabaseMongoDB
 
Five Great Ways to Lose Data on Kubernetes - KubeCon EU 2020
Five Great Ways to Lose Data on Kubernetes - KubeCon EU 2020Five Great Ways to Lose Data on Kubernetes - KubeCon EU 2020
Five Great Ways to Lose Data on Kubernetes - KubeCon EU 2020Altinity Ltd
 
Real-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL DatabasesReal-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL DatabasesEugene Dvorkin
 
Research Automation for Data-Driven Discovery
Research Automationfor Data-Driven DiscoveryResearch Automationfor Data-Driven Discovery
Research Automation for Data-Driven DiscoveryGlobus
 

What's hot (20)

How to Reduce Your Database Total Cost of Ownership with TimescaleDB
How to Reduce Your Database Total Cost of Ownership with TimescaleDBHow to Reduce Your Database Total Cost of Ownership with TimescaleDB
How to Reduce Your Database Total Cost of Ownership with TimescaleDB
 
Lightning Talk: What You Need to Know Before You Shard in 20 Minutes
Lightning Talk: What You Need to Know Before You Shard in 20 MinutesLightning Talk: What You Need to Know Before You Shard in 20 Minutes
Lightning Talk: What You Need to Know Before You Shard in 20 Minutes
 
I have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced ShardingI have a good shard key now what - Advanced Sharding
I have a good shard key now what - Advanced Sharding
 
Bigdata : Big picture
Bigdata : Big pictureBigdata : Big picture
Bigdata : Big picture
 
Aggregation - What's it to The HDF Group
Aggregation - What's it to The HDF GroupAggregation - What's it to The HDF Group
Aggregation - What's it to The HDF Group
 
Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
Time Series Data in a Time Series World
Time Series Data in a Time Series WorldTime Series Data in a Time Series World
Time Series Data in a Time Series World
 
Berlin buzzwords 2013 - Faceting analyzed fields with some sprinkles of proba...
Berlin buzzwords 2013 - Faceting analyzed fields with some sprinkles of proba...Berlin buzzwords 2013 - Faceting analyzed fields with some sprinkles of proba...
Berlin buzzwords 2013 - Faceting analyzed fields with some sprinkles of proba...
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
 
Druid realtime indexing
Druid realtime indexingDruid realtime indexing
Druid realtime indexing
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
 
MongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: Sharding
 
MongoDB for Time Series Data
MongoDB for Time Series DataMongoDB for Time Series Data
MongoDB for Time Series Data
 
Redis Day TLV 2018 - 10 Reasons why Redis should be your Primary Database
Redis Day TLV 2018 - 10 Reasons why Redis should be your Primary DatabaseRedis Day TLV 2018 - 10 Reasons why Redis should be your Primary Database
Redis Day TLV 2018 - 10 Reasons why Redis should be your Primary Database
 
Webinar 2017. Supercharge your analytics with ClickHouse. Vadim Tkachenko
Webinar 2017. Supercharge your analytics with ClickHouse. Vadim TkachenkoWebinar 2017. Supercharge your analytics with ClickHouse. Vadim Tkachenko
Webinar 2017. Supercharge your analytics with ClickHouse. Vadim Tkachenko
 
Using MongoDB As a Tick Database
Using MongoDB As a Tick DatabaseUsing MongoDB As a Tick Database
Using MongoDB As a Tick Database
 
Five Great Ways to Lose Data on Kubernetes - KubeCon EU 2020
Five Great Ways to Lose Data on Kubernetes - KubeCon EU 2020Five Great Ways to Lose Data on Kubernetes - KubeCon EU 2020
Five Great Ways to Lose Data on Kubernetes - KubeCon EU 2020
 
Real-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL DatabasesReal-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL Databases
 
Research Automation for Data-Driven Discovery
Research Automationfor Data-Driven DiscoveryResearch Automationfor Data-Driven Discovery
Research Automation for Data-Driven Discovery
 

Similar to Data Analytics with Druid

Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidOpen Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidDataWorks Summit
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupBlake Irvine
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesKrishna Sankar
 
WALD: A Modern & Sustainable Analytics Stack
WALD: A Modern & Sustainable Analytics StackWALD: A Modern & Sustainable Analytics Stack
WALD: A Modern & Sustainable Analytics StackFlorian Wilhelm
 
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...Facultad de Informática UCM
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLMapR Technologies
 
Afterwork big data et data viz - du lac à votre écran
Afterwork big data et data viz - du lac à votre écranAfterwork big data et data viz - du lac à votre écran
Afterwork big data et data viz - du lac à votre écranJoseph Glorieux
 
Talavant Data Lake Analytics
Talavant Data Lake Analytics Talavant Data Lake Analytics
Talavant Data Lake Analytics Sean Forgatch
 
20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...
20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...
20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...Athens Big Data
 
Interactive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidInteractive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidDataWorks Summit/Hadoop Summit
 
Managing data analytics in a hybrid cloud
Managing data analytics in a hybrid cloudManaging data analytics in a hybrid cloud
Managing data analytics in a hybrid cloudKaran Singh
 
The Central Hub: Defining the Data Lake
The Central Hub: Defining the Data LakeThe Central Hub: Defining the Data Lake
The Central Hub: Defining the Data LakeEric Kavanagh
 
Indexing Big Data on Amazon AWS
Indexing Big Data on Amazon AWSIndexing Big Data on Amazon AWS
Indexing Big Data on Amazon AWSlucenerevolution
 
Data Culture Series - Keynote & Panel - Reading - 12th May 2015
Data Culture Series  - Keynote & Panel - Reading - 12th May 2015Data Culture Series  - Keynote & Panel - Reading - 12th May 2015
Data Culture Series - Keynote & Panel - Reading - 12th May 2015Jonathan Woodward
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Demi Ben-Ari
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Codemotion
 
Foundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureFoundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureInside Analysis
 

Similar to Data Analytics with Druid (20)

Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidOpen Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering Meetup
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
 
WALD: A Modern & Sustainable Analytics Stack
WALD: A Modern & Sustainable Analytics StackWALD: A Modern & Sustainable Analytics Stack
WALD: A Modern & Sustainable Analytics Stack
 
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
 
Afterwork big data et data viz - du lac à votre écran
Afterwork big data et data viz - du lac à votre écranAfterwork big data et data viz - du lac à votre écran
Afterwork big data et data viz - du lac à votre écran
 
Talavant Data Lake Analytics
Talavant Data Lake Analytics Talavant Data Lake Analytics
Talavant Data Lake Analytics
 
20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...
20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...
20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...
 
Interactive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidInteractive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using Druid
 
Managing data analytics in a hybrid cloud
Managing data analytics in a hybrid cloudManaging data analytics in a hybrid cloud
Managing data analytics in a hybrid cloud
 
The Central Hub: Defining the Data Lake
The Central Hub: Defining the Data LakeThe Central Hub: Defining the Data Lake
The Central Hub: Defining the Data Lake
 
Indexing big data in the cloud
Indexing big data in the cloudIndexing big data in the cloud
Indexing big data in the cloud
 
Indexing Big Data on Amazon AWS
Indexing Big Data on Amazon AWSIndexing Big Data on Amazon AWS
Indexing Big Data on Amazon AWS
 
Big Data and OSS at IBM
Big Data and OSS at IBMBig Data and OSS at IBM
Big Data and OSS at IBM
 
Data Culture Series - Keynote & Panel - Reading - 12th May 2015
Data Culture Series  - Keynote & Panel - Reading - 12th May 2015Data Culture Series  - Keynote & Panel - Reading - 12th May 2015
Data Culture Series - Keynote & Panel - Reading - 12th May 2015
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
 
Foundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureFoundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information Architecture
 
Drill 1.0
Drill 1.0Drill 1.0
Drill 1.0
 

More from Dataya Nolja

How to Study Mathematics for ML
How to Study Mathematics for MLHow to Study Mathematics for ML
How to Study Mathematics for MLDataya Nolja
 
Music Data Start to End
Music Data Start to EndMusic Data Start to End
Music Data Start to EndDataya Nolja
 
Find a Leak Time in the Schedule
Find a Leak Time in the ScheduleFind a Leak Time in the Schedule
Find a Leak Time in the ScheduleDataya Nolja
 
A Financial Company Story of Bringing Open Source and ML in
A Financial Company Story of Bringing Open Source and ML inA Financial Company Story of Bringing Open Source and ML in
A Financial Company Story of Bringing Open Source and ML inDataya Nolja
 
Practice, Practice, Practice and do the Dirty Work
Practice, Practice, Practice and do the Dirty WorkPractice, Practice, Practice and do the Dirty Work
Practice, Practice, Practice and do the Dirty WorkDataya Nolja
 
Predicting People Who May Get off at the Next Station
Predicting People Who May Get off at the Next StationPredicting People Who May Get off at the Next Station
Predicting People Who May Get off at the Next StationDataya Nolja
 
Endless Trial-and-Errors for Data Collecting
Endless Trial-and-Errors for Data CollectingEndless Trial-and-Errors for Data Collecting
Endless Trial-and-Errors for Data CollectingDataya Nolja
 
Log Design Case Study
Log Design Case StudyLog Design Case Study
Log Design Case StudyDataya Nolja
 
Let's Play with Data Safely
Let's Play with Data SafelyLet's Play with Data Safely
Let's Play with Data SafelyDataya Nolja
 
Things Data Scientists Should Keep in Mind
Things Data Scientists Should Keep in MindThings Data Scientists Should Keep in Mind
Things Data Scientists Should Keep in MindDataya Nolja
 
Things Happend between JDBC and MySQL
Things Happend between JDBC and MySQLThings Happend between JDBC and MySQL
Things Happend between JDBC and MySQLDataya Nolja
 
Human-Machine Interaction and AI
Human-Machine Interaction and AIHuman-Machine Interaction and AI
Human-Machine Interaction and AIDataya Nolja
 
Julia 0.5 and TensorFlow
Julia 0.5 and TensorFlowJulia 0.5 and TensorFlow
Julia 0.5 and TensorFlowDataya Nolja
 
Zeppelin and Open Source Ecosystem and Silicon Valley
Zeppelin and Open Source Ecosystem and Silicon ValleyZeppelin and Open Source Ecosystem and Silicon Valley
Zeppelin and Open Source Ecosystem and Silicon ValleyDataya Nolja
 
Hadoop 10th Birthday and Hadoop 3 Alpha
Hadoop 10th Birthday and Hadoop 3 AlphaHadoop 10th Birthday and Hadoop 3 Alpha
Hadoop 10th Birthday and Hadoop 3 AlphaDataya Nolja
 
Kakao Bank Powered by Open Sources
Kakao Bank Powered by Open SourcesKakao Bank Powered by Open Sources
Kakao Bank Powered by Open SourcesDataya Nolja
 
Open Source is My Job
Open Source is My JobOpen Source is My Job
Open Source is My JobDataya Nolja
 
Creating Value through Data Analysis
Creating Value through Data AnalysisCreating Value through Data Analysis
Creating Value through Data AnalysisDataya Nolja
 
How to Make Money from Data - Global Cases
How to Make Money from Data - Global CasesHow to Make Money from Data - Global Cases
How to Make Money from Data - Global CasesDataya Nolja
 
Structured Streaming with Apache Spark
Structured Streaming with Apache SparkStructured Streaming with Apache Spark
Structured Streaming with Apache SparkDataya Nolja
 

More from Dataya Nolja (20)

How to Study Mathematics for ML
How to Study Mathematics for MLHow to Study Mathematics for ML
How to Study Mathematics for ML
 
Music Data Start to End
Music Data Start to EndMusic Data Start to End
Music Data Start to End
 
Find a Leak Time in the Schedule
Find a Leak Time in the ScheduleFind a Leak Time in the Schedule
Find a Leak Time in the Schedule
 
A Financial Company Story of Bringing Open Source and ML in
A Financial Company Story of Bringing Open Source and ML inA Financial Company Story of Bringing Open Source and ML in
A Financial Company Story of Bringing Open Source and ML in
 
Practice, Practice, Practice and do the Dirty Work
Practice, Practice, Practice and do the Dirty WorkPractice, Practice, Practice and do the Dirty Work
Practice, Practice, Practice and do the Dirty Work
 
Predicting People Who May Get off at the Next Station
Predicting People Who May Get off at the Next StationPredicting People Who May Get off at the Next Station
Predicting People Who May Get off at the Next Station
 
Endless Trial-and-Errors for Data Collecting
Endless Trial-and-Errors for Data CollectingEndless Trial-and-Errors for Data Collecting
Endless Trial-and-Errors for Data Collecting
 
Log Design Case Study
Log Design Case StudyLog Design Case Study
Log Design Case Study
 
Let's Play with Data Safely
Let's Play with Data SafelyLet's Play with Data Safely
Let's Play with Data Safely
 
Things Data Scientists Should Keep in Mind
Things Data Scientists Should Keep in MindThings Data Scientists Should Keep in Mind
Things Data Scientists Should Keep in Mind
 
Things Happend between JDBC and MySQL
Things Happend between JDBC and MySQLThings Happend between JDBC and MySQL
Things Happend between JDBC and MySQL
 
Human-Machine Interaction and AI
Human-Machine Interaction and AIHuman-Machine Interaction and AI
Human-Machine Interaction and AI
 
Julia 0.5 and TensorFlow
Julia 0.5 and TensorFlowJulia 0.5 and TensorFlow
Julia 0.5 and TensorFlow
 
Zeppelin and Open Source Ecosystem and Silicon Valley
Zeppelin and Open Source Ecosystem and Silicon ValleyZeppelin and Open Source Ecosystem and Silicon Valley
Zeppelin and Open Source Ecosystem and Silicon Valley
 
Hadoop 10th Birthday and Hadoop 3 Alpha
Hadoop 10th Birthday and Hadoop 3 AlphaHadoop 10th Birthday and Hadoop 3 Alpha
Hadoop 10th Birthday and Hadoop 3 Alpha
 
Kakao Bank Powered by Open Sources
Kakao Bank Powered by Open SourcesKakao Bank Powered by Open Sources
Kakao Bank Powered by Open Sources
 
Open Source is My Job
Open Source is My JobOpen Source is My Job
Open Source is My Job
 
Creating Value through Data Analysis
Creating Value through Data AnalysisCreating Value through Data Analysis
Creating Value through Data Analysis
 
How to Make Money from Data - Global Cases
How to Make Money from Data - Global CasesHow to Make Money from Data - Global Cases
How to Make Money from Data - Global Cases
 
Structured Streaming with Apache Spark
Structured Streaming with Apache SparkStructured Streaming with Apache Spark
Structured Streaming with Apache Spark
 

Recently uploaded

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 

Recently uploaded (20)

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 

Data Analytics with Druid

  • 1. DATA ANALYTICS WITH DRUID YOU SUN JEONG
  • 2. DATA ANALYTICS WITH DRUID WHO AM I ? Senior Software Engineer of SK Telecom Commercial Products Big Data Discovery Solution (~’16) Hadoop DW (~’15) PaaS(CloudFoundry) (~’13) Iaas (OpenStack) (~’13) Mail to : jerryjung@apache.org 2
  • 3. DATA ANALYTICS WITH DRUID FOOTPRINTS 2014 2015 
 - Hadoop DW 
 - Realtime NW Analytics 2016 
 - Big Data Discovery
 - Streaming Processing 3
  • 4. DATA ANALYTICS WITH DRUID AGENDA ‣ History ‣ What is Druid? ‣ Druid Architecture ‣ Real-Time Ingestion Demo (15m) ‣ Cohort Analysis (15m) 4
  • 5. DATA ANALYTICS WITH DRUID HISTORY ▸ Development started at Meta markets in 2011 ▸ Apache V2 in early 2015 ▸ 150+ contributors today ▸ https://github.com/druid-io 5
  • 6. DATA ANALYTICS WITH DRUID DATA LAKE 6 https://www.linkedin.com/pulse/more-analytics-than-just-fishing-data-lake-john-poppelaars
  • 7. DATA ANALYTICS WITH DRUID DW VS DATA LAKE http://www.kdnuggets.com/2015/09/data-lake-vs-data-warehouse-key-differences.html 7
  • 8. DATA ANALYTICS WITH DRUID WHAT IS DRUID Distributed, 
 In-memory Multi-dimensional OLAP store 8
  • 9. DATA ANALYTICS WITH DRUID PROBLEMS timestamp domain user gender clicked 2011-01-01T00:01:35Z bieber.com 4312345532 Female 1 2011-01-01T00:03:03Z bieber.com 3484920241 Female 0 2011-01-01T00:04:51Z ultra.com 9530174728 Male 1 2011-01-01T00:05:33Z ultra.com 4098310573 Male 1 2011-01-01T00:05:53Z ultra.com 5832057930 Female 0 2011-01-01T00:06:17Z ultra.com 5789283478 Female 1 2011-01-01T00:23:15Z bieber.com 4730093842 Female 0 2011-01-01T00:38:51Z ultra.com 3909846810 Male 1 2011-01-01T00:49:33Z bieber.com 4930097162 Female 1 2011-01-01T00:49:53Z ultra.com 0381837193 Female 0 timestamp impressions clicks 2011-01-01T00:00:00Z 10 6 timestamp domain user gender clicked 2011-01-01T00:01:35Z bieber.com 4312345532 Female 1 2011-01-01T00:03:03Z bieber.com 3484920241 Female 0 2011-01-01T00:04:51Z ultra.com 9530174728 Male 1 2011-01-01T00:05:33Z ultra.com 4098310573 Male 1 2011-01-01T00:05:53Z ultra.com 5832057930 Female 0 2011-01-01T00:06:17Z ultra.com 5789283478 Female 1 2011-01-01T00:23:15Z bieber.com 4730093842 Female 0 2011-01-01T00:38:51Z ultra.com 9530174728 Male 1 2011-01-01T00:49:33Z bieber.com 4930097162 Female 1 2011-01-01T00:49:53Z ultra.com 0381837193 Female 0 timestamp domain gender impressions clicks 2011-01-01T00:00:00Z bieber.com Female 4 2 2011-01-01T00:00:00Z ultra.com Female 3 1 2011-01-01T00:00:00Z ultra.com Male 3 2 9
  • 10. DATA ANALYTICS WITH DRUID BIG DATA DISCOVERY ▸ Roll-up ▸ Summarizing over a dimension ▸ Drill-down ▸ Focusing (zooming in) ▸ Slicing and dicing ▸ Reducing dimensions (slice) ▸ Picking values of specific dimensions (dice) ▸ Pivoting ▸ Rotating multi-dimensional cube 10
  • 11. DATA ANALYTICS WITH DRUID OLAP CUBE ▸ Slice and Dice 11
  • 12. DATA ANALYTICS WITH DRUID IN-MEMORY 12
  • 13. DATA ANALYTICS WITH DRUID COLUMNAR STORAGE 13
  • 14. DATA ANALYTICS WITH DRUID DRUID TERMS ▸ Data ▸ Timestamp ▸ Dimension ▸ Metric ▸ Datasource ▸ Segment ▸ Granularity 14
  • 15. DATA ANALYTICS WITH DRUID DRUID ARCHITECTURE REALTIME BROKER HISTORICAL 15
  • 16. DATA ANALYTICS WITH DRUID ARCHITECTURE - BATCH INGESTION HDFS HISTORICAL NODE HISTORICAL NODE HISTORICAL NODE BROKER NODE Segments Queries 16
  • 17. DATA ANALYTICS WITH DRUID ARCHITECTURE - STREAMING INGESTION REALTIME NODE HISTORICAL NODE HISTORICAL NODE HISTORICAL NODE BROKER NODE Segments Queries Streaming 17
  • 18. DATA ANALYTICS WITH DRUID ARCHITECTURE - LAMBDA REALTIME NODE HISTORICAL NODE HISTORICAL NODE HISTORICAL NODE BROKER NODE Segments Queries Streaming HDFS 18
  • 19. DATA ANALYTICS WITH DRUID GLUE ARCHITECTURE REAL TIME TASK HISTORICAL NODE HISTORICAL NODE HISTORICAL NODE BROKER NODE Segments Queries Streaming STREAM PROCESSOR
 (TRANQUILITY) Kafka Indexing Service 19
  • 20. DATA ANALYTICS WITH DRUID REAL WORLD ARCHITECTURE DATA 
 NODE #1 DATA 
 NODE #N OVERLORD MIDDLE MANAGE
 #1 COORDI
 NATOR MYSQL HA 
 PROXY MEMCACHED
 #2 BROKER NODE
 #1 BROKER NODE
 #1 MEMCACHED
 #3 MEMCACHED
 #1 HISTORICAL NODE #1 HISTORICAL NODE #N MIDDLE MANAGE
 #N ZK1 ZK2 ZK3 20
  • 21. DATA ANALYTICS WITH DRUID DRUID MONITORING 21 http://www.slideshare.net/CharlesAllen9/programmatic-bidding-data-streams-druid
  • 22. DATA ANALYTICS WITH DRUID DRUID DATASOURCE 22
  • 23. RDRUID DATA ANALYTICS WITH DRUID https://github.com/druid-io/RDruid 23
  • 24. DATA ANALYTICS WITH DRUID PYDROID 24 https://github.com/druid-io/pydruid
  • 25. DATA ANALYTICS WITH DRUID DEMO ▸ Jupyter Notebook(PyDruid) ▸ Mobile App User Events for 1 week 
 : 2 billion events ▸ Scenario 
 : Unique users
 Cohort Analysis 25
  • 26. DEMO
  • 27. DATA ANALYTICS WITH DRUID MAY THE FORCE BE WITH YOU 27
  • 28. DATA ANALYTICS WITH DRUID REFERENCES ▸ Druid
 : http://www.popit.kr/tag/druid/ 
 (https://www.facebook.com/popitkr/)
 : http://druid.io/ ▸ Cohort Analysis
 : http://www.gregreda.com/2015/08/23/cohort-analysis- with-python/ ▸ Druid Meetup@Seoul
 : http://www.meetup.com/Druid-Seoul/ 28
  • 29. DATA ANALYTICS WITH DRUID POPIT 29 https://www.facebook.com/popitkr/