Apache Druid®: A Dance of Distributed Processes

Apache®, Apache Druid®, Druid®, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.

peter.marshall@imply.io
20 years in Enterprise Architecture
CRM, EDRM, ERP, EIP, Digital Services,
Security, BI, RI, and MDM
BA Theology (!) and Computer Studies
TOGAF certiﬁed
Book collector & A/V buyer
Prime Timeline = proper timeline
#werk
petermarshall.io

What the collaborations do
Some principles
Each collaboration
Wrap-up
Questions!

Query
Distributed execution of SQL / Druid
Native queries on the cluster
Ingestion
Ingestion tasks that bring data into Druid
from storage and delivery services
Distribution
Replication and distribution of the
ingested data according to rules

● A job to do
● Compute to do the ingestion
● A place to store optimised data
● A question to answer
● Data to process!
● Compute to answer queries
● Somewhere to put the data that’s
near to the query process
● Some rules to follow
Query
Ingestion
Distribution

https://www.archimatetool.com/

Zookeeper
Overlord
Ingest
Spec
Task
Log
S3, GCS, ABS, HFDS, HTTP, Local, MySql, Postgres, Druid
Kafka Kinesis
Middle Manager
(Indexer)

Zookeeper
Overlord
Middle Manager
(Indexer)

Deep Store
Zookeeper
Overlord
Middle Manager
(Indexer) columnarise
index & encode
time-shardDeep Store

Deep Store
Metadata
Store
Zookeeper
Overlord
Middle Manager
(Indexer)

Metadata
Store
Broker
Middle Manager
(Indexer)
Overlord
Deep Store
Zookeeper
SQL
and
Native

Metadata
Store
Zookeeper
Broker
Middle Manager
(Indexer)
Overlord
Deep Store

Metadata
Store
Zookeeper
Broker
Middle Manager
(Indexer)
Overlord
Deep Store Historical

Middle Manager
(Indexer)
Metadata
Store
Zookeeper
Historical
BrokerOverlord
Deep Store

OverlordOverlordCoordinator
Metadata
Store
Historical
Zookeeper
BrokerOverlord
Middle Manager
(Indexer)
Deep Store

Zookeeper
Historical
Metadata
Store
Historical
BrokerOverlord
Middle Manager
(Indexer)
Deep Store
Historical

Zookeeper
Historical
Historical
Metadata
Store
Historical
BrokerOverlord
Middle Manager
(Indexer)
Deep Store

Zookeeper
Coordinator
Overlord
Broker
Query
Ingestion
Distribution

★ A job to do
★ Compute to do the ingestion
★ A place to store optimised data
★ A question to answer
★ Data to process!
★ Compute to answer queries
★ Somewhere to put the data that’s
near to the query process
★ Some rules to follow
Query
Ingestion
Distribution

http://druid.apache.org
Imply Distribution
https://imply.io/get-started
@druidio
Add Apache Druid
as a skill
Apache Distribution
https://github.com/apache/druid
ASF Slack
#druid
Druid Community
https://druid.apache.org/community/
Meetup Groups
https://www.meetup.com/pro/apache-druid/
Google Groups Druid User Forum
https://groups.google.com/

Apache Druid®: A Dance of Distributed Processes

More Related Content

What's hot

Similar to Apache Druid®: A Dance of Distributed Processes

More from Imply

Recently uploaded

Apache Druid®: A Dance of Distributed Processes