A Day in the life of a Druid
Architect
Benjamin Hopp
Senior Solutions Architect @ Imply
ben@imply.io
San Francisco Airport Marriott Waterfront
Real-Time Analytics at Scale
https://www.druidsummit.org/
What do I do?
Productionalization
Implementation
Recommendation
Education
Ask a lot of Questions
● What is the use-case?
○ Is it a good fit for druid?
● Who are the stakeholders?
○ End users - running queries
○ Data Engineers - ingesting data
○ Cluster Administrators - managing services
● How are they using the cluster?
● Where is the data coming from?
● What are the issues or concerns?
● Where does druid fit in the technology stack?
When to use Druid
6
Search
platform
OLAP
● Real-time ingestion
● Flexible schema
● Full text search
● Batch ingestion
● Efficient storage
● Fast analytic queries
Timeseries
database
● Optimized for
time-based datasets
● Time-based functions
When NOT to use Druid
7
OLTP
Individual
record
update/delet
e
Big join
operations
Where Druid fits in
8
Data lakes
Message buses
Raw data Storage Analyze Application
Cluster Evaluation
Druid Architecture
Pick your servers
Data NodesD
● Large-ish
● Scales with size of data and query
volume
● Lots of cores, lots of memory, fast NVMe
disk
Query NodesQ
● Medium-ish
● Scales with concurrency and # of Data
nodes
● Typically CPU bound
Master NodesM
● Small-ish Nodes
● Coordinator scales with # of segments
● Overlord scales with # of supervisors and
tasks
Configure for MAXIMUM PERFORMANCE
Data NodesD
● Enable Cache
● Heap/maxDirectMemory size
● druid.processing.buffer.sizeBytes
● druid.processing.numMergeBuffers
● druid.processing.numThreads
Query NodesQ
● Disable Caching
● Heap/maxDirectMemory size
● druid.broker.http.numConnections
● druid.processing.numMergeBuffers
● druid.processing.numThreads
Master NodesM ● Heap Size
Data Evaluation
Unified Console
Optimize segment size
Ideally 300 - 700 mb (~ 5 million rows)
To control segment size
● Alter segment granularity
● Specify partition spec
● Use Automatic Compaction
Controlling Segment Size
● Number of Tasks - Keep to lowest number that supports max
ingestion rate.
● Segment Granularity - Increase if only 1 file per segment and <
200MB
"segmentGranularity": "HOUR"
● Max Rows Per Segment - Increase if a single segment is <
200MB
"maxRowsPerSegment": 5000000
Compaction
● Combines small segments into larger segments
● Useful for late-arriving data
● Task submitted to Overlord
{
"type" : "compact",
"dataSource" : "wikipedia",
"interval" : "2017-01-01/2018-01-01"
}
Rollup
● Pre-aggregation at ingestion
time
● Saves space, better
compression
● Query performance boost
Rollup
timestamp page city count sum_added sum_deleted
2011-01-01T00:00:00Z
Justin
Bieber
SF 3 50 61
2011-01-01T00:00:00Z Ke$ha LA 2 46 53
2011-01-01T00:00:00Z Miley
Cyrus
DC 4 198 88
timestamp page city added deleted
2011-01-01T00:01:35Z
Justin
Bieber
SF 10 5
2011-01-01T00:03:45Z
Justin
Bieber
SF 25 37
2011-01-01T00:05:62Z
Justin
Bieber
SF 15 19
2011-01-01T00:06:33Z Ke$ha LA 30 45
2011-01-01T00:08:51Z Ke$ha LA 16 8
2011-01-01T00:09:17Z
Miley
Cyrus
DC 75 10
2011-01-01T00:11:25Z
Miley
Cyrus
DC 11 25
2011-01-01T00:23:30Z
Miley
Cyrus
DC 22 12
2011-01-01T00:49:33Z
Miley
Cyrus
DC 90 41
Summarize with data sketches
timestamp page city count
sum_
added
sum_
deleted userid_sketch
2011-01-01T00:00:00Z
Justin
Bieber
SF 3 50 61 sketch_obj
2011-01-01T00:00:00Z Ke$ha LA 2 46 53 sketch_obj
2011-01-01T00:00:00Z Miley
Cyrus
DC 4 198 88 sketch_obj
timestamp page userid city added deleted
2011-01-01T00:01:3
5Z
Justin
Bieber
user11 SF 10 5
2011-01-01T00:03:4
5Z
Justin
Bieber
user22 SF 25 37
2011-01-01T00:05:6
2Z
Justin
Bieber
user11 SF 15 19
2011-01-01T00:06:3
3Z
Ke$ha user33 LA 30 45
2011-01-01T00:08:5
1Z
Ke$ha user33 LA 16
8
2011-01-01T00:09:1
7Z
Miley
Cyrus
user11 DC 75 10
2011-01-01T00:11:2
5Z
Miley
Cyrus
user44 DC 11 25
2011-01-01T00:23:3
0Z
Miley
Cyrus
user44 DC 22 12
2011-01-01T00:49:3
3Z
Miley
Cyrus
user55 DC 90 41
Choose column types carefully
String column
indexed
fast aggregation
fast grouping
Numeric column
indexed
fast aggregation
fast grouping
Partitioning beyond time
● Druid always partitions by time
● Decide which dimension to
partition on… next
● Partition by some dimension you
often filter on
● Improves locality, compression,
storage size, query performance
Query Evaluation
Decisions based on data!
Use Druid SQL
● Easier to learn/more familiar
● Will attempt to make intelligent query type choices (timeseries
vs topN vs groupBy)
● There are some limitations - such as multi-value dimensions,
not all aggregations are supported
Explain Plan
EXPLAIN PLAN FOR
SELECT channel, sum(added)
FROM wikipedia
WHERE commentLength >= 50
GROUP BY channel
ORDER BY sum(added) desc
LIMIT 3
Pick your query carefully
● TimeBoundary - Returns min/max timestamp for given interval.
● Timeseries - When you don’t want to group by dimension
● TopN - When you want to group by a single dimension
○ Approximate if > 1000 dimension values
● GroupBy - Least performant/most flexible
● Scan - For returning streaming raw data
○ Perfect ordering not preserved
● Select - For returning paginated raw data
● Search - Returns dimensions that match text search
Using Lookups
● Use lookups when you have dimensions that change to avoid
re-indexing data
● Lookups are key/value pairs stored on every node.
● Loaded via file or JDBC connection to external database
● Lookups are loaded into the java heap size, so large lookups
need larger heaps
Stay in touch
29
@druidio
https://imply.io
https://druid.apache.org/
Ben Hopp
Benjamin.hopp@imply.io
LinkedIn: benhopp
@implydata
roadmap and community update
Ben Hopp
ben@imply.io
Apache Druid 0.17.0
Druid 0.17.0
Our first release as a top-level Apache project!
3
Druid 0.17.0 Highlights
● Native batch - binary inputs & more
○ Supports non-binary formats such as ORC, Parquet, and Avro
○ Native batch tasks can now read from HDFS
○ Single-dimension range partitioning for parallel native batch
● Compaction improvements
○ Parallel index task split hints and parallel auto-compaction
○ Stateful auto-compaction
● Parallel query merge on brokers
○ Broker can now optionally merge query results in parallel using multiple threads.
4
Druid 0.17.0 Highlights
● ...and More!
○ Improved SQL-compatible null handling
○ New dropwizard emitter which supports counter, gauge, meter, timer and histogram
metric types
○ Task supervisors (e.g. Kafka or Kinesis supervisors) are now recorded in the system
tables in a new sys.supervisors table
○ Fast historical start with deferred loading of segments until query time
○ New readiness and self-discovery resources
○ Task assignment based on MiddleManager categories
○ Security updates
5
Apache Druid 0.16.0
Druid 0.16.0
Over 350 new features from 50 contributors!
Released September 2019.
7
Druid 0.16.0 Highlights
● Native parallel batch shuffle
○ Two-phase shuffle system allows for ‘perfect rollup’ and partitioning on dimensions
● Query vectorization phase one
○ Allows queries to be sped up by reducing the number of method calls
● Indexer process
○ An alternative to the MiddleManager + Peon task execution system which is easier to
configure and deploy
● Improved web console
○ Kafka & Kinesis support!
○ Point-and-click reindexing
8
Druid 0.17.0
Our first release as a top-level Apache project!
Coming soon (really soon).
9
Druid 0.17.0 Highlights
● Native batch - binary inputs & more
○ Supports non-binary formats such as ORC, Parquet, and Avro
○ Native batch tasks can now read from HDFS
○ Single-dimension range partitioning for parallel native batch
● Compaction improvements
○ Parallel index task split hints and parallel auto-compaction
○ Stateful auto-compaction
● Parallel query merge on brokers
○ Broker can now optionally merge query results in parallel using multiple threads.
10
Druid 0.17.0 Highlights
● ...and More!
○ Improved SQL-compatible null handling
○ New dropwizard emitter which supports counter, gauge, meter, timer and histogram
metric types
○ Task supervisors (e.g. Kafka or Kinesis supervisors) are now recorded in the system
tables in a new sys.supervisors table
○ Fast historical start with deferred loading of segments until query time
○ New readiness and self-discovery resources
○ Task assignment based on MiddleManager categories
○ Security updates
11
…and beyond!!
…and beyond!!
A selection of items planned for future 2020 Druid releases.
13
…and beyond!!
● SQL Joins
○ A multi-phase project to add full SQL Join support to Druid. Coming up first -
sub-queries and lookups
● Windowed aggregations
○ For example, moving average and cumulative sum aggregations.
● Dynamic query prioritization & laning
○ Mix ‘heavy’ and ‘light’ workloads in the same cluster without heavy workloads blocking
light ones.
● Extended query vectorization support
○ Richer support for query vectorization against more query types
14
Download
Druid community site (new): https://druid.apache.org/
Imply distribution: https://imply.io/get-started
15
Contribute
16
https://github.com/apache/druid
Stay in touch
17
@druidio
Join the community!
http://druid.io/community
Free training hosted by Imply!
https://imply.io/druid-days
Follow the Druid project on Twitter!

A Day in the Life of a Druid Implementor and Druid's Roadmap

  • 1.
    A Day inthe life of a Druid Architect Benjamin Hopp Senior Solutions Architect @ Imply ben@imply.io
  • 2.
    San Francisco AirportMarriott Waterfront Real-Time Analytics at Scale https://www.druidsummit.org/
  • 4.
    What do Ido? Productionalization Implementation Recommendation Education
  • 5.
    Ask a lotof Questions ● What is the use-case? ○ Is it a good fit for druid? ● Who are the stakeholders? ○ End users - running queries ○ Data Engineers - ingesting data ○ Cluster Administrators - managing services ● How are they using the cluster? ● Where is the data coming from? ● What are the issues or concerns? ● Where does druid fit in the technology stack?
  • 6.
    When to useDruid 6 Search platform OLAP ● Real-time ingestion ● Flexible schema ● Full text search ● Batch ingestion ● Efficient storage ● Fast analytic queries Timeseries database ● Optimized for time-based datasets ● Time-based functions
  • 7.
    When NOT touse Druid 7 OLTP Individual record update/delet e Big join operations
  • 8.
    Where Druid fitsin 8 Data lakes Message buses Raw data Storage Analyze Application
  • 9.
  • 10.
  • 11.
    Pick your servers DataNodesD ● Large-ish ● Scales with size of data and query volume ● Lots of cores, lots of memory, fast NVMe disk Query NodesQ ● Medium-ish ● Scales with concurrency and # of Data nodes ● Typically CPU bound Master NodesM ● Small-ish Nodes ● Coordinator scales with # of segments ● Overlord scales with # of supervisors and tasks
  • 12.
    Configure for MAXIMUMPERFORMANCE Data NodesD ● Enable Cache ● Heap/maxDirectMemory size ● druid.processing.buffer.sizeBytes ● druid.processing.numMergeBuffers ● druid.processing.numThreads Query NodesQ ● Disable Caching ● Heap/maxDirectMemory size ● druid.broker.http.numConnections ● druid.processing.numMergeBuffers ● druid.processing.numThreads Master NodesM ● Heap Size
  • 13.
  • 14.
  • 15.
    Optimize segment size Ideally300 - 700 mb (~ 5 million rows) To control segment size ● Alter segment granularity ● Specify partition spec ● Use Automatic Compaction
  • 16.
    Controlling Segment Size ●Number of Tasks - Keep to lowest number that supports max ingestion rate. ● Segment Granularity - Increase if only 1 file per segment and < 200MB "segmentGranularity": "HOUR" ● Max Rows Per Segment - Increase if a single segment is < 200MB "maxRowsPerSegment": 5000000
  • 17.
    Compaction ● Combines smallsegments into larger segments ● Useful for late-arriving data ● Task submitted to Overlord { "type" : "compact", "dataSource" : "wikipedia", "interval" : "2017-01-01/2018-01-01" }
  • 18.
    Rollup ● Pre-aggregation atingestion time ● Saves space, better compression ● Query performance boost
  • 19.
    Rollup timestamp page citycount sum_added sum_deleted 2011-01-01T00:00:00Z Justin Bieber SF 3 50 61 2011-01-01T00:00:00Z Ke$ha LA 2 46 53 2011-01-01T00:00:00Z Miley Cyrus DC 4 198 88 timestamp page city added deleted 2011-01-01T00:01:35Z Justin Bieber SF 10 5 2011-01-01T00:03:45Z Justin Bieber SF 25 37 2011-01-01T00:05:62Z Justin Bieber SF 15 19 2011-01-01T00:06:33Z Ke$ha LA 30 45 2011-01-01T00:08:51Z Ke$ha LA 16 8 2011-01-01T00:09:17Z Miley Cyrus DC 75 10 2011-01-01T00:11:25Z Miley Cyrus DC 11 25 2011-01-01T00:23:30Z Miley Cyrus DC 22 12 2011-01-01T00:49:33Z Miley Cyrus DC 90 41
  • 20.
    Summarize with datasketches timestamp page city count sum_ added sum_ deleted userid_sketch 2011-01-01T00:00:00Z Justin Bieber SF 3 50 61 sketch_obj 2011-01-01T00:00:00Z Ke$ha LA 2 46 53 sketch_obj 2011-01-01T00:00:00Z Miley Cyrus DC 4 198 88 sketch_obj timestamp page userid city added deleted 2011-01-01T00:01:3 5Z Justin Bieber user11 SF 10 5 2011-01-01T00:03:4 5Z Justin Bieber user22 SF 25 37 2011-01-01T00:05:6 2Z Justin Bieber user11 SF 15 19 2011-01-01T00:06:3 3Z Ke$ha user33 LA 30 45 2011-01-01T00:08:5 1Z Ke$ha user33 LA 16 8 2011-01-01T00:09:1 7Z Miley Cyrus user11 DC 75 10 2011-01-01T00:11:2 5Z Miley Cyrus user44 DC 11 25 2011-01-01T00:23:3 0Z Miley Cyrus user44 DC 22 12 2011-01-01T00:49:3 3Z Miley Cyrus user55 DC 90 41
  • 21.
    Choose column typescarefully String column indexed fast aggregation fast grouping Numeric column indexed fast aggregation fast grouping
  • 22.
    Partitioning beyond time ●Druid always partitions by time ● Decide which dimension to partition on… next ● Partition by some dimension you often filter on ● Improves locality, compression, storage size, query performance
  • 23.
  • 24.
  • 25.
    Use Druid SQL ●Easier to learn/more familiar ● Will attempt to make intelligent query type choices (timeseries vs topN vs groupBy) ● There are some limitations - such as multi-value dimensions, not all aggregations are supported
  • 26.
    Explain Plan EXPLAIN PLANFOR SELECT channel, sum(added) FROM wikipedia WHERE commentLength >= 50 GROUP BY channel ORDER BY sum(added) desc LIMIT 3
  • 27.
    Pick your querycarefully ● TimeBoundary - Returns min/max timestamp for given interval. ● Timeseries - When you don’t want to group by dimension ● TopN - When you want to group by a single dimension ○ Approximate if > 1000 dimension values ● GroupBy - Least performant/most flexible ● Scan - For returning streaming raw data ○ Perfect ordering not preserved ● Select - For returning paginated raw data ● Search - Returns dimensions that match text search
  • 28.
    Using Lookups ● Uselookups when you have dimensions that change to avoid re-indexing data ● Lookups are key/value pairs stored on every node. ● Loaded via file or JDBC connection to external database ● Lookups are loaded into the java heap size, so large lookups need larger heaps
  • 29.
    Stay in touch 29 @druidio https://imply.io https://druid.apache.org/ BenHopp Benjamin.hopp@imply.io LinkedIn: benhopp @implydata
  • 30.
    roadmap and communityupdate Ben Hopp ben@imply.io
  • 31.
  • 32.
    Druid 0.17.0 Our firstrelease as a top-level Apache project! 3
  • 33.
    Druid 0.17.0 Highlights ●Native batch - binary inputs & more ○ Supports non-binary formats such as ORC, Parquet, and Avro ○ Native batch tasks can now read from HDFS ○ Single-dimension range partitioning for parallel native batch ● Compaction improvements ○ Parallel index task split hints and parallel auto-compaction ○ Stateful auto-compaction ● Parallel query merge on brokers ○ Broker can now optionally merge query results in parallel using multiple threads. 4
  • 34.
    Druid 0.17.0 Highlights ●...and More! ○ Improved SQL-compatible null handling ○ New dropwizard emitter which supports counter, gauge, meter, timer and histogram metric types ○ Task supervisors (e.g. Kafka or Kinesis supervisors) are now recorded in the system tables in a new sys.supervisors table ○ Fast historical start with deferred loading of segments until query time ○ New readiness and self-discovery resources ○ Task assignment based on MiddleManager categories ○ Security updates 5
  • 35.
  • 36.
    Druid 0.16.0 Over 350new features from 50 contributors! Released September 2019. 7
  • 37.
    Druid 0.16.0 Highlights ●Native parallel batch shuffle ○ Two-phase shuffle system allows for ‘perfect rollup’ and partitioning on dimensions ● Query vectorization phase one ○ Allows queries to be sped up by reducing the number of method calls ● Indexer process ○ An alternative to the MiddleManager + Peon task execution system which is easier to configure and deploy ● Improved web console ○ Kafka & Kinesis support! ○ Point-and-click reindexing 8
  • 38.
    Druid 0.17.0 Our firstrelease as a top-level Apache project! Coming soon (really soon). 9
  • 39.
    Druid 0.17.0 Highlights ●Native batch - binary inputs & more ○ Supports non-binary formats such as ORC, Parquet, and Avro ○ Native batch tasks can now read from HDFS ○ Single-dimension range partitioning for parallel native batch ● Compaction improvements ○ Parallel index task split hints and parallel auto-compaction ○ Stateful auto-compaction ● Parallel query merge on brokers ○ Broker can now optionally merge query results in parallel using multiple threads. 10
  • 40.
    Druid 0.17.0 Highlights ●...and More! ○ Improved SQL-compatible null handling ○ New dropwizard emitter which supports counter, gauge, meter, timer and histogram metric types ○ Task supervisors (e.g. Kafka or Kinesis supervisors) are now recorded in the system tables in a new sys.supervisors table ○ Fast historical start with deferred loading of segments until query time ○ New readiness and self-discovery resources ○ Task assignment based on MiddleManager categories ○ Security updates 11
  • 41.
  • 42.
    …and beyond!! A selectionof items planned for future 2020 Druid releases. 13
  • 43.
    …and beyond!! ● SQLJoins ○ A multi-phase project to add full SQL Join support to Druid. Coming up first - sub-queries and lookups ● Windowed aggregations ○ For example, moving average and cumulative sum aggregations. ● Dynamic query prioritization & laning ○ Mix ‘heavy’ and ‘light’ workloads in the same cluster without heavy workloads blocking light ones. ● Extended query vectorization support ○ Richer support for query vectorization against more query types 14
  • 44.
    Download Druid community site(new): https://druid.apache.org/ Imply distribution: https://imply.io/get-started 15
  • 45.
  • 46.
    Stay in touch 17 @druidio Jointhe community! http://druid.io/community Free training hosted by Imply! https://imply.io/druid-days Follow the Druid project on Twitter!