Automating CI/CD for Druid Clusters at Athena Health

Automating CI/CD for Druid
clusters at Athena Health
April 2020
Shyam
Mudambi
Sr. Architect
Athena intelligence
Athenahealth
1
Karthik
Urs
Lead MTS
Athena intelligence
Athenahealth
Ramesh
Kempanna
Principal MTS
Athena intelligence
Athenahealth

● Goals
● Druid architecture at Athena
● Why Terraform?
● Athena’s CI/CD processes
● Creating a Druid cluster
● Deployment demo
● Scale up and down example
● Conclusions and next steps
● Questions
Overview

● Druid will power a new self-service analytics environment
● Key features that led us to Druid
• Low latency – sub-second response on large datasets
• Horizontally scalable – support 100's of sessions in parallel
• Standard OLAP support – rollups on ingestion
• Time series support is built-in - pros & cons
● Snowﬂake – Low latency/high concurrency is not its sweet spot
● Cassandra – High dimensionality with many diﬀerent query patterns
Druid at Athena

• Master
1. Coordinator
2. Overlord
• Query
1. Broker
2. Router
• Data - Historical
• Data - Middle Manager
Druid Environment

Druid Environment
Service Instance Type # of nodes
DEMO/DEV
zookeeper m5.large 3
master m5.large 1
query m5.large 1
historical i3.large 1
middlemanager m5.large 1 (5)
STAGING
master m5.large 1
query m5.large 1
historical i3.2xlarge 1
middlemanager m5.2xlarge 1 (20)
PROD
master m5.2xlarge 1
query m5.2xlarge 1

Motivation to automate
● A volatile environment as we are still in development
• A lot of build/destroy of druid clusters
● Scaling up/down clusters involved a lot of (semi) manual work
• Tuning JVM for each type of machine
• Setting up and managing ﬁle systems for data & logs
● Governance around conﬁguration changes
• Security groups
• Machine instance changes
● Monitoring/alerting capabilities

● Terraform
• Declarative - separates speciﬁcation from execution
• Support - Large community support
• Multi-provider support in a single stack
• Composition – Easy to incorporate existing stacks
• Modularity - Robust module system for reusable code.
• No lag between AWS rollout and Terraform parity.
● State Management - Utility
• Buddy
Why Terraform – Pros and Cons

Athena’s CI/CD processes
Dashboard: https://[NEWENV]-druid.us-east-1.staging.ai.athena.io/uniﬁed-console.html
buddy

Conﬁg uploader
• tar.gz of all druid
service conﬁg
S3 bucket
Query cluster
Historical
cluster
Creating a Druid cluster
Jenkins Create
Env
PostgreSQL
(RDS)
~ 7-8mins
Lambda
Create User and DB in
PostgreSQL
2 mins
Zookeeper
Cluster
Master cluster
ALB
• Router service
MiddleManager
cluster

Dissection of the Druid service instance creation
CloudInit
Download OS Dependency
Packages
Based on Instance Type + Service
Disk Setup
Log Volume
Data Volume
Format Partition
Log Partition
Data Partition
Mount Partition
Log Directory
Data Directory
Bootstrapbash
script
Download Druid Binaries
Download Config files from s3
bucket
Replace Config file + Supervise
scripts
Config File update based on
Resource limits
Based on CPU cores.
Based on RAM Size.
Based on Data volume.
Install + Config Filebeat log
forwarder & Prometheus
Based on Service Name Start Multiple services via Supervise Scripts

Scale up and down
Service Instance Type # of nodes
DEMO/DEV
master m5.large 1
query m5.large 1
historical i3.large 1
middlemanager m5.large 1 (5)
STAGING
master m5.large 1
query m5.large 1
PROD
master m5.2xlarge 1
query m5.2xlarge 1

13
Time for
questions
Apache Druid is an independent project of The Apache Software Foundation. More information can be found at https://druid.apache.org.
Apache Druid, Druid, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.

14
Register now for
Druid Summit
November 2-4, 2020
San Francisco, CA
druidsummit.org
Apache Druid is an independent project of The Apache Software Foundation. More information can be found at https://druid.apache.org.
Apache Druid, Druid, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.
DRUID
SUMMIT

Automating CI/CD for Druid Clusters at Athena Health

Recommended

Recommended

More Related Content

More from Imply

More from Imply (15)

Recently uploaded

Recently uploaded (20)

Automating CI/CD for Druid Clusters at Athena Health