RADU CALIN, JUNE 2020
INCREMENTAL
RESHARDING FOR
ELASTICSEARCH
▪ Brief Context Presentation
▪ Intro on Elasticsearch
▪ The Resharding Challenge
▪ Our Solution
▪ Performance Overview
▪ Q&A
AGENDA
CROWDSTRIKE
▪ Delivering large scale cloud-based security
products @ CrowdStrike
▪ A founder of the Golang Bucharest Community
▪ Built several large-scale architectures in the
past, some processing >300M events per day
▪ linkedin.com/in/rcalin
Engineering Manager @ CrowdStrike
RADU CALIN
CROWDSTRIKE
CROWDSTRIKE FACTS
CROWDSTRIKE
Protecting
44 of Fortune 100
Companies
37 of 100 Top Global
Companies
Company valued at
$20B
(CRWD on NASDAQ stock
market)
Growing
Revenue 90% Year over
Year
Team globally - fully
remote company
CrowdStrike Cloud
Ingesting
100s of billions of events
daily
Leveraging
100s of Microservices
10s of PBs of data storage
1000s of servers
On
Multiple cloud
environments
And multiple service
providers
CROWDSTRIKE
LEVERAGING SCALE TO STOP BREACHES
CROWDSTRIKE
“A distributed, open source search and analytics
engine for all types of data, including textual,
numerical, geospatial, structured, and
unstructured.” - elastic.co
CROWDSTRIKE
THE BASIC ES STRUCTURE
CLUSTER (2 replicas, evenly balanced)
NODE 1
Replica Shard 2
Primary Shard 1
NODE 2
Primary Shard 2
Replica Shard 3
NODE 3
Primary Shard 3
Replica Shard 4
Replica Shard 3
NODE 4
Primary Shard 4
Replica Shard 1
Replica Shard 4 Replica Shard 1 Replica Shard 2
ELASTIC ☺ FACTS
CROWDSTRIKE
Fixed number of
shards once index
created
50 GB maximum
recommended
shard size
Data growth and
use-cases can
vary dramatically
CROWDSTRIKE
USE CASE – PROCESSING
X (M) Hosts
25 data points per host
Mutable
indicators
metadata
1000 * X (M) Indicators
40 Indicators per DP
CROWDSTRIKE
USE CASE – QUERYING
Falcon UI
Heavy aggregations
Gateway
Data dump
Queries scoped
by customer
Max Y (B) Indicators
per customer
{ }REST
Client
OVERALL CHALLENGE
CROWDSTRIKE
Ensuring
boundless
horizontal
scalability
While
Maintaining
shards at max
40 GB
With maximum
uptime &
consistency
SO
CROWDSTRIKE
Why not employ
aliases and time
based rolled over
indices?
CROWDSTRIKE
We’re processing
mutable data. Plus
aliases don’t support
GET operations. Plus
refresh interval ☹
BECAUSE
SO WE’LL NEED TO MIGRATE DATA ONE WAY OR ANOTHER
CROWDSTRIKE
Full or partial dataset
Index 1
(N Shards)
Index 2
(M Shards)
THEN
CROWDSTRIKE
Why not stop the processing,
create a bigger index, use
REINDEX and restart
processing?
CROWDSTRIKE
We can’t stop the
processing.
Consistency and
availability are a must.
☹
BECAUSE
YET
CROWDSTRIKE
How will you migrate
the data from one index
to another, while
keeping processing
going and not using
aliases?
CROWDSTRIKE
WELL…
CROWDSTRIKE
LET’S BUILD OUR OWN INDEX (POD) MANAGEMENT SYSTEM
SQL DB
Customer -> Pod
Memberships
S3 Bucket
JSON Data
Microservice
Data fetched by dedicated
client lib, memory cache included
Data dump
Autoprovision to
default index,
if needed
Events for a particular customer
Manual management
ES Cluster
ES Docs
CROWDSTRIKE
DATA MODEL? SIMPLE!
pods
id (PK)
name
description
cluster_url
cluster_version
namespace
index_name
data_model_name
data_model_version
is_default
updated_timestamp
created_timestamp
pod_memberships
customer_id (CPK)
pod_id (CPK)
operation [read/write] (CPK)
updated_timestamp
created_timestamp
CROWDSTRIKE
HOW IT WORKS – INITIAL STATE
ES Index X
Read*
Write
* Do not use ES as a source of truth! Unless… you do
Pod Management
Customer events
Get read,
write indices
CROWDSTRIKE
HOW IT WORKS – NEW INDEX
ES Index Y
Read
Write
ES Index X
Customer events
Get read,
write indices
Pod Management
CROWDSTRIKE
HOW IT WORKS – DUAL WRITE
ES Index Y
Set dual write for
customer Z to Index X
and Y
Read
Write
ES Index X
Customer events
Get read,
write indices
Pod Management
CROWDSTRIKE
HOW IT WORKS – DUAL WRITE
ES Index Y
Write
Read
Write
ES Index X
Customer events
Get read,
write indices
Pod Management
CROWDSTRIKE
HOW IT WORKS – SYNC INDICES
ES Index Y
POST _reindex*
* conflicts: proceed &
op_type: create
Read
WriteWrite
ES Index X
Customer events
Get read,
write indices
Pod Management
CROWDSTRIKE
HOW IT WORKS – SYNC INDICES
ES Index Y
Reindex in progress2
1 Transaction, overwrites existing docs
2 Skips existing docs
Read
Write1
Write1
ES Index X
Customer events
Get read,
write indices
Pod Management
CROWDSTRIKE
HOW IT WORKS – SYNC INDICES
ES Index Y
Reindex finished
(indices in sync)
Read
WriteWrite
ES Index X
Customer events
Get read,
write indices
Pod Management
CROWDSTRIKE
HOW IT WORKS – SWITCH API READS
ES Index Y
Set reads to Index Y
Read
WriteWrite
ES Index X
Customer events
Get read,
write indices
Pod Management
CROWDSTRIKE
HOW IT WORKS – SWITCH API READS
ES Index Y
Read
WriteWrite
ES Index X
Customer events
Get read,
write indices
Pod Management
CROWDSTRIKE
HOW IT WORKS – CUT-OFF WRITES
Customer events
Remove writes from
Index X
ES Index Y
Read
Get read,
write indices
WriteWrite
ES Index X
Pod Management
CROWDSTRIKE
HOW IT WORKS – CUT-OFF WRITES
ES Index Y
Read
Write
ES Index X
Customer events
Get read,
write indices
Pod Management
CROWDSTRIKE
HOW IT WORKS - CLEANUP
ES Index Y
POST _delete_by_query
Read
Write
ES Index X
Customer events
Get read,
write indices
Pod Management
STATS FROM AN ACTUAL MIGRATION
CROWDSTRIKE
16 AWS
m4.4xlarge
data nodes
16 primary shards per index
1 replica per index
1 m refresh interval per index
230M primary
documents
90 GB of primary data
48 slices (best config)
4000 bulk size (best config)
40 minutes
@ 70% CPU
95K* index/s for
migration
130K* index/s for cluster
*primary indices
CROWDSTRIKE
PRODUCTION CLUSTER SNAPSHOT
CROWDSTRIKE
COME
JOIN
US
crowdstrike.com/careers
CROWDSTRIKE
Q&A

Incremental Resharding for Elasticsearch

  • 1.
    RADU CALIN, JUNE2020 INCREMENTAL RESHARDING FOR ELASTICSEARCH
  • 2.
    ▪ Brief ContextPresentation ▪ Intro on Elasticsearch ▪ The Resharding Challenge ▪ Our Solution ▪ Performance Overview ▪ Q&A AGENDA CROWDSTRIKE
  • 3.
    ▪ Delivering largescale cloud-based security products @ CrowdStrike ▪ A founder of the Golang Bucharest Community ▪ Built several large-scale architectures in the past, some processing >300M events per day ▪ linkedin.com/in/rcalin Engineering Manager @ CrowdStrike RADU CALIN CROWDSTRIKE
  • 4.
    CROWDSTRIKE FACTS CROWDSTRIKE Protecting 44 ofFortune 100 Companies 37 of 100 Top Global Companies Company valued at $20B (CRWD on NASDAQ stock market) Growing Revenue 90% Year over Year Team globally - fully remote company CrowdStrike Cloud Ingesting 100s of billions of events daily Leveraging 100s of Microservices 10s of PBs of data storage 1000s of servers On Multiple cloud environments And multiple service providers
  • 5.
  • 6.
    CROWDSTRIKE “A distributed, opensource search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured.” - elastic.co
  • 7.
    CROWDSTRIKE THE BASIC ESSTRUCTURE CLUSTER (2 replicas, evenly balanced) NODE 1 Replica Shard 2 Primary Shard 1 NODE 2 Primary Shard 2 Replica Shard 3 NODE 3 Primary Shard 3 Replica Shard 4 Replica Shard 3 NODE 4 Primary Shard 4 Replica Shard 1 Replica Shard 4 Replica Shard 1 Replica Shard 2
  • 8.
    ELASTIC ☺ FACTS CROWDSTRIKE Fixednumber of shards once index created 50 GB maximum recommended shard size Data growth and use-cases can vary dramatically
  • 9.
    CROWDSTRIKE USE CASE –PROCESSING X (M) Hosts 25 data points per host Mutable indicators metadata 1000 * X (M) Indicators 40 Indicators per DP
  • 10.
    CROWDSTRIKE USE CASE –QUERYING Falcon UI Heavy aggregations Gateway Data dump Queries scoped by customer Max Y (B) Indicators per customer { }REST Client
  • 11.
  • 12.
    SO CROWDSTRIKE Why not employ aliasesand time based rolled over indices?
  • 13.
    CROWDSTRIKE We’re processing mutable data.Plus aliases don’t support GET operations. Plus refresh interval ☹ BECAUSE
  • 14.
    SO WE’LL NEEDTO MIGRATE DATA ONE WAY OR ANOTHER CROWDSTRIKE Full or partial dataset Index 1 (N Shards) Index 2 (M Shards)
  • 15.
    THEN CROWDSTRIKE Why not stopthe processing, create a bigger index, use REINDEX and restart processing?
  • 16.
    CROWDSTRIKE We can’t stopthe processing. Consistency and availability are a must. ☹ BECAUSE
  • 17.
    YET CROWDSTRIKE How will youmigrate the data from one index to another, while keeping processing going and not using aliases?
  • 18.
  • 19.
    CROWDSTRIKE LET’S BUILD OUROWN INDEX (POD) MANAGEMENT SYSTEM SQL DB Customer -> Pod Memberships S3 Bucket JSON Data Microservice Data fetched by dedicated client lib, memory cache included Data dump Autoprovision to default index, if needed Events for a particular customer Manual management ES Cluster ES Docs
  • 20.
    CROWDSTRIKE DATA MODEL? SIMPLE! pods id(PK) name description cluster_url cluster_version namespace index_name data_model_name data_model_version is_default updated_timestamp created_timestamp pod_memberships customer_id (CPK) pod_id (CPK) operation [read/write] (CPK) updated_timestamp created_timestamp
  • 21.
    CROWDSTRIKE HOW IT WORKS– INITIAL STATE ES Index X Read* Write * Do not use ES as a source of truth! Unless… you do Pod Management Customer events Get read, write indices
  • 22.
    CROWDSTRIKE HOW IT WORKS– NEW INDEX ES Index Y Read Write ES Index X Customer events Get read, write indices Pod Management
  • 23.
    CROWDSTRIKE HOW IT WORKS– DUAL WRITE ES Index Y Set dual write for customer Z to Index X and Y Read Write ES Index X Customer events Get read, write indices Pod Management
  • 24.
    CROWDSTRIKE HOW IT WORKS– DUAL WRITE ES Index Y Write Read Write ES Index X Customer events Get read, write indices Pod Management
  • 25.
    CROWDSTRIKE HOW IT WORKS– SYNC INDICES ES Index Y POST _reindex* * conflicts: proceed & op_type: create Read WriteWrite ES Index X Customer events Get read, write indices Pod Management
  • 26.
    CROWDSTRIKE HOW IT WORKS– SYNC INDICES ES Index Y Reindex in progress2 1 Transaction, overwrites existing docs 2 Skips existing docs Read Write1 Write1 ES Index X Customer events Get read, write indices Pod Management
  • 27.
    CROWDSTRIKE HOW IT WORKS– SYNC INDICES ES Index Y Reindex finished (indices in sync) Read WriteWrite ES Index X Customer events Get read, write indices Pod Management
  • 28.
    CROWDSTRIKE HOW IT WORKS– SWITCH API READS ES Index Y Set reads to Index Y Read WriteWrite ES Index X Customer events Get read, write indices Pod Management
  • 29.
    CROWDSTRIKE HOW IT WORKS– SWITCH API READS ES Index Y Read WriteWrite ES Index X Customer events Get read, write indices Pod Management
  • 30.
    CROWDSTRIKE HOW IT WORKS– CUT-OFF WRITES Customer events Remove writes from Index X ES Index Y Read Get read, write indices WriteWrite ES Index X Pod Management
  • 31.
    CROWDSTRIKE HOW IT WORKS– CUT-OFF WRITES ES Index Y Read Write ES Index X Customer events Get read, write indices Pod Management
  • 32.
    CROWDSTRIKE HOW IT WORKS- CLEANUP ES Index Y POST _delete_by_query Read Write ES Index X Customer events Get read, write indices Pod Management
  • 33.
    STATS FROM ANACTUAL MIGRATION CROWDSTRIKE 16 AWS m4.4xlarge data nodes 16 primary shards per index 1 replica per index 1 m refresh interval per index 230M primary documents 90 GB of primary data 48 slices (best config) 4000 bulk size (best config) 40 minutes @ 70% CPU 95K* index/s for migration 130K* index/s for cluster *primary indices
  • 34.
  • 35.
  • 36.