SlideShare a Scribd company logo
1 of 29
Download to read offline
Sizing The Elastic Stack for
Security Use Cases
James Spiteri (special thanks to Dave Moore!)
17th March 2021
What we’ll be covering today
- Elasticsearch Internals and Computing Resources - Quick Overview
- Preparation: How much data can I expect?
- Performance: How much can I get out of my hardware?
- Speed: How can I get optimal search performance?
- Using Cross Cluster Search and Data tiers effectively
- Transforms
- Kibana Considerations - The Detection Engine
- Example sizing exercises
Endpoint SIEM
Elastic Security
Computing Resources
Elasticsearch Internals
What’s happening behind the
scenes?
7
Cluster A group of nodes that work together to operate Elasticsearch.
Node A Java process that runs the Elasticsearch software.
Index A group of shards that form a logical data store.
Shard A Lucene index that stores and processes a portion of an Elasticsearch index.
Segment A Lucene segment that immutably stores a portion of a Lucene index.
Document A record that is submitted to and retrieved from an Elasticsearch index.
8
9
Nodes
Role Description Resources
Storage Memory Compute Network
Data Indexes, stores, and searches data Extreme High High Medium
Master Manages cluster state Low Low Low Low
Ingest Transforms inbound data Low Medium High Medium
Machine Learning Processes machine learning models Low Extreme Extreme Medium
Coordinator Delegates requests and merges search results Low Medium Medium Medium
10
Preparation
12
Calculating data storage
requirements
- Ingest a sample
- Monitor size + Ingest Rates
- Calculate going forward
https://www.elastic.co/guide/en/elas
ticsearch/plugins/current/mapper-si
ze.html
13
Summarizing Considerations:
● How much raw data (GB will we index per day?
● How many days will we retain the data for?
● How many days in the hot zone?
● How many days in the warm zone?
● How many replica shards will you enforce?
In general we add 5% or 10% for margin of error and 15% to stay under the disk watermarks.
Performance
15
What is my hardware capable of?
- Run performance benchmarks using Rally
- Understand what throughput you’ll achieve
https://www.elastic.co/blog/rally-1-0
-0-released-benchmark-elasticsear
ch-like-we-do
Search Speed
17
18
It’s all about balance.
Speed
Cluster Size/Cost
19
Keeping in mind:
● Searches run on a single thread per shard
● Shards have overhead
● Shards are balanced by elasticsearch
● Use datastreams and life cycle policies
● Aim for shard sizes between 10GB and 50GB
● Aim for 20 shards or fewer per GB head of memory
Optimise With CCS
21
Optimize using CCS  Cross Cluster Search
It makes sense to have smaller clusters for different users/customers/datasets. CCS makes this
easy.
Transforms
Streamline your logs and events, save time and money.
Kibana and The Detection Engine
25
The Detection Engine
● Detections should be treated like a search
● Detection performance should be monitored regularly
● The Kibana alerting engine can be scaled vertically
and/or horizontally
Kibana task manager workers can be increased in number
to take advantage of vertical scaling, or can be replicated
across separate Kibana instances and scaled horizontally.
When multiple Kibana instances are running, the task
managers will coordinate across the wire to balance the
tasks across the instances. By updating the number of
max_workers inside of the kibana.yml file from it’s default
of 10, you can vertically scale up or down to appropriately
allocate resources more efficiently per Kibana node.
Examples
27
● Total Data (GB  Raw data (GB per day * Number of days
retained * Number of replicas + 1
● Total Storage (GB  Total data (GB * 1  0.15 disk Watermark
threshold + 0.1 Margin of error)
● Total Data Nodes  ROUNDUPTotal storage (GB / Memory per
data node / Memory:Data ratio)
In case of large deployment it's safer to add a node for failover
capacity.
Formulas and Examples:
28
Sizing a small cluster:
You might be pulling logs and metrics from some applications, databases, web
servers, the network, and other supporting services . Let's assume this pulls in
1GB per day and you need to keep the data 9 months.
You can use 8GB memory per node for this small deployment. Let’s do the math:
● Total Data (GB  1GB x (9  30 days) x 2 
540GB
● Total Storage (GB 540GB x (10.150.1 
675GB
● Total Data Nodes  675GB disk / 8GB RAM
/30 ratio = 3 nodes
Sizing a large(r) deployment
Let’s do the math with the following inputs:
● You receive 100GB per day and we need to keep this data for 30
days in the hot zone and 12 months in the warm zone.
● We have 64GB of memory per node with 30GB allocated for heap
and the remaining for OS cache.
● The typical memory:data ratio for the hot zone used is 130 and for
the warm zone is 1160.
If we receive 100GB per day and we have to keep this data for 30 days, this
gives us:
● Total Data (GB in the hot zone = 100GB x 30 days * 2  6000GB
● Total Storage (GB in the hot zone = 6000GB x (10.150.1 
7500GB
● Total Data Nodes in the hot zone = ROUNDUP7500 / 64 / 30  1 =
5 nodes
● Total Data (GB in the warm zone = 100GB x 365 days * 2 
73000GB
● Total Storage (GB in the warm zone = 73000GB x (10.150.1 
91250GB
● Total Data Nodes in the warm zone = ROUNDUP91250 / 64 / 160
 1  10 nodes
Formulas and Examples:
Try free on Cloud:
elastic.co/cloud
Take a quick spin:
demo.elastic.co
Connect on Slack:
ela.st/slack
1 2 3
Join the Elastic community

More Related Content

What's hot

Elasticsearch on Azure
Elasticsearch on AzureElasticsearch on Azure
Elasticsearch on AzureElasticsearch
 
Logging, indicateurs et APM : le trio gagnant pour des opérations réussies
Logging, indicateurs et APM : le trio gagnant pour des opérations réussiesLogging, indicateurs et APM : le trio gagnant pour des opérations réussies
Logging, indicateurs et APM : le trio gagnant pour des opérations réussiesElasticsearch
 
_Search? Made Simple: Elastic + App Search
_Search? Made Simple: Elastic + App Search_Search? Made Simple: Elastic + App Search
_Search? Made Simple: Elastic + App SearchElasticsearch
 
The Fermilab HEPCloud Facility
The Fermilab HEPCloud FacilityThe Fermilab HEPCloud Facility
The Fermilab HEPCloud FacilityClaudio Pontili
 
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring SolutionHow KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring SolutionElasticsearch
 
Industrial production process visualization with the Elastic Stack in real-ti...
Industrial production process visualization with the Elastic Stack in real-ti...Industrial production process visualization with the Elastic Stack in real-ti...
Industrial production process visualization with the Elastic Stack in real-ti...Elasticsearch
 
Big problems Big Data, simple solutions
Big problems Big Data, simple solutionsBig problems Big Data, simple solutions
Big problems Big Data, simple solutionsClaudio Pontili
 
Infrastructure monitoring made easy, from ingest to insight
Infrastructure monitoring made easy, from ingest to insightInfrastructure monitoring made easy, from ingest to insight
Infrastructure monitoring made easy, from ingest to insightElasticsearch
 
Migrating a legacy logging system: Etsy’s journey to Elastic Cloud
Migrating a legacy logging system: Etsy’s journey to Elastic CloudMigrating a legacy logging system: Etsy’s journey to Elastic Cloud
Migrating a legacy logging system: Etsy’s journey to Elastic CloudElasticsearch
 
Architectural Best Practices to Master + Pitfalls to Avoid (P)
Architectural Best Practices to Master + Pitfalls to Avoid (P) Architectural Best Practices to Master + Pitfalls to Avoid (P)
Architectural Best Practices to Master + Pitfalls to Avoid (P) Elasticsearch
 
Logging, Metrics, and APM: The Operations Trifecta (P)
Logging, Metrics, and APM: The Operations Trifecta (P)Logging, Metrics, and APM: The Operations Trifecta (P)
Logging, Metrics, and APM: The Operations Trifecta (P)Elasticsearch
 
Better Search and Business Analytics at Southern Glazer’s Wine & Spirits
Better Search and Business Analytics at Southern Glazer’s Wine & SpiritsBetter Search and Business Analytics at Southern Glazer’s Wine & Spirits
Better Search and Business Analytics at Southern Glazer’s Wine & SpiritsElasticsearch
 
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...Imam Raza
 
MongoDB @ Pango
MongoDB @ PangoMongoDB @ Pango
MongoDB @ PangoMongoDB
 
Data saturday malta - ADX Azure Data Explorer overview
Data saturday malta - ADX Azure Data Explorer overviewData saturday malta - ADX Azure Data Explorer overview
Data saturday malta - ADX Azure Data Explorer overviewRiccardo Zamana
 
Big Data, HPC and Streaming
Big Data, HPC and StreamingBig Data, HPC and Streaming
Big Data, HPC and StreamingAnjani Phuyal
 
Reblaze Case Study on GCP
Reblaze Case Study on GCPReblaze Case Study on GCP
Reblaze Case Study on GCPIdan Tohami
 
Making Sense of Time Series Data in MongoDB
Making Sense of Time Series Data in MongoDBMaking Sense of Time Series Data in MongoDB
Making Sense of Time Series Data in MongoDBMongoDB
 
Google Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsGoogle Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsLynn Langit
 
Community day ppt_kinesisv1.0
Community day ppt_kinesisv1.0Community day ppt_kinesisv1.0
Community day ppt_kinesisv1.0Sridevi Murugayen
 

What's hot (20)

Elasticsearch on Azure
Elasticsearch on AzureElasticsearch on Azure
Elasticsearch on Azure
 
Logging, indicateurs et APM : le trio gagnant pour des opérations réussies
Logging, indicateurs et APM : le trio gagnant pour des opérations réussiesLogging, indicateurs et APM : le trio gagnant pour des opérations réussies
Logging, indicateurs et APM : le trio gagnant pour des opérations réussies
 
_Search? Made Simple: Elastic + App Search
_Search? Made Simple: Elastic + App Search_Search? Made Simple: Elastic + App Search
_Search? Made Simple: Elastic + App Search
 
The Fermilab HEPCloud Facility
The Fermilab HEPCloud FacilityThe Fermilab HEPCloud Facility
The Fermilab HEPCloud Facility
 
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring SolutionHow KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
 
Industrial production process visualization with the Elastic Stack in real-ti...
Industrial production process visualization with the Elastic Stack in real-ti...Industrial production process visualization with the Elastic Stack in real-ti...
Industrial production process visualization with the Elastic Stack in real-ti...
 
Big problems Big Data, simple solutions
Big problems Big Data, simple solutionsBig problems Big Data, simple solutions
Big problems Big Data, simple solutions
 
Infrastructure monitoring made easy, from ingest to insight
Infrastructure monitoring made easy, from ingest to insightInfrastructure monitoring made easy, from ingest to insight
Infrastructure monitoring made easy, from ingest to insight
 
Migrating a legacy logging system: Etsy’s journey to Elastic Cloud
Migrating a legacy logging system: Etsy’s journey to Elastic CloudMigrating a legacy logging system: Etsy’s journey to Elastic Cloud
Migrating a legacy logging system: Etsy’s journey to Elastic Cloud
 
Architectural Best Practices to Master + Pitfalls to Avoid (P)
Architectural Best Practices to Master + Pitfalls to Avoid (P) Architectural Best Practices to Master + Pitfalls to Avoid (P)
Architectural Best Practices to Master + Pitfalls to Avoid (P)
 
Logging, Metrics, and APM: The Operations Trifecta (P)
Logging, Metrics, and APM: The Operations Trifecta (P)Logging, Metrics, and APM: The Operations Trifecta (P)
Logging, Metrics, and APM: The Operations Trifecta (P)
 
Better Search and Business Analytics at Southern Glazer’s Wine & Spirits
Better Search and Business Analytics at Southern Glazer’s Wine & SpiritsBetter Search and Business Analytics at Southern Glazer’s Wine & Spirits
Better Search and Business Analytics at Southern Glazer’s Wine & Spirits
 
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
 
MongoDB @ Pango
MongoDB @ PangoMongoDB @ Pango
MongoDB @ Pango
 
Data saturday malta - ADX Azure Data Explorer overview
Data saturday malta - ADX Azure Data Explorer overviewData saturday malta - ADX Azure Data Explorer overview
Data saturday malta - ADX Azure Data Explorer overview
 
Big Data, HPC and Streaming
Big Data, HPC and StreamingBig Data, HPC and Streaming
Big Data, HPC and Streaming
 
Reblaze Case Study on GCP
Reblaze Case Study on GCPReblaze Case Study on GCP
Reblaze Case Study on GCP
 
Making Sense of Time Series Data in MongoDB
Making Sense of Time Series Data in MongoDBMaking Sense of Time Series Data in MongoDB
Making Sense of Time Series Data in MongoDB
 
Google Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsGoogle Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline Patterns
 
Community day ppt_kinesisv1.0
Community day ppt_kinesisv1.0Community day ppt_kinesisv1.0
Community day ppt_kinesisv1.0
 

Similar to Security sizing meetup

From the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deploymentFrom the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deploymentFaithWestdorp
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB
 
Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...
Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...
Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...Redis Labs
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Shardinguzzal basak
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...NETWAYS
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Hernan Costante
 
Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveSematext Group, Inc.
 
Optimizing elastic search on google compute engine
Optimizing elastic search on google compute engineOptimizing elastic search on google compute engine
Optimizing elastic search on google compute engineBhuvaneshwaran R
 
Running ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in ProductionRunning ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in ProductionSearce Inc
 
The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)Nicolas Poggi
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned Omid Vahdaty
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...Fred de Villamil
 
Writing Applications for Scylla
Writing Applications for ScyllaWriting Applications for Scylla
Writing Applications for ScyllaScyllaDB
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simpleDori Waldman
 
Scale search powered apps with Elastisearch, k8s and go - Maxime Boisvert
Scale search powered apps with Elastisearch, k8s and go - Maxime BoisvertScale search powered apps with Elastisearch, k8s and go - Maxime Boisvert
Scale search powered apps with Elastisearch, k8s and go - Maxime BoisvertWeb à Québec
 
MySQL Cluster (NDB) - Best Practices Percona Live 2017
MySQL Cluster (NDB) - Best Practices Percona Live 2017MySQL Cluster (NDB) - Best Practices Percona Live 2017
MySQL Cluster (NDB) - Best Practices Percona Live 2017Severalnines
 
Security Monitoring for big Infrastructures without a Million Dollar budget
Security Monitoring for big Infrastructures without a Million Dollar budgetSecurity Monitoring for big Infrastructures without a Million Dollar budget
Security Monitoring for big Infrastructures without a Million Dollar budgetJuan Berner
 

Similar to Security sizing meetup (20)

From the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deploymentFrom the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deployment
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
 
Dynomite @ Redis Conference 2016
Dynomite @ Redis Conference 2016Dynomite @ Redis Conference 2016
Dynomite @ Redis Conference 2016
 
Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...
Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...
Dynomite: A Highly Available, Distributed and Scalable Dynamo Layer--Ioannis ...
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Sharding
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
 
Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep dive
 
Big data nyu
Big data nyuBig data nyu
Big data nyu
 
Optimizing elastic search on google compute engine
Optimizing elastic search on google compute engineOptimizing elastic search on google compute engine
Optimizing elastic search on google compute engine
 
Running ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in ProductionRunning ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in Production
 
The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
 
Writing Applications for Scylla
Writing Applications for ScyllaWriting Applications for Scylla
Writing Applications for Scylla
 
Log analytics with ELK stack
Log analytics with ELK stackLog analytics with ELK stack
Log analytics with ELK stack
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simple
 
Scale search powered apps with Elastisearch, k8s and go - Maxime Boisvert
Scale search powered apps with Elastisearch, k8s and go - Maxime BoisvertScale search powered apps with Elastisearch, k8s and go - Maxime Boisvert
Scale search powered apps with Elastisearch, k8s and go - Maxime Boisvert
 
MySQL Cluster (NDB) - Best Practices Percona Live 2017
MySQL Cluster (NDB) - Best Practices Percona Live 2017MySQL Cluster (NDB) - Best Practices Percona Live 2017
MySQL Cluster (NDB) - Best Practices Percona Live 2017
 
Security Monitoring for big Infrastructures without a Million Dollar budget
Security Monitoring for big Infrastructures without a Million Dollar budgetSecurity Monitoring for big Infrastructures without a Million Dollar budget
Security Monitoring for big Infrastructures without a Million Dollar budget
 

More from Daliya Spasova

S2 e elastic observability per i servizi core banking - mar 23, 2021
S2 e   elastic observability per i servizi core banking - mar 23, 2021S2 e   elastic observability per i servizi core banking - mar 23, 2021
S2 e elastic observability per i servizi core banking - mar 23, 2021Daliya Spasova
 
Geo network 4 elasticsearch (1)
Geo network 4   elasticsearch (1)Geo network 4   elasticsearch (1)
Geo network 4 elasticsearch (1)Daliya Spasova
 
Food safety risks the elastic stack to the rescue
Food safety risks  the elastic stack to the rescueFood safety risks  the elastic stack to the rescue
Food safety risks the elastic stack to the rescueDaliya Spasova
 
Q&a on running the elastic stack on kubernetes
Q&a on running the elastic stack on kubernetesQ&a on running the elastic stack on kubernetes
Q&a on running the elastic stack on kubernetesDaliya Spasova
 
Elastic maps application_21_10_20
Elastic maps application_21_10_20Elastic maps application_21_10_20
Elastic maps application_21_10_20Daliya Spasova
 
Covid19 map presentation
Covid19 map presentationCovid19 map presentation
Covid19 map presentationDaliya Spasova
 
Data exploration using elastic stack for beginners
Data exploration using elastic stack for beginnersData exploration using elastic stack for beginners
Data exploration using elastic stack for beginnersDaliya Spasova
 
Run the elastic stack on kubernetes with eck
Run the elastic stack on kubernetes with eck   Run the elastic stack on kubernetes with eck
Run the elastic stack on kubernetes with eck Daliya Spasova
 
Dynamic presentations with_canvas
Dynamic presentations with_canvasDynamic presentations with_canvas
Dynamic presentations with_canvasDaliya Spasova
 
2020 07-30 elastic agent + ingest management
2020 07-30 elastic agent + ingest management2020 07-30 elastic agent + ingest management
2020 07-30 elastic agent + ingest managementDaliya Spasova
 
Spring meetup elasticsearch
Spring meetup elasticsearchSpring meetup elasticsearch
Spring meetup elasticsearchDaliya Spasova
 

More from Daliya Spasova (16)

Limitless xdr meetup
Limitless xdr meetupLimitless xdr meetup
Limitless xdr meetup
 
S2 e elastic observability per i servizi core banking - mar 23, 2021
S2 e   elastic observability per i servizi core banking - mar 23, 2021S2 e   elastic observability per i servizi core banking - mar 23, 2021
S2 e elastic observability per i servizi core banking - mar 23, 2021
 
Verba @ elastic
Verba @ elasticVerba @ elastic
Verba @ elastic
 
Geo network 4 elasticsearch (1)
Geo network 4   elasticsearch (1)Geo network 4   elasticsearch (1)
Geo network 4 elasticsearch (1)
 
Food safety risks the elastic stack to the rescue
Food safety risks  the elastic stack to the rescueFood safety risks  the elastic stack to the rescue
Food safety risks the elastic stack to the rescue
 
Q&a on running the elastic stack on kubernetes
Q&a on running the elastic stack on kubernetesQ&a on running the elastic stack on kubernetes
Q&a on running the elastic stack on kubernetes
 
October 2020 meetup
October 2020 meetupOctober 2020 meetup
October 2020 meetup
 
Elastic maps application_21_10_20
Elastic maps application_21_10_20Elastic maps application_21_10_20
Elastic maps application_21_10_20
 
Covid19 map presentation
Covid19 map presentationCovid19 map presentation
Covid19 map presentation
 
Data exploration using elastic stack for beginners
Data exploration using elastic stack for beginnersData exploration using elastic stack for beginners
Data exploration using elastic stack for beginners
 
Run the elastic stack on kubernetes with eck
Run the elastic stack on kubernetes with eck   Run the elastic stack on kubernetes with eck
Run the elastic stack on kubernetes with eck
 
Dynamic presentations with_canvas
Dynamic presentations with_canvasDynamic presentations with_canvas
Dynamic presentations with_canvas
 
Kibana webinar (1)
Kibana webinar (1)Kibana webinar (1)
Kibana webinar (1)
 
2020 07-30 elastic agent + ingest management
2020 07-30 elastic agent + ingest management2020 07-30 elastic agent + ingest management
2020 07-30 elastic agent + ingest management
 
Spring meetup elasticsearch
Spring meetup elasticsearchSpring meetup elasticsearch
Spring meetup elasticsearch
 
Meetup 13 08 2020
Meetup 13 08 2020Meetup 13 08 2020
Meetup 13 08 2020
 

Recently uploaded

microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docxPoojaSen20
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptxPoojaSen20
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 

Recently uploaded (20)

microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 

Security sizing meetup

  • 1. Sizing The Elastic Stack for Security Use Cases James Spiteri (special thanks to Dave Moore!) 17th March 2021
  • 2. What we’ll be covering today - Elasticsearch Internals and Computing Resources - Quick Overview - Preparation: How much data can I expect? - Performance: How much can I get out of my hardware? - Speed: How can I get optimal search performance? - Using Cross Cluster Search and Data tiers effectively - Transforms - Kibana Considerations - The Detection Engine - Example sizing exercises
  • 5.
  • 7. 7 Cluster A group of nodes that work together to operate Elasticsearch. Node A Java process that runs the Elasticsearch software. Index A group of shards that form a logical data store. Shard A Lucene index that stores and processes a portion of an Elasticsearch index. Segment A Lucene segment that immutably stores a portion of a Lucene index. Document A record that is submitted to and retrieved from an Elasticsearch index.
  • 8. 8
  • 9. 9 Nodes Role Description Resources Storage Memory Compute Network Data Indexes, stores, and searches data Extreme High High Medium Master Manages cluster state Low Low Low Low Ingest Transforms inbound data Low Medium High Medium Machine Learning Processes machine learning models Low Extreme Extreme Medium Coordinator Delegates requests and merges search results Low Medium Medium Medium
  • 10. 10
  • 12. 12 Calculating data storage requirements - Ingest a sample - Monitor size + Ingest Rates - Calculate going forward https://www.elastic.co/guide/en/elas ticsearch/plugins/current/mapper-si ze.html
  • 13. 13 Summarizing Considerations: ● How much raw data (GB will we index per day? ● How many days will we retain the data for? ● How many days in the hot zone? ● How many days in the warm zone? ● How many replica shards will you enforce? In general we add 5% or 10% for margin of error and 15% to stay under the disk watermarks.
  • 15. 15 What is my hardware capable of? - Run performance benchmarks using Rally - Understand what throughput you’ll achieve https://www.elastic.co/blog/rally-1-0 -0-released-benchmark-elasticsear ch-like-we-do
  • 17. 17
  • 18. 18 It’s all about balance. Speed Cluster Size/Cost
  • 19. 19 Keeping in mind: ● Searches run on a single thread per shard ● Shards have overhead ● Shards are balanced by elasticsearch ● Use datastreams and life cycle policies ● Aim for shard sizes between 10GB and 50GB ● Aim for 20 shards or fewer per GB head of memory
  • 21. 21 Optimize using CCS  Cross Cluster Search It makes sense to have smaller clusters for different users/customers/datasets. CCS makes this easy.
  • 22. Transforms Streamline your logs and events, save time and money.
  • 23.
  • 24. Kibana and The Detection Engine
  • 25. 25 The Detection Engine ● Detections should be treated like a search ● Detection performance should be monitored regularly ● The Kibana alerting engine can be scaled vertically and/or horizontally Kibana task manager workers can be increased in number to take advantage of vertical scaling, or can be replicated across separate Kibana instances and scaled horizontally. When multiple Kibana instances are running, the task managers will coordinate across the wire to balance the tasks across the instances. By updating the number of max_workers inside of the kibana.yml file from it’s default of 10, you can vertically scale up or down to appropriately allocate resources more efficiently per Kibana node.
  • 27. 27 ● Total Data (GB  Raw data (GB per day * Number of days retained * Number of replicas + 1 ● Total Storage (GB  Total data (GB * 1  0.15 disk Watermark threshold + 0.1 Margin of error) ● Total Data Nodes  ROUNDUPTotal storage (GB / Memory per data node / Memory:Data ratio) In case of large deployment it's safer to add a node for failover capacity. Formulas and Examples:
  • 28. 28 Sizing a small cluster: You might be pulling logs and metrics from some applications, databases, web servers, the network, and other supporting services . Let's assume this pulls in 1GB per day and you need to keep the data 9 months. You can use 8GB memory per node for this small deployment. Let’s do the math: ● Total Data (GB  1GB x (9  30 days) x 2  540GB ● Total Storage (GB 540GB x (10.150.1  675GB ● Total Data Nodes  675GB disk / 8GB RAM /30 ratio = 3 nodes Sizing a large(r) deployment Let’s do the math with the following inputs: ● You receive 100GB per day and we need to keep this data for 30 days in the hot zone and 12 months in the warm zone. ● We have 64GB of memory per node with 30GB allocated for heap and the remaining for OS cache. ● The typical memory:data ratio for the hot zone used is 130 and for the warm zone is 1160. If we receive 100GB per day and we have to keep this data for 30 days, this gives us: ● Total Data (GB in the hot zone = 100GB x 30 days * 2  6000GB ● Total Storage (GB in the hot zone = 6000GB x (10.150.1  7500GB ● Total Data Nodes in the hot zone = ROUNDUP7500 / 64 / 30  1 = 5 nodes ● Total Data (GB in the warm zone = 100GB x 365 days * 2  73000GB ● Total Storage (GB in the warm zone = 73000GB x (10.150.1  91250GB ● Total Data Nodes in the warm zone = ROUNDUP91250 / 64 / 160  1  10 nodes Formulas and Examples:
  • 29. Try free on Cloud: elastic.co/cloud Take a quick spin: demo.elastic.co Connect on Slack: ela.st/slack 1 2 3 Join the Elastic community