Taking Elasticsearch From 0 to 88mph

Taking Elasticsearch from
0 to 88 mph
By: Molly Struve
500 million documents
1 million processed daily
Over 3 billion documents
200 million processed daily
The average company has...
60 thousand
assets
24 million
vulnerabilities?
MySQL Elasticsearch
Taking Elasticsearch From 0 to 88mph
Taking Elasticsearch From 0 to 88mph
Refresh Interval
In-memory buffer
In-memory buffer
Toggle the Refresh Interval
curl -XPUT 'localhost:9200/my_index/_settings' -d '{
"index" : {
"refresh_interval" : "30s"
}
}'
In addition...
● ?refresh=wait_for option
● Manual refresh
○ curl -XPOST 'localhost:9200/my_index/_refresh'
Speed up Indexing
● Toggle the refresh interval1
2
3
Speed Up Searching
4
5
6
7
● 200 thousand assets
● 100 million
vulnerabilities
Taking Elasticsearch From 0 to 88mph
Taking Elasticsearch From 0 to 88mph
Bulk Processing
POST _bulk
When bulk processing your data...
● Start with batches of 100 and double the size from there until indexing time
plateaus
● Too large requests can put memory pressure on Elasticsearch so keep it under
a couple tens of megabytes
Bulk process data
Speed up Indexing
Toggle the refresh interval1
2
3
Speed Up Searching
4
5
6
7
MySQL
Elasticsearch
Elasticsearch
MySQL
429 Too Many Requests
Route Your Documents
8 Threads 4 Threads
Shard 2
Shard 3
Shard 4
4 Threads
4 Threads
Shard 1
Shard 2
Shard 3
Shard 4
Shard 1
Shard 1
Shard 2
Shard 2
2 Threads
2 Threads
Shard 1
Shard 3
Shard 4
Shard 4
Shard 3
Routing
shard = hash(_routing) % number_of_primary_shards
PUT my_index/_doc/1?routing=custom
{
"title": "This is a document"
}
document _id custom
4 Threads
2 Threads
2 Threads
Shard 1
Shard 2
Shard 2
Shard 1
Shard 3
Shard 4
Shard 4
Shard 3
Route 1
Route 2
Route 2
Route 1
Route 3
Route 4
Route 4
Route 3
Parent -> Child
Asset -> Vulnerabilities
PUT my_index/_vulnerability/1?parent=2
{
"title": "This is a document"
}
Route your documents
Bulk process data
Speed up Indexing
Toggle the refresh interval1
2
3
Speed Up Searching
4
5
6
7
Speed up Searching
Typical Logging Cluster
logstash_2018.09.01
logstash_2018.09.02
logstash_2018.09.03
logstash_2018.09.04
Search
Group Your Data
Client 1 Client 2 Client 3 Client 4
Speed Up Searching
Group your data
Route your documents
Bulk process data
Speed up Indexing
Toggle the refresh interval1
2
3
4
5
6
7
Does NOT score documents
and only cares if the
document matches the search
criteria or not
Queries
Scores documents based on
how well they match the
search criteria.
Filters
Easy/Fast Hard/Slow Easy/Fast Hard/Slow
2.x 5.x
Taking Elasticsearch From 0 to 88mph
Taking Elasticsearch From 0 to 88mph
Taking Elasticsearch From 0 to 88mph
Use Filters Whenever
Possible
Filters Are Friends!
Filters
GET /_search
{
"query": {
"bool" : {
"must" : {
"term" : { "user" : "kimchy" }
},
"filter": {
"term" : { "tag" : "tech" }
}
}
}
}
Use filters whenever possible
Speed Up Searching
Group your data
Route your documents
Bulk process data
Speed up Indexing
Toggle the refresh interval1
2
3
4
5
6
7
Taking Elasticsearch From 0 to 88mph
Store IDs as keywords
Why?
● Numeric mapping types are optimized for RANGE
queries
● Keyword mapping types are optimized for TERM
queries
Taking Elasticsearch From 0 to 88mph
30% increase in
search speed
Store IDs as keywords
Use filters whenever possible
Speed Up Searching
Group your data
Route your documents
Bulk process data
Speed up Indexing
Toggle the refresh interval1
2
3
4
5
6
7
Taking Elasticsearch From 0 to 88mph
Taking Elasticsearch From 0 to 88mph
Don’t let your users slow
you down
Taking Elasticsearch From 0 to 88mph
Define keywords for searching
Don’t let your users slow you down
Store IDs as keywords
Use filters whenever possible
Speed Up Searching
Group your data
Route your documents
Bulk process data
Speed up Indexing
Toggle the refresh interval1
2
3
4
5
6
7
Questions?
Contact
https://www.linkedin.com/in/mollystruve/
https://github.com/mstruve
@molly_struve
molly.struve@gmail.com
1 of 56

Recommended

Mongodb in-anger-boston-rb-2011 by
Mongodb in-anger-boston-rb-2011Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011bostonrb
571 views50 slides
Webinar Slides: Tungsten Replicator for Elasticsearch - Real-time data loadin... by
Webinar Slides: Tungsten Replicator for Elasticsearch - Real-time data loadin...Webinar Slides: Tungsten Replicator for Elasticsearch - Real-time data loadin...
Webinar Slides: Tungsten Replicator for Elasticsearch - Real-time data loadin...Continuent
162 views12 slides
Deep Dive Into Elasticsearch by
Deep Dive Into ElasticsearchDeep Dive Into Elasticsearch
Deep Dive Into ElasticsearchKnoldus Inc.
16.3K views27 slides
Elasticsearch for Data Analytics by
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data AnalyticsFelipe
3.2K views66 slides
Introduction to elasticsearch by
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearchFlorian Hopf
1.4K views53 slides
Automated Slow Query Analysis: Dex the Index Robot by
Automated Slow Query Analysis: Dex the Index RobotAutomated Slow Query Analysis: Dex the Index Robot
Automated Slow Query Analysis: Dex the Index RobotMongoDB
2.1K views34 slides

More Related Content

What's hot

Replicating application data into materialized views by
Replicating application data into materialized viewsReplicating application data into materialized views
Replicating application data into materialized viewsZach Cox
657 views24 slides
Elastic meetup june16 by
Elastic meetup june16Elastic meetup june16
Elastic meetup june16Miguel Bosin
1.1K views28 slides
Elasticsearch - under the hood by
Elasticsearch - under the hoodElasticsearch - under the hood
Elasticsearch - under the hoodSmartCat
1.1K views52 slides
Chapter 23 by
Chapter 23Chapter 23
Chapter 23application developer
539 views19 slides
MongoDB Replication and Sharding by
MongoDB Replication and ShardingMongoDB Replication and Sharding
MongoDB Replication and ShardingTharun Srinivasa
1.7K views20 slides
2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014) by
2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014)2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014)
2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014)Dr.-Ing. Thomas Hartmann
702 views17 slides

What's hot(20)

Replicating application data into materialized views by Zach Cox
Replicating application data into materialized viewsReplicating application data into materialized views
Replicating application data into materialized views
Zach Cox657 views
Elastic meetup june16 by Miguel Bosin
Elastic meetup june16Elastic meetup june16
Elastic meetup june16
Miguel Bosin1.1K views
Elasticsearch - under the hood by SmartCat
Elasticsearch - under the hoodElasticsearch - under the hood
Elasticsearch - under the hood
SmartCat1.1K views
2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014) by Dr.-Ing. Thomas Hartmann
2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014)2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014)
2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014)
Elasticsearch presentation 1 by Maruf Hassan
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1
Maruf Hassan4.6K views
Free rtos workshop 2 @nuu by 紀榮 陳
Free rtos workshop 2 @nuuFree rtos workshop 2 @nuu
Free rtos workshop 2 @nuu
紀榮 陳146 views
Big Data DC - Analytics at Clearspring by abramsm
Big Data DC - Analytics at ClearspringBig Data DC - Analytics at Clearspring
Big Data DC - Analytics at Clearspring
abramsm1.2K views
Elasticsearch Arcihtecture & What's New in Version 5 by Burak TUNGUT
Elasticsearch Arcihtecture & What's New in Version 5Elasticsearch Arcihtecture & What's New in Version 5
Elasticsearch Arcihtecture & What's New in Version 5
Burak TUNGUT383 views
Elasticsearch War Stories by Arno Broekhof
Elasticsearch War StoriesElasticsearch War Stories
Elasticsearch War Stories
Arno Broekhof138 views
Sekilas PHP + mongoDB by Hadi Ariawan
Sekilas PHP + mongoDBSekilas PHP + mongoDB
Sekilas PHP + mongoDB
Hadi Ariawan2.2K views
ElasticSearch for data mining by William Simms
ElasticSearch for data mining ElasticSearch for data mining
ElasticSearch for data mining
William Simms6.7K views
Grails Data by kendrew
Grails DataGrails Data
Grails Data
kendrew203 views
Overview on NoSQL and MongoDB by harithakannan
Overview on NoSQL and MongoDBOverview on NoSQL and MongoDB
Overview on NoSQL and MongoDB
harithakannan707 views

Similar to Taking Elasticsearch From 0 to 88mph

Masterclass - Redshift by
Masterclass - RedshiftMasterclass - Redshift
Masterclass - RedshiftAmazon Web Services
2.8K views82 slides
Análisis del roadmap del Elastic Stack by
Análisis del roadmap del Elastic StackAnálisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic StackElasticsearch
295 views72 slides
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day by
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
1.6K views93 slides
My Database Skills Killed the Server by
My Database Skills Killed the ServerMy Database Skills Killed the Server
My Database Skills Killed the ServerColdFusionConference
587 views71 slides
Amazon Redshift Masterclass by
Amazon Redshift MasterclassAmazon Redshift Masterclass
Amazon Redshift MasterclassAmazon Web Services
9.6K views90 slides
Elastic Stack Roadmap by
Elastic Stack RoadmapElastic Stack Roadmap
Elastic Stack RoadmapImma Valls Bernaus
96 views72 slides

Similar to Taking Elasticsearch From 0 to 88mph (20)

Análisis del roadmap del Elastic Stack by Elasticsearch
Análisis del roadmap del Elastic StackAnálisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic Stack
Elasticsearch295 views
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day by C4Media
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
C4Media1.6K views
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics by Amazon Web Services
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
Amazon Web Services54.2K views
Best Practices for Building Robust Data Platform with Apache Spark and Delta by Databricks
Best Practices for Building Robust Data Platform with Apache Spark and DeltaBest Practices for Building Robust Data Platform with Apache Spark and Delta
Best Practices for Building Robust Data Platform with Apache Spark and Delta
Databricks756 views
Centralized log-management-with-elastic-stack by Rich Lee
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
Rich Lee480 views
Webinar: Best Practices for Getting Started with MongoDB by MongoDB
Webinar: Best Practices for Getting Started with MongoDBWebinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDB
MongoDB6.8K views
Black friday logs - Scaling Elasticsearch by Sylvain Wallez
Black friday logs - Scaling ElasticsearchBlack friday logs - Scaling Elasticsearch
Black friday logs - Scaling Elasticsearch
Sylvain Wallez1.4K views
Users as Data by pdingles
Users as DataUsers as Data
Users as Data
pdingles1.1K views
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018 by Codemotion
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Codemotion171 views
Introduction to Elasticsearch by Ruslan Zavacky
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
Ruslan Zavacky7.6K views
[2D1]Elasticsearch 성능 최적화 by NAVER D2
[2D1]Elasticsearch 성능 최적화[2D1]Elasticsearch 성능 최적화
[2D1]Elasticsearch 성능 최적화
NAVER D230.5K views
AWS October Webinar Series - Introducing Amazon Elasticsearch Service by Amazon Web Services
AWS October Webinar Series - Introducing Amazon Elasticsearch ServiceAWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
Amazon Web Services41.7K views
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite by Gigaom
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan WaiteStructure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Gigaom1K views
[2 d1] elasticsearch 성능 최적화 by Henry Jeong
[2 d1] elasticsearch 성능 최적화[2 d1] elasticsearch 성능 최적화
[2 d1] elasticsearch 성능 최적화
Henry Jeong3.5K views

More from Molly Struve

LeadDev NYC 2022: Calling Out a Terrible On-call System by
LeadDev NYC 2022: Calling Out a Terrible On-call SystemLeadDev NYC 2022: Calling Out a Terrible On-call System
LeadDev NYC 2022: Calling Out a Terrible On-call SystemMolly Struve
77 views129 slides
Talk Horsey to Me by
Talk Horsey to MeTalk Horsey to Me
Talk Horsey to MeMolly Struve
85 views31 slides
Eight Timezones, One Cohesive Team by
Eight Timezones, One Cohesive TeamEight Timezones, One Cohesive Team
Eight Timezones, One Cohesive TeamMolly Struve
122 views69 slides
All Day DevOps: Calling Out A Terrible On-Call System by
All Day DevOps: Calling Out A Terrible On-Call SystemAll Day DevOps: Calling Out A Terrible On-Call System
All Day DevOps: Calling Out A Terrible On-Call SystemMolly Struve
184 views137 slides
Talk Horsey To Me by
Talk Horsey To MeTalk Horsey To Me
Talk Horsey To MeMolly Struve
55 views31 slides
Elasticsearch 5 and Bust (RubyConf 2019) by
Elasticsearch 5 and Bust (RubyConf 2019)Elasticsearch 5 and Bust (RubyConf 2019)
Elasticsearch 5 and Bust (RubyConf 2019)Molly Struve
374 views110 slides

More from Molly Struve(13)

LeadDev NYC 2022: Calling Out a Terrible On-call System by Molly Struve
LeadDev NYC 2022: Calling Out a Terrible On-call SystemLeadDev NYC 2022: Calling Out a Terrible On-call System
LeadDev NYC 2022: Calling Out a Terrible On-call System
Molly Struve77 views
Eight Timezones, One Cohesive Team by Molly Struve
Eight Timezones, One Cohesive TeamEight Timezones, One Cohesive Team
Eight Timezones, One Cohesive Team
Molly Struve122 views
All Day DevOps: Calling Out A Terrible On-Call System by Molly Struve
All Day DevOps: Calling Out A Terrible On-Call SystemAll Day DevOps: Calling Out A Terrible On-Call System
All Day DevOps: Calling Out A Terrible On-Call System
Molly Struve184 views
Elasticsearch 5 and Bust (RubyConf 2019) by Molly Struve
Elasticsearch 5 and Bust (RubyConf 2019)Elasticsearch 5 and Bust (RubyConf 2019)
Elasticsearch 5 and Bust (RubyConf 2019)
Molly Struve374 views
Creating a Scalable Monitoring System That Everyone Will Love ADDO by Molly Struve
Creating a Scalable Monitoring System That Everyone Will Love ADDOCreating a Scalable Monitoring System That Everyone Will Love ADDO
Creating a Scalable Monitoring System That Everyone Will Love ADDO
Molly Struve199 views
Creating a Scalable Monitoring System That Everyone Will Love (Velocity Conf) by Molly Struve
Creating a Scalable Monitoring System That Everyone Will Love (Velocity Conf)Creating a Scalable Monitoring System That Everyone Will Love (Velocity Conf)
Creating a Scalable Monitoring System That Everyone Will Love (Velocity Conf)
Molly Struve566 views
Building a Scalable Monitoring System by Molly Struve
Building a Scalable Monitoring SystemBuilding a Scalable Monitoring System
Building a Scalable Monitoring System
Molly Struve850 views
Cache is King: RubyConf Columbia by Molly Struve
Cache is King: RubyConf ColumbiaCache is King: RubyConf Columbia
Cache is King: RubyConf Columbia
Molly Struve498 views
Cache is King - RailsConf 2019 by Molly Struve
Cache is King - RailsConf 2019Cache is King - RailsConf 2019
Cache is King - RailsConf 2019
Molly Struve640 views
Cache is King - RubyHACK 2019 by Molly Struve
Cache is King - RubyHACK 2019Cache is King - RubyHACK 2019
Cache is King - RubyHACK 2019
Molly Struve3.2K views
Cache is King: Get the Most Bang for Your Buck From Ruby by Molly Struve
Cache is King: Get the Most Bang for Your Buck From RubyCache is King: Get the Most Bang for Your Buck From Ruby
Cache is King: Get the Most Bang for Your Buck From Ruby
Molly Struve1.4K views

Recently uploaded

sam_software_eng_cv.pdf by
sam_software_eng_cv.pdfsam_software_eng_cv.pdf
sam_software_eng_cv.pdfsammyigbinovia
5 views5 slides
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx by
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptxlwang78
83 views19 slides
802.11 Computer Networks by
802.11 Computer Networks802.11 Computer Networks
802.11 Computer NetworksTusharChoudhary72015
10 views33 slides
Introduction to CAD-CAM.pptx by
Introduction to CAD-CAM.pptxIntroduction to CAD-CAM.pptx
Introduction to CAD-CAM.pptxsuyogpatil49
5 views15 slides
Investor Presentation by
Investor PresentationInvestor Presentation
Investor Presentationeser sevinç
25 views26 slides
Codes and Conventions.pptx by
Codes and Conventions.pptxCodes and Conventions.pptx
Codes and Conventions.pptxIsabellaGraceAnkers
9 views5 slides

Recently uploaded(20)

2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx by lwang78
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx
lwang7883 views
Introduction to CAD-CAM.pptx by suyogpatil49
Introduction to CAD-CAM.pptxIntroduction to CAD-CAM.pptx
Introduction to CAD-CAM.pptx
suyogpatil495 views
Design_Discover_Develop_Campaign.pptx by ShivanshSeth6
Design_Discover_Develop_Campaign.pptxDesign_Discover_Develop_Campaign.pptx
Design_Discover_Develop_Campaign.pptx
ShivanshSeth632 views
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc... by csegroupvn
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...
csegroupvn5 views
Effect of deep chemical mixing columns on properties of surrounding soft clay... by AltinKaradagli
Effect of deep chemical mixing columns on properties of surrounding soft clay...Effect of deep chemical mixing columns on properties of surrounding soft clay...
Effect of deep chemical mixing columns on properties of surrounding soft clay...
AltinKaradagli9 views
DevOps-ITverse-2023-IIT-DU.pptx by Anowar Hossain
DevOps-ITverse-2023-IIT-DU.pptxDevOps-ITverse-2023-IIT-DU.pptx
DevOps-ITverse-2023-IIT-DU.pptx
Anowar Hossain12 views
fakenews_DBDA_Mar23.pptx by deepmitra8
fakenews_DBDA_Mar23.pptxfakenews_DBDA_Mar23.pptx
fakenews_DBDA_Mar23.pptx
deepmitra815 views
GDSC Mikroskil Members Onboarding 2023.pdf by gdscmikroskil
GDSC Mikroskil Members Onboarding 2023.pdfGDSC Mikroskil Members Onboarding 2023.pdf
GDSC Mikroskil Members Onboarding 2023.pdf
gdscmikroskil53 views
Control Systems Feedback.pdf by LGGaming5
Control Systems Feedback.pdfControl Systems Feedback.pdf
Control Systems Feedback.pdf
LGGaming56 views
SUMIT SQL PROJECT SUPERSTORE 1.pptx by Sumit Jadhav
SUMIT SQL PROJECT SUPERSTORE 1.pptxSUMIT SQL PROJECT SUPERSTORE 1.pptx
SUMIT SQL PROJECT SUPERSTORE 1.pptx
Sumit Jadhav 15 views

Taking Elasticsearch From 0 to 88mph