SlideShare a Scribd company logo
1 of 28
Download to read offline
@talevy
All About Aggregations
Tal Levy, Software Engineer
- http://localhost:9200
{ }
{ “tagline”: “You Know, for Search” }
3
• Originally built on Lucene for text-based
searching
• Lucene and Elasticsearch work together to
provide new storage formats and data types
specific for numeric and keyword metrics.
• Aggregations alongside searching
More than search
4
Query
5
Query
Aggs
6
And Analytics
Searching & Aggregating
7
price color make sold
10000 red honda 10/28/2016
20000 red honda 11/05/2016
30000 green ford 05/08/2016
15000 blue toyota 07/02/2016
12000 green toyota 08/19/2016
20000 red honda 11/05/2016
80000 red bmw 01/01/2016
25000 blue ford 02/12/2016
Data Structures For Field Values on Shards
8
color
red
red
green
blue
green
red
red
blue
• Two considerations for our data
• Fast querying by values
• Fast aggregating by values
Inverted Index: terms-to-documents
9
color doc1 doc2 doc3
red ◉ ◉ ◉
blue ◉ ◉ ◉
green ◉ ◉ ◉
purple ◉ ◉ ◉
orange ◉ ◉ ◉
white ◉ ◉ ◉
black ◉ ◉ ◉
brown ◉ ◉ ◉
Doc Values: documents-to-terms
10
1
value
per
document
1 column per field
price color make sold
10000 red honda 10/28/2016
20000 red honda 11/05/2016
30000 green ford 05/08/2016
15000 blue toyota 07/02/2016
12000 green toyota 08/19/2016
20000 red honda 11/05/2016
80000 red bmw 01/01/2016
25000 blue ford 02/12/2016
How Distributed Aggregations Work?
11
Data nodes
Coordinating node
• inline with search query
• Executed in isolation on
each shard
• 4 phases
• Parse
• Collect
• Combine
• Reduce
Phase 1: Parse
12
Data nodes
Coordinating node
• Coordinating node splits
the request into shard
requests
• Shards parse
aggregations and
initialize data-structures
Phase 2,3: Collect, Combine
13
Data nodes
Coordinating node
• Shards process all
matching documents
• Once done, they combine
aggregated data into
an aggregation
Phase 4: Reduce
14
Data nodes
Coordinating node
• Shards send their
aggregations to the
coordinating node
• Which reduces them
into a single aggregation
Designed for speed
15
Single network round-trip
Single pass through data on shards
Aggregates are computed in memory
Trades accuracy for speed
Only pay for documents that match query
Can be composed (average response time — broken by day)
Types of Aggregations
16
• Bucket
• Terms
• (Date) Histograms
• Filter
• Range
• …
• Metric
• Stats
• Percentiles
• Cardinality (unique counts)
• Top Hits
• Scripted
• …
Example Terms Aggregation Query
17
GET products/_search
{
"size" : 0,
"query": {"match_all": {} },
"aggs" : {
"my_produce_ids” : {
"terms": {
"field": "pid",
"size": 3
}
}
}
}
Example Terms Aggregation Response
18
{
"hits": {…},
"aggregations": {
"my_product_ids”: {
"doc_count_error_upper_bound": 3302,
"sum_other_doc_count": 8879020,
"buckets": [
{ "key": "030758836X", "doc_count": 7440 },
{ "key": "0439023483", "doc_count": 6717 },
{ "key": "0375831002", "doc_count": 4864 }
]
}}}
Things To Consider
19
{
"hits": {…},
"aggregations": {
"my_product_ids”: {
"doc_count_error_upper_bound": 3302,
"sum_other_doc_count": 8879020,
"buckets": [
{ "key": "030758836X", "doc_count": 7440 },
{ "key": "0439023483", "doc_count": 6717 },
{ "key": "0375831002", "doc_count": 4864 }
]
}}}
Upper bound on error on counts for each term
number of docs not included in buckets
Locality Bias: Top N(1)
20
A
COUNT
RED 5
GREEN 4
BLUE 2
COUNT
RED 2
GREEN 4
BLUE 1
B
COUNT
RED 7
GREEN 8
BLUE 3
A B
Node A’s Counts Node B’s Counts Global Counts
Shard Size: Top 3
21
Data nodes
Coordinating node
• How many buckets to
return per shard?
• “shard_size”
15
15
15
15
3
Example Terms Aggregation Query
22
GET products/_search
{
"size" : 0,
"query": {"match_all": {} },
"aggs" : {
"my_produce_ids” : {
"terms": {
"field": "pid",
"size": 3,
“shard_size”: 999999
}
}
}
}
Summary
23
Aggregations are powerful & fast
Need to trade accuracy for speed/memory in some cases
Use `shard_size` to help manage accuracy with terms aggregation
Leverage Kibana to help write aggregations!
Profile your aggregations using the Query Profiler
What We Missed
24
Pipeline Aggregations: Aggregations of Aggregations
Using `requests.cache` to cache complex static aggregations
Matrix Aggregations: covariance and correlation
New aggregation types introduced all the time
What’s New In 6.0
What to expect?
26
Efficient sparse doc-value reading and writing
index-time sorting
Removal of types
Cross-cluster search
Upgrading to 6.0 with rolling restarts!
and so much more!
• Elastic Discussion Forums:
https://discuss.elastic.co/
• Aggregation Documentation:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-
aggregations.html
• Terms Aggregation Approximation: https://www.elastic.co/guide/en/elasticsearch/
reference/current/search-aggregations-bucket-terms-aggregation.html#search-
aggregations-bucket-terms-aggregation-approximate-counts
• Similar Deck From my colleagues Adrien and Colin! https://www.elastic.co/elasticon/
2015/sf/all-about-aggregations
Resources
27
Q & A

More Related Content

Similar to All about aggregations

MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB
 
Real Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at WishReal Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at WishMongoDB
 
Elasticsearch an overview
Elasticsearch   an overviewElasticsearch   an overview
Elasticsearch an overviewAmit Juneja
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Codemotion
 
Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Maxime Beugnet
 
Webinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBWebinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBMongoDB
 
MongoDB Best Practices
MongoDB Best PracticesMongoDB Best Practices
MongoDB Best PracticesLewis Lin 🦊
 
Webinar: How Banks Use MongoDB as a Tick Database
Webinar: How Banks Use MongoDB as a Tick DatabaseWebinar: How Banks Use MongoDB as a Tick Database
Webinar: How Banks Use MongoDB as a Tick DatabaseMongoDB
 
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB
 
Redis Day TLV 2018 - RediSearch Aggregations
Redis Day TLV 2018 - RediSearch AggregationsRedis Day TLV 2018 - RediSearch Aggregations
Redis Day TLV 2018 - RediSearch AggregationsRedis Labs
 
Elasticsearch: Getting Started Part 3 Aggregations
Elasticsearch: Getting Started Part 3 AggregationsElasticsearch: Getting Started Part 3 Aggregations
Elasticsearch: Getting Started Part 3 AggregationsSuyog Kale
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDBAlex Zyl
 
Python Ireland Conference 2016 - Python and MongoDB Workshop
Python Ireland Conference 2016 - Python and MongoDB WorkshopPython Ireland Conference 2016 - Python and MongoDB Workshop
Python Ireland Conference 2016 - Python and MongoDB WorkshopJoe Drumgoole
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDBMongoDB
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters MongoDB
 
Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB MongoDB
 
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼Elasticsearch
 
Back to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkBack to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkMongoDB
 

Similar to All about aggregations (20)

MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
 
Real Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at WishReal Time Data Analytics with MongoDB and Fluentd at Wish
Real Time Data Analytics with MongoDB and Fluentd at Wish
 
Elasticsearch an overview
Elasticsearch   an overviewElasticsearch   an overview
Elasticsearch an overview
 
MongoDB FabLab León
MongoDB FabLab LeónMongoDB FabLab León
MongoDB FabLab León
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
 
Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...
 
Webinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBWebinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDB
 
MongoDB Best Practices
MongoDB Best PracticesMongoDB Best Practices
MongoDB Best Practices
 
Webinar: How Banks Use MongoDB as a Tick Database
Webinar: How Banks Use MongoDB as a Tick DatabaseWebinar: How Banks Use MongoDB as a Tick Database
Webinar: How Banks Use MongoDB as a Tick Database
 
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
 
Redis Day TLV 2018 - RediSearch Aggregations
Redis Day TLV 2018 - RediSearch AggregationsRedis Day TLV 2018 - RediSearch Aggregations
Redis Day TLV 2018 - RediSearch Aggregations
 
Nosql part3
Nosql part3Nosql part3
Nosql part3
 
Elasticsearch: Getting Started Part 3 Aggregations
Elasticsearch: Getting Started Part 3 AggregationsElasticsearch: Getting Started Part 3 Aggregations
Elasticsearch: Getting Started Part 3 Aggregations
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
 
Python Ireland Conference 2016 - Python and MongoDB Workshop
Python Ireland Conference 2016 - Python and MongoDB WorkshopPython Ireland Conference 2016 - Python and MongoDB Workshop
Python Ireland Conference 2016 - Python and MongoDB Workshop
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters
 
Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB Data Processing and Aggregation with MongoDB
Data Processing and Aggregation with MongoDB
 
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼
Customer Story: Elastic Stack을 이용한 게임 서비스 통합 로깅 플랫폼
 
Back to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkBack to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation Framework
 

More from Fan Robbin

The state of geo in ElasticSearch
The state of geo in ElasticSearchThe state of geo in ElasticSearch
The state of geo in ElasticSearchFan Robbin
 
reliabe by design
reliabe by designreliabe by design
reliabe by designFan Robbin
 
updates from lucene lands 2015
updates from lucene lands 2015updates from lucene lands 2015
updates from lucene lands 2015Fan Robbin
 
bm25 demystified
bm25 demystifiedbm25 demystified
bm25 demystifiedFan Robbin
 
Seven deadly sins of ElasticSearch Benchmarking
Seven deadly sins of ElasticSearch BenchmarkingSeven deadly sins of ElasticSearch Benchmarking
Seven deadly sins of ElasticSearch BenchmarkingFan Robbin
 
AinoVongeCorry_AnIntroductionToArchitectureQuality.ppt
AinoVongeCorry_AnIntroductionToArchitectureQuality.pptAinoVongeCorry_AnIntroductionToArchitectureQuality.ppt
AinoVongeCorry_AnIntroductionToArchitectureQuality.pptFan Robbin
 
广告推荐训练系统的落地实践
广告推荐训练系统的落地实践广告推荐训练系统的落地实践
广告推荐训练系统的落地实践Fan Robbin
 
微博推荐引擎架构蜕变之路
微博推荐引擎架构蜕变之路微博推荐引擎架构蜕变之路
微博推荐引擎架构蜕变之路Fan Robbin
 
Claire protorpc
Claire protorpcClaire protorpc
Claire protorpcFan Robbin
 
可视化的微博
可视化的微博可视化的微博
可视化的微博Fan Robbin
 

More from Fan Robbin (10)

The state of geo in ElasticSearch
The state of geo in ElasticSearchThe state of geo in ElasticSearch
The state of geo in ElasticSearch
 
reliabe by design
reliabe by designreliabe by design
reliabe by design
 
updates from lucene lands 2015
updates from lucene lands 2015updates from lucene lands 2015
updates from lucene lands 2015
 
bm25 demystified
bm25 demystifiedbm25 demystified
bm25 demystified
 
Seven deadly sins of ElasticSearch Benchmarking
Seven deadly sins of ElasticSearch BenchmarkingSeven deadly sins of ElasticSearch Benchmarking
Seven deadly sins of ElasticSearch Benchmarking
 
AinoVongeCorry_AnIntroductionToArchitectureQuality.ppt
AinoVongeCorry_AnIntroductionToArchitectureQuality.pptAinoVongeCorry_AnIntroductionToArchitectureQuality.ppt
AinoVongeCorry_AnIntroductionToArchitectureQuality.ppt
 
广告推荐训练系统的落地实践
广告推荐训练系统的落地实践广告推荐训练系统的落地实践
广告推荐训练系统的落地实践
 
微博推荐引擎架构蜕变之路
微博推荐引擎架构蜕变之路微博推荐引擎架构蜕变之路
微博推荐引擎架构蜕变之路
 
Claire protorpc
Claire protorpcClaire protorpc
Claire protorpc
 
可视化的微博
可视化的微博可视化的微博
可视化的微博
 

Recently uploaded

Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Dana Luther
 
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on DeliveryCall Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Deliverybabeytanya
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...aditipandeya
 
Denver Web Design brochure for public viewing
Denver Web Design brochure for public viewingDenver Web Design brochure for public viewing
Denver Web Design brochure for public viewingbigorange77
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of indiaimessage0108
 
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一3sw2qly1
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Roomishabajaj13
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一Fs
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012rehmti665
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITMgdsc13
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts servicesonalikaur4
 
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Roomgirls4nights
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Roomdivyansh0kumar0
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 

Recently uploaded (20)

Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girls
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
 
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on DeliveryCall Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
 
Denver Web Design brochure for public viewing
Denver Web Design brochure for public viewingDenver Web Design brochure for public viewing
Denver Web Design brochure for public viewing
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of india
 
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
 
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITM
 
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Samaira 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Samaira 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
 
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
 

All about aggregations

  • 1. @talevy All About Aggregations Tal Levy, Software Engineer
  • 2. - http://localhost:9200 { } { “tagline”: “You Know, for Search” }
  • 3. 3 • Originally built on Lucene for text-based searching • Lucene and Elasticsearch work together to provide new storage formats and data types specific for numeric and keyword metrics. • Aggregations alongside searching More than search
  • 7. Searching & Aggregating 7 price color make sold 10000 red honda 10/28/2016 20000 red honda 11/05/2016 30000 green ford 05/08/2016 15000 blue toyota 07/02/2016 12000 green toyota 08/19/2016 20000 red honda 11/05/2016 80000 red bmw 01/01/2016 25000 blue ford 02/12/2016
  • 8. Data Structures For Field Values on Shards 8 color red red green blue green red red blue • Two considerations for our data • Fast querying by values • Fast aggregating by values
  • 9. Inverted Index: terms-to-documents 9 color doc1 doc2 doc3 red ◉ ◉ ◉ blue ◉ ◉ ◉ green ◉ ◉ ◉ purple ◉ ◉ ◉ orange ◉ ◉ ◉ white ◉ ◉ ◉ black ◉ ◉ ◉ brown ◉ ◉ ◉
  • 10. Doc Values: documents-to-terms 10 1 value per document 1 column per field price color make sold 10000 red honda 10/28/2016 20000 red honda 11/05/2016 30000 green ford 05/08/2016 15000 blue toyota 07/02/2016 12000 green toyota 08/19/2016 20000 red honda 11/05/2016 80000 red bmw 01/01/2016 25000 blue ford 02/12/2016
  • 11. How Distributed Aggregations Work? 11 Data nodes Coordinating node • inline with search query • Executed in isolation on each shard • 4 phases • Parse • Collect • Combine • Reduce
  • 12. Phase 1: Parse 12 Data nodes Coordinating node • Coordinating node splits the request into shard requests • Shards parse aggregations and initialize data-structures
  • 13. Phase 2,3: Collect, Combine 13 Data nodes Coordinating node • Shards process all matching documents • Once done, they combine aggregated data into an aggregation
  • 14. Phase 4: Reduce 14 Data nodes Coordinating node • Shards send their aggregations to the coordinating node • Which reduces them into a single aggregation
  • 15. Designed for speed 15 Single network round-trip Single pass through data on shards Aggregates are computed in memory Trades accuracy for speed Only pay for documents that match query Can be composed (average response time — broken by day)
  • 16. Types of Aggregations 16 • Bucket • Terms • (Date) Histograms • Filter • Range • … • Metric • Stats • Percentiles • Cardinality (unique counts) • Top Hits • Scripted • …
  • 17. Example Terms Aggregation Query 17 GET products/_search { "size" : 0, "query": {"match_all": {} }, "aggs" : { "my_produce_ids” : { "terms": { "field": "pid", "size": 3 } } } }
  • 18. Example Terms Aggregation Response 18 { "hits": {…}, "aggregations": { "my_product_ids”: { "doc_count_error_upper_bound": 3302, "sum_other_doc_count": 8879020, "buckets": [ { "key": "030758836X", "doc_count": 7440 }, { "key": "0439023483", "doc_count": 6717 }, { "key": "0375831002", "doc_count": 4864 } ] }}}
  • 19. Things To Consider 19 { "hits": {…}, "aggregations": { "my_product_ids”: { "doc_count_error_upper_bound": 3302, "sum_other_doc_count": 8879020, "buckets": [ { "key": "030758836X", "doc_count": 7440 }, { "key": "0439023483", "doc_count": 6717 }, { "key": "0375831002", "doc_count": 4864 } ] }}} Upper bound on error on counts for each term number of docs not included in buckets
  • 20. Locality Bias: Top N(1) 20 A COUNT RED 5 GREEN 4 BLUE 2 COUNT RED 2 GREEN 4 BLUE 1 B COUNT RED 7 GREEN 8 BLUE 3 A B Node A’s Counts Node B’s Counts Global Counts
  • 21. Shard Size: Top 3 21 Data nodes Coordinating node • How many buckets to return per shard? • “shard_size” 15 15 15 15 3
  • 22. Example Terms Aggregation Query 22 GET products/_search { "size" : 0, "query": {"match_all": {} }, "aggs" : { "my_produce_ids” : { "terms": { "field": "pid", "size": 3, “shard_size”: 999999 } } } }
  • 23. Summary 23 Aggregations are powerful & fast Need to trade accuracy for speed/memory in some cases Use `shard_size` to help manage accuracy with terms aggregation Leverage Kibana to help write aggregations! Profile your aggregations using the Query Profiler
  • 24. What We Missed 24 Pipeline Aggregations: Aggregations of Aggregations Using `requests.cache` to cache complex static aggregations Matrix Aggregations: covariance and correlation New aggregation types introduced all the time
  • 26. What to expect? 26 Efficient sparse doc-value reading and writing index-time sorting Removal of types Cross-cluster search Upgrading to 6.0 with rolling restarts! and so much more!
  • 27. • Elastic Discussion Forums: https://discuss.elastic.co/ • Aggregation Documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/search- aggregations.html • Terms Aggregation Approximation: https://www.elastic.co/guide/en/elasticsearch/ reference/current/search-aggregations-bucket-terms-aggregation.html#search- aggregations-bucket-terms-aggregation-approximate-counts • Similar Deck From my colleagues Adrien and Colin! https://www.elastic.co/elasticon/ 2015/sf/all-about-aggregations Resources 27
  • 28. Q & A