Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Elasticsearch From the Bottom Up


Published on

The talk covers how Elasticsearch, Lucene and to some extent search engines in general actually work under the hood. We'll start at the "bottom" (or close enough!) of the many abstraction levels, and gradually move upwards towards the user-visible layers, studying the various internal data structures and behaviors as we ascend. Elasticsearch provides APIs that are very easy to use, and it will get you started and take you far without much effort. However, to get the most of it, it helps to have some knowledge about the underlying algorithms and data structures. This understanding enables you to make full use of its substantial set of features such that you can improve your users search experiences, while at the same time keep your systems performant, reliable and updated in (near) real time.

Published in: Data & Analytics
  • Be the first to comment

Elasticsearch From the Bottom Up

  1. 1. Elasticsearch from the Bottom Up Njal Karevoll @nkvoll
  2. 2. Elasticsearch from the Bottom Up Njal Karevoll @nkvoll
  3. 3. Who? Co-founder of Found AS - Hosted Elasticsearch: 8+ years search, 3+ Elasticsearch Herding hundreds of Elasticsearch clusters
  4. 4. Motivation • Why isn't my search for *foo-bar* matching “foo-bar"? • Why can adding more documents shrink the index? • Why is Elasticsearch using so much memory?
  5. 5. Segments are immutable
  6. 6. Deletes?
  7. 7. Compress all the things!
  8. 8. Cache all the things!
  9. 9. Search by index terms Text analysis gives us terms
  10. 10. Search by segment Uses several data structures
  11. 11. Immutable segments
  12. 12. Shard == Lucene Index
  13. 13. Elasticsearch Index abstracts Lucene Indexes
  14. 14. … across nodes in a cluster
  15. 15. Learn More! Follow @foundsays