Elasticsearch in productionAlex Brasetvik@alexbrasetvik
How marketing thinks our users feel
How we developers sometimes feel
Who?Co-founder of Found AS7+ years of search, 2+ ElasticsearchWe manage hundreds of Elasticsearchclusters… on Amazons cloud
AgendaMemory (and stability)Security (and multi-tenancy)Networking (and reliability)Client (and resiliency)
MemorySearch engines crave memoryCaches, caches, cachesField- and filter cachesPage cacheIndex building
PostgreSQLVerifies resource usageSafe >>> fastUses disk if necessary
Elasticsearch trusts youBuilt for speedItll jump if you ask it toWhat could possibly go wrong?
OutOfMemoryErrorWoah thereI ate all the memoriesYour cluster may or may not work any more
May or may not work?What else was happening at the time?Corrupt cluster state, crashed Netty, …In short: Dont end up there
Warning signs?Monitor cache sizes and heap spaceOutgrowing page cache: gradual slowdownOutgrowing heap space: sudden crash
Understand the memory profileTest realisticlyBound cache sizes and flush thresholdsv0.90+ takes you longer with field filters,...
Large heaps are expensive to garbage collectKeep heap < 32GiB (But test!)Lots of page cache is good, though!
SecurityElasticsearch trusts everyoneNot its job to do auth(z)Youre the gatekeeper
_searchRead only?Limit indexes / wrap with filters?Protect the field caches
Arbitrary code executionElasticsearch has powerful scriptingNot sandboxedOn by default
Any website can reach your machinehttp://127.0.0.1:9200/_search?callback=capture&source=…Run in a virtual machine
NetworkingElasticsearch is distributedEasy (for a distributed system)Supports many usage patterns.
Quite common topologyHigh availability, right?
Obey or risk split brains …… and irrecoverable data-loss
+1 is a "tie breaker"
Stormy cloudsZone vs instance failureThundering herdsOptimizing MTTR is not HA
Client considerationsIdempotent/retry-able requests  Use a connection pool._bulk / _msearch
Have enough memoryHave a majority of nodesDont allow arbitrary search requestsUse retryable requests
Alex over Trondheim, Tore HelgedagsrudElephant, Roy CostelloWingsuit, Richard SchneiderLightning Storm and Stars, Justin E...
Elasticsearch in production
Upcoming SlideShare
Loading in …5
×

Elasticsearch in production

1,370 views

Published on

Video available at http://www.youtube.com/watch?v=gkdfNl0WL-A

Original slides at http://presentations.found.no/berlin-buzzwords-2013/


This talk covers some of the lessons we've learned from securing and herding hundreds of Elasticsearch clusters. It is applicable whether you operate Elasticsearch in your own infrastructure, in the cloud, or if you're a developer who wants a better understanding of Elasticsearch's various failure modes.

Elasticsearch easily lets you develop amazing things, and it has gone to great lengths to make Lucene's features readily available in a distributed setting. However, when it comes to running Elasticsearch in production, you still have a fairly complicated system on your hands: a system with high expectations on network stability, a huge appetite for memory, and a system that assumes all users are trustworthy.

Instead of delving deeply into a few specifics, we give a brief overview of problems you are likely to run into and suggested solutions to these problems. We cover topics that are applicable to both developers and users with Elasticsearch clusters of every shape and size – with an emphasis on resiliency and security.

Basic familiarity with Elasticsearch is assumed.

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,370
On SlideShare
0
From Embeds
0
Number of Embeds
54
Actions
Shares
0
Downloads
25
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Elasticsearch in production

  1. 1. Elasticsearch in productionAlex Brasetvik@alexbrasetvik
  2. 2. How marketing thinks our users feel
  3. 3. How we developers sometimes feel
  4. 4. Who?Co-founder of Found AS7+ years of search, 2+ ElasticsearchWe manage hundreds of Elasticsearchclusters… on Amazons cloud
  5. 5. AgendaMemory (and stability)Security (and multi-tenancy)Networking (and reliability)Client (and resiliency)
  6. 6. MemorySearch engines crave memoryCaches, caches, cachesField- and filter cachesPage cacheIndex building
  7. 7. PostgreSQLVerifies resource usageSafe >>> fastUses disk if necessary
  8. 8. Elasticsearch trusts youBuilt for speedItll jump if you ask it toWhat could possibly go wrong?
  9. 9. OutOfMemoryErrorWoah thereI ate all the memoriesYour cluster may or may not work any more
  10. 10. May or may not work?What else was happening at the time?Corrupt cluster state, crashed Netty, …In short: Dont end up there
  11. 11. Warning signs?Monitor cache sizes and heap spaceOutgrowing page cache: gradual slowdownOutgrowing heap space: sudden crash
  12. 12. Understand the memory profileTest realisticlyBound cache sizes and flush thresholdsv0.90+ takes you longer with field filters, etc.
  13. 13. Large heaps are expensive to garbage collectKeep heap < 32GiB (But test!)Lots of page cache is good, though!
  14. 14. SecurityElasticsearch trusts everyoneNot its job to do auth(z)Youre the gatekeeper
  15. 15. _searchRead only?Limit indexes / wrap with filters?Protect the field caches
  16. 16. Arbitrary code executionElasticsearch has powerful scriptingNot sandboxedOn by default
  17. 17. Any website can reach your machinehttp://127.0.0.1:9200/_search?callback=capture&source=…Run in a virtual machine
  18. 18. NetworkingElasticsearch is distributedEasy (for a distributed system)Supports many usage patterns.
  19. 19. Quite common topologyHigh availability, right?
  20. 20. Obey or risk split brains …… and irrecoverable data-loss
  21. 21. +1 is a "tie breaker"
  22. 22. Stormy cloudsZone vs instance failureThundering herdsOptimizing MTTR is not HA
  23. 23. Client considerationsIdempotent/retry-able requests  Use a connection pool._bulk / _msearch
  24. 24. Have enough memoryHave a majority of nodesDont allow arbitrary search requestsUse retryable requests
  25. 25. Alex over Trondheim, Tore HelgedagsrudElephant, Roy CostelloWingsuit, Richard SchneiderLightning Storm and Stars, Justin EnnisWingsuit flock, Richard SchneiderOh salad, you so funny, Eatliver

×