Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

35,803 views

Published on

Sematext engineer Rafal Kuc (@kucrafal) walks through the details of running high-performance, fault tolerant Elasticsearch clusters on Docker. Topics include: Containers vs. Virtual Machines, running the official Elasticsearch container, container constraints, good network practices, dealing with storage, data-only Docker volumes, scaling, time-based data, multiple tiers and tenants, indexing with and without routing, querying with and without routing, routing vs. no routing, and monitoring. Talk was delivered at DevOps Days Warsaw 2015.

Published in: Data & Analytics
  • I’ve personally never heard of companies who can produce a paper for you until word got around among my college groupmates. My professor asked me to write a research paper based on a field I have no idea about. My research skills are also very poor. So, I thought I’d give it a try. I chose a writer who matched my writing style and fulfilled every requirement I proposed. I turned my paper in and I actually got a good grade. I highly recommend ⇒ www.HelpWriting.net ⇐
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THAT BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book that can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer that is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story That Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money That the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths that Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hi there! Essay Help For Students | Discount 10% for your first order! - Check our website! https://vk.cc/80SakO
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Slightly Hard to understand
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

  1. 1. Elasticsearch & Docker Rafał Kuć – Sematext Group, Inc. @kucrafal @sematext sematext.com Running High Performance Fault Tolerant Elasticsearch Clusters On Docker
  2. 2. About me… Sematext consultant & engineer Solr.pl co-founder Father and husband :)
  3. 3. Next 30 minutes
  4. 4. You Are Probably Familiar With This Development
  5. 5. You Are Probably Familiar With This Development Test
  6. 6. You Are Probably Familiar With This Development Test QA
  7. 7. You Are Probably Familiar With This Development Test QA Production environment
  8. 8. And The Problems That Come With It Resources not utilized
  9. 9. And The Problems That Come With It Resources not utilized Overprovisioned Servers
  10. 10. And The Problems That Come With It Resources not utilized Overprovisioned Servers ≠ ≠
  11. 11. The solution Development Test QA Production
  12. 12. Container Technologies
  13. 13. What is Docker? Lightweight Based on Open Standards Secure
  14. 14. Containers vs Virtual Machines Hardware Traditional Virtual Machine
  15. 15. Containers vs Virtual Machines Hardware Host Operating System Traditional Virtual Machine
  16. 16. Containers vs Virtual Machines Hardware Host Operating System Hypervisor Traditional Virtual Machine
  17. 17. Containers vs Virtual Machines Hardware Host Operating System Hypervisor Guest OS Guest OS Traditional Virtual Machine
  18. 18. Containers vs Virtual Machines Hardware Host Operating System Hypervisor Guest OS Guest OS Libraries Libraries Traditional Virtual Machine
  19. 19. Containers vs Virtual Machines Hardware Host Operating System Hypervisor Guest OS Guest OS Libraries Libraries Application 1 Application 2 Traditional Virtual Machine
  20. 20. Containers vs Virtual Machines Hardware Host Operating System Hypervisor Guest OS Guest OS Libraries Libraries Application 1 Application 2 Hardware Host Operating System Traditional Virtual MachineContainer
  21. 21. Containers vs Virtual Machines Hardware Host Operating System Hypervisor Guest OS Guest OS Libraries Libraries Application 1 Application 2 Hardware Host Operating System Docker Engine Traditional Virtual MachineContainer
  22. 22. Containers vs Virtual Machines Hardware Host Operating System Hypervisor Guest OS Guest OS Libraries Libraries Application 1 Application 2 Hardware Host Operating System Docker Engine Libraries Libraries Application 1 Application 2 Traditional Virtual MachineContainer
  23. 23. What is Elasticsearch? Reasonable defaults { JSON } Distributed by design http://www.dailypets.co.uk/2007/06/17/kittens-rest-at-half-time/
  24. 24. Running Official Elasticsearch Container $ docker run -d elasticsearch
  25. 25. Running Official Elasticsearch Container $ docker run -d elasticsearch == docker run -d elasticsearch:latest
  26. 26. Running Official Elasticsearch Container $ docker run -d elasticsearch:1.7 $ docker run -d elasticsearch == docker run -d elasticsearch:latest
  27. 27. Running Official Elasticsearch Container $ docker run -d elasticsearch == docker run -d elasticsearch:latest $ docker run --name es_1 -h es_master_1 elasticsearch $ docker run -d elasticsearch:1.7
  28. 28. Running Official Elasticsearch Container $ docker run -d elasticsearch == docker run -d elasticsearch:latest $ docker run --name es_1 -h es_master_1 elasticsearch $ docker run -d elasticsearch:1.7
  29. 29. Container Constraints $ docker run -d -m 2G elasticsearch http://docs.docker.com/engine/reference/run/
  30. 30. Container Constraints $ docker run -d -m 2G elasticsearch $ docker run -d -m 2G --memory-swappiness=0 elasticsearch http://docs.docker.com/engine/reference/run/
  31. 31. Container Constraints $ docker run -d -m 2G elasticsearch $ docker run -d -m 2G --memory-swappiness=0 elasticsearch $ docker run -d --cpuset-cpus="1,3" elasticsearch http://docs.docker.com/engine/reference/run/
  32. 32. Container Constraints $ docker run -d -m 2G elasticsearch $ docker run -d -m 2G --memory-swappiness=0 elasticsearch $ docker run -d --cpuset-cpus="1,3" elasticsearch http://docs.docker.com/engine/reference/run/ $ docker run -d --cpu-period=50000 --cpu-quota=25000 elasticsearch
  33. 33. Creating Optimized Image Dockerfile: FROM elasticsearch ADD ./elasticsearch.yml /usr/share/elasticsearch/config/
  34. 34. Creating Optimized Image Dockerfile: FROM elasticsearch ADD ./elasticsearch.yml /usr/share/elasticsearch/config/ $ docker build -t devops/example .
  35. 35. Creating Optimized Image Dockerfile: FROM elasticsearch ADD ./elasticsearch.yml /usr/share/elasticsearch/config/ $ docker build -t devops/example . Sending build context to Docker daemon 3.072 kB Step 1 : FROM elasticsearch ---> 8112755253f1 Step 2 : ADD ./elasticsearch.yml /usr/share/elasticsearch/config/ ---> Using cache ---> c9ca48a22e58 Successfully built c9ca48a22e58
  36. 36. Dealing With Network $ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
  37. 37. Dealing With Network $ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch $ docker run -d elasticsearch -Dnetwork.publish_host=192.168.1.1
  38. 38. Dealing With Network $ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch $ docker run -d elasticsearch -Dnetwork.publish_host=192.168.1.1 $ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch -Dnetwork.publish_host=192.168.1.1
  39. 39. Dealing With Network $ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch $ docker run -d elasticsearch -Dnetwork.publish_host=192.168.1.1 $ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch -Dnetwork.publish_host=192.168.1.1 $ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch -Dnetwork.publish_host=0.0.0.0
  40. 40. Network - Good Practices Separate network for Elasticsearch cluster
  41. 41. Network - Good Practices Separate network for Elasticsearch cluster Common host names for containers $ docker run -d -h es_node_1 elasticsearch
  42. 42. Network - Good Practices Separate network for Elasticsearch cluster Common host names for containers $ docker run -d -h es_node_1 elasticsearch Expose 9200 & 9300 ports only for client nodes
  43. 43. Network - Good Practices Separate network for Elasticsearch cluster Common host names for containers $ docker run -d -h es_node_1 elasticsearch Expose 9200 & 9300 ports only for client nodes Elasticsearch data & client nodes point to masters only
  44. 44. Dealing With Storage By default in /usr/share/elasticsearch/data
  45. 45. Dealing With Storage By default in /usr/share/elasticsearch/data By default not persisted
  46. 46. Dealing With Storage By default in /usr/share/elasticsearch/data By default not persisted $ docker run -d -v /opt/elasticsearch/data:/usr/share/elasticsearch/data elasticsearch
  47. 47. Dealing With Storage $ docker run -d -v /opt/elasticsearch/data:/usr/share/elasticsearch/data elasticsearch By default in /usr/share/elasticsearch/data By default not persisted Use data only containers Permissions
  48. 48. Data-Only Docker Volumes Bypasses Union File System
  49. 49. Data-Only Docker Volumes Bypasses Union File System Can be shared between containers
  50. 50. Data-Only Docker Volumes Bypasses Union File System Can be shared between containers Data volumes persist if the container itself is deleted
  51. 51. Data-Only Docker Volumes Bypasses Union File System Can be shared between containers Data volumes persist if the container itself is deleted $ docker create -v /mnt/es/data:/usr/share/elasticsearch/data --name esdata elasticsearch Permissions
  52. 52. Data-Only Docker Volumes Bypasses Union File System Can be shared between containers Data volumes persist if the container itself is deleted $ docker create -v /mnt/es/data:/usr/share/elasticsearch/data --name esdata elasticsearch $ docker run --volumes-from esdata elasticsearch
  53. 53. Highly Available Cluster Master only Master only Master only Data only Data only Data only Data only Data only Data only Client only Client only
  54. 54. Highly Available Cluster Master only Master only Master only Data only Data only Data only Data only Data only Data only Client only Client only minimum_master_nodes = N/2 + 1
  55. 55. Highly Available Cluster Master only Master only Master only Data only Data only Data only Data only Data only Data only Client only Client only minimum_master_nodes = N/2 + 1 recovery.after.nodes recovery.expected.nodes cluster.routing.allocation.node_concurrent_ recoveries index.unassigned.node_left.delayed_timeout index.priority
  56. 56. Master Nodes & Docker $ docker run -d elasticsearch -Dnode.master=true -Dnode.data=false -Dnode.client=false
  57. 57. Client Nodes & Docker $ docker run -d elasticsearch -Dnode.master=false -Dnode.data=false -Dnode.client=true
  58. 58. Data Nodes & Docker $ docker run -d elasticsearch -Dnode.master=false -Dnode.data=true -Dnode.client=false
  59. 59. Scaling Elasticsearch Node Elasticsearch Node Elasticsearch Node Elasticsearch Node
  60. 60. Scaling curl -XPUT 'http://localhost:9200/devops/' -d '{ "settings" : { "index" : { "number_of_shards" : 4, "number_of_replicas" : 0 } } }'
  61. 61. Scaling P P P P
  62. 62. Scaling curl -XPUT 'http://localhost:9200/devops/_settings' -d '{ "index.number_of_replicas" : 1 }'
  63. 63. Scaling P P P P R R R R
  64. 64. Scaling curl -XPUT 'http://localhost:9200/devops/_settings' -d '{ "index.number_of_replicas" : 2 }'
  65. 65. Scaling P P P P R R R R R R R R
  66. 66. Scaling curl -XPUT 'http://localhost:9200/devops/_settings' -d '{ "index.number_of_replicas" : 1 }'
  67. 67. Scaling P P P P R R R R
  68. 68. Scaling P P P P R R R R
  69. 69. Scaling P P PP UnassignedR R R R
  70. 70. RAM Buffer indices.memory.index_buffer_size: 10% indices.memory.min_index_buffer_size: 48mb indices.memory.max_index_buffer_size (unbounded) indices.memory.min_shard_index_buffer_size: 4mb
  71. 71. RAM Buffer indices.memory.index_buffer_size: 10% indices.memory.min_index_buffer_size: 48mb indices.memory.max_index_buffer_size (unbounded) indices.memory.min_shard_index_buffer_size: 4mb Higher Indexing Throughput Lower Indexing Throughput defaults ><
  72. 72. Time-Based Data? 2015-11-23 TODAY WEEK
  73. 73. Time-Based Data? curl -XPOST 'http://localhost:9200/_aliases' -d '{ "actions" : [ { "add" : {"index":"2015-11-23","alias":"today"} }, { "add" : {"index":"2015-11-23","alias":"week"} } ]}'
  74. 74. Time-Based Data? 2015-11-23 2015-11-24 TODAY WEEK
  75. 75. Time-Based Data? 2015-11-23 2015-11-24 2014-11-25 TODAY WEEK
  76. 76. Multiple Tiers node.tag=hot node.tag=cold node.tag=cold
  77. 77. Multiple Tiers curl -XPUT 'localhost:9200/data_2015-11-23' -d '{ "settings": { "index.routing.allocation.include.tag" : "hot" } }'
  78. 78. Multiple Tiers node.tag=hot node.tag=cold node.tag=cold data_2015-11-23 data_2015-11-23
  79. 79. Multiple Tiers curl -XPUT 'localhost:9200/data_2015-11-23/_settings' -d '{ "settings": { "index.routing.allocation.exclude.tag" : "hot", "index.routing.allocation.include.tag" : "cold", } }'
  80. 80. Multiple Tiers node.tag=hot node.tag=cold node.tag=cold data_2015-11-23 data_2015-11-23
  81. 81. Multiple Tiers node.tag=hot node.tag=cold node.tag=cold data_2015-11-23 data_2015-11-23 data_2015-11-24 data_2015-11-24
  82. 82. Multiple Tiers node.tag=hot node.tag=cold node.tag=cold data_2015-11-23 data_2015-11-23 data_2015-11-25 data_2015-11-25 data_2015-11-24 data_2015-11-24
  83. 83. Multiple Tenants
  84. 84. Multiple Tenants Hot Hot Cold Cold Cold Cold
  85. 85. Multiple Tenants Hot Hot Cold Cold Cold Cold R O U T I N G
  86. 86. Indexing Without Routing Shard 1 Shard 2 Shard 3 Shard 4 Shard 5 Shard 6 Shard 7 Shard 8 Elasticsearch Application userA userA userA userA userAuserA userA userA
  87. 87. Indexing With Routing Shard 1 Shard 2 Shard 3 Shard 4 Shard 5 Shard 6 Shard 7 Shard 8 Elasticsearch Application
  88. 88. Querying Without Routing Shard 1 Shard 2 Shard 3 Shard 4 Shard 5 Shard 6 Shard 7 Shard 8 Elasticsearch Application
  89. 89. Querying With Routing Shard 1 Shard 2 Shard 3 Shard 4 Shard 5 Shard 6 Shard 7 Shard 8 Elasticsearch Application
  90. 90. Routing vs No Routing Queries without routing (200 shards, 1 replica) #threads Avg response time Throughput 90% line Median CPU Utilization 1 3169ms 19,0/min 5214ms 2692ms 95 – 99%
  91. 91. Routing vs No Routing Queries without routing (200 shards, 1 replica) #threads Avg response time Throughput 90% line Median CPU Utilization 1 3169ms 19,0/min 5214ms 2692ms 95 – 99% Queries with routing (200 shards, 1 replica) #threads Avg response time Throughput 90% line Median CPU Utilization 10 196ms 50,6/sec 642ms 29ms 25 – 40% 20 218ms 91,2/sec 718ms 11ms 10 – 15%
  92. 92. Monitoring https://sematext.com/spm/integrations/docker-monitoring.html https://github.com/sematext/spm-agent-docker
  93. 93. Short summary http://www.soothetube.com/2013/12/29/thats-all-folks/
  94. 94. We Are Hiring! Dig Search? Dig Analytics? Dig Big Data? Dig Performance? Dig Logging? Dig working with, and in, open–source? We’re hiring worldwide! http://sematext.com/about/jobs.html
  95. 95. Rafał Kuć @kucrafal rafal.kuc@sematext.com Sematext @sematext http://sematext.com http://blog.sematext.com Thank You !

×