Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Elastic meetup june16

Slides from Elastic Barcelona Meetup June16

Related Books

Free with a 30 day trial from Scribd

See all
  • Login to see the comments

Elastic meetup june16

  1. 1. 1 Miguel Bosin Support Engineer, @miguelbosin Hot/Warm Architecture + Sizing
  2. 2. 2 Intro Int • Miguel Bosin – Support engineer – Joined in 2015 – Interested in techonology – Passion about support • Elastic – Founded in 2012 – Distributed company – Elasticsearch: What’s it? – Open-source: ES,LS,Kibana and Beats – Commercial: X-Pack
  3. 3. 3 Intro • Miguel Bosin – Support engineer – Joined in 2015 – Interested in techonology – Passion about support • Elastic – Founded in 2012 – Distributed company – Elasticsearch: What’s it – Open-source: ES,LS,Kibana and Beats – Commercial: X-Pack
  4. 4. 4 What is it?  Open source  Distributed-scalable  Highly available  Document-oriented (JSON)  RESTful  FT search engine with real- time search and analytics capabilities
  5. 5. 5 Agenda Elastic overview1 Sizing introduction3 Hot/Warm architecture4 Elasticsearch basic architecture2
  6. 6. 6 Elastic current’s products overview
  7. 7. 7 Agenda Elastic overview Sizing introduction3 Hot/Warm architecture4 Elasticsearch basic architecture 1 2
  8. 8. 8 Elasticsearch terminology  A node is a single Elasticsearch instance, a single JVM  Multiple nodes can form a cluster  A cluster or a node can manage multiple indices  An index is a container for data  A shard is a single piece of an Elasticsearch index  A shard is either a primary or a replica
  9. 9. 9 Elasticsearch terminology II
  10. 10. 10 Elasticsearch terminology III
  11. 11. 11 Elasticsearch Architecture: Node roles Master node:  coordinates the cluster  only node able to apply changes to cluster state  publishes updated cluster state to all nodes Data node:  performs indexing  can allocate shards locally  knows cluster state
  12. 12. 12 Elasticsearch Architecture: Node roles II Client node:  does NOT perform indexing or allocate shards locally  does NOT perform cluster management operations  knows cluster state  smart load balancer (load balancing Kibana searches i.e.)  redirect operations to the nodes that holds the relevant data  calculate aggregations results
  13. 13. 13 Nodes roles are set in the elasticsearch.yml Elasticsearch Architecture: Node roles III
  14. 14. 14 Architecture: node roles
  15. 15. 15 Architecture: node roles
  16. 16. 16 Architecture special case: dedicated master nodes
  17. 17. 17 Dedicated master nodes –Why / minimum_master_nodes  Indexing and searching data is CPU-, memory-, and I/O-intensive work which can put pressure on a node’s resources  Avoiding split brain: 2 current master nodes on the same cluster DATA LOSS  Set this setting discovery.zen.minimum_master_nodes to the quorum: (master_eligible_nodes / 2) + 1
  18. 18. 18 Agenda Elastic overview Sizing introduction Hot/Warm architecture4 Elasticsearch basic architecture 1 3 2
  19. 19. 19 Sizing: general factors (server capacity) • Disks (SSD vs. HD) • RAM -1/2 total RAM for ES -ES heap size max: 30.5Gb • # CPU cores -ES threadpools concept **1 shard—>gets 1 thread—>1 java process—>1core**
  20. 20. 20 Sizing: Elasticsearch factors (logging case)  Size of shards  Number of shards on each node  Retention period of data  Mapping configuration  -Which fields are searchable, _source enabled or not,etc…  Size (average) of the documents
  21. 21. 21 Sizing: Capacity planning test I  FIRST: testing on a single node with a single index with one shard and no replica  THEN: insert as many documents as you can and run some typical queries  At some point, queries will start to slow down to a threshold, which no longer meet your requirements  This is the ideal number of documents a single shard is able to hold  NEXT: Find the ideal number your primary shards (by dividing your dataset size by the ideal shard size)  FINALLY: Add replicas for HA and improve the read throughput
  22. 22. 22 Sizing: Capacity planning test II Each experiment tries to accomplish a discreet goal and build upon previous 22 Determine various disk utilization 1 2 3 4 Determine breaking point of a shard Determine saturation point of a node Test desired configuration on two node cluster
  23. 23. 23 Agenda Elastic overview Sizing introduction Hot/Warm architecture 3 Elasticsearch basic architecture 1 2 4
  24. 24. 24 Hot / Warm architecture When using it?  Elasticsearch for larger time-data analytics use cases  Using time-based indices  Able to run an architecture with 3 different types of nodes
  25. 25. 25 Hot / Warm architecture: Type of nodes Master, Hot and Warm nodes:  Master nodes: 3 dedicated master nodes  Hot data nodes: perform all indexing and also hold the most recent daily (data to be queried most frequently). Powerful machines with SSD storage  Warm data nodes: handle a large amount of read-only indices that are not queried frequently. Very large attached spinning disks
  26. 26. 26 Hot / Warm architecture: tagging Which node is doing what?  ES needs to know which servers contain the hot nodes and which servers contain the warm nodes  This can be achieved by assigning arbitrary tags to each server (Hot/Warm)  Tag the node with node.box_type: xxx in elasticsearch.yml  OR start a node using ./bin/elasticsearch --node.box_type xxx
  27. 27. 27 Hot / Warm architecture: Force Merge API Optimizing your indices in the Warm Node  The force merge API allows to force merging of one or more indices through an API. Optimizes the index for faster search operation  The merge relates to the number of segments a Lucene index holds within each shard  The force merge operation allows to reduce the number of segments by merging them: $ curl -XPOST 'http://localhost:9200/my_index/_forcemerge'
  28. 28. 28 Hot / Warm architecture: Demo time!! DEMO

×