An index can potentially store a large amount of data that can exceed the hardware limits of a single node. For example, a single index of a billion documents taking up 1TB of disk space may not fit on the disk of a single node or may be too slow to serve search requests from a single node alone.
TechTalk #13 Grokking: Scaling and supercharging your online business models with Elasticsearch
• Shard? What is it?
• Primary shard vs. Replica shard
• Shard Overallocation
Elasticsearch provides the ability to subdivide your index into
multiple pieces called shards. Each shard is in itself a fully-
functional and independent "index" that can be hosted on any
node in the cluster.
P1 R1 R2
• Can be stored on any node in a cluster
• Independent Index
• Parallelize operations
• A copy of primary shard
• In case failover
• Never resides on a same node as the
• Scaling search volume
By Scott Davis
• Preventing to spend expensive SQL queries to a
relational database with several joins and conditions.
• Or using Elasticsearch as a primary data storage