What’s elasticsearch• “Distributed, (Near) Real Time, Search Engine”• Open Source（Apache 2.0）• RESTful• Free Schema（Dynamic）• MultiTenant• Scalable• High Availability• Rich Search Features• Good Extensibility• ……
Distributed Lucene Directory• Each index is fully sharded with a configurable number of shards.• Each shard can have zero or more replicas.• Read / Search operations performed on either replica shard.
Scalability• nodes that can hold data, and nodes that do not.• There is no need for a load balancer in elasticsearch, each node can receive a request, and if it can’t handle it, it will automatically delegate it to the appropriate node(s).• If you want to scale out search, you can simply have more shard replicas per shard.
Transaction log• Indexed / deleted doc is fully persistent• No need for a Lucene IndexWriter#commit• Managed using a transaction log / WAL• Full single node durability (kill dash 9)• Utilized when doing hot relocation of shards• Periodically “flushed” (calling IW#commit)
BASE• Each document you index is there once the index operation is done.• No need to commit or something similar to get everything persisted.• A shard can have 1 or more replicas for HA.• Gateway persistency is done in the background in an async manner.
Not Mentioned Here…• Versioning• Template• River That’s Too Much,• Percolator Discovery it yourself• PartialUpdate• Routing• Parent-Child Type• Scripting• ……