ElasticSearch vs. Apache Solr last update Dec 2011 An experiment to compare ElasticSearch with Apache Solr From Peter.Karich@pannous.info
Comments and suggestions should go to my blog
This document is based on my personal oppinion and experience with Jetslide where I moved from Solr and several other projects based on ElasticSearch
This document should not be used to show that one of the projects is 'bad'. Keep in mind that both projects are rapidly evolving and you should always make your own tests if a (software) product fits to your use case / company.
Similarities of ElasticSearch (ES) and Solr
Both software systems are:
Powerful search servers based on Lucene 3.5:
Apache Solr 3.5
Both are free software and stand under Apache License 2
Both systems can send JSON over HTTP for indexing and querying. Both support advanced Lucene queries and more.
What ES offers and Solr not (so well)
ES is distributed
easy sharding ('splits' one index into smaller)
easy replication ('copies' one index to multiple nodes)
adding / removing a node is simple; indices will move accordingly
easy cloud support for amazon (S3, discovery via API)
support for GigaSpaces, Coherence and Terracotta
Some Solr features are not supported in 'distributed mode' and there are several methods (no one is really easy) to do it
ES is realtime and distributed : just specify latency via API
... what ElasticSearch offers (2)
The full JSON doc can be nested and stored within the index in one field called _source (even compressed). Re-indexing simple
Query via JSON (or url). Use curl or ElasticSearch-Head
When indexing you specify:
the type - this makes:
multi-tenancy easy. A simple API call to create/delete an index. Solr multi core is more complicated ...
ElasticSearch - Head (similar to Solr Admin page)
... what ElasticSearch offers (3)
ES introduces concept of ' Gateway ' for long term persistency
define index storage (in-memory, on filesystem ...)
when ES crashes it can recover the index storage from this gateway (the 'availability' system). Even if index storage was in-memory.
For amazon you can use S3 as gateway
ES supports complete scripting while querying to do stats, facets etc.
ES lacks ...
100% documentation and always working query examples
solve this via freenode and ask!
Only one main contributer. No commercial support
But strong community (again: freenode + google groups)
But Shay is working full time on ES and ES has a strong community. E.g. community developed clients for several different languages
Shay has a strong background in 'search'. He wrote Compass
ES lacks ...
No Autowarming Queries (see issue 1006 )
No XML support
ES has no facet pagination. See issue 1044
ES has no field collapsing. See issue 256
ES has no Date Math
ES has no separate smaller client jar for Java projects
ES has no spell checking plugin. See issue 911
When to use ElasticSearch
You should use it when
Your index is big or realtime or both
You have several indices or
A multi tenancy requirement.
You want to save administration effort and cost
And when shouldn't you use ElasticSearch?
If your company already uses Solr and no massive indexing is required, you have small indices or no realtime updates are required.
You company does not allow it due to riscs
Search directly elasticsearch.org
Freenode channel #elasticsearch ('group chat')
Code and Issues: github.com/elasticsearch/elasticsearch
Logstash - a log indexer built with ES and jruby: code.google.com/p/logstash
A slide which points out how hard distributed Solr/Lucene is