SlideShare a Scribd company logo
Battle of the Giants
Apache Solr 4.0 vs ElasticSearch 0.20
Rafał Kuć – Sematext International
@kucrafal @sematext sematext.com
Who Am I
• „Solr 3.1 Cookbook” author (4.0 inc)
• Sematext consultant & engineer
• Solr.pl co-founder
• Father and husband 
Copyright 2012 Sematext Int’l. All rights reserved
What Will I Talk About ?
Copyright 2012 Sematext Int’l. All rights reserved
Under the Hood
• ElasticSearch 0.20
– Apache Lucene 3.6.1
• Apache Solr 4.0
– Apache Lucene 4.0
Copyright 2012 Sematext Int’l. All rights reserved
Architecture
• What we expect
– Scalability
– Fault toleranance
– High availablity
– Features
• What we are also looking for
– Manageability
– Installation ease
– Tools
Copyright 2012 Sematext Int’l. All rights reserved
ElasticSearch Cluster Architecture
• Distributed
• Fault tolerant
• Only ElasticSearch nodes
• Single leader
• Automatic leader election
Copyright 2012 Sematext Int’l. All rights reserved
SolrCloud Cluster Architecture
• Distributed
• Fault tolerant
• Apache Solr + ZooKeeper ensemble
• Leader per shard
• Automatic leader election
Copyright 2012 Sematext Int’l. All rights reserved
Collection vs Index
• Collection – Solr main logical index
• Index – ElasticSearch main logic structure
• Collections and Indices can be spread among
different nodes in the cluster
Copyright 2012 Sematext Int’l. All rights reserved
Multiple Document Types in Index
• ElasticSearch - multiple document types in a
single index
• Apache Solr - multiple document types in a
single collection – shared schema.xml
Copyright 2012 Sematext Int’l. All rights reserved
Shards and Replicas
• Index / Collection can have many shards
• Each shard can have 0 or more replicas
• Replicas are automatically updated
• Replicas can be promoted to leaders when a
leader shard goes off-line
Copyright 2012 Sematext Int’l. All rights reserved
Index and Query Routing
• Control where documents are going
• Control where queries are going
• Manual data distribution
Copyright 2012 Sematext Int’l. All rights reserved
Querying Without Routing
Copyright 2012 Sematext Int’l. All rights reserved
Shard 1 Shard 2 Shard 3 Shard 4
Shard 5 Shard 6 Shard 7 Shard 8
Collection / Index
Application
Query With Routing
Copyright 2012 Sematext Int’l. All rights reserved
Shard 1 Shard 2 Shard 3 Shard 4
Shard 5 Shard 6 Shard 7 Shard 8
Collection / Index
Application
Routing Docs and Queries in Solr
• Requires some effort
• Defaults to hash based on document
identifiers
• Can be turned off using
solr.NoOpDistributingUpdateProcessorFactory
Copyright 2012 Sematext Int’l. All rights reserved
<updateRequestProcessorChain>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
<processor class="solr.NoOpDistributingUpdateProcessorFactory" />
</updateRequestProcessorChain>
Routing Docs and Queries - ElasticSearch
• routing parameter controls target shard
which document/query will be forwarded to
• defaults to document identifiers
• can be changed to any value
Copyright 2012 Sematext Int’l. All rights reserved
curl -XPUT localhost:9200/sematext/test/1?routing=1234 -d '{
"title" : "Test routing document"
}'
curl –XGET localhost:9200/sematext/test/_search/?q=*&routing=1234
Apache Solr Index Structure
• Field types defined in schema.xml file
• Fields defined in schema.xml file
• Allows automatic value copying
• Allows dynamic fields
• Allows custom similarity definition
Copyright 2012 Sematext Int’l. All rights reserved
ElasticSearch Index Structure
• Schema - less
• Analyzers and filters defined with HTTP API
• Fields defined with an HTTP request
• Multi – field support
• Allows nested documents
• Allows parent – child relationship
• Allows structured data
Copyright 2012 Sematext Int’l. All rights reserved
Index Structure Manipulation
• Possible to some extent in Solr as well as
ElasticSearch
• ElasticSearch allows dynamic mappings
update (not always)
Copyright 2012 Sematext Int’l. All rights reserved
Aliasing
• Solr
– Allows core aliasing
• ElasticSearch
– Allows index aliasing
– We can add filter to alias
– We can add index routing
– We can add search routing
Copyright 2012 Sematext Int’l. All rights reserved
Server Configuration
• Solr
– Static in solrconfig.xml
– Can be reloaded
during runtime with
collection/core reload
Copyright 2012 Sematext Int’l. All rights reserved
• ElasticSearch
– Static in elasticsearch.yml
– Properties can be
changed during runtime
(although not all) without
reloading
ElasticSearch Gateway Module
• Your data time machine
• Stores indices and meta data
• Currently available:
– Local
– Shared FS
– Hadoop
– S3
Copyright 2012 Sematext Int’l. All rights reserved
Discovery
• Apache Solr uses ZooKeeper
• ElasticSearch uses Zen Discovery
Copyright 2012 Sematext Int’l. All rights reserved
ElasticSearch Zen Discovery
• Allows automatic node discovery
• Provides multicast and unicast discovery
methods
• Automatic master detection
• Two - way failure detection
Copyright 2012 Sematext Int’l. All rights reserved
Apache Solr & Apache ZooKeeper
• Requires additional software
• ZooKeeper ensemble with 1+ ZooKeeper
instances
• Prevents split – brain situations
• Holds collections configurations
• Solr needs to know address of one of the
ZooKeeper instances
Copyright 2012 Sematext Int’l. All rights reserved
API
• HTTP REST API in ElasticSearch or Query String
for simple queries
• HTTP with Query String in Apache Solr
• Both provide specialized Java API
– SolrJ for Apache Solr and CloudSolrServer
– ElasticSearch with TransportClient for remote
connections
Copyright 2012 Sematext Int’l. All rights reserved
Apache Solr and Query String
• Queries are built of request parameters
• Some degree of structuring allowed (local
params)
Copyright 2012 Sematext Int’l. All rights reserved
curl 'http://localhost:8983/solr/select?q=text:weird&sort=date+desc'
ElasticSearch REST End-Points
• Simple queries built of request parameters
• Stuctured queries built as JSON objects
Copyright 2012 Sematext Int’l. All rights reserved
curl –XGET
'localhost:9200/sematext/test/_search/?q=_all:weird&sort=date:desc'
curl -XGET 'localhost:9200/sematext/test_search' -d '{
"query" : {
"term" : {
"_all" : "weird"
},
"sort" : {
"date" : {
"order" : "desc"
}
}
}'
Data Handling
• Solr
– Multiple formats allowed as input
– Can return results in multiple formats
• ElasticSearch
– JSON in / JSON out
Copyright 2012 Sematext Int’l. All rights reserved
Single or Batch
• Solr
– Single or multiple
documents per
request
Copyright 2012 Sematext Int’l. All rights reserved
• ElasticSearch
– Single document with a
standard indexing call
– _bulk end – point exposed
for batch indexing
– _bulk UDP end – point can
be exposed for low
latency batch indexing
Partial Document Updates
• Not based on LUCENE-3837 proposed by
Andrzej Białecki
• Document reindexing on the side of search
server
• Both servers use versioning to prevent
changes being overwritten
• Can lead to decreased network traffic in some
cases
Copyright 2012 Sematext Int’l. All rights reserved
ElasticSearch Partial Doc Update
• Special end – point exposed - _update
• Supports parameters like routing, parent,
replication, percolate, etc (similar to Index
API)
• Uses scripts to perform document updates
Copyright 2012 Sematext Int’l. All rights reserved
curl -XPOST 'localhost:9200/sematext/test/12345/_update' -d '{
"script" : "ctx._source.enabled = enabled",
"params" : {
"enabled" : true
}
}'
Apache Solr Partial Doc Update
• Sent to the standard update handler
• Requires _version_ field to be present
Copyright 2012 Sematext Int’l. All rights reserved
curl 'localhost:8983/solr/update?commit=true' -H 'Content-
type:application/json' -d '[
{
"id" : "12345",
"enabled" : {
"set" : true
}
}
]'
Solr Collections API
• Built on top of Core Admin
• Allows:
– Collection creation
– Collection reload
– Collection deletion
Copyright 2012 Sematext Int’l. All rights reserved
ElasticSearch Indices REST API
• Allows:
– Index creation
– Index deletion
– Index closing and opening
– Index refreshing
– Existence checking
Copyright 2012 Sematext Int’l. All rights reserved
Analysis Chain Definition
• Solr
– Static in schema.xml
– Can be reloaded
during runtime with
collection/core reload
Copyright 2012 Sematext Int’l. All rights reserved
• ElasticSearch
– Static in elasticsearch.yml
– Defined during index/type
creation with REST call
– Possible to change with
update mapping call (not
all changes allowed)
Multilingual Data Handling
• Both ElasticSearch and Apache Solr built on
top of Apache Lucene
• Solr – analyzers defined per field in schema.xml
file
• ElasticSearch – analyzer defined in
mappings, but can be set during query or
specified on the basis of field values
Copyright 2012 Sematext Int’l. All rights reserved
Results Grouping
• Available in Apache Solr only
• Allows for results grouping based on:
– Field value
– Query
– Function query (not available during distributed
searching)
Copyright 2012 Sematext Int’l. All rights reserved
Prospective Search
• Allows for checking if a document matches a
stored query
• Not available in Apache Solr
• Available in ElasticSearch under the name of
Percolator
Copyright 2012 Sematext Int’l. All rights reserved
Spellchecker
• Allows to check and correct spelling mistakes
• Not available in ElasticSearch currently
• Multiple implementations available in Apache
Solr
– IndexBasedSpellChecker
– WordBreakSolrSpellChecker
– DirectSolrSpellChecker
Copyright 2012 Sematext Int’l. All rights reserved
Full Text Search Capabilities
• Variety of queries
• Ability to control score calculation
• Different query parsers available
• Advanced Lucene queries (like SpanQueries)
exposed
Copyright 2012 Sematext Int’l. All rights reserved
Score Calculation
• Leverage Lucene scoring capabilities
• Control over document importance
• Control over query importance
• Control over term and phrase importance
Copyright 2012 Sematext Int’l. All rights reserved
Apache Solr and Score Influence
• Index time
– Document boosts
– Field boosts
• Query time
– Term boosts
– Field boosts
– Phrases boost
– Function queries
Copyright 2012 Sematext Int’l. All rights reserved
ElasticSearch and Score Influence
• Index time
– Document and field boosts
• Query time
– Different queries provide different boost controls
– Can calculate distributed term frequencies
– Negative and Positive boosting queries
– Custom score filters
• Scripts
– Control scoring with scripts
Copyright 2012 Sematext Int’l. All rights reserved
Nested Objects
• Possible only in ElasticSearch
• Indexed as separate documents
• Stored in the same part of the index as the
root document
• Hidden from standard queries and filters
• Need appropriate queries and filters (nested)
Copyright 2012 Sematext Int’l. All rights reserved
More Like This
• Lets us find similar documents
• Solr
– More Like This Component
• ElasticSearch
– More Like This Query
– More Like This Field Query
– _mlt REST end – point
Copyright 2012 Sematext Int’l. All rights reserved
Solr Parent – Child Relationship
• Used at query time
• Multi core joins possible
Copyright 2012 Sematext Int’l. All rights reserved
http://localhost:8983/solr/select?q={!join from=parent to=id}color:Yellow
ElasticSearch Parent – Child Handling
• Proper indexing required
• Indexed as separate documents
• Standard queries don’t return child
documents
• In order to retrieve parent docs one should
use appropriate queries and filters (has_child,
has_parent, top_children)
Copyright 2012 Sematext Int’l. All rights reserved
Filters
• Used to narrown down query results
• Good candidates for caching and reuse
• Supported by ElasticSearch and Apache Solr
• Should be used for repeatable query elements
Copyright 2012 Sematext Int’l. All rights reserved
Apache Solr Filter Queries
• Multiple filters per query
• Filters are addictive
• Different query parsers can be used
• Local params can be used
• Narrow down faceting results
Copyright 2012 Sematext Int’l. All rights reserved
ElasticSearch Filtered Queries
• Can be defined using queries exposed by the
Query DSL
• Can be used for custom score calculation
(i.e., custom filters score query)
• Doesn’t narrow down faceting results by
default (facets have their own filters)
Copyright 2012 Sematext Int’l. All rights reserved
Filter Cache Control
• Both Solr and ElasticSearch let us control
cache for filters
• Solr
– Using local params and cache property
• ElasticSearch
– _cache property
– _cache_key property
Copyright 2012 Sematext Int’l. All rights reserved
Faceting
• Both provide common facets
– Terms
– Range & query
– Terms statistics
– Spatial distance
• Solr
– Pivot faceting
• ElasticSearch
– Histograms
Copyright 2012 Sematext Int’l. All rights reserved
Real Time Or Not ?
• Allow getting document not yet indexed
• Don’t need searcher reopening
• ElasticSearch
– Separate Get and Multi Get API’s
• Apache Solr
– Separate Realtime Get Handler
– Can be used as a search component
Copyright 2012 Sematext Int’l. All rights reserved
Caches and Warming
• ElasticSearch and Solr allow caching
• Both allow running warming queries
• ElasticSearch by default doesn’t limit cache
sizes
Copyright 2012 Sematext Int’l. All rights reserved
Solr Caches
• Types
– Filter Cache
– Query Result Cache
– Document Cache
• Implementation choices
– LRUCache
– FastLRUCache
– LFUCache
• Other configuration options:
– Size
– Maximum size
– Autowarming count
Copyright 2012 Sematext Int’l. All rights reserved
ElasticSearch Caches
• Types
– Filter Cache
– Field Data Cache
• Implementation choices
– Resident
– Soft
– Weak
• Other configuration options:
– Max size (entries per segment)
– Expiration time
Copyright 2012 Sematext Int’l. All rights reserved
Cluster State Monitoring
• Apache Solr – multiple mbeans exposed by
JMX
• ElasticSearch – multiple REST end – points
exposed to get different statistics
Copyright 2012 Sematext Int’l. All rights reserved
ElasticSearch Statistics API
• Health and State Check
• Nodes Information and Statistics
• Cache Statistics
• Index Segments Information
• Index Information and Statistics
• Mappings Information
Copyright 2012 Sematext Int’l. All rights reserved
Cluster Monitoring
Copyright 2012 Sematext Int’l. All rights reserved
Cluster Monitoring with SPM
Copyright 2012 Sematext Int’l. All rights reserved
Cluster Settings Update
• ElasticSearch lets us:
– Control rebalancing
– Control recovery
– Control allocation
– Change the above on the live cluster
Copyright 2012 Sematext Int’l. All rights reserved
Custom Shard Allocation
• Possible in ElasticSearch
• Cluster level:
• Index level:
Copyright 2012 Sematext Int’l. All rights reserved
curl -XPUT localhost:9200/_cluster/settings -d '{
"persistent" : {
"cluster.routing.allocation.exclude._ip" : "192.168.2.1"
}
}'
curl -XPUT localhost:9200/sematext/ -d '{
"index.routing.allocation.include.tag" : "nodeOne,nodeTwo"
}'
Moving Shards and Replicas
• Possible in ElasticSearch, not available in Solr
• Allows to move shards and replicas to any
node in the cluster on demand
• Available in ElasticSearch:
Copyright 2012 Sematext Int’l. All rights reserved
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
"commands" : [
{"move" : {"index" : "sematext", "shard" : 0, "from_node" : "node1", "to_node" : "node2"}},
{"allocate" : {"index" : "sematext", "shard" : 1, "node" : "node3"}}
]
}'
And The Winner Is ?
Copyright 2012 Sematext Int’l. All rights reserved
How to Reach Us
• Rafał Kuć
– Twitter: @kucrafal
– E-mail: rafal.kuc@sematext.com
• Sematext
– Twitter: @sematext
– Website: http://sematext.com
• Solr vs ElasticSearch series:
• http://blog.sematext.com/2012/08/23/solr-vs-
elasticsearch-part-1-overview/
Copyright 2012 Sematext Int’l. All rights reserved
We Are Hiring !
• Dig Search ?
• Dig Analytics ?
• Dig Big Data ?
• Dig Performance ?
• Dig working with and in open – source ?
• We’re hiring world – wide !
http://sematext.com/about/jobs.html
Copyright 2012 Sematext Int’l. All rights reserved

More Related Content

What's hot

Elasticsearch Basics
Elasticsearch BasicsElasticsearch Basics
Elasticsearch Basics
Shifa Khan
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
Rahul Jain
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
Rahul Jain
 
Elastic search apache_solr
Elastic search apache_solrElastic search apache_solr
Elastic search apache_solrmacrochen
 
Intro to Elasticsearch
Intro to ElasticsearchIntro to Elasticsearch
Intro to Elasticsearch
Clifford James
 
Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2
Sematext Group, Inc.
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
BeyondTrees
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearch
clintongormley
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
Ruslan Zavacky
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
Eric Rodriguez (Hiring in Lex)
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
ABC Talks
 
Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)
Karel Minarik
 
Apache Solr/Lucene Internals by Anatoliy Sokolenko
Apache Solr/Lucene Internals  by Anatoliy SokolenkoApache Solr/Lucene Internals  by Anatoliy Sokolenko
Apache Solr/Lucene Internals by Anatoliy Sokolenko
Provectus
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
Jurriaan Persyn
 
Elasticsearch speed is key
Elasticsearch speed is keyElasticsearch speed is key
Elasticsearch speed is key
Enterprise Search Warsaw Meetup
 
The ultimate guide for Elasticsearch plugins
The ultimate guide for Elasticsearch pluginsThe ultimate guide for Elasticsearch plugins
The ultimate guide for Elasticsearch plugins
Itamar
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
pmanvi
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
Jason Austin
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)
Federico Panini
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search
medcl
 

What's hot (20)

Elasticsearch Basics
Elasticsearch BasicsElasticsearch Basics
Elasticsearch Basics
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 
Elastic search apache_solr
Elastic search apache_solrElastic search apache_solr
Elastic search apache_solr
 
Intro to Elasticsearch
Intro to ElasticsearchIntro to Elasticsearch
Intro to Elasticsearch
 
Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearch
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
 
Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)
 
Apache Solr/Lucene Internals by Anatoliy Sokolenko
Apache Solr/Lucene Internals  by Anatoliy SokolenkoApache Solr/Lucene Internals  by Anatoliy Sokolenko
Apache Solr/Lucene Internals by Anatoliy Sokolenko
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
 
Elasticsearch speed is key
Elasticsearch speed is keyElasticsearch speed is key
Elasticsearch speed is key
 
The ultimate guide for Elasticsearch plugins
The ultimate guide for Elasticsearch pluginsThe ultimate guide for Elasticsearch plugins
The ultimate guide for Elasticsearch plugins
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search
 

Similar to Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)

BigData Faceted Search Comparison between Apache Solr vs. ElasticSearch
BigData Faceted Search Comparison between Apache Solr vs. ElasticSearchBigData Faceted Search Comparison between Apache Solr vs. ElasticSearch
BigData Faceted Search Comparison between Apache Solr vs. ElasticSearch
NetConstructor, Inc.
 
Scaling Massive Elasticsearch Clusters
Scaling Massive Elasticsearch ClustersScaling Massive Elasticsearch Clusters
Scaling Massive Elasticsearch ClustersSematext Group, Inc.
 
From Lucene to Solr 4 Trunk
From Lucene to Solr 4 TrunkFrom Lucene to Solr 4 Trunk
From Lucene to Solr 4 Trunk
tdthomassld
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Rafał Kuć
 
New Persistence Features in Spring Roo 1.1
New Persistence Features in Spring Roo 1.1New Persistence Features in Spring Roo 1.1
New Persistence Features in Spring Roo 1.1
Stefan Schmidt
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
Erik Hatcher
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes WorkshopErik Hatcher
 
What's new in Lucene and Solr 4.x
What's new in Lucene and Solr 4.xWhat's new in Lucene and Solr 4.x
What's new in Lucene and Solr 4.x
Grant Ingersoll
 
Apache Solr - Enterprise search platform
Apache Solr - Enterprise search platformApache Solr - Enterprise search platform
Apache Solr - Enterprise search platform
Tommaso Teofili
 
hibernateormfeatures-140223193044-phpapp02.pdf
hibernateormfeatures-140223193044-phpapp02.pdfhibernateormfeatures-140223193044-phpapp02.pdf
hibernateormfeatures-140223193044-phpapp02.pdf
Patiento Del Mar
 
20130310 solr tuorial
20130310 solr tuorial20130310 solr tuorial
20130310 solr tuorialChris Huang
 
Not Just ORM: Powerful Hibernate ORM Features and Capabilities
Not Just ORM: Powerful Hibernate ORM Features and CapabilitiesNot Just ORM: Powerful Hibernate ORM Features and Capabilities
Not Just ORM: Powerful Hibernate ORM Features and Capabilities
Brett Meyer
 
Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search
Hortonworks
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
Erik Hatcher
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash courseTommaso Teofili
 
Solr at zvents 6 years later & still going strong
Solr at zvents   6 years later & still going strongSolr at zvents   6 years later & still going strong
Solr at zvents 6 years later & still going strong
lucenerevolution
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Solr search engine with multiple table relation
Solr search engine with multiple table relationSolr search engine with multiple table relation
Solr search engine with multiple table relation
Jay Bharat
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 

Similar to Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon) (20)

BigData Faceted Search Comparison between Apache Solr vs. ElasticSearch
BigData Faceted Search Comparison between Apache Solr vs. ElasticSearchBigData Faceted Search Comparison between Apache Solr vs. ElasticSearch
BigData Faceted Search Comparison between Apache Solr vs. ElasticSearch
 
Scaling Massive Elasticsearch Clusters
Scaling Massive Elasticsearch ClustersScaling Massive Elasticsearch Clusters
Scaling Massive Elasticsearch Clusters
 
From Lucene to Solr 4 Trunk
From Lucene to Solr 4 TrunkFrom Lucene to Solr 4 Trunk
From Lucene to Solr 4 Trunk
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
 
Solr
SolrSolr
Solr
 
New Persistence Features in Spring Roo 1.1
New Persistence Features in Spring Roo 1.1New Persistence Features in Spring Roo 1.1
New Persistence Features in Spring Roo 1.1
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes Workshop
 
What's new in Lucene and Solr 4.x
What's new in Lucene and Solr 4.xWhat's new in Lucene and Solr 4.x
What's new in Lucene and Solr 4.x
 
Apache Solr - Enterprise search platform
Apache Solr - Enterprise search platformApache Solr - Enterprise search platform
Apache Solr - Enterprise search platform
 
hibernateormfeatures-140223193044-phpapp02.pdf
hibernateormfeatures-140223193044-phpapp02.pdfhibernateormfeatures-140223193044-phpapp02.pdf
hibernateormfeatures-140223193044-phpapp02.pdf
 
20130310 solr tuorial
20130310 solr tuorial20130310 solr tuorial
20130310 solr tuorial
 
Not Just ORM: Powerful Hibernate ORM Features and Capabilities
Not Just ORM: Powerful Hibernate ORM Features and CapabilitiesNot Just ORM: Powerful Hibernate ORM Features and Capabilities
Not Just ORM: Powerful Hibernate ORM Features and Capabilities
 
Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash course
 
Solr at zvents 6 years later & still going strong
Solr at zvents   6 years later & still going strongSolr at zvents   6 years later & still going strong
Solr at zvents 6 years later & still going strong
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Solr search engine with multiple table relation
Solr search engine with multiple table relationSolr search engine with multiple table relation
Solr search engine with multiple table relation
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 

More from Sematext Group, Inc.

Tweaking the Base Score: Lucene/Solr Similarities Explained
Tweaking the Base Score: Lucene/Solr Similarities ExplainedTweaking the Base Score: Lucene/Solr Similarities Explained
Tweaking the Base Score: Lucene/Solr Similarities Explained
Sematext Group, Inc.
 
OOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM appsOOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM apps
Sematext Group, Inc.
 
Is observability good for your brain?
Is observability good for your brain?Is observability good for your brain?
Is observability good for your brain?
Sematext Group, Inc.
 
Introducing log analysis to your organization
Introducing log analysis to your organization Introducing log analysis to your organization
Introducing log analysis to your organization
Sematext Group, Inc.
 
Solr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for YouSolr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for You
Sematext Group, Inc.
 
Solr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the UglySolr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the Ugly
Sematext Group, Inc.
 
Monitoring and Log Management for
Monitoring and Log Management forMonitoring and Log Management for
Monitoring and Log Management for
Sematext Group, Inc.
 
Introduction to solr
Introduction to solrIntroduction to solr
Introduction to solr
Sematext Group, Inc.
 
Building Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
Building Resilient Log Aggregation Pipeline with Elasticsearch & KafkaBuilding Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
Building Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
Sematext Group, Inc.
 
Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep dive
Sematext Group, Inc.
 
How to Run Solr on Docker and Why
How to Run Solr on Docker and WhyHow to Run Solr on Docker and Why
How to Run Solr on Docker and Why
Sematext Group, Inc.
 
Tuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for LogsTuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for Logs
Sematext Group, Inc.
 
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on DockerRunning High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Sematext Group, Inc.
 
Top Node.js Metrics to Watch
Top Node.js Metrics to WatchTop Node.js Metrics to Watch
Top Node.js Metrics to Watch
Sematext Group, Inc.
 
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Sematext Group, Inc.
 
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Sematext Group, Inc.
 
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
Sematext Group, Inc.
 
Docker Logging Webinar
Docker Logging  WebinarDocker Logging  Webinar
Docker Logging Webinar
Sematext Group, Inc.
 
Docker Monitoring Webinar
Docker Monitoring  WebinarDocker Monitoring  Webinar
Docker Monitoring Webinar
Sematext Group, Inc.
 
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at ScaleMetrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at ScaleSematext Group, Inc.
 

More from Sematext Group, Inc. (20)

Tweaking the Base Score: Lucene/Solr Similarities Explained
Tweaking the Base Score: Lucene/Solr Similarities ExplainedTweaking the Base Score: Lucene/Solr Similarities Explained
Tweaking the Base Score: Lucene/Solr Similarities Explained
 
OOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM appsOOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM apps
 
Is observability good for your brain?
Is observability good for your brain?Is observability good for your brain?
Is observability good for your brain?
 
Introducing log analysis to your organization
Introducing log analysis to your organization Introducing log analysis to your organization
Introducing log analysis to your organization
 
Solr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for YouSolr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for You
 
Solr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the UglySolr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the Ugly
 
Monitoring and Log Management for
Monitoring and Log Management forMonitoring and Log Management for
Monitoring and Log Management for
 
Introduction to solr
Introduction to solrIntroduction to solr
Introduction to solr
 
Building Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
Building Resilient Log Aggregation Pipeline with Elasticsearch & KafkaBuilding Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
Building Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
 
Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep dive
 
How to Run Solr on Docker and Why
How to Run Solr on Docker and WhyHow to Run Solr on Docker and Why
How to Run Solr on Docker and Why
 
Tuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for LogsTuning Solr & Pipeline for Logs
Tuning Solr & Pipeline for Logs
 
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on DockerRunning High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
 
Top Node.js Metrics to Watch
Top Node.js Metrics to WatchTop Node.js Metrics to Watch
Top Node.js Metrics to Watch
 
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
 
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
 
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
 
Docker Logging Webinar
Docker Logging  WebinarDocker Logging  Webinar
Docker Logging Webinar
 
Docker Monitoring Webinar
Docker Monitoring  WebinarDocker Monitoring  Webinar
Docker Monitoring Webinar
 
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at ScaleMetrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale
 

Recently uploaded

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
ViralQR
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 

Recently uploaded (20)

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 

Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)

  • 1. Battle of the Giants Apache Solr 4.0 vs ElasticSearch 0.20 Rafał Kuć – Sematext International @kucrafal @sematext sematext.com
  • 2. Who Am I • „Solr 3.1 Cookbook” author (4.0 inc) • Sematext consultant & engineer • Solr.pl co-founder • Father and husband  Copyright 2012 Sematext Int’l. All rights reserved
  • 3. What Will I Talk About ? Copyright 2012 Sematext Int’l. All rights reserved
  • 4. Under the Hood • ElasticSearch 0.20 – Apache Lucene 3.6.1 • Apache Solr 4.0 – Apache Lucene 4.0 Copyright 2012 Sematext Int’l. All rights reserved
  • 5. Architecture • What we expect – Scalability – Fault toleranance – High availablity – Features • What we are also looking for – Manageability – Installation ease – Tools Copyright 2012 Sematext Int’l. All rights reserved
  • 6. ElasticSearch Cluster Architecture • Distributed • Fault tolerant • Only ElasticSearch nodes • Single leader • Automatic leader election Copyright 2012 Sematext Int’l. All rights reserved
  • 7. SolrCloud Cluster Architecture • Distributed • Fault tolerant • Apache Solr + ZooKeeper ensemble • Leader per shard • Automatic leader election Copyright 2012 Sematext Int’l. All rights reserved
  • 8. Collection vs Index • Collection – Solr main logical index • Index – ElasticSearch main logic structure • Collections and Indices can be spread among different nodes in the cluster Copyright 2012 Sematext Int’l. All rights reserved
  • 9. Multiple Document Types in Index • ElasticSearch - multiple document types in a single index • Apache Solr - multiple document types in a single collection – shared schema.xml Copyright 2012 Sematext Int’l. All rights reserved
  • 10. Shards and Replicas • Index / Collection can have many shards • Each shard can have 0 or more replicas • Replicas are automatically updated • Replicas can be promoted to leaders when a leader shard goes off-line Copyright 2012 Sematext Int’l. All rights reserved
  • 11. Index and Query Routing • Control where documents are going • Control where queries are going • Manual data distribution Copyright 2012 Sematext Int’l. All rights reserved
  • 12. Querying Without Routing Copyright 2012 Sematext Int’l. All rights reserved Shard 1 Shard 2 Shard 3 Shard 4 Shard 5 Shard 6 Shard 7 Shard 8 Collection / Index Application
  • 13. Query With Routing Copyright 2012 Sematext Int’l. All rights reserved Shard 1 Shard 2 Shard 3 Shard 4 Shard 5 Shard 6 Shard 7 Shard 8 Collection / Index Application
  • 14. Routing Docs and Queries in Solr • Requires some effort • Defaults to hash based on document identifiers • Can be turned off using solr.NoOpDistributingUpdateProcessorFactory Copyright 2012 Sematext Int’l. All rights reserved <updateRequestProcessorChain> <processor class="solr.LogUpdateProcessorFactory" /> <processor class="solr.RunUpdateProcessorFactory" /> <processor class="solr.NoOpDistributingUpdateProcessorFactory" /> </updateRequestProcessorChain>
  • 15. Routing Docs and Queries - ElasticSearch • routing parameter controls target shard which document/query will be forwarded to • defaults to document identifiers • can be changed to any value Copyright 2012 Sematext Int’l. All rights reserved curl -XPUT localhost:9200/sematext/test/1?routing=1234 -d '{ "title" : "Test routing document" }' curl –XGET localhost:9200/sematext/test/_search/?q=*&routing=1234
  • 16. Apache Solr Index Structure • Field types defined in schema.xml file • Fields defined in schema.xml file • Allows automatic value copying • Allows dynamic fields • Allows custom similarity definition Copyright 2012 Sematext Int’l. All rights reserved
  • 17. ElasticSearch Index Structure • Schema - less • Analyzers and filters defined with HTTP API • Fields defined with an HTTP request • Multi – field support • Allows nested documents • Allows parent – child relationship • Allows structured data Copyright 2012 Sematext Int’l. All rights reserved
  • 18. Index Structure Manipulation • Possible to some extent in Solr as well as ElasticSearch • ElasticSearch allows dynamic mappings update (not always) Copyright 2012 Sematext Int’l. All rights reserved
  • 19. Aliasing • Solr – Allows core aliasing • ElasticSearch – Allows index aliasing – We can add filter to alias – We can add index routing – We can add search routing Copyright 2012 Sematext Int’l. All rights reserved
  • 20. Server Configuration • Solr – Static in solrconfig.xml – Can be reloaded during runtime with collection/core reload Copyright 2012 Sematext Int’l. All rights reserved • ElasticSearch – Static in elasticsearch.yml – Properties can be changed during runtime (although not all) without reloading
  • 21. ElasticSearch Gateway Module • Your data time machine • Stores indices and meta data • Currently available: – Local – Shared FS – Hadoop – S3 Copyright 2012 Sematext Int’l. All rights reserved
  • 22. Discovery • Apache Solr uses ZooKeeper • ElasticSearch uses Zen Discovery Copyright 2012 Sematext Int’l. All rights reserved
  • 23. ElasticSearch Zen Discovery • Allows automatic node discovery • Provides multicast and unicast discovery methods • Automatic master detection • Two - way failure detection Copyright 2012 Sematext Int’l. All rights reserved
  • 24. Apache Solr & Apache ZooKeeper • Requires additional software • ZooKeeper ensemble with 1+ ZooKeeper instances • Prevents split – brain situations • Holds collections configurations • Solr needs to know address of one of the ZooKeeper instances Copyright 2012 Sematext Int’l. All rights reserved
  • 25. API • HTTP REST API in ElasticSearch or Query String for simple queries • HTTP with Query String in Apache Solr • Both provide specialized Java API – SolrJ for Apache Solr and CloudSolrServer – ElasticSearch with TransportClient for remote connections Copyright 2012 Sematext Int’l. All rights reserved
  • 26. Apache Solr and Query String • Queries are built of request parameters • Some degree of structuring allowed (local params) Copyright 2012 Sematext Int’l. All rights reserved curl 'http://localhost:8983/solr/select?q=text:weird&sort=date+desc'
  • 27. ElasticSearch REST End-Points • Simple queries built of request parameters • Stuctured queries built as JSON objects Copyright 2012 Sematext Int’l. All rights reserved curl –XGET 'localhost:9200/sematext/test/_search/?q=_all:weird&sort=date:desc' curl -XGET 'localhost:9200/sematext/test_search' -d '{ "query" : { "term" : { "_all" : "weird" }, "sort" : { "date" : { "order" : "desc" } } }'
  • 28. Data Handling • Solr – Multiple formats allowed as input – Can return results in multiple formats • ElasticSearch – JSON in / JSON out Copyright 2012 Sematext Int’l. All rights reserved
  • 29. Single or Batch • Solr – Single or multiple documents per request Copyright 2012 Sematext Int’l. All rights reserved • ElasticSearch – Single document with a standard indexing call – _bulk end – point exposed for batch indexing – _bulk UDP end – point can be exposed for low latency batch indexing
  • 30. Partial Document Updates • Not based on LUCENE-3837 proposed by Andrzej Białecki • Document reindexing on the side of search server • Both servers use versioning to prevent changes being overwritten • Can lead to decreased network traffic in some cases Copyright 2012 Sematext Int’l. All rights reserved
  • 31. ElasticSearch Partial Doc Update • Special end – point exposed - _update • Supports parameters like routing, parent, replication, percolate, etc (similar to Index API) • Uses scripts to perform document updates Copyright 2012 Sematext Int’l. All rights reserved curl -XPOST 'localhost:9200/sematext/test/12345/_update' -d '{ "script" : "ctx._source.enabled = enabled", "params" : { "enabled" : true } }'
  • 32. Apache Solr Partial Doc Update • Sent to the standard update handler • Requires _version_ field to be present Copyright 2012 Sematext Int’l. All rights reserved curl 'localhost:8983/solr/update?commit=true' -H 'Content- type:application/json' -d '[ { "id" : "12345", "enabled" : { "set" : true } } ]'
  • 33. Solr Collections API • Built on top of Core Admin • Allows: – Collection creation – Collection reload – Collection deletion Copyright 2012 Sematext Int’l. All rights reserved
  • 34. ElasticSearch Indices REST API • Allows: – Index creation – Index deletion – Index closing and opening – Index refreshing – Existence checking Copyright 2012 Sematext Int’l. All rights reserved
  • 35. Analysis Chain Definition • Solr – Static in schema.xml – Can be reloaded during runtime with collection/core reload Copyright 2012 Sematext Int’l. All rights reserved • ElasticSearch – Static in elasticsearch.yml – Defined during index/type creation with REST call – Possible to change with update mapping call (not all changes allowed)
  • 36. Multilingual Data Handling • Both ElasticSearch and Apache Solr built on top of Apache Lucene • Solr – analyzers defined per field in schema.xml file • ElasticSearch – analyzer defined in mappings, but can be set during query or specified on the basis of field values Copyright 2012 Sematext Int’l. All rights reserved
  • 37. Results Grouping • Available in Apache Solr only • Allows for results grouping based on: – Field value – Query – Function query (not available during distributed searching) Copyright 2012 Sematext Int’l. All rights reserved
  • 38. Prospective Search • Allows for checking if a document matches a stored query • Not available in Apache Solr • Available in ElasticSearch under the name of Percolator Copyright 2012 Sematext Int’l. All rights reserved
  • 39. Spellchecker • Allows to check and correct spelling mistakes • Not available in ElasticSearch currently • Multiple implementations available in Apache Solr – IndexBasedSpellChecker – WordBreakSolrSpellChecker – DirectSolrSpellChecker Copyright 2012 Sematext Int’l. All rights reserved
  • 40. Full Text Search Capabilities • Variety of queries • Ability to control score calculation • Different query parsers available • Advanced Lucene queries (like SpanQueries) exposed Copyright 2012 Sematext Int’l. All rights reserved
  • 41. Score Calculation • Leverage Lucene scoring capabilities • Control over document importance • Control over query importance • Control over term and phrase importance Copyright 2012 Sematext Int’l. All rights reserved
  • 42. Apache Solr and Score Influence • Index time – Document boosts – Field boosts • Query time – Term boosts – Field boosts – Phrases boost – Function queries Copyright 2012 Sematext Int’l. All rights reserved
  • 43. ElasticSearch and Score Influence • Index time – Document and field boosts • Query time – Different queries provide different boost controls – Can calculate distributed term frequencies – Negative and Positive boosting queries – Custom score filters • Scripts – Control scoring with scripts Copyright 2012 Sematext Int’l. All rights reserved
  • 44. Nested Objects • Possible only in ElasticSearch • Indexed as separate documents • Stored in the same part of the index as the root document • Hidden from standard queries and filters • Need appropriate queries and filters (nested) Copyright 2012 Sematext Int’l. All rights reserved
  • 45. More Like This • Lets us find similar documents • Solr – More Like This Component • ElasticSearch – More Like This Query – More Like This Field Query – _mlt REST end – point Copyright 2012 Sematext Int’l. All rights reserved
  • 46. Solr Parent – Child Relationship • Used at query time • Multi core joins possible Copyright 2012 Sematext Int’l. All rights reserved http://localhost:8983/solr/select?q={!join from=parent to=id}color:Yellow
  • 47. ElasticSearch Parent – Child Handling • Proper indexing required • Indexed as separate documents • Standard queries don’t return child documents • In order to retrieve parent docs one should use appropriate queries and filters (has_child, has_parent, top_children) Copyright 2012 Sematext Int’l. All rights reserved
  • 48. Filters • Used to narrown down query results • Good candidates for caching and reuse • Supported by ElasticSearch and Apache Solr • Should be used for repeatable query elements Copyright 2012 Sematext Int’l. All rights reserved
  • 49. Apache Solr Filter Queries • Multiple filters per query • Filters are addictive • Different query parsers can be used • Local params can be used • Narrow down faceting results Copyright 2012 Sematext Int’l. All rights reserved
  • 50. ElasticSearch Filtered Queries • Can be defined using queries exposed by the Query DSL • Can be used for custom score calculation (i.e., custom filters score query) • Doesn’t narrow down faceting results by default (facets have their own filters) Copyright 2012 Sematext Int’l. All rights reserved
  • 51. Filter Cache Control • Both Solr and ElasticSearch let us control cache for filters • Solr – Using local params and cache property • ElasticSearch – _cache property – _cache_key property Copyright 2012 Sematext Int’l. All rights reserved
  • 52. Faceting • Both provide common facets – Terms – Range & query – Terms statistics – Spatial distance • Solr – Pivot faceting • ElasticSearch – Histograms Copyright 2012 Sematext Int’l. All rights reserved
  • 53. Real Time Or Not ? • Allow getting document not yet indexed • Don’t need searcher reopening • ElasticSearch – Separate Get and Multi Get API’s • Apache Solr – Separate Realtime Get Handler – Can be used as a search component Copyright 2012 Sematext Int’l. All rights reserved
  • 54. Caches and Warming • ElasticSearch and Solr allow caching • Both allow running warming queries • ElasticSearch by default doesn’t limit cache sizes Copyright 2012 Sematext Int’l. All rights reserved
  • 55. Solr Caches • Types – Filter Cache – Query Result Cache – Document Cache • Implementation choices – LRUCache – FastLRUCache – LFUCache • Other configuration options: – Size – Maximum size – Autowarming count Copyright 2012 Sematext Int’l. All rights reserved
  • 56. ElasticSearch Caches • Types – Filter Cache – Field Data Cache • Implementation choices – Resident – Soft – Weak • Other configuration options: – Max size (entries per segment) – Expiration time Copyright 2012 Sematext Int’l. All rights reserved
  • 57. Cluster State Monitoring • Apache Solr – multiple mbeans exposed by JMX • ElasticSearch – multiple REST end – points exposed to get different statistics Copyright 2012 Sematext Int’l. All rights reserved
  • 58. ElasticSearch Statistics API • Health and State Check • Nodes Information and Statistics • Cache Statistics • Index Segments Information • Index Information and Statistics • Mappings Information Copyright 2012 Sematext Int’l. All rights reserved
  • 59. Cluster Monitoring Copyright 2012 Sematext Int’l. All rights reserved
  • 60. Cluster Monitoring with SPM Copyright 2012 Sematext Int’l. All rights reserved
  • 61. Cluster Settings Update • ElasticSearch lets us: – Control rebalancing – Control recovery – Control allocation – Change the above on the live cluster Copyright 2012 Sematext Int’l. All rights reserved
  • 62. Custom Shard Allocation • Possible in ElasticSearch • Cluster level: • Index level: Copyright 2012 Sematext Int’l. All rights reserved curl -XPUT localhost:9200/_cluster/settings -d '{ "persistent" : { "cluster.routing.allocation.exclude._ip" : "192.168.2.1" } }' curl -XPUT localhost:9200/sematext/ -d '{ "index.routing.allocation.include.tag" : "nodeOne,nodeTwo" }'
  • 63. Moving Shards and Replicas • Possible in ElasticSearch, not available in Solr • Allows to move shards and replicas to any node in the cluster on demand • Available in ElasticSearch: Copyright 2012 Sematext Int’l. All rights reserved curl -XPOST 'localhost:9200/_cluster/reroute' -d '{ "commands" : [ {"move" : {"index" : "sematext", "shard" : 0, "from_node" : "node1", "to_node" : "node2"}}, {"allocate" : {"index" : "sematext", "shard" : 1, "node" : "node3"}} ] }'
  • 64. And The Winner Is ? Copyright 2012 Sematext Int’l. All rights reserved
  • 65. How to Reach Us • Rafał Kuć – Twitter: @kucrafal – E-mail: rafal.kuc@sematext.com • Sematext – Twitter: @sematext – Website: http://sematext.com • Solr vs ElasticSearch series: • http://blog.sematext.com/2012/08/23/solr-vs- elasticsearch-part-1-overview/ Copyright 2012 Sematext Int’l. All rights reserved
  • 66. We Are Hiring ! • Dig Search ? • Dig Analytics ? • Dig Big Data ? • Dig Performance ? • Dig working with and in open – source ? • We’re hiring world – wide ! http://sematext.com/about/jobs.html Copyright 2012 Sematext Int’l. All rights reserved