SlideShare a Scribd company logo
Introduction To ElasticSearch
Real-Time Search and Analytics
Roy Russo, DevNexus, 2015
Who Am I
•Roy Russo
•VP Engineering, Predikto
•Co-Author - Elasticsearch in Action
-Due ~April 2015.
•ElasticHQ.org
•Other (*)
2
* Silverpop, JBoss, AltiSource Labs
Why Am I Here?
•What is Search
•What is Elasticsearch
•Real-World Use
•Scale Out
•Interacting with Elasticsearch
3
Search is about
filtering information
and determining
relevance.
4
How does a Search Engine
Work?
5
Select * FROM make WHERE name LIKE ‘%Tesla%’
Search Engines use Magic
6
Where Magic == Inverted Index
It’s FM!
Inverted Index
•Take some documents
•Tokenize them
•Find unique tokens
•Map tokens to documents
7
apple oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Inverted Index
8
apples oranges peach
Document 1
Document 2
Document 3
Document 4
Document 5
Document 6
Search for “apple peach”
Relevance
•How many tokens per document?
•How many tokens relative to the number of total
tokens in the document?
•What is the frequency of token across all
documents?
9
Relevance in Elasticsearch
•At Search Time
•At Index Time
•Term Frequency
-Term / Document
•Inverse Document Frequency (IDF)
-Term / All Documents in the collection
•Field-Length Norm
10
What is Elasticsearch?
11
Elasticsearch is…
•Search and Analytics engine
•Document Store
-Every field is indexed/searchable
•Distributed
12
What Elasticsearch is not
•Key-Value Store
-Redis, Riak
•Column Family Store
-C*, HBase
•Graph Database
-Neo4J
13
ElasticSearch in a Nutshell
•Based on Apache Lucene
•Distributed
•Document-Oriented
•Schema free
•HTTP + JSON
•(Near) Real-time search
•Ecosystem
-Hosting, Monitoring Apps, Clients (SDK)
14
Where can I get it?
•Free and Open Source
•https://www.elastic.co/
•https://github.com/elastic/elasticsearch
•Backed by a Company, Elastic
-Training
-Support
-Auth/AuthZ
-Marvel for Monitoring
15
How do I run it?
•Download it
-https://www.elastic.co/downloads
•bin/elasticsearch
•http://localhost:9200
16
{
status: 200,
name: "Tesla",
cluster_name: "elasticsearch_royrusso",
version: {
number: "1.4.2",
build_hash: "927caff6f05403e936c20bf4529f144f0c89fd8c",
build_timestamp: "2014-12-16T14:11:12Z",
build_snapshot: false,
lucene_version: "4.10.2"
},
tagline: "You Know, for Search"
}
Elasticsearch requires Java. *
17
* You have 5 seconds to whine about it and then shutup.
Some Use-Cases
18
ElasticSearch for Centralized
Logs
•Logstash + ElasticSearch + Kibana (ELK)
•Well… and then there’s Loggly
19
“Netflix is a Log generating company that happens to stream movies”
Elasticsearch at Predikto
20
Elasticsearch at Predikto
21
•Write From Spark to Elasticsearch
•Query from Spark to Elasticsearch
•Visualize
Widely Used
22
Based on Apache Lucene
•Free and Open Source
•Started in 1999
•Created by Doug Cutting
•What’s it do?
-Tokenizing
-Locations
-Relevance scoring
-Filtering
-Text search
-Date Parsing
23
Elasticsearch is a Document
Store
24
Document Store
•Like MongoDB and CouchDB
•Document DBs:
-JSON documents
-Collections of key-value collections
-Nesting
-Versioned
25
What is a document?
26
{
"genre": "Crime",
“language": "English",
"country": "USA",
"runtime": 170,
"title": "Scarface",
"year": 1983
}
Modeled in JSON
27
{
"genre": "Crime",
"language": "English",
"country": "USA",
"runtime": 170,
"title": "Scarface",
"year": 1983
}
{
"_index": "imdb",
"_type": "movie",
"_id": "u17o8zy9RcKg6SjQZqQ4Ow",
"_version": 1,
"_source": {
"genre": "Crime",
"language": "English",
"country": "USA",
"runtime": 170,
"title": "Scarface",
"year": 1983
}
}
Schema-Free
•Dynamic Mapping
-Elasticsearch guesses the data-types (string, int,
float…)
28
"imdb": {
"movie": {
"properties": {
"country": {
"type": "string“,
“store”:true,
“index”:false
},
"genre": {
"type": "string“,
"null_value" : "na“,
“store”:false,
“index:true
},
"year": {
"type": "long"
}
}
}
}
Elasticsearch is Distributed
29
Terminology
30
MySQL Elasticsearch
Database Index
Table Type
Row Document
Column Field
Schema Mapping
Index (Everything is indexed)
SQL Query DSL
•Cluster: 1..N Nodes w/ same Cluster Name
•Node: One ElasticSearch instance (1 java proc)
•Shard = One Lucene instance
-0 or more replicas
High Availability
•No need for load balancer
•Different Node Types
•Indices are Sharded
•Replica shards on different Nodes
•Automatic Master election & failover
31
About Indices / Shards
32
$ curl -XPUT 'http://localhost:9200/twitter/' -d '{
"settings" : {
"index" : {
"number_of_shards" : 3,
"number_of_replicas" : 2
}
}
}'
Cluster Topology
33
A1 A2B2 B2 B1
B3
B1 A1 A2
B3
4 Node Cluster
Index A: 2 Shards & 1 Replica
Index B: 3 Shards & 1 Replica
Discovery
•Nodes discover each other using multicast.
-Unicast is an option
•Each cluster has an elected master node
-Beware of split-brain
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["host1", "host2:port", "host3"]
34
Nodes
•Master node handles cluster-wide (Meta-API)
events:
-Node participation
-New indices create/delete
-Re-Allocation of shards
•Data Nodes
-Indexing / Searching operations
•Client Nodes
-REST calls
-Light-weight load balancers
35
node.data | node.master
The Basics - Shards
•Primary Shard:
-First time Indexing
-Index has 1..N primary shards (default: 5)
-# Not changeable once index created
•Replica Shard:
-Copy of the primary shard
-# Can be changed later
-Each primary has 0..N replicas
- HA:
• Promoted to primary if primary fails
• Get/Search handled by primary||replica36
Shard Auto-Allocation
•Shard Phases:
-Unassigned
-Initializing
-Started
-Relocating
37
Node 1
0P
1R
Node 2
1P
0R
Node 2
0R
Add a Node: Shards Relocate
Allocation Awareness
•Shard Allocation Awareness
-cluster.routing.allocation.awareness.attributes: rack
-Shards RELOCATE to even distribution
-Primary & Replica will NOT be on the same rack
value.
38
Cluster State
•Cluster State
-Node Membership
-Indices Settings
-Shard Allocation
Table
-Shard State
39
cURL -XGET http://localhost:9200/_cluster/state?pretty=1'
{
"cluster_name" : "elasticsearch_royrusso",
"version" : 27,
"master_node" : "s3fpXfPKSFeUqo1MYZxSng",
"blocks" : { },
"nodes" : {
"s3fpXfPKSFeUqo1MYZxSng" : {
"name" : "Bulldozer",
"transport_address" : "inet[localhost/127.0.0.1:9300]",
"attributes" : { }
}
},
"metadata" : {
"templates" : {
"logging_index_all" : {
"template" : "logstash-09-*",
"order" : 1,
"settings" : {
"index" : {
"number_of_shards" : "2",
"number_of_replicas" : "1"
}
},
"mappings" : {
"date" : {
"store" : false
}
}
},
"logging_index" : {
"template" : "logstash-*",
"order" : 0,
"settings" : {
"index" : {
"number_of_shards" : "2",
"“number_of_replicas”" : "1"
}
},
"mappings" : {
"“date”" : {
"“store”" : true
Talking to Elasticsearch
40
REST
•HTTP Verbs: GET, POST, PUT, DELETE
•JSON
•_cat API
41
% curl '192.168.56.10:9200/_cat/health?v&ts=0'
cluster status nodeTotal nodeData shards pri relo init unassign
foo green 3 3 3 3 0 0 0
The API
•Document
•Cluster
-Node
•Index
•Search
42
Create a Document
43
curl -XPOST 'http://127.0.0.1:9200/imdb/movie' -d '
{
"genre": "Crime",
"language": "English",
"country": "USA",
"runtime": 170,
"title": "Scarface",
"year": 1983
}';
{
"_index":"imdb",
"_type":"movie",
"_id":"AUwGeWib1u4mCngDYT7y",
"_version":1,
"created":true
}
Of note…
44
curl -XPOST 'http://127.0.0.1:9200/imdb/movie' -d '
{
"genre": "Crime",
"language": "English",
"country": "USA",
"runtime": 170,
"title": "Scarface",
"year": 1983
}';
{
"_index":"imdb",
"_type":"movie",
"_id":"AUwGeWib1u4mCngDYT7y",
"_version":1,
"created":true
}
Auto-creates Index & Type
Auto-Gen ID
Auto-Version
Get a Document
45
curl -XGET 'http://127.0.0.1:9200/imdb/movie/AUwGeWib1u4mCngDYT7y';
{
"_index":"imdb",
"_type":"movie",
"_id":"AUwGeWib1u4mCngDYT7y",
"_version":1,
"found":true,
"_source":
{
"genre": "Crime",
"language": "English",
"country": "USA",
"runtime": 170,
"title": "Scarface",
"year": 1983
}
}
Update a Document
46
curl -XPUT 'http://127.0.0.1:9200/imdb/movie/AUwGeWib1u4mCngDYT7y' -d '
{
"genre": "Crime",
"language": "English",
"country": "USA",
"runtime": 180,
"title": "Scarface",
"year": 1983
}';
{
"_index":"imdb",
"_type":"movie",
"_id":"AUwGeWib1u4mCngDYT7y",
"_version":2,
"created":false
}
* More like an Upsert.
Delete a Document
47
curl -XDELETE 'http://127.0.0.1:9200/imdb/movie/AUwGeWib1u4mCngDYT7y'
{
“found":true,
“_index":"imdb",
“_type":"movie",
“_id":"AUwGeWib1u4mCngDYT7y",
“_version":2
}
You can also…
•Partial document updating
•Specify Version
•Specify ID
•Multi-Get API
•Exists API
•Bulk API
48
How Searching Works
•How it works:
-Search request hits a node
-Node broadcasts to every shard in the index
-Each shard performs query
-Each shard returns metadata about results
-Node merges results and scores them
-Node requests documents from shards
-Results merged, sorted, and returned to client.
49
REST API - Search
•Free Text Search
-URL Request
•Complex Query
50
http://localhost:9200/imdb/movie/_search?q=scar*
http://localhost:9200/imdb/movie/_search?q=scarface+OR+star
http://localhost:9200/imdb/movie/_search?q=(scarface+OR+star)+AND+year:[1981+TO+1984]
REST API – Query DSL
curl -XPOST 'localhost:9200/_search?pretty' -d '{
"query" : {
"bool" : {
"must" : [
{
"query_string" : {
"query" : "scarface or star"
}
},
{
"range" : {
"year" : { "gte" : 1931 }
}
}
]
}
}
}'
51
REST API – Query DSL
•Boolean Query
"bool":{
"must":[
{
"match":{
"color":"blue"
}
},
{
"match":{
"title":"shirt"
}
}
],
"must_not":[
{
"match":{
"size":"xxl"
}
}
],
"should":[
{
"match":{
"textile":"cotton"
}
52
REST API – Query DSL
•Range Query
-Numeric / Date Types
•Prefix/Wildcard Query
-Match on partial terms
•RegExp Query
•Geo_bbox
-Bounding box filter
•Geo_distance
-Geo_distance_range
{
"range":{
"founded_year":{
"gte":1990,
"lt":2000
}
}
}
53
Filters
•Filters recommended over Queries
-Better cache support
54
curl -XGET 'http://localhost:9200/my_index/events/_search?pretty=1' -d '
{
"from" : 0,
"size" : 0,
"query" : {
"terms" : {
"message" : [ "apples"],
"minimum_should_match" : "1"
}
},
"post_filter" : {
"terms" : {
"userId" : [ "25476c6788ce", "g20d5470d7b4" ],
"execution" : "or"
}
},
"sort": { "eventDate": { "order": "desc" }},
"explain" : false
}';
Analyzers / Tokenizers
55
curl -XPUT ‘http://localhost:9200/my_index/' -d '
{
"settings" :
{
"analysis" : {
"analyzer" : {
"str_search_analyzer" : {
"tokenizer" : "keyword",
"filter": ["lowercase"]
},
"str_index_analyzer" : {
"tokenizer" : "substring",
"filter" : ["lowercase", "stop"]
}
},
"tokenizer" : {
"substring" : {
"type" : "edgeNgram",
"min_gram" : "3",
"max_gram" : "42",
"token_chars" : ["letter", "digit"]
}
}
curl -XPUT 'http://localhost:9200/my_index/events/_mapping' -d '
{
"events" : {
"properties" :
{
"eventId" : {"type" : "string", "store" : true, "index" : "not_analyzed" },
"userId" : {"type" : "string", "store" : false, "index" : "not_analyzed" },
"message" : {
"type" : "string", "store" : false,
"search_analyzer" : "str_search_analyzer",
"index_analyzer" : "str_index_analyzer"
}
}
}
}';
Tokenizers
•Whitespace
•NGram
•Edge NGram
•Letter
-@ non-letters
56
Clients
•Client list:
http://www.elasticsearch.org/guide/clients/
-Java (Node) Client, JS, PHP, Perl, Python, Ruby
•Spring Data:
-Uses TransportClient
-Implementation of ElasticsearchRepository aligns
with generic Repository interfaces
57
Monitoring
•BigDesk
•Kopf
•Head
•ElasticHQ
•Marvel
•Sematext SPM
58
Questions?
59

More Related Content

What's hot

ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document database
Robert Lujo
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
Ricardo Peres
 
Intro to Elasticsearch
Intro to ElasticsearchIntro to Elasticsearch
Intro to Elasticsearch
Clifford James
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
Philips Kokoh Prasetyo
 
Elasticsearch first-steps
Elasticsearch first-stepsElasticsearch first-steps
Elasticsearch first-steps
Matteo Moci
 
Elasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseElasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational database
Kristijan Duvnjak
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
Eric Rodriguez (Hiring in Lex)
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
Codemotion
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
Jason Austin
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)
Federico Panini
 
Elasticsearch Introduction
Elasticsearch IntroductionElasticsearch Introduction
Elasticsearch Introduction
Roopendra Vishwakarma
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
Rahul Jain
 
Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018
Vinay Kumar
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1
Maruf Hassan
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
Navule Rao
 
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
Rahul K Chauhan
 
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiPhilly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Robert Calcavecchia
 
The ultimate guide for Elasticsearch plugins
The ultimate guide for Elasticsearch pluginsThe ultimate guide for Elasticsearch plugins
The ultimate guide for Elasticsearch plugins
Itamar
 
Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with Elasticsearch
Samantha Quiñones
 

What's hot (19)

ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document database
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Intro to Elasticsearch
Intro to ElasticsearchIntro to Elasticsearch
Intro to Elasticsearch
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
 
Elasticsearch first-steps
Elasticsearch first-stepsElasticsearch first-steps
Elasticsearch first-steps
 
Elasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseElasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational database
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)
 
Elasticsearch Introduction
Elasticsearch IntroductionElasticsearch Introduction
Elasticsearch Introduction
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
 
Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
 
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
 
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiPhilly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
 
The ultimate guide for Elasticsearch plugins
The ultimate guide for Elasticsearch pluginsThe ultimate guide for Elasticsearch plugins
The ultimate guide for Elasticsearch plugins
 
Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with Elasticsearch
 

Similar to Elasticsearch - DevNexus 2015

Devnexus 2018
Devnexus 2018Devnexus 2018
Devnexus 2018
Roy Russo
 
Dev nexus 2017
Dev nexus 2017Dev nexus 2017
Dev nexus 2017
Roy Russo
 
ELK stack introduction
ELK stack introduction ELK stack introduction
ELK stack introduction
abenyeung1
 
ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!
Alexander Byndyu
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life琛琳 饶
 
Elk presentation1#3
Elk presentation1#3Elk presentation1#3
Elk presentation1#3
uzzal basak
 
Test driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDBTest driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDB
Andrew Siemer
 
Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012
Jeremy Zawodny
 
The Background Noise of the Internet
The Background Noise of the InternetThe Background Noise of the Internet
The Background Noise of the Internet
Andrew Morris
 
Percona Live London 2014: Serve out any page with an HA Sphinx environment
Percona Live London 2014: Serve out any page with an HA Sphinx environmentPercona Live London 2014: Serve out any page with an HA Sphinx environment
Percona Live London 2014: Serve out any page with an HA Sphinx environment
spil-engineering
 
Solr at zvents 6 years later & still going strong
Solr at zvents   6 years later & still going strongSolr at zvents   6 years later & still going strong
Solr at zvents 6 years later & still going strong
lucenerevolution
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
Andy Jackson
 
Search Engine
Search EngineSearch Engine
Search Engine
Gong Haibing
 
曾勇 Elastic search-intro
曾勇 Elastic search-intro曾勇 Elastic search-intro
曾勇 Elastic search-intro
Shaoning Pan
 
Delhi elasticsearch meetup
Delhi elasticsearch meetupDelhi elasticsearch meetup
Delhi elasticsearch meetup
Bharvi Dixit
 
Workshop: Big Data Visualization for Security
Workshop: Big Data Visualization for SecurityWorkshop: Big Data Visualization for Security
Workshop: Big Data Visualization for Security
Raffael Marty
 
Practical Elasticsearch - real world use cases
Practical Elasticsearch - real world use casesPractical Elasticsearch - real world use cases
Practical Elasticsearch - real world use cases
Itamar
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearch
Minsoo Jun
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Codemotion
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsKorea Sdec
 

Similar to Elasticsearch - DevNexus 2015 (20)

Devnexus 2018
Devnexus 2018Devnexus 2018
Devnexus 2018
 
Dev nexus 2017
Dev nexus 2017Dev nexus 2017
Dev nexus 2017
 
ELK stack introduction
ELK stack introduction ELK stack introduction
ELK stack introduction
 
ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
 
Elk presentation1#3
Elk presentation1#3Elk presentation1#3
Elk presentation1#3
 
Test driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDBTest driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDB
 
Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012
 
The Background Noise of the Internet
The Background Noise of the InternetThe Background Noise of the Internet
The Background Noise of the Internet
 
Percona Live London 2014: Serve out any page with an HA Sphinx environment
Percona Live London 2014: Serve out any page with an HA Sphinx environmentPercona Live London 2014: Serve out any page with an HA Sphinx environment
Percona Live London 2014: Serve out any page with an HA Sphinx environment
 
Solr at zvents 6 years later & still going strong
Solr at zvents   6 years later & still going strongSolr at zvents   6 years later & still going strong
Solr at zvents 6 years later & still going strong
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Search Engine
Search EngineSearch Engine
Search Engine
 
曾勇 Elastic search-intro
曾勇 Elastic search-intro曾勇 Elastic search-intro
曾勇 Elastic search-intro
 
Delhi elasticsearch meetup
Delhi elasticsearch meetupDelhi elasticsearch meetup
Delhi elasticsearch meetup
 
Workshop: Big Data Visualization for Security
Workshop: Big Data Visualization for SecurityWorkshop: Big Data Visualization for Security
Workshop: Big Data Visualization for Security
 
Practical Elasticsearch - real world use cases
Practical Elasticsearch - real world use casesPractical Elasticsearch - real world use cases
Practical Elasticsearch - real world use cases
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearch
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
 

Recently uploaded

Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 

Recently uploaded (20)

Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 

Elasticsearch - DevNexus 2015

  • 1. Introduction To ElasticSearch Real-Time Search and Analytics Roy Russo, DevNexus, 2015
  • 2. Who Am I •Roy Russo •VP Engineering, Predikto •Co-Author - Elasticsearch in Action -Due ~April 2015. •ElasticHQ.org •Other (*) 2 * Silverpop, JBoss, AltiSource Labs
  • 3. Why Am I Here? •What is Search •What is Elasticsearch •Real-World Use •Scale Out •Interacting with Elasticsearch 3
  • 4. Search is about filtering information and determining relevance. 4
  • 5. How does a Search Engine Work? 5 Select * FROM make WHERE name LIKE ‘%Tesla%’
  • 6. Search Engines use Magic 6 Where Magic == Inverted Index It’s FM!
  • 7. Inverted Index •Take some documents •Tokenize them •Find unique tokens •Map tokens to documents 7 apple oranges peach Document 1 Document 2 Document 3 Document 4 Document 5 Document 6
  • 8. Inverted Index 8 apples oranges peach Document 1 Document 2 Document 3 Document 4 Document 5 Document 6 Search for “apple peach”
  • 9. Relevance •How many tokens per document? •How many tokens relative to the number of total tokens in the document? •What is the frequency of token across all documents? 9
  • 10. Relevance in Elasticsearch •At Search Time •At Index Time •Term Frequency -Term / Document •Inverse Document Frequency (IDF) -Term / All Documents in the collection •Field-Length Norm 10
  • 12. Elasticsearch is… •Search and Analytics engine •Document Store -Every field is indexed/searchable •Distributed 12
  • 13. What Elasticsearch is not •Key-Value Store -Redis, Riak •Column Family Store -C*, HBase •Graph Database -Neo4J 13
  • 14. ElasticSearch in a Nutshell •Based on Apache Lucene •Distributed •Document-Oriented •Schema free •HTTP + JSON •(Near) Real-time search •Ecosystem -Hosting, Monitoring Apps, Clients (SDK) 14
  • 15. Where can I get it? •Free and Open Source •https://www.elastic.co/ •https://github.com/elastic/elasticsearch •Backed by a Company, Elastic -Training -Support -Auth/AuthZ -Marvel for Monitoring 15
  • 16. How do I run it? •Download it -https://www.elastic.co/downloads •bin/elasticsearch •http://localhost:9200 16 { status: 200, name: "Tesla", cluster_name: "elasticsearch_royrusso", version: { number: "1.4.2", build_hash: "927caff6f05403e936c20bf4529f144f0c89fd8c", build_timestamp: "2014-12-16T14:11:12Z", build_snapshot: false, lucene_version: "4.10.2" }, tagline: "You Know, for Search" }
  • 17. Elasticsearch requires Java. * 17 * You have 5 seconds to whine about it and then shutup.
  • 19. ElasticSearch for Centralized Logs •Logstash + ElasticSearch + Kibana (ELK) •Well… and then there’s Loggly 19 “Netflix is a Log generating company that happens to stream movies”
  • 21. Elasticsearch at Predikto 21 •Write From Spark to Elasticsearch •Query from Spark to Elasticsearch •Visualize
  • 23. Based on Apache Lucene •Free and Open Source •Started in 1999 •Created by Doug Cutting •What’s it do? -Tokenizing -Locations -Relevance scoring -Filtering -Text search -Date Parsing 23
  • 24. Elasticsearch is a Document Store 24
  • 25. Document Store •Like MongoDB and CouchDB •Document DBs: -JSON documents -Collections of key-value collections -Nesting -Versioned 25
  • 26. What is a document? 26 { "genre": "Crime", “language": "English", "country": "USA", "runtime": 170, "title": "Scarface", "year": 1983 }
  • 27. Modeled in JSON 27 { "genre": "Crime", "language": "English", "country": "USA", "runtime": 170, "title": "Scarface", "year": 1983 } { "_index": "imdb", "_type": "movie", "_id": "u17o8zy9RcKg6SjQZqQ4Ow", "_version": 1, "_source": { "genre": "Crime", "language": "English", "country": "USA", "runtime": 170, "title": "Scarface", "year": 1983 } }
  • 28. Schema-Free •Dynamic Mapping -Elasticsearch guesses the data-types (string, int, float…) 28 "imdb": { "movie": { "properties": { "country": { "type": "string“, “store”:true, “index”:false }, "genre": { "type": "string“, "null_value" : "na“, “store”:false, “index:true }, "year": { "type": "long" } } } }
  • 30. Terminology 30 MySQL Elasticsearch Database Index Table Type Row Document Column Field Schema Mapping Index (Everything is indexed) SQL Query DSL •Cluster: 1..N Nodes w/ same Cluster Name •Node: One ElasticSearch instance (1 java proc) •Shard = One Lucene instance -0 or more replicas
  • 31. High Availability •No need for load balancer •Different Node Types •Indices are Sharded •Replica shards on different Nodes •Automatic Master election & failover 31
  • 32. About Indices / Shards 32 $ curl -XPUT 'http://localhost:9200/twitter/' -d '{ "settings" : { "index" : { "number_of_shards" : 3, "number_of_replicas" : 2 } } }'
  • 33. Cluster Topology 33 A1 A2B2 B2 B1 B3 B1 A1 A2 B3 4 Node Cluster Index A: 2 Shards & 1 Replica Index B: 3 Shards & 1 Replica
  • 34. Discovery •Nodes discover each other using multicast. -Unicast is an option •Each cluster has an elected master node -Beware of split-brain discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts: ["host1", "host2:port", "host3"] 34
  • 35. Nodes •Master node handles cluster-wide (Meta-API) events: -Node participation -New indices create/delete -Re-Allocation of shards •Data Nodes -Indexing / Searching operations •Client Nodes -REST calls -Light-weight load balancers 35 node.data | node.master
  • 36. The Basics - Shards •Primary Shard: -First time Indexing -Index has 1..N primary shards (default: 5) -# Not changeable once index created •Replica Shard: -Copy of the primary shard -# Can be changed later -Each primary has 0..N replicas - HA: • Promoted to primary if primary fails • Get/Search handled by primary||replica36
  • 37. Shard Auto-Allocation •Shard Phases: -Unassigned -Initializing -Started -Relocating 37 Node 1 0P 1R Node 2 1P 0R Node 2 0R Add a Node: Shards Relocate
  • 38. Allocation Awareness •Shard Allocation Awareness -cluster.routing.allocation.awareness.attributes: rack -Shards RELOCATE to even distribution -Primary & Replica will NOT be on the same rack value. 38
  • 39. Cluster State •Cluster State -Node Membership -Indices Settings -Shard Allocation Table -Shard State 39 cURL -XGET http://localhost:9200/_cluster/state?pretty=1' { "cluster_name" : "elasticsearch_royrusso", "version" : 27, "master_node" : "s3fpXfPKSFeUqo1MYZxSng", "blocks" : { }, "nodes" : { "s3fpXfPKSFeUqo1MYZxSng" : { "name" : "Bulldozer", "transport_address" : "inet[localhost/127.0.0.1:9300]", "attributes" : { } } }, "metadata" : { "templates" : { "logging_index_all" : { "template" : "logstash-09-*", "order" : 1, "settings" : { "index" : { "number_of_shards" : "2", "number_of_replicas" : "1" } }, "mappings" : { "date" : { "store" : false } } }, "logging_index" : { "template" : "logstash-*", "order" : 0, "settings" : { "index" : { "number_of_shards" : "2", "“number_of_replicas”" : "1" } }, "mappings" : { "“date”" : { "“store”" : true
  • 41. REST •HTTP Verbs: GET, POST, PUT, DELETE •JSON •_cat API 41 % curl '192.168.56.10:9200/_cat/health?v&ts=0' cluster status nodeTotal nodeData shards pri relo init unassign foo green 3 3 3 3 0 0 0
  • 43. Create a Document 43 curl -XPOST 'http://127.0.0.1:9200/imdb/movie' -d ' { "genre": "Crime", "language": "English", "country": "USA", "runtime": 170, "title": "Scarface", "year": 1983 }'; { "_index":"imdb", "_type":"movie", "_id":"AUwGeWib1u4mCngDYT7y", "_version":1, "created":true }
  • 44. Of note… 44 curl -XPOST 'http://127.0.0.1:9200/imdb/movie' -d ' { "genre": "Crime", "language": "English", "country": "USA", "runtime": 170, "title": "Scarface", "year": 1983 }'; { "_index":"imdb", "_type":"movie", "_id":"AUwGeWib1u4mCngDYT7y", "_version":1, "created":true } Auto-creates Index & Type Auto-Gen ID Auto-Version
  • 45. Get a Document 45 curl -XGET 'http://127.0.0.1:9200/imdb/movie/AUwGeWib1u4mCngDYT7y'; { "_index":"imdb", "_type":"movie", "_id":"AUwGeWib1u4mCngDYT7y", "_version":1, "found":true, "_source": { "genre": "Crime", "language": "English", "country": "USA", "runtime": 170, "title": "Scarface", "year": 1983 } }
  • 46. Update a Document 46 curl -XPUT 'http://127.0.0.1:9200/imdb/movie/AUwGeWib1u4mCngDYT7y' -d ' { "genre": "Crime", "language": "English", "country": "USA", "runtime": 180, "title": "Scarface", "year": 1983 }'; { "_index":"imdb", "_type":"movie", "_id":"AUwGeWib1u4mCngDYT7y", "_version":2, "created":false } * More like an Upsert.
  • 47. Delete a Document 47 curl -XDELETE 'http://127.0.0.1:9200/imdb/movie/AUwGeWib1u4mCngDYT7y' { “found":true, “_index":"imdb", “_type":"movie", “_id":"AUwGeWib1u4mCngDYT7y", “_version":2 }
  • 48. You can also… •Partial document updating •Specify Version •Specify ID •Multi-Get API •Exists API •Bulk API 48
  • 49. How Searching Works •How it works: -Search request hits a node -Node broadcasts to every shard in the index -Each shard performs query -Each shard returns metadata about results -Node merges results and scores them -Node requests documents from shards -Results merged, sorted, and returned to client. 49
  • 50. REST API - Search •Free Text Search -URL Request •Complex Query 50 http://localhost:9200/imdb/movie/_search?q=scar* http://localhost:9200/imdb/movie/_search?q=scarface+OR+star http://localhost:9200/imdb/movie/_search?q=(scarface+OR+star)+AND+year:[1981+TO+1984]
  • 51. REST API – Query DSL curl -XPOST 'localhost:9200/_search?pretty' -d '{ "query" : { "bool" : { "must" : [ { "query_string" : { "query" : "scarface or star" } }, { "range" : { "year" : { "gte" : 1931 } } } ] } } }' 51
  • 52. REST API – Query DSL •Boolean Query "bool":{ "must":[ { "match":{ "color":"blue" } }, { "match":{ "title":"shirt" } } ], "must_not":[ { "match":{ "size":"xxl" } } ], "should":[ { "match":{ "textile":"cotton" } 52
  • 53. REST API – Query DSL •Range Query -Numeric / Date Types •Prefix/Wildcard Query -Match on partial terms •RegExp Query •Geo_bbox -Bounding box filter •Geo_distance -Geo_distance_range { "range":{ "founded_year":{ "gte":1990, "lt":2000 } } } 53
  • 54. Filters •Filters recommended over Queries -Better cache support 54 curl -XGET 'http://localhost:9200/my_index/events/_search?pretty=1' -d ' { "from" : 0, "size" : 0, "query" : { "terms" : { "message" : [ "apples"], "minimum_should_match" : "1" } }, "post_filter" : { "terms" : { "userId" : [ "25476c6788ce", "g20d5470d7b4" ], "execution" : "or" } }, "sort": { "eventDate": { "order": "desc" }}, "explain" : false }';
  • 55. Analyzers / Tokenizers 55 curl -XPUT ‘http://localhost:9200/my_index/' -d ' { "settings" : { "analysis" : { "analyzer" : { "str_search_analyzer" : { "tokenizer" : "keyword", "filter": ["lowercase"] }, "str_index_analyzer" : { "tokenizer" : "substring", "filter" : ["lowercase", "stop"] } }, "tokenizer" : { "substring" : { "type" : "edgeNgram", "min_gram" : "3", "max_gram" : "42", "token_chars" : ["letter", "digit"] } } curl -XPUT 'http://localhost:9200/my_index/events/_mapping' -d ' { "events" : { "properties" : { "eventId" : {"type" : "string", "store" : true, "index" : "not_analyzed" }, "userId" : {"type" : "string", "store" : false, "index" : "not_analyzed" }, "message" : { "type" : "string", "store" : false, "search_analyzer" : "str_search_analyzer", "index_analyzer" : "str_index_analyzer" } } } }';
  • 57. Clients •Client list: http://www.elasticsearch.org/guide/clients/ -Java (Node) Client, JS, PHP, Perl, Python, Ruby •Spring Data: -Uses TransportClient -Implementation of ElasticsearchRepository aligns with generic Repository interfaces 57