SlideShare a Scribd company logo
Loïc Bertron
Director of Research & Development @Cedrom-SNI
!
Working on Big Data for Cedrom-SNI : social media, tv & radio aggregation
Introduced Elasticsearch at Cedrom-Sni
!
Cedrom-Sni
!
10k+ different sources, 750k+ new docs/days
Our job : Ingesting, enriching, extracting analytics and intelligence from docs
loic.bertron@cedrom-sni.com
linkedin.com/in/loicbertron
@loicbertron
Who am I ?
ElasticSearch is offering advanced search features to any application or
website easily, scaling on a large amount of data.
«
»
ElasticSearch
Simple : Plug & Play - Schema free - RESTful API
!
Elastic : Automatically discover all others instances
!
Strong : Replication & Load balancing - Scales massively - Lucene based
!
Fast : Requests executed in parallel - Real Time
!
Full featured : Search, Analytics, Facets, Percolator, Geo search, Suggest, Plugins …
What is ElasticSearch ?
Document as JSON
• Object representing your data
• Grouped in an index
• One index can have multiples types of documents
{
    "message": "Introducing #ElasticSearch",
"post_date": "2014-03-12T18:30:00",
    "author": {
"first_name" : "Loïc",
"email" : "loic.bertron@cedrom-sni.com"
},
"employee_at_Cedrom" : true,
"Tags" : ["Meetup","Montreal"]
}
• API REST : http://host:port/[index]/[type]/[_action/id]

HTTP Methods: GET, POST, PUT, DELETE
• Documents
• http://node1:9200/twitter/tweet/1 (POST)
• http://node1:9200/twitter/tweet/1 (GET)
• http://node1:9200/twitter/tweet/1 (DELETE)
• Search
• http://node1:9200/twitter/tweet/_search (GET)
• http://node1:9200/twitter/_search (GET)
• http://node1:9200/_search (GET)
• Metadata
• http://node1:9200/twitter/_status (GET)
• http://node1:9200/_shutdown (POST)
API
Index a document
$ curl -X PUT http://node1:9200/twitter/tweet/1 -d '{
    "user": "loicbertron",
    "post_date": "2014-03-12T18:30:00",
    "message": "Introducing #ElasticSearch"
}'
{
"ok":true,
"_index":"twitter",
"_type":"tweet",
"_id":"1"
"_version":"1"
}
Index a document
Update a document
$ curl -X PUT http://node1:9200/twitter/tweet/1 -d '{
    "user": "loicbertron",
    "post_date": "2014-03-12T18:40:00",
    "message": "Introducing #ElasticSearch to the #Community"
}'
{
"ok":true,
"_index":"twitter",
"_type":"tweet",
"_id":"1"
"_version":"2"
}
Update a document
$ curl -XGET http://node1:9200/twitter/tweet/_search -d '{
    "query": {
    "term": { "message": "ElasticSearch" }
}
}'
Search for documents
$ curl -XGET http://node1:9200/twitter/tweet/_search?q=elasticsearch
Search for documents
{
"took" : 24,
"timed_out" : false,
"_shards" : { "total" : 2, "successful" : 2, "failed" : 0 },
"hits" : {
"total" : 1,
"max_score" : 0.227,
"hits" : [ {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "1",
"_score" : 0.227, "_source" : {
"user": "loicbertron",
    "post_date": "2014-03-12T18:40:00",
    "message": "Introducing #ElasticSearch to the #Community"
}
} ]
}
}
Search for documents
{
"took" : 24,
"timed_out" : false,
"_shards" : { "total" : 2, "successful" : 2, "failed" : 0 },
"hits" : {
"total" : 1,
"max_score" : 0.227,
"hits" : [ {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "1",
"_score" : 0.227, "_source" : {
"user": "loicbertron",
    "post_date": "2014-03-12T18:40:00",
    "message": "Introducing #ElasticSearch to the #Community"
}
} ]
}
}
Execution
time
{
"took" : 24,
"timed_out" : false,
"_shards" : { "total" : 2, "successful" : 2, "failed" : 0 },
"hits" : {
"total" : 1,
"max_score" : 0.227,
"hits" : [ {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "1",
"_score" : 0.227, "_source" : {
"user": "loicbertron",
    "post_date": "2014-03-12T18:40:00",
    "message": "Introducing #ElasticSearch to the #Community"
}
} ]
}
}
# of documents
matching
Search for documents
{
"took" : 24,
"timed_out" : false,
"_shards" : { "total" : 2, "successful" : 2, "failed" : 0 },
"hits" : {
"total" : 1,
"max_score" : 0.227,
"hits" : [ {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "1",
"_score" : 0.227, "_source" : {
"user": "loicbertron",
    "post_date": "2014-03-12T18:40:00",
    "message": "Introducing #ElasticSearch to the #Community"
}
} ]
}
}
Infos
Search for documents
{
"took" : 24,
"timed_out" : false,
"_shards" : { "total" : 2, "successful" : 2, "failed" : 0 },
"hits" : {
"total" : 1,
"max_score" : 0.227,
"hits" : [ {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "1",
"_score" : 0.227, "_source" : {
"user": "loicbertron",
    "post_date": "2014-03-12T18:40:00",
    "message": "Introducing #ElasticSearch to the #Community"
}
} ]
}
}
Score
Search for documents
{
"took" : 24,
"timed_out" : false,
"_shards" : { "total" : 2, "successful" : 2, "failed" : 0 },
"hits" : {
"total" : 1,
"max_score" : 0.227,
"hits" : [ {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "1",
"_score" : 0.227, "_source" : {
"user": "loicbertron",
    "post_date": "2014-03-12T18:40:00",
    "message": "Introducing #ElasticSearch to the #Community"
}
} ]
}
}
Document
Search for documents
Search operand
Terms quebec
quebec ontario
Phrases "city of montréal"
Proximity "montreal collusion" ~5
Fuzzy schwarzenegger ~0.8
Wildcards queb*
Boosting Quebec^5 montreal
Range [2011/03/12 TO 2014/03/12]
[java to json]
Boolean quebec AND NOT montreal
+quebec -montreal
(quebec OR ottawa) AND NOT toronto
Fields title:montreal^10 OR body:montreal
$ curl -XGET http://node1:9200/twitter/tweet/_search?q=<Your Query>
$ curl -XGET http://node1:9200/twitter/tweet/_search -d ‘{
"query": {
"filtered" : {
"query" : {
"bool" : {
!
"must" : {
"match" : {
"author.first_name" : {
"query" : "loic",
"fuzziness" : 0.1
}
}
},
!
"must" : {
"multi_match" : {
"query" : "elasticsearch",
"fields" : ["title^10","body"]
}
}
}
},
!
"filter": {
"and" : [
{"terms" : { "tags" : ["search","scale","store"] } },
{"range" : { "created_at" : {"from": "2013" } } } ,
{"term": { "featured" : true } }
]
}
}
}
}’
Query DSL
$ curl -XGET http://node1:9200/twitter/tweet/_search -d ‘{
"query": {
"filtered" : {
"query" : {
"bool" : {
!
"must" : {
"match" : {
"author.first_name" : {
"query" : "loic",
"fuzziness" : 0.1
}
}
},
!
"must" : {
"multi_match" : {
"query" : "elasticsearch",
"fields" : ["title^10","body"]
}
}
}
},
!
"filter": {
"and" : [
{"terms" : { "tags" : ["search","scale","store"] } },
{"range" : { "created_at" : {"from": "2013" } } } ,
{"term": { "featured" : true } }
]
}
}
}
}’
Query DSL
"must" : {
"match" : {
"author.first_name" : {
"query" : "loic",
"fuzziness" : 0.1
}
}
$ curl -XGET http://node1:9200/twitter/tweet/_search -d ‘{
"query": {
"filtered" : {
"query" : {
"bool" : {
!
"must" : {
"match" : {
"author.first_name" : {
"query" : "loic",
"fuzziness" : 0.1
}
}
},
!
"must" : {
"multi_match" : {
"query" : "elasticsearch",
"fields" : ["title^10","body"]
}
}
}
},
!
"filter": {
"and" : [
{"terms" : { "tags" : ["search","scale","store"] } },
{"range" : { "created_at" : {"from": "2013" } } } ,
{"term": { "featured" : true } }
]
}
}
}
}’
Query DSL
"must" : {
"multi_match" : {
"query" : "elasticsearch",
"fields" : ["title^10","body"]
}
}
$ curl -XGET http://node1:9200/twitter/tweet/_search -d ‘{
"query": {
"filtered" : {
"query" : {
"bool" : {
!
"must" : {
"match" : {
"author.first_name" : {
"query" : "loic",
"fuzziness" : 0.1
}
}
},
!
"must" : {
"multi_match" : {
"query" : "elasticsearch",
"fields" : ["title^10","body"]
}
}
}
},
!
"filter": {
"and" : [
{"terms" : { "tags" : ["search","scale","store"] } },
{"range" : { "created_at" : {"from": "2013" } } } ,
{"term": { "featured" : true } }
]
}
}
}
}’
Query DSL
"filter": {
"and" : [
{"terms" : { "tags" : ["search","scale","store"] } },
{"range" : { "created_at" : {"from": "2013" } } } ,
{"term": { "featured" : true } }
]
}
Facets
Ranges
Term
Term
Ranges
Facets
$ curl -XPOST http://node1:9200/articles/_search -d '{
    "aggregations" : {
"tag_cloud" : { "terms" : {"field" : "tags"} }
}
}'
Tag Cloud
"aggregations" : {
"tag_cloud" :[
{"terms": "Quebec", "count" : 5},
{"terms": "Montréal", "count" : 3},
...
]
}
$ curl -XPOST http://node1:9200/students/_search?search_type=count -d '{
    "facets": {
"scores-per-subject" : {
"terms_stats" : {
"key_field" : "subject",
"value_field" : "score"
}
}
}
}'
Stats
"facets" : {
"scores-per-subject" : {
"_type" : "terms_stats",
"missing" : 0,
"terms" : [ {
"term" : "math",
"count" : 4,
"total_count" : 4,
"min" : 25.0,
"max" : 92.0,
"total" : 267.0,
"mean" : 66.75
}, […]
}
}
Advanced facets : Aggregations
{
"rank": "21",
"city": "Boston",
"state": "MA",
"population2012": "636479",
"population2010": "617594",
"land_area": "48.277",
"density": "12793",
"ansi": "619463",
"location": {
"lat": "42.332",
"lon": "71.0202"
}
}
curl -XGET "node1:9200/cities/_search?pretty" -d '{
"aggs" : {
"mean_density_by_state" : {
"terms" : {
"field" : "state"
},
"aggs": {
"mean_density": {
"avg" : {
"field" : "density"
}
}
}
}
}
}'
Advanced facets : Aggregations
"aggregations" : {
"mean_density_by_state" : {
"terms" : [ {
"term" : "CA",
"doc_count" : 69,
"mean_density" : {
"value" : 5558.623188405797
}
}, {
"term" : "TX",
"doc_count" : 32,
"mean_density" : {
"value" : 2496.625
}
}, {
"term" : "FL",
"doc_count" : 20,
"mean_density" : {
"value" : 4006.6
}
}, {
"term" : "CO",
"doc_count" : 11,
Advanced facets : Aggregations
Ranges
Term
Facets
Facets
Terms
Terms Stats
Statistical
Range
Histogram
Date Histogram
Filter
Query
Geo Distance
Noeud 1
Cluster
État du cluster : Vert
Node 1
Cluster
Shard 0
Shard 1
cluster state : Yellow
Architecture
$ curl -XPUT localhost:9200/twitter -d '{
"index" : {
"number_of_shards" : 2,
"number_of_replicas" : 1
}
}'
Noeud 1
Cluster
État du cluster : Vert
Noeud 1
Cluster
Shard 0
Shard 1
État du cluster : Jaune
Node 1
Cluster
Shard 0
Shard 1
cluster state : Green
Node 2
Shard 0
Shard 1
adding a second node
Architecture
Node 1
Cluster
Shard 0
Shard 1
Node 2
Shard 1
Shard 0
Architecture
Node 1
Cluster
Shard 0
Node 3
Shard 1
Node 2
Shard 1
Shard 0
Architecture
Node 1
Cluster
Shard 0
Node 3
Shard 1
Node 2
Shard 1
Shard 0
Architecture
Node 1
Cluster
Shard 0
Node 3 Node 4
Shard 1
Node 2
Shard 1
Shard 0
Architecture
Node 1
Cluster
Shard 0
Node 3 Node 4
Shard 1
Node 2
Shard 1
Shard 0
Architecture
Node 1
Cluster
Shard 0
Node 3 Node 4
Shard 1
Node 2
Shard 1
Shard 0
Architecture
Node 1
Cluster
Shard 0
Node 3 Node 4
Shard 1
Node 2
Shard 1
Shard 0
$ curl -X PUT http://node1:9200/twitter/tweet/1 -d '{
    "user": "loicbertron",
    "post_date": "2014-03-12T18:30:00",
    "message": "Introducing #ElasticSearch"
}'
Architecture
Node 1
Cluster
Shard 0
Node 3 Node 4
Shard 1
Node 2
Shard 1
Shard 0
Doc 1
$ curl -X PUT http://node1:9200/twitter/tweet/1 -d '{
    "user": "loicbertron",
    "post_date": "2014-03-12T18:30:00",
    "message": "Introducing #ElasticSearch"
}'
Architecture
Node 1
Cluster
Shard 0
Node 3 Node 4
Shard 1
Node 2
Shard 1
Shard 0
Doc 1
$ curl -X PUT http://node1:9200/twitter/tweet/1 -d '{
    "user": "loicbertron",
    "post_date": "2014-03-12T18:30:00",
    "message": "Introducing #ElasticSearch"
}'
Architecture
Cluster
Shard 0
Shard 1Shard 1
Shard 0
Doc 1 Doc 1
$ curl -X PUT http://node1:9200/twitter/tweet/1 -d '{
    "user": "loicbertron",
    "post_date": "2014-03-12T18:30:00",
    "message": "Introducing #ElasticSearch"
}'
Architecture
Node 1 Node 2 Node 3 Node 4
Cluster
Shard 0
Shard 1Shard 1
Shard 0
Doc 1 Doc 1
{
"ok":true,
"_index":"twitter",
"_type":"tweet",
"_id":"1"
"_version":"1"
}
Architecture
Node 1 Node 2 Node 3 Node 4
Cluster
Shard 0
Shard 1Shard 1
Shard 0
Doc 1 Doc 1
Architecture
Node 1 Node 2 Node 3 Node 4
$ curl -X PUT http://node1:9200/twitter/tweet/2 -d '{
    "user": "loicbertron",
    "post_date": "2014-03-12T18:45:00",
    "message": "The crowd is on fire #ElasticSearch"
}'
Cluster
Shard 0
Shard 1Shard 1
Shard 0
Doc 1 Doc 1
Doc 2
Architecture
Node 1 Node 2 Node 3 Node 4
$ curl -X PUT http://node1:9200/twitter/tweet/2 -d '{
    "user": "loicbertron",
    "post_date": "2014-03-12T18:45:00",
    "message": "The crowd is on fire #ElasticSearch"
}'
Cluster
Shard 0
Shard 1Shard 1
Shard 0
Doc 1 Doc 1
Doc 2
Architecture
Node 1 Node 2 Node 3 Node 4
$ curl -X PUT http://node1:9200/twitter/tweet/2 -d '{
    "user": "loicbertron",
    "post_date": "2014-03-12T18:45:00",
    "message": "The crowd is on fire #ElasticSearch"
}'
Cluster
Shard 0
Shard 1Shard 1
Shard 0
Doc 1 Doc 1
Doc 2 Doc 2
Architecture
Node 1 Node 2 Node 3 Node 4
$ curl -X PUT http://node1:9200/twitter/tweet/2 -d '{
    "user": "loicbertron",
    "post_date": "2014-03-12T18:45:00",
    "message": "The crowd is on fire #ElasticSearch"
}'
Cluster
Shard 0
Shard 1Shard 1
Shard 0
Doc 1 Doc 1
Doc 2 Doc 2
{
"ok":true,
"_index":"twitter",
"_type":"tweet",
"_id":"2"
"_version":"1"
}
Architecture
Node 1 Node 2 Node 3 Node 4
Cluster
Shard 0
Shard 1Shard 1
Shard 0
Doc 1 Doc 1
Doc 2 Doc 2
$ curl -XGET http://node1:9200/twitter/tweet/_search -d '{
    "query": {
    "term": { "message": "ElasticSearch" }
}
}'
Architecture
Node 1 Node 2 Node 3 Node 4
Cluster
Shard 0
Shard 1Shard 1
Shard 0
Doc 1 Doc 1
Doc 2 Doc 2
$ curl -XGET http://node1:9200/twitter/tweet/_search -d '{
    "query": {
    "term": { "message": "ElasticSearch" }
}
}'
Architecture
Node 1 Node 2 Node 3 Node 4
Cluster
Shard 0
Shard 1Shard 1
Shard 0
Doc 1 Doc 1
Doc 2 Doc 2
$ curl -XGET http://node1:9200/twitter/tweet/_search -d '{
    "query": {
    "term": { "message": "ElasticSearch" }
}
}'
Architecture
Node 1 Node 2 Node 3 Node 4
Cluster
Shard 0
Shard 1Shard 1
Shard 0
Doc 1
Doc 1
Doc 2
Doc 2
$ curl -XGET http://node1:9200/twitter/tweet/_search -d '{
    "query": {
    "term": { "message": "ElasticSearch" }
}
}'
Architecture
Node 1 Node 2 Node 3 Node 4
Cluster
Shard 0
Shard 1Shard 1
Shard 0
Doc 1
Doc 1
Doc 2
Doc 2
$ curl -XGET http://node1:9200/twitter/tweet/_search -d '{
    "query": {
    "term": { "message": "ElasticSearch" }
}
}'
Architecture
Node 1 Node 2 Node 3 Node 4
Cluster
Shard 0
Shard 1Shard 1
Shard 0
Doc 1 Doc 1
Doc 2 Doc 2
Architecture
Node 1 Node 2 Node 3 Node 4
Cluster
Shard 1Shard 1
Shard 0
Doc 1
Doc 2 Doc 2
Architecture
Node 2 Node 3 Node 4
Cluster
Shard 1
Node 2
Shard 1
Doc 2 Doc 2
Shard 0
Doc 1
Architecture
Node 3 Node 4
Shard 0
Doc 1
Cluster
Shard 1
Node 2
Shard 1
Doc 2 Doc 2
Shard 0
Doc 1
Architecture
Node 3 Node 4
Shard 0
Doc 1
Cluster
Shard 1Shard 1
Doc 2 Doc 2
Shard 0
Doc 1
Architecture
Node 2 Node 3 Node 4
$ curl -X PUT http://node1:9200/twitter/tweet/3 -d '{
    "user": "loicbertron",
    "post_date": "2014-03-12T19:00:00",
    "message": "A third message about #ElasticSearch"
}'
Shard 0
Doc 1
Cluster
Shard 1Shard 1
Doc 2 Doc 2
Shard 0
Doc 1
Doc 3
Architecture
Node 2 Node 3 Node 4
$ curl -X PUT http://node1:9200/twitter/tweet/3 -d '{
    "user": "loicbertron",
    "post_date": "2014-03-12T19:00:00",
    "message": "A third message about #ElasticSearch"
}'
Shard 0
Doc 1
Cluster
Shard 1Shard 1
Doc 2 Doc 2
Shard 0
Doc 1
Doc 3
Architecture
Node 2 Node 3 Node 4
$ curl -X PUT http://node1:9200/twitter/tweet/3 -d '{
    "user": "loicbertron",
    "post_date": "2014-03-12T19:00:00",
    "message": "A third message about #ElasticSearch"
}'
Shard 0
Doc 1
Doc 3
Cluster
Shard 1Shard 1
Doc 2 Doc 2
Shard 0
Doc 1
Doc 3
{
"ok":true,
"_index":"twitter",
"_type":"tweet",
"_id":"3"
"_version":"1"
}
Architecture
Node 2 Node 3 Node 4
Shard 0
Doc 1
Doc 3
Cluster
Shard 1Shard 1
Doc 2 Doc 2
Shard 0
Doc 1
Doc 3
$ curl -XGET http://node1:9200/twitter/tweet/_search -d '{
    "query": {
    "term": { "message": "ElasticSearch" }
}
}'
Architecture
Node 2 Node 3 Node 4
Shard 0
Doc 1
Doc 3
Cluster
Shard 1Shard 1
Doc 2
Doc 2
Shard 0
Doc 1Doc 3
$ curl -XGET http://node1:9200/twitter/tweet/_search -d '{
    "query": {
    "term": { "message": "ElasticSearch" }
}
}'
Architecture
Node 2 Node 3 Node 4
Shard 0
Doc 1
Doc 3
Cluster
Shard 1Shard 1
Doc 2 Doc 2
Architecture
Node 2 Node 4
How users see search ?
ResultUser Query List of results
How search engine works?
1. Fetch document field
2. Pick configured anlyser
3. Parse text inot tokens
4. Apply token filters
5. Store into index
Analyzer
curl -XGET "http://localhost:9200/docs/_analyze?
analyzer=standard&pretty=1" -d "Édith Piaf vedette du feu d'artifice"
Analyzer
{
"tokens" : [ {
"token" : "édith",
"start_offset" : 0,
"end_offset" : 5,
"type" : "<ALPHANUM>",
"position" : 1
}, {
"token" : "piaf",
"start_offset" : 6,
"end_offset" : 10,
"type" : "<ALPHANUM>",
"position" : 2
}, {
"token" : "vedette",
"start_offset" : 11,
"end_offset" : 18,
"type" : "<ALPHANUM>",
"position" : 3
}, {
"token" : "du",
"start_offset" : 19,
"end_offset" : 21,
"type" : "<ALPHANUM>",
"position" : 4
}, {
"token" : "feu",
"start_offset" : 22,
"end_offset" : 25,
"type" : "<ALPHANUM>",
"position" : 5
}, {
"token" : "d'artifice",
"start_offset" : 26,
"end_offset" : 36,
"type" : "<ALPHANUM>",
"position" : 6
} ]
}
composed of a single tokenizer and zero or more filters
Analyzer
Cutting out a string of words & transforming :
!
Whitespace tokenizer :
«Édith piaf» -> «Édith», «Piaf»
!
Standard tokenizer :
«Édith piaf!» -> «édith», «piaf»
Tokenizer
Modify, delete or add tokens
!
Asciifolding filter :
«Édith Piaf» -> «Edith Piaf»
!
Stemmer filter (english) :
«stemming» -> «stem»
«fishing», «fished», «fisher» -> «fish»
«cats,catlike» -> «cat»
!
Phonetic :
«quick» -> «Q200»
«quik» -> «Q200»
!
Edge nGram :
«Montreal» -> [«Mon», «Mont», «Montr»]
Filters
Analyzer
{
"tokens" : [ {
"token" : "edith",
"start_offset" : 0,
"end_offset" : 5,
"type" : "<ALPHANUM>",
"position" : 1
}, {
"token" : "piaf",
"start_offset" : 6,
"end_offset" : 10,
"type" : "<ALPHANUM>",
"position" : 2
}, {
"token" : "vedet",
"start_offset" : 11,
"end_offset" : 18,
"type" : "<ALPHANUM>",
"position" : 3
}, {
"token" : "feu",
"start_offset" : 22,
"end_offset" : 25,
"type" : "<ALPHANUM>",
"position" : 5
},
!
!
{
"token" : "artific",
"start_offset" : 26,
"end_offset" : 36,
"type" : "<ALPHANUM>",
"position" : 6
} ]
}
1.Documents get indexed
2.I come back often on the search page to run my request
3.I hope that my document will be well ranked to be on top of the results page
4.if not, i won’t never see my document
Regular search engine usage
1. Register my query
2. When document get indexed, the percolator look for a match again registered queries
Percolator
Real Time Updates !
Percolator
Percolator
curl -XPUT 'http://node1:9200/twitter/.percolator/elasticsearch' -d '{
"query" : {
"match" : {
"message" : "elasticsearch"
}
}
}'
Percolator
$ curl -X GET http://node1:9200/twitter/tweet/_percolate -d '{
"doc" : {
    "user": "loicbertron",
    "post_date": "2014-03-12T19:00:00",
    "message": "A third message about #ElasticSearch"
}
}'
Percolator
{
    "took" : 19,
    "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
    },
    "total" : 1,
    "matches" : [
        {
             "_index" : "twitter",
             "_id" : "elasticsearch"
        }
    ]
}
{
"name": "Jules Verne",
"biography": "One of the greatest author",
!
"books": [
{
"title": "Vingt mille lieues sous les mers",
"genre": "Novel",
"publisher": "Hetzel"
}
{
"title": "Les Châteaux en Californie",
"genre": "Drama",
"publisher": "Marc Soriano"
}
]
}
Inner objects
curl -XPUT node1:9200/authors/bare_author/1 -d'{
"name": "Jules Verne",
"biography": « One of the greets author"
}'
curl -XPOST node1:9200/authors/book/1?parent=1 -d '{
"title": "Les Châteaux en Californie",
"genre": "Drama",
"publisher": "Marc Soriano"
}'
!
curl -XPOST node1:9200/authors/book/2?parent=1 -d '{
"title": "Vingt mille lieues sous les mers",
"genre": "Novel",
"publisher": "Hetzel"
!
}'
Parents / Childs
Others features
• Suggest API : Did you mean ?, Autocomplete, …
• Results Highlight
• More like this
• Backup Data : Snapshot / Restore
• File System
• Amazon S3
• HDFS
• Google Compute Engine
• Microsoft Azure
• Hadoop connector
Clients
• Perl
• Python
• Ruby
• Php
• Javascript
• Java
• .Net
• Scala
• Clojure
• Erlang
• Eventmachine
• Bash
• Ocaml
• Smalltalk
• Cold Fusion
Who’s using it ?
Questions
Thank you
Thank you David Pilato for his presentation : https://speakerdeck.com/dadoonet/tours-jug-elasticsearch
Thank you Kevin Kluge for his presentation : https://speakerdeck.com/elasticsearch/elasticsearch-in-20-minutes
Bonus :)
Suggest
curl -s -XPOST 'localhost:9200/_search?search_type=count' -d '{
  "suggest" : {
    "my-title-suggestions-1" : {
      "text" : "devloping",
      "term" : {
        "size" : 3,
        "field" : "title"  
      }
    }
  }
}'
Suggest
"suggest": {
    "my-title-suggestions-1": [
      {
        "text": "devloping",
        "offset": 0,
        "length": 9,
        "options": [
          {
            "text": "developing",
            "freq": 77,
            "score": 0.8888889
          },
          {
            "text": "deloping",
            "freq": 1,
            "score": 0.875
          },
          {
            "text": "deploying",
            "freq": 2,
            "score": 0.7777778
          }
        ]
      }
More Like This
curl -XGET 'http://node1:9200/twitter/tweet/1/_mlt?mlt_fields=tag,content&min_doc_freq=1'
{
    "more_like_this" : {
        "fields" : ["name.first", "name.last"],
        "like_text" : "text like this one",
        "min_term_freq" : 1,
        "max_query_terms" : 12,
        "percent_terms_to_match" : 0.95
    }
}
Highlight
{
    "query" : {...},
    "highlight" : {
        "number_of_fragments" : 3,
        "fragment_size" : 150,
        "tag_schema" : "styled",
        "fields" : {
            "_all" : { "pre_tags" : ["<em>"], "post_tags" : ["</em>"] },
            "bio.title" : { "number_of_fragments" : 0 },
            "bio.author" : { "number_of_fragments" : 0 },
            "bio.content" : { "number_of_fragments" : 5, "order" : "score" }
        }
    }
}
Highlight
Hadoop
Hadoop
• Java library for integrating Elasticsearch and Hadoop
• Pig, Hive, Cascading, MapReduce
• Search and Real Time Analytics with Elasticsearch, Hadoop as Data Lake
• Scales with Hadoop

More Related Content

What's hot

MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...
MongoDB
 
Building a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and JavaBuilding a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and Java
antoinegirbal
 
Webinar: Building Your First App with MongoDB and Java
Webinar: Building Your First App with MongoDB and JavaWebinar: Building Your First App with MongoDB and Java
Webinar: Building Your First App with MongoDB and Java
MongoDB
 
ElasticSearch - Introduction to Aggregations
ElasticSearch - Introduction to AggregationsElasticSearch - Introduction to Aggregations
ElasticSearch - Introduction to Aggregations
enterprisesearchmeetup
 
Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev Teams
MongoDB
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDB
MongoDB
 
Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Sy...
Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Sy...Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Sy...
Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Sy...
André Ricardo Barreto de Oliveira
 
Back to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkBack to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation Framework
MongoDB
 
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, GermanyHarnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
André Ricardo Barreto de Oliveira
 
Online | MongoDB Atlas on GCP Workshop
Online | MongoDB Atlas on GCP Workshop Online | MongoDB Atlas on GCP Workshop
Online | MongoDB Atlas on GCP Workshop
Natasha Wilson
 
Back to Basics Webinar 3: Schema Design Thinking in Documents
 Back to Basics Webinar 3: Schema Design Thinking in Documents Back to Basics Webinar 3: Schema Design Thinking in Documents
Back to Basics Webinar 3: Schema Design Thinking in Documents
MongoDB
 
elasticsearch - advanced features in practice
elasticsearch - advanced features in practiceelasticsearch - advanced features in practice
elasticsearch - advanced features in practice
Jano Suchal
 
MongoDB .local Chicago 2019: Using Client Side Encryption in MongoDB 4.2
MongoDB .local Chicago 2019: Using Client Side Encryption in MongoDB 4.2MongoDB .local Chicago 2019: Using Client Side Encryption in MongoDB 4.2
MongoDB .local Chicago 2019: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
Webinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkWebinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation Framework
MongoDB
 
Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework
MongoDB
 
MongoDB In Production At Sailthru
MongoDB In Production At SailthruMongoDB In Production At Sailthru
MongoDB In Production At Sailthru
ibwhite
 
Curiosity, outil de recherche open source par PagesJaunes
Curiosity, outil de recherche open source par PagesJaunesCuriosity, outil de recherche open source par PagesJaunes
Curiosity, outil de recherche open source par PagesJaunes
PagesJaunes
 
Webinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based Sharding
Webinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based ShardingWebinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based Sharding
Webinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based Sharding
MongoDB
 

What's hot (20)

MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...
 
Mongo db presentation
Mongo db presentationMongo db presentation
Mongo db presentation
 
Building a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and JavaBuilding a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and Java
 
Webinar: Building Your First App with MongoDB and Java
Webinar: Building Your First App with MongoDB and JavaWebinar: Building Your First App with MongoDB and Java
Webinar: Building Your First App with MongoDB and Java
 
ElasticSearch - Introduction to Aggregations
ElasticSearch - Introduction to AggregationsElasticSearch - Introduction to Aggregations
ElasticSearch - Introduction to Aggregations
 
Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev Teams
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDB
 
Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Sy...
Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Sy...Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Sy...
Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Sy...
 
Back to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkBack to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation Framework
 
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, GermanyHarnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
 
Online | MongoDB Atlas on GCP Workshop
Online | MongoDB Atlas on GCP Workshop Online | MongoDB Atlas on GCP Workshop
Online | MongoDB Atlas on GCP Workshop
 
Back to Basics Webinar 3: Schema Design Thinking in Documents
 Back to Basics Webinar 3: Schema Design Thinking in Documents Back to Basics Webinar 3: Schema Design Thinking in Documents
Back to Basics Webinar 3: Schema Design Thinking in Documents
 
elasticsearch - advanced features in practice
elasticsearch - advanced features in practiceelasticsearch - advanced features in practice
elasticsearch - advanced features in practice
 
MongoDB .local Chicago 2019: Using Client Side Encryption in MongoDB 4.2
MongoDB .local Chicago 2019: Using Client Side Encryption in MongoDB 4.2MongoDB .local Chicago 2019: Using Client Side Encryption in MongoDB 4.2
MongoDB .local Chicago 2019: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
Webinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation FrameworkWebinar: Exploring the Aggregation Framework
Webinar: Exploring the Aggregation Framework
 
Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework
 
MongoDB In Production At Sailthru
MongoDB In Production At SailthruMongoDB In Production At Sailthru
MongoDB In Production At Sailthru
 
Curiosity, outil de recherche open source par PagesJaunes
Curiosity, outil de recherche open source par PagesJaunesCuriosity, outil de recherche open source par PagesJaunes
Curiosity, outil de recherche open source par PagesJaunes
 
Webinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based Sharding
Webinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based ShardingWebinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based Sharding
Webinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based Sharding
 

Similar to Montreal Elasticsearch Meetup

SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
South Tyrol Free Software Conference
 
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
Niels Henrik Hagen
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 Minutes
Karel Minarik
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampAlexei Gorobets
 
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
confluent
 
DRUPAL AND ELASTICSEARCH
DRUPAL AND ELASTICSEARCHDRUPAL AND ELASTICSEARCH
DRUPAL AND ELASTICSEARCH
DrupalCamp Kyiv
 
Elasticmeetup curiosity 20141113
Elasticmeetup curiosity 20141113Elasticmeetup curiosity 20141113
Elasticmeetup curiosity 20141113Erwan Pigneul
 
ELK - What's new and showcases
ELK - What's new and showcasesELK - What's new and showcases
ELK - What's new and showcases
Andrii Gakhov
 
Introduction to solr
Introduction to solrIntroduction to solr
Introduction to solr
Sematext Group, Inc.
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
Codemotion
 
Tracking and visualizing COVID-19 with Elastic stack
Tracking and visualizing COVID-19 with Elastic stackTracking and visualizing COVID-19 with Elastic stack
Tracking and visualizing COVID-19 with Elastic stack
Anna Ossowski
 
Real-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchReal-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchAlexei Gorobets
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
Philips Kokoh Prasetyo
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用
LearningTech
 
Looking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended VersionLooking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended Version
Sonya Liberman
 
Elasticsearch intro output
Elasticsearch intro outputElasticsearch intro output
Elasticsearch intro output
Tom Chen
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
Ricardo Peres
 
Elasticsearch an overview
Elasticsearch   an overviewElasticsearch   an overview
Elasticsearch an overview
Amit Juneja
 
Internet of things
Internet of thingsInternet of things
Internet of things
Bryan Reinero
 
Elasticsearch first-steps
Elasticsearch first-stepsElasticsearch first-steps
Elasticsearch first-steps
Matteo Moci
 

Similar to Montreal Elasticsearch Meetup (20)

SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
 
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 Minutes
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
 
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
 
DRUPAL AND ELASTICSEARCH
DRUPAL AND ELASTICSEARCHDRUPAL AND ELASTICSEARCH
DRUPAL AND ELASTICSEARCH
 
Elasticmeetup curiosity 20141113
Elasticmeetup curiosity 20141113Elasticmeetup curiosity 20141113
Elasticmeetup curiosity 20141113
 
ELK - What's new and showcases
ELK - What's new and showcasesELK - What's new and showcases
ELK - What's new and showcases
 
Introduction to solr
Introduction to solrIntroduction to solr
Introduction to solr
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
 
Tracking and visualizing COVID-19 with Elastic stack
Tracking and visualizing COVID-19 with Elastic stackTracking and visualizing COVID-19 with Elastic stack
Tracking and visualizing COVID-19 with Elastic stack
 
Real-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchReal-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet Elasticsearch
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用
 
Looking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended VersionLooking at Content Recommendations through a Search Lens - Extended Version
Looking at Content Recommendations through a Search Lens - Extended Version
 
Elasticsearch intro output
Elasticsearch intro outputElasticsearch intro output
Elasticsearch intro output
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Elasticsearch an overview
Elasticsearch   an overviewElasticsearch   an overview
Elasticsearch an overview
 
Internet of things
Internet of thingsInternet of things
Internet of things
 
Elasticsearch first-steps
Elasticsearch first-stepsElasticsearch first-steps
Elasticsearch first-steps
 

Recently uploaded

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 

Recently uploaded (20)

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 

Montreal Elasticsearch Meetup

  • 1.
  • 2. Loïc Bertron Director of Research & Development @Cedrom-SNI ! Working on Big Data for Cedrom-SNI : social media, tv & radio aggregation Introduced Elasticsearch at Cedrom-Sni ! Cedrom-Sni ! 10k+ different sources, 750k+ new docs/days Our job : Ingesting, enriching, extracting analytics and intelligence from docs loic.bertron@cedrom-sni.com linkedin.com/in/loicbertron @loicbertron Who am I ?
  • 3. ElasticSearch is offering advanced search features to any application or website easily, scaling on a large amount of data. « » ElasticSearch
  • 4. Simple : Plug & Play - Schema free - RESTful API ! Elastic : Automatically discover all others instances ! Strong : Replication & Load balancing - Scales massively - Lucene based ! Fast : Requests executed in parallel - Real Time ! Full featured : Search, Analytics, Facets, Percolator, Geo search, Suggest, Plugins … What is ElasticSearch ?
  • 5. Document as JSON • Object representing your data • Grouped in an index • One index can have multiples types of documents {     "message": "Introducing #ElasticSearch", "post_date": "2014-03-12T18:30:00",     "author": { "first_name" : "Loïc", "email" : "loic.bertron@cedrom-sni.com" }, "employee_at_Cedrom" : true, "Tags" : ["Meetup","Montreal"] }
  • 6. • API REST : http://host:port/[index]/[type]/[_action/id]
 HTTP Methods: GET, POST, PUT, DELETE • Documents • http://node1:9200/twitter/tweet/1 (POST) • http://node1:9200/twitter/tweet/1 (GET) • http://node1:9200/twitter/tweet/1 (DELETE) • Search • http://node1:9200/twitter/tweet/_search (GET) • http://node1:9200/twitter/_search (GET) • http://node1:9200/_search (GET) • Metadata • http://node1:9200/twitter/_status (GET) • http://node1:9200/_shutdown (POST) API
  • 7. Index a document $ curl -X PUT http://node1:9200/twitter/tweet/1 -d '{     "user": "loicbertron",     "post_date": "2014-03-12T18:30:00",     "message": "Introducing #ElasticSearch" }'
  • 9. Update a document $ curl -X PUT http://node1:9200/twitter/tweet/1 -d '{     "user": "loicbertron",     "post_date": "2014-03-12T18:40:00",     "message": "Introducing #ElasticSearch to the #Community" }'
  • 11. $ curl -XGET http://node1:9200/twitter/tweet/_search -d '{     "query": {     "term": { "message": "ElasticSearch" } } }' Search for documents $ curl -XGET http://node1:9200/twitter/tweet/_search?q=elasticsearch
  • 12. Search for documents { "took" : 24, "timed_out" : false, "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.227, "hits" : [ { "_index" : "twitter", "_type" : "tweet", "_id" : "1", "_score" : 0.227, "_source" : { "user": "loicbertron",     "post_date": "2014-03-12T18:40:00",     "message": "Introducing #ElasticSearch to the #Community" } } ] } }
  • 13. Search for documents { "took" : 24, "timed_out" : false, "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.227, "hits" : [ { "_index" : "twitter", "_type" : "tweet", "_id" : "1", "_score" : 0.227, "_source" : { "user": "loicbertron",     "post_date": "2014-03-12T18:40:00",     "message": "Introducing #ElasticSearch to the #Community" } } ] } } Execution time
  • 14. { "took" : 24, "timed_out" : false, "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.227, "hits" : [ { "_index" : "twitter", "_type" : "tweet", "_id" : "1", "_score" : 0.227, "_source" : { "user": "loicbertron",     "post_date": "2014-03-12T18:40:00",     "message": "Introducing #ElasticSearch to the #Community" } } ] } } # of documents matching Search for documents
  • 15. { "took" : 24, "timed_out" : false, "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.227, "hits" : [ { "_index" : "twitter", "_type" : "tweet", "_id" : "1", "_score" : 0.227, "_source" : { "user": "loicbertron",     "post_date": "2014-03-12T18:40:00",     "message": "Introducing #ElasticSearch to the #Community" } } ] } } Infos Search for documents
  • 16. { "took" : 24, "timed_out" : false, "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.227, "hits" : [ { "_index" : "twitter", "_type" : "tweet", "_id" : "1", "_score" : 0.227, "_source" : { "user": "loicbertron",     "post_date": "2014-03-12T18:40:00",     "message": "Introducing #ElasticSearch to the #Community" } } ] } } Score Search for documents
  • 17. { "took" : 24, "timed_out" : false, "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.227, "hits" : [ { "_index" : "twitter", "_type" : "tweet", "_id" : "1", "_score" : 0.227, "_source" : { "user": "loicbertron",     "post_date": "2014-03-12T18:40:00",     "message": "Introducing #ElasticSearch to the #Community" } } ] } } Document Search for documents
  • 18. Search operand Terms quebec quebec ontario Phrases "city of montréal" Proximity "montreal collusion" ~5 Fuzzy schwarzenegger ~0.8 Wildcards queb* Boosting Quebec^5 montreal Range [2011/03/12 TO 2014/03/12] [java to json] Boolean quebec AND NOT montreal +quebec -montreal (quebec OR ottawa) AND NOT toronto Fields title:montreal^10 OR body:montreal $ curl -XGET http://node1:9200/twitter/tweet/_search?q=<Your Query>
  • 19. $ curl -XGET http://node1:9200/twitter/tweet/_search -d ‘{ "query": { "filtered" : { "query" : { "bool" : { ! "must" : { "match" : { "author.first_name" : { "query" : "loic", "fuzziness" : 0.1 } } }, ! "must" : { "multi_match" : { "query" : "elasticsearch", "fields" : ["title^10","body"] } } } }, ! "filter": { "and" : [ {"terms" : { "tags" : ["search","scale","store"] } }, {"range" : { "created_at" : {"from": "2013" } } } , {"term": { "featured" : true } } ] } } } }’ Query DSL
  • 20. $ curl -XGET http://node1:9200/twitter/tweet/_search -d ‘{ "query": { "filtered" : { "query" : { "bool" : { ! "must" : { "match" : { "author.first_name" : { "query" : "loic", "fuzziness" : 0.1 } } }, ! "must" : { "multi_match" : { "query" : "elasticsearch", "fields" : ["title^10","body"] } } } }, ! "filter": { "and" : [ {"terms" : { "tags" : ["search","scale","store"] } }, {"range" : { "created_at" : {"from": "2013" } } } , {"term": { "featured" : true } } ] } } } }’ Query DSL "must" : { "match" : { "author.first_name" : { "query" : "loic", "fuzziness" : 0.1 } }
  • 21. $ curl -XGET http://node1:9200/twitter/tweet/_search -d ‘{ "query": { "filtered" : { "query" : { "bool" : { ! "must" : { "match" : { "author.first_name" : { "query" : "loic", "fuzziness" : 0.1 } } }, ! "must" : { "multi_match" : { "query" : "elasticsearch", "fields" : ["title^10","body"] } } } }, ! "filter": { "and" : [ {"terms" : { "tags" : ["search","scale","store"] } }, {"range" : { "created_at" : {"from": "2013" } } } , {"term": { "featured" : true } } ] } } } }’ Query DSL "must" : { "multi_match" : { "query" : "elasticsearch", "fields" : ["title^10","body"] } }
  • 22. $ curl -XGET http://node1:9200/twitter/tweet/_search -d ‘{ "query": { "filtered" : { "query" : { "bool" : { ! "must" : { "match" : { "author.first_name" : { "query" : "loic", "fuzziness" : 0.1 } } }, ! "must" : { "multi_match" : { "query" : "elasticsearch", "fields" : ["title^10","body"] } } } }, ! "filter": { "and" : [ {"terms" : { "tags" : ["search","scale","store"] } }, {"range" : { "created_at" : {"from": "2013" } } } , {"term": { "featured" : true } } ] } } } }’ Query DSL "filter": { "and" : [ {"terms" : { "tags" : ["search","scale","store"] } }, {"range" : { "created_at" : {"from": "2013" } } } , {"term": { "featured" : true } } ] }
  • 25. $ curl -XPOST http://node1:9200/articles/_search -d '{     "aggregations" : { "tag_cloud" : { "terms" : {"field" : "tags"} } } }' Tag Cloud "aggregations" : { "tag_cloud" :[ {"terms": "Quebec", "count" : 5}, {"terms": "Montréal", "count" : 3}, ... ] }
  • 26. $ curl -XPOST http://node1:9200/students/_search?search_type=count -d '{     "facets": { "scores-per-subject" : { "terms_stats" : { "key_field" : "subject", "value_field" : "score" } } } }' Stats "facets" : { "scores-per-subject" : { "_type" : "terms_stats", "missing" : 0, "terms" : [ { "term" : "math", "count" : 4, "total_count" : 4, "min" : 25.0, "max" : 92.0, "total" : 267.0, "mean" : 66.75 }, […] } }
  • 27. Advanced facets : Aggregations { "rank": "21", "city": "Boston", "state": "MA", "population2012": "636479", "population2010": "617594", "land_area": "48.277", "density": "12793", "ansi": "619463", "location": { "lat": "42.332", "lon": "71.0202" } }
  • 28. curl -XGET "node1:9200/cities/_search?pretty" -d '{ "aggs" : { "mean_density_by_state" : { "terms" : { "field" : "state" }, "aggs": { "mean_density": { "avg" : { "field" : "density" } } } } } }' Advanced facets : Aggregations
  • 29. "aggregations" : { "mean_density_by_state" : { "terms" : [ { "term" : "CA", "doc_count" : 69, "mean_density" : { "value" : 5558.623188405797 } }, { "term" : "TX", "doc_count" : 32, "mean_density" : { "value" : 2496.625 } }, { "term" : "FL", "doc_count" : 20, "mean_density" : { "value" : 4006.6 } }, { "term" : "CO", "doc_count" : 11, Advanced facets : Aggregations
  • 32. Noeud 1 Cluster État du cluster : Vert Node 1 Cluster Shard 0 Shard 1 cluster state : Yellow Architecture $ curl -XPUT localhost:9200/twitter -d '{ "index" : { "number_of_shards" : 2, "number_of_replicas" : 1 } }'
  • 33. Noeud 1 Cluster État du cluster : Vert Noeud 1 Cluster Shard 0 Shard 1 État du cluster : Jaune Node 1 Cluster Shard 0 Shard 1 cluster state : Green Node 2 Shard 0 Shard 1 adding a second node Architecture
  • 34. Node 1 Cluster Shard 0 Shard 1 Node 2 Shard 1 Shard 0 Architecture
  • 35. Node 1 Cluster Shard 0 Node 3 Shard 1 Node 2 Shard 1 Shard 0 Architecture
  • 36. Node 1 Cluster Shard 0 Node 3 Shard 1 Node 2 Shard 1 Shard 0 Architecture
  • 37. Node 1 Cluster Shard 0 Node 3 Node 4 Shard 1 Node 2 Shard 1 Shard 0 Architecture
  • 38. Node 1 Cluster Shard 0 Node 3 Node 4 Shard 1 Node 2 Shard 1 Shard 0 Architecture
  • 39. Node 1 Cluster Shard 0 Node 3 Node 4 Shard 1 Node 2 Shard 1 Shard 0 Architecture
  • 40. Node 1 Cluster Shard 0 Node 3 Node 4 Shard 1 Node 2 Shard 1 Shard 0 $ curl -X PUT http://node1:9200/twitter/tweet/1 -d '{     "user": "loicbertron",     "post_date": "2014-03-12T18:30:00",     "message": "Introducing #ElasticSearch" }' Architecture
  • 41. Node 1 Cluster Shard 0 Node 3 Node 4 Shard 1 Node 2 Shard 1 Shard 0 Doc 1 $ curl -X PUT http://node1:9200/twitter/tweet/1 -d '{     "user": "loicbertron",     "post_date": "2014-03-12T18:30:00",     "message": "Introducing #ElasticSearch" }' Architecture
  • 42. Node 1 Cluster Shard 0 Node 3 Node 4 Shard 1 Node 2 Shard 1 Shard 0 Doc 1 $ curl -X PUT http://node1:9200/twitter/tweet/1 -d '{     "user": "loicbertron",     "post_date": "2014-03-12T18:30:00",     "message": "Introducing #ElasticSearch" }' Architecture
  • 43. Cluster Shard 0 Shard 1Shard 1 Shard 0 Doc 1 Doc 1 $ curl -X PUT http://node1:9200/twitter/tweet/1 -d '{     "user": "loicbertron",     "post_date": "2014-03-12T18:30:00",     "message": "Introducing #ElasticSearch" }' Architecture Node 1 Node 2 Node 3 Node 4
  • 44. Cluster Shard 0 Shard 1Shard 1 Shard 0 Doc 1 Doc 1 { "ok":true, "_index":"twitter", "_type":"tweet", "_id":"1" "_version":"1" } Architecture Node 1 Node 2 Node 3 Node 4
  • 45. Cluster Shard 0 Shard 1Shard 1 Shard 0 Doc 1 Doc 1 Architecture Node 1 Node 2 Node 3 Node 4 $ curl -X PUT http://node1:9200/twitter/tweet/2 -d '{     "user": "loicbertron",     "post_date": "2014-03-12T18:45:00",     "message": "The crowd is on fire #ElasticSearch" }'
  • 46. Cluster Shard 0 Shard 1Shard 1 Shard 0 Doc 1 Doc 1 Doc 2 Architecture Node 1 Node 2 Node 3 Node 4 $ curl -X PUT http://node1:9200/twitter/tweet/2 -d '{     "user": "loicbertron",     "post_date": "2014-03-12T18:45:00",     "message": "The crowd is on fire #ElasticSearch" }'
  • 47. Cluster Shard 0 Shard 1Shard 1 Shard 0 Doc 1 Doc 1 Doc 2 Architecture Node 1 Node 2 Node 3 Node 4 $ curl -X PUT http://node1:9200/twitter/tweet/2 -d '{     "user": "loicbertron",     "post_date": "2014-03-12T18:45:00",     "message": "The crowd is on fire #ElasticSearch" }'
  • 48. Cluster Shard 0 Shard 1Shard 1 Shard 0 Doc 1 Doc 1 Doc 2 Doc 2 Architecture Node 1 Node 2 Node 3 Node 4 $ curl -X PUT http://node1:9200/twitter/tweet/2 -d '{     "user": "loicbertron",     "post_date": "2014-03-12T18:45:00",     "message": "The crowd is on fire #ElasticSearch" }'
  • 49. Cluster Shard 0 Shard 1Shard 1 Shard 0 Doc 1 Doc 1 Doc 2 Doc 2 { "ok":true, "_index":"twitter", "_type":"tweet", "_id":"2" "_version":"1" } Architecture Node 1 Node 2 Node 3 Node 4
  • 50. Cluster Shard 0 Shard 1Shard 1 Shard 0 Doc 1 Doc 1 Doc 2 Doc 2 $ curl -XGET http://node1:9200/twitter/tweet/_search -d '{     "query": {     "term": { "message": "ElasticSearch" } } }' Architecture Node 1 Node 2 Node 3 Node 4
  • 51. Cluster Shard 0 Shard 1Shard 1 Shard 0 Doc 1 Doc 1 Doc 2 Doc 2 $ curl -XGET http://node1:9200/twitter/tweet/_search -d '{     "query": {     "term": { "message": "ElasticSearch" } } }' Architecture Node 1 Node 2 Node 3 Node 4
  • 52. Cluster Shard 0 Shard 1Shard 1 Shard 0 Doc 1 Doc 1 Doc 2 Doc 2 $ curl -XGET http://node1:9200/twitter/tweet/_search -d '{     "query": {     "term": { "message": "ElasticSearch" } } }' Architecture Node 1 Node 2 Node 3 Node 4
  • 53. Cluster Shard 0 Shard 1Shard 1 Shard 0 Doc 1 Doc 1 Doc 2 Doc 2 $ curl -XGET http://node1:9200/twitter/tweet/_search -d '{     "query": {     "term": { "message": "ElasticSearch" } } }' Architecture Node 1 Node 2 Node 3 Node 4
  • 54. Cluster Shard 0 Shard 1Shard 1 Shard 0 Doc 1 Doc 1 Doc 2 Doc 2 $ curl -XGET http://node1:9200/twitter/tweet/_search -d '{     "query": {     "term": { "message": "ElasticSearch" } } }' Architecture Node 1 Node 2 Node 3 Node 4
  • 55. Cluster Shard 0 Shard 1Shard 1 Shard 0 Doc 1 Doc 1 Doc 2 Doc 2 Architecture Node 1 Node 2 Node 3 Node 4
  • 56. Cluster Shard 1Shard 1 Shard 0 Doc 1 Doc 2 Doc 2 Architecture Node 2 Node 3 Node 4
  • 57. Cluster Shard 1 Node 2 Shard 1 Doc 2 Doc 2 Shard 0 Doc 1 Architecture Node 3 Node 4 Shard 0 Doc 1
  • 58. Cluster Shard 1 Node 2 Shard 1 Doc 2 Doc 2 Shard 0 Doc 1 Architecture Node 3 Node 4 Shard 0 Doc 1
  • 59. Cluster Shard 1Shard 1 Doc 2 Doc 2 Shard 0 Doc 1 Architecture Node 2 Node 3 Node 4 $ curl -X PUT http://node1:9200/twitter/tweet/3 -d '{     "user": "loicbertron",     "post_date": "2014-03-12T19:00:00",     "message": "A third message about #ElasticSearch" }' Shard 0 Doc 1
  • 60. Cluster Shard 1Shard 1 Doc 2 Doc 2 Shard 0 Doc 1 Doc 3 Architecture Node 2 Node 3 Node 4 $ curl -X PUT http://node1:9200/twitter/tweet/3 -d '{     "user": "loicbertron",     "post_date": "2014-03-12T19:00:00",     "message": "A third message about #ElasticSearch" }' Shard 0 Doc 1
  • 61. Cluster Shard 1Shard 1 Doc 2 Doc 2 Shard 0 Doc 1 Doc 3 Architecture Node 2 Node 3 Node 4 $ curl -X PUT http://node1:9200/twitter/tweet/3 -d '{     "user": "loicbertron",     "post_date": "2014-03-12T19:00:00",     "message": "A third message about #ElasticSearch" }' Shard 0 Doc 1 Doc 3
  • 62. Cluster Shard 1Shard 1 Doc 2 Doc 2 Shard 0 Doc 1 Doc 3 { "ok":true, "_index":"twitter", "_type":"tweet", "_id":"3" "_version":"1" } Architecture Node 2 Node 3 Node 4 Shard 0 Doc 1 Doc 3
  • 63. Cluster Shard 1Shard 1 Doc 2 Doc 2 Shard 0 Doc 1 Doc 3 $ curl -XGET http://node1:9200/twitter/tweet/_search -d '{     "query": {     "term": { "message": "ElasticSearch" } } }' Architecture Node 2 Node 3 Node 4 Shard 0 Doc 1 Doc 3
  • 64. Cluster Shard 1Shard 1 Doc 2 Doc 2 Shard 0 Doc 1Doc 3 $ curl -XGET http://node1:9200/twitter/tweet/_search -d '{     "query": {     "term": { "message": "ElasticSearch" } } }' Architecture Node 2 Node 3 Node 4 Shard 0 Doc 1 Doc 3
  • 65. Cluster Shard 1Shard 1 Doc 2 Doc 2 Architecture Node 2 Node 4
  • 66. How users see search ? ResultUser Query List of results
  • 67. How search engine works? 1. Fetch document field 2. Pick configured anlyser 3. Parse text inot tokens 4. Apply token filters 5. Store into index
  • 69. Analyzer { "tokens" : [ { "token" : "édith", "start_offset" : 0, "end_offset" : 5, "type" : "<ALPHANUM>", "position" : 1 }, { "token" : "piaf", "start_offset" : 6, "end_offset" : 10, "type" : "<ALPHANUM>", "position" : 2 }, { "token" : "vedette", "start_offset" : 11, "end_offset" : 18, "type" : "<ALPHANUM>", "position" : 3 }, { "token" : "du", "start_offset" : 19, "end_offset" : 21, "type" : "<ALPHANUM>", "position" : 4 }, { "token" : "feu", "start_offset" : 22, "end_offset" : 25, "type" : "<ALPHANUM>", "position" : 5 }, { "token" : "d'artifice", "start_offset" : 26, "end_offset" : 36, "type" : "<ALPHANUM>", "position" : 6 } ] }
  • 70. composed of a single tokenizer and zero or more filters Analyzer
  • 71. Cutting out a string of words & transforming : ! Whitespace tokenizer : «Édith piaf» -> «Édith», «Piaf» ! Standard tokenizer : «Édith piaf!» -> «édith», «piaf» Tokenizer
  • 72. Modify, delete or add tokens ! Asciifolding filter : «Édith Piaf» -> «Edith Piaf» ! Stemmer filter (english) : «stemming» -> «stem» «fishing», «fished», «fisher» -> «fish» «cats,catlike» -> «cat» ! Phonetic : «quick» -> «Q200» «quik» -> «Q200» ! Edge nGram : «Montreal» -> [«Mon», «Mont», «Montr»] Filters
  • 73. Analyzer { "tokens" : [ { "token" : "edith", "start_offset" : 0, "end_offset" : 5, "type" : "<ALPHANUM>", "position" : 1 }, { "token" : "piaf", "start_offset" : 6, "end_offset" : 10, "type" : "<ALPHANUM>", "position" : 2 }, { "token" : "vedet", "start_offset" : 11, "end_offset" : 18, "type" : "<ALPHANUM>", "position" : 3 }, { "token" : "feu", "start_offset" : 22, "end_offset" : 25, "type" : "<ALPHANUM>", "position" : 5 }, ! ! { "token" : "artific", "start_offset" : 26, "end_offset" : 36, "type" : "<ALPHANUM>", "position" : 6 } ] }
  • 74. 1.Documents get indexed 2.I come back often on the search page to run my request 3.I hope that my document will be well ranked to be on top of the results page 4.if not, i won’t never see my document Regular search engine usage
  • 75. 1. Register my query 2. When document get indexed, the percolator look for a match again registered queries Percolator
  • 76. Real Time Updates ! Percolator
  • 77. Percolator curl -XPUT 'http://node1:9200/twitter/.percolator/elasticsearch' -d '{ "query" : { "match" : { "message" : "elasticsearch" } } }'
  • 78. Percolator $ curl -X GET http://node1:9200/twitter/tweet/_percolate -d '{ "doc" : {     "user": "loicbertron",     "post_date": "2014-03-12T19:00:00",     "message": "A third message about #ElasticSearch" } }'
  • 79. Percolator {     "took" : 19,     "_shards" : {         "total" : 5,         "successful" : 5,         "failed" : 0     },     "total" : 1,     "matches" : [         {              "_index" : "twitter",              "_id" : "elasticsearch"         }     ] }
  • 80. { "name": "Jules Verne", "biography": "One of the greatest author", ! "books": [ { "title": "Vingt mille lieues sous les mers", "genre": "Novel", "publisher": "Hetzel" } { "title": "Les Châteaux en Californie", "genre": "Drama", "publisher": "Marc Soriano" } ] } Inner objects
  • 81. curl -XPUT node1:9200/authors/bare_author/1 -d'{ "name": "Jules Verne", "biography": « One of the greets author" }' curl -XPOST node1:9200/authors/book/1?parent=1 -d '{ "title": "Les Châteaux en Californie", "genre": "Drama", "publisher": "Marc Soriano" }' ! curl -XPOST node1:9200/authors/book/2?parent=1 -d '{ "title": "Vingt mille lieues sous les mers", "genre": "Novel", "publisher": "Hetzel" ! }' Parents / Childs
  • 82. Others features • Suggest API : Did you mean ?, Autocomplete, … • Results Highlight • More like this • Backup Data : Snapshot / Restore • File System • Amazon S3 • HDFS • Google Compute Engine • Microsoft Azure • Hadoop connector
  • 83. Clients • Perl • Python • Ruby • Php • Javascript • Java • .Net • Scala • Clojure • Erlang • Eventmachine • Bash • Ocaml • Smalltalk • Cold Fusion
  • 86. Thank you Thank you David Pilato for his presentation : https://speakerdeck.com/dadoonet/tours-jug-elasticsearch Thank you Kevin Kluge for his presentation : https://speakerdeck.com/elasticsearch/elasticsearch-in-20-minutes
  • 88. Suggest curl -s -XPOST 'localhost:9200/_search?search_type=count' -d '{   "suggest" : {     "my-title-suggestions-1" : {       "text" : "devloping",       "term" : {         "size" : 3,         "field" : "title"         }     }   } }'
  • 89. Suggest "suggest": {     "my-title-suggestions-1": [       {         "text": "devloping",         "offset": 0,         "length": 9,         "options": [           {             "text": "developing",             "freq": 77,             "score": 0.8888889           },           {             "text": "deloping",             "freq": 1,             "score": 0.875           },           {             "text": "deploying",             "freq": 2,             "score": 0.7777778           }         ]       }
  • 90. More Like This curl -XGET 'http://node1:9200/twitter/tweet/1/_mlt?mlt_fields=tag,content&min_doc_freq=1' {     "more_like_this" : {         "fields" : ["name.first", "name.last"],         "like_text" : "text like this one",         "min_term_freq" : 1,         "max_query_terms" : 12,         "percent_terms_to_match" : 0.95     } }
  • 92. {     "query" : {...},     "highlight" : {         "number_of_fragments" : 3,         "fragment_size" : 150,         "tag_schema" : "styled",         "fields" : {             "_all" : { "pre_tags" : ["<em>"], "post_tags" : ["</em>"] },             "bio.title" : { "number_of_fragments" : 0 },             "bio.author" : { "number_of_fragments" : 0 },             "bio.content" : { "number_of_fragments" : 5, "order" : "score" }         }     } } Highlight
  • 94. Hadoop • Java library for integrating Elasticsearch and Hadoop • Pig, Hive, Cascading, MapReduce • Search and Real Time Analytics with Elasticsearch, Hadoop as Data Lake • Scales with Hadoop