Elasticsearch
Mexico City JVM Group
April 2016
!Gracias por estar aqui¡
¡El meetup con mas
asistencia de la historia!
What is the topic tonight?
Elastic search?
NO
Elastic Search?
NO
ElasticSearch?
NO
@superserch?
NO
Elasticsearch
YES
Search
Do I have to elaborate
why search is important?
A little history…
Lucene History
• Douglass Read "Doug" Cutting wrote Lucene in 1999
• Doug also is the author of Hadoop
• In Lucene other projects came to life
• Mahout
• Tika
• Nutch
Lucene?
• Lucene is an open-source Java full-text search
library which makes it easy to add search
functionality to an application or website.
Index
Query
Inverted index
• Lucene creates a data structure where it keeps a
list of where each word belongs.
Lucene-Based projects
• Solr
• Compass
• Elasticsearch
• Hibernate search
Elasticsearch
You Know, for Search.
Features
• Real-Time Data. I (Domingo) say near Real-Time Data.
• Massively Distributed
• High Availability
• Full-Text Search
• Document-Oriented
• Schema-Free
• Developer-Friendly, RESTful API
• Extensible via plugins
Concepts
• Cluster
• Node
• Index
• Shard & Replica
• Type
• Mapping
• Document
How data is
organized in Elasticsearch
Nodes & shards
Indexing documents
Sharding is crucial
• Shard is a physical Lucene index
• # documents in a Lucent index is 2 billion docs.
• When you create a index you have to declare the
# shards, you can’t change later. Beware!
• Don’t try to over-sharding your index! Beware!
Distributed indexing
URL
http://localhost:9200/{index}/{type}/{document_id}
HTTPie for the samples
Creating an index
$ http put :9200/my_index/ settings:='{ "index" : { "number_of_shards" : 3, "number_of_replicas" : 0 } }'
PUT /my_index/ HTTP/1.1
Accept: application/json
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 73
Content-Type: application/json
Host: localhost:9200
User-Agent: HTTPie/0.9.3
{
"settings": {
"index": {
"number_of_replicas": 0,
"number_of_shards": 3
}
}
}
HTTP/1.1 200 OK
Content-Length: 21
Content-Type: application/json; charset=UTF-8
{
"acknowledged": true
}
Creating a type
$ http put :9200/my_index/_mapping/my_document properties:='{ "user_name": { "type": "string" } }' -v
PUT /my_index/_mapping/my_document1 HTTP/1.1
Accept: application/json
Content-Length: 49
Content-Type: application/json
{
"properties": {
"user_name": {
"type": "string"
}
}
}
HTTP/1.1 200 OK
Content-Length: 21
Content-Type: application/json; charset=UTF-8
{
"acknowledged": true
}
Indexing$ http :9200/my_index/my_document user_name="Domingo Suarez" -v
POST /my_index/my_document1 HTTP/1.1
Content-Length: 31
Content-Type: application/json
{ "user_name": "Domingo Suarez” }
HTTP/1.1 201 Created
Content-Length: 149
Content-Type: application/json; charset=UTF-8
{
"_id": "AVRaEeBK3Lbw2oDzSIWN",
"_index": "my_index",
"_shards": {
"failed": 0,
"successful": 1,
"total": 1
},
"_type": "my_document1",
"_version": 1,
"created": true
}
Search $ http :9200/my_index/my_document/_search?q=user_name:Domingo
HTTP/1.1 200 OK
Content-Length: 657
Content-Type: application/json; charset=UTF-8
{
"_shards": {
"failed": 0,
"successful": 3,
"total": 3
},
"hits": {
"hits": [
{
"_id": "AVRaEdPJ3Lbw2oDzSIWM",
"_index": "my_index",
"_score": 0.625,
"_source": {
"user_name": "Domingo Suarez"
},
"_type": "my_document1"
}
],
"max_score": 0.625, "total": 1
},
"timed_out": false,
"took": 5
}

Elasticsearch JVM-MX Meetup April 2016

  • 1.
  • 2.
  • 3.
    ¡El meetup conmas asistencia de la historia!
  • 4.
    What is thetopic tonight?
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
    Do I haveto elaborate why search is important?
  • 17.
  • 18.
    Lucene History • DouglassRead "Doug" Cutting wrote Lucene in 1999 • Doug also is the author of Hadoop • In Lucene other projects came to life • Mahout • Tika • Nutch
  • 19.
  • 20.
    • Lucene isan open-source Java full-text search library which makes it easy to add search functionality to an application or website.
  • 21.
  • 22.
  • 23.
    Inverted index • Lucenecreates a data structure where it keeps a list of where each word belongs.
  • 24.
    Lucene-Based projects • Solr •Compass • Elasticsearch • Hibernate search
  • 25.
  • 26.
    Features • Real-Time Data.I (Domingo) say near Real-Time Data. • Massively Distributed • High Availability • Full-Text Search • Document-Oriented • Schema-Free • Developer-Friendly, RESTful API • Extensible via plugins
  • 27.
    Concepts • Cluster • Node •Index • Shard & Replica • Type • Mapping • Document
  • 28.
    How data is organizedin Elasticsearch
  • 29.
  • 30.
  • 31.
    Sharding is crucial •Shard is a physical Lucene index • # documents in a Lucent index is 2 billion docs. • When you create a index you have to declare the # shards, you can’t change later. Beware! • Don’t try to over-sharding your index! Beware!
  • 32.
  • 33.
  • 34.
  • 35.
    Creating an index $http put :9200/my_index/ settings:='{ "index" : { "number_of_shards" : 3, "number_of_replicas" : 0 } }' PUT /my_index/ HTTP/1.1 Accept: application/json Accept-Encoding: gzip, deflate Connection: keep-alive Content-Length: 73 Content-Type: application/json Host: localhost:9200 User-Agent: HTTPie/0.9.3 { "settings": { "index": { "number_of_replicas": 0, "number_of_shards": 3 } } } HTTP/1.1 200 OK Content-Length: 21 Content-Type: application/json; charset=UTF-8 { "acknowledged": true }
  • 36.
    Creating a type $http put :9200/my_index/_mapping/my_document properties:='{ "user_name": { "type": "string" } }' -v PUT /my_index/_mapping/my_document1 HTTP/1.1 Accept: application/json Content-Length: 49 Content-Type: application/json { "properties": { "user_name": { "type": "string" } } } HTTP/1.1 200 OK Content-Length: 21 Content-Type: application/json; charset=UTF-8 { "acknowledged": true }
  • 37.
    Indexing$ http :9200/my_index/my_documentuser_name="Domingo Suarez" -v POST /my_index/my_document1 HTTP/1.1 Content-Length: 31 Content-Type: application/json { "user_name": "Domingo Suarez” } HTTP/1.1 201 Created Content-Length: 149 Content-Type: application/json; charset=UTF-8 { "_id": "AVRaEeBK3Lbw2oDzSIWN", "_index": "my_index", "_shards": { "failed": 0, "successful": 1, "total": 1 }, "_type": "my_document1", "_version": 1, "created": true }
  • 38.
    Search $ http:9200/my_index/my_document/_search?q=user_name:Domingo HTTP/1.1 200 OK Content-Length: 657 Content-Type: application/json; charset=UTF-8 { "_shards": { "failed": 0, "successful": 3, "total": 3 }, "hits": { "hits": [ { "_id": "AVRaEdPJ3Lbw2oDzSIWM", "_index": "my_index", "_score": 0.625, "_source": { "user_name": "Domingo Suarez" }, "_type": "my_document1" } ], "max_score": 0.625, "total": 1 }, "timed_out": false, "took": 5 }