CURRENT VERSION 7.6
Elastic 101 - Index operations
ENG. ISMAIL ANJRINI
ELASTIC CERTIFIED ENGINEER
CURRENT VERSION 7.6Elastic-Saudi-Arabia
About Me
Ismail Anjrini
More than 15 years experience
Elasticsearch Certified Engineer
CURRENT VERSION 7.6
CURRENT VERSION 7.6
INDEX
Index
An index is like a ‘table’ in a relational database.
It has a mapping which defines multiple types.
An index is a logical namespace:
◦ Maps to one or more primary shards
◦ Can have zero or more replica shards
RDBMS
ES
Database
?
Table
Index
Columns/Rows
Document
CURRENT VERSION 7.6
CURRENT VERSION 7.6
Index settings
number_of_shards
number_of_replicas
refresh_interval
CURRENT VERSION 7.6
Index Operations – create index
We can update number of replicas for existed indexes
CURRENT VERSION 7.6
Index Operations – mapping
CURRENT VERSION 7.6
Index Operations – mapping
PUT names
{
"mappings":
{
"properties":
{
"name":
{
"type": "keyword“
},
"name_text": { "type": "text" }
}
}
}
CURRENT VERSION 7.6
Index Operations – mapping
CURRENT VERSION 7.6
Index Operations – list all indexes
GET _cat/indices
GET /_cat/indices/twi*?v
GET /_cat/indices/?v&health=green|yellow|red&h=col1,col2
CURRENT VERSION 7.6
Index Operations – read index details
GET big-index
GET big-index?format=yaml|json
CURRENT VERSION 7.6
Index Operations – create document
POST big-index/_doc/1
{
"name": "Ismail Anjrini",
"age": 27
}
POST big-index/_doc/2
{
"name": "Fadi Abdul Wahab",
"age": 45,
"country": "Saudi Arabia"
}
CURRENT VERSION 7.6
Index Operations – POST vs PUT
POST big-index/_doc/
{
"name": "Kasem",
"age": 46
}
PUT big-index/_doc/
{
"name": "Riyadh",
"age": 33
}
CURRENT VERSION 7.6
Index Operations – read document
GET big-index/_doc/2
CURRENT VERSION 7.6
Index Operations – update document
POST big-index/_update/1
{
"doc":
{
"name":"Ismail Hassan Anjrini" ,
"country": "Syria"
}
}
CURRENT VERSION 7.6
Index Operations – delete document
DELETE big-index/_doc/1 PUT big-index/_doc/1
{
"name":"Ismail Anjrini",
"age": 27
}
CURRENT VERSION 7.6
Index Operations - Index aliases
An index alias is a secondary name used to refer to one or more existing indices
POST index-1/_alias/index-alias
POST index-2/_alias/index-alias
POST index-3/_alias/index-alias
CURRENT VERSION 7.6
Index Operations - Index aliases
filter: If specified, the index alias only applies to documents returned by the filter.
POST index-*/_alias/index-Egypt
{
"filter":
{
"term":
{
"nationality": "egypt"
}
}
}
CURRENT VERSION 7.6
Index Operations - Index aliases
DELETE index-1/_alias/index-alias
DELETE index-*/_alias/index-alias
GET index-alias/_search
GET index-alias/_search
CURRENT VERSION 7.6
Index Template
Index templates define settings and mappings that you can automatically apply when creating
new indices
Elasticsearch applies templates to new indices based on an index pattern that matches the index
name
Changes to index templates do not affect existing indices
Settings and mappings specified in create index API requests override any settings or mappings
specified in an index template
CURRENT VERSION 7.6
Index Template
CURRENT VERSION 7.6
PUT elastic-log-sys1
Index Template - Order
Multiple index templates can potentially match an index
Both the settings and mappings are merged into the final configuration of the index
The order of the merging can be controlled using the order parameter
With lower order being applied first, and higher orders overriding them
CURRENT VERSION 7.6
Index Template - Order
CURRENT VERSION 7.6
PUT elastic-log-sys1
Index Operations - Reindex
Reindex the current data in old-index to new-index
It does not copy the settings/fields settings from the source index to destination
CURRENT VERSION 7.6
Index Operations - Reindex
version_type: internal or empty:
◦ Update any document that have the same _id regardless the version number in the target index
◦ Increase the version number for the documents with the same _id
CURRENT VERSION 7.6
Index Operations - Reindex
CURRENT VERSION 7.6
Index Operations - Reindex
version_type: external
◦ Elasticsearch to preserve the version from the source
◦ Create any documents that are missing
◦ The _id value is not matched
◦ Update any documents that have an older version in the destination index than they do in the source
index
◦ The document with older version will get the same version number from the source index
CURRENT VERSION 7.6
Index Operations - Reindex
Created index-1
Add data to index-1
Delete new-index-1
CURRENT VERSION 7.6
Index Operations - Reindex
Add document to index-1
Do reindex
CURRENT VERSION 7.6
Index Operations - Reindex
op_type: create
◦ _reindex to only create missing documents in the target index
◦ All existing documents will cause a version conflict
max_docs
◦ To limit the number of processed documents from source to dest
CURRENT VERSION 7.6
CURRENT VERSION 7.6
ElasticSaudi
ElasticSaudiArabia
Elastic-Saudi-Arabia

Elastic 101 index operations

  • 1.
  • 2.
    Elastic 101 -Index operations ENG. ISMAIL ANJRINI ELASTIC CERTIFIED ENGINEER CURRENT VERSION 7.6Elastic-Saudi-Arabia
  • 3.
    About Me Ismail Anjrini Morethan 15 years experience Elasticsearch Certified Engineer CURRENT VERSION 7.6
  • 4.
  • 5.
    Index An index islike a ‘table’ in a relational database. It has a mapping which defines multiple types. An index is a logical namespace: ◦ Maps to one or more primary shards ◦ Can have zero or more replica shards RDBMS ES Database ? Table Index Columns/Rows Document CURRENT VERSION 7.6
  • 6.
  • 7.
  • 8.
    Index Operations –create index We can update number of replicas for existed indexes CURRENT VERSION 7.6
  • 9.
    Index Operations –mapping CURRENT VERSION 7.6
  • 10.
    Index Operations –mapping PUT names { "mappings": { "properties": { "name": { "type": "keyword“ }, "name_text": { "type": "text" } } } } CURRENT VERSION 7.6
  • 11.
    Index Operations –mapping CURRENT VERSION 7.6
  • 12.
    Index Operations –list all indexes GET _cat/indices GET /_cat/indices/twi*?v GET /_cat/indices/?v&health=green|yellow|red&h=col1,col2 CURRENT VERSION 7.6
  • 13.
    Index Operations –read index details GET big-index GET big-index?format=yaml|json CURRENT VERSION 7.6
  • 14.
    Index Operations –create document POST big-index/_doc/1 { "name": "Ismail Anjrini", "age": 27 } POST big-index/_doc/2 { "name": "Fadi Abdul Wahab", "age": 45, "country": "Saudi Arabia" } CURRENT VERSION 7.6
  • 15.
    Index Operations –POST vs PUT POST big-index/_doc/ { "name": "Kasem", "age": 46 } PUT big-index/_doc/ { "name": "Riyadh", "age": 33 } CURRENT VERSION 7.6
  • 16.
    Index Operations –read document GET big-index/_doc/2 CURRENT VERSION 7.6
  • 17.
    Index Operations –update document POST big-index/_update/1 { "doc": { "name":"Ismail Hassan Anjrini" , "country": "Syria" } } CURRENT VERSION 7.6
  • 18.
    Index Operations –delete document DELETE big-index/_doc/1 PUT big-index/_doc/1 { "name":"Ismail Anjrini", "age": 27 } CURRENT VERSION 7.6
  • 19.
    Index Operations -Index aliases An index alias is a secondary name used to refer to one or more existing indices POST index-1/_alias/index-alias POST index-2/_alias/index-alias POST index-3/_alias/index-alias CURRENT VERSION 7.6
  • 20.
    Index Operations -Index aliases filter: If specified, the index alias only applies to documents returned by the filter. POST index-*/_alias/index-Egypt { "filter": { "term": { "nationality": "egypt" } } } CURRENT VERSION 7.6
  • 21.
    Index Operations -Index aliases DELETE index-1/_alias/index-alias DELETE index-*/_alias/index-alias GET index-alias/_search GET index-alias/_search CURRENT VERSION 7.6
  • 22.
    Index Template Index templatesdefine settings and mappings that you can automatically apply when creating new indices Elasticsearch applies templates to new indices based on an index pattern that matches the index name Changes to index templates do not affect existing indices Settings and mappings specified in create index API requests override any settings or mappings specified in an index template CURRENT VERSION 7.6
  • 23.
    Index Template CURRENT VERSION7.6 PUT elastic-log-sys1
  • 24.
    Index Template -Order Multiple index templates can potentially match an index Both the settings and mappings are merged into the final configuration of the index The order of the merging can be controlled using the order parameter With lower order being applied first, and higher orders overriding them CURRENT VERSION 7.6
  • 25.
    Index Template -Order CURRENT VERSION 7.6 PUT elastic-log-sys1
  • 26.
    Index Operations -Reindex Reindex the current data in old-index to new-index It does not copy the settings/fields settings from the source index to destination CURRENT VERSION 7.6
  • 27.
    Index Operations -Reindex version_type: internal or empty: ◦ Update any document that have the same _id regardless the version number in the target index ◦ Increase the version number for the documents with the same _id CURRENT VERSION 7.6
  • 28.
    Index Operations -Reindex CURRENT VERSION 7.6
  • 29.
    Index Operations -Reindex version_type: external ◦ Elasticsearch to preserve the version from the source ◦ Create any documents that are missing ◦ The _id value is not matched ◦ Update any documents that have an older version in the destination index than they do in the source index ◦ The document with older version will get the same version number from the source index CURRENT VERSION 7.6
  • 30.
    Index Operations -Reindex Created index-1 Add data to index-1 Delete new-index-1 CURRENT VERSION 7.6
  • 31.
    Index Operations -Reindex Add document to index-1 Do reindex CURRENT VERSION 7.6
  • 32.
    Index Operations -Reindex op_type: create ◦ _reindex to only create missing documents in the target index ◦ All existing documents will cause a version conflict max_docs ◦ To limit the number of processed documents from source to dest CURRENT VERSION 7.6
  • 33.

Editor's Notes

  • #6 Table  Type (deprecated)
  • #8 refresh_interval: How often to perform a refresh operation, which makes recent changes to the index visible to search. Defaults to 1s
  • #9 number_of_shards The number of primary shards that an index should have. Defaults to 1. This setting can only be set at index creation time
  • #13 Health values: green|yellow|red (Optional, string) Health status used to limit returned indices h: (Optional, string) Comma-separated list of column names to display. s: (Optional, string) Comma-separated list of column names or column aliases used to sort the response.
  • #15 Script 1: Where is the Nationality field? It is not here because we didn’t pass it during the document creation Script 2: Note the country column in the mappings section
  • #16 PUT 1 - updates a full document, not only the field you're sending. 2 - can not create document without id POST 1 - will do a partial update and only update the fields you're sending, and not touch the other ones already present in the document. 2 - creates document with/without id
  • #18 1 - Note that we didn’t touch the field age and still appears 2 – You can add new field to the document
  • #19 Check _version: 6 Versioning: Each document indexed is versioned. When deleting a document, the version can be specified to make sure the relevant document we are trying to delete is actually being deleted and it has not changed in the meantime. Every write operation executed on a document, deletes included, causes its version to be incremented. The version number of a deleted document remains available for a short time after deletion to allow for control of concurrent operations. The length of time for which a deleted document’s version remains available is determined by the index.gc_deletes index setting and defaults to 60 seconds.
  • #23 https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-templates.html
  • #24 https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-templates.html
  • #25 https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-templates.html
  • #26 https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-templates.html
  • #27 Great articles https://developers.soundcloud.com/blog/how-to-reindex-1-billion-documents-in-1-hour-at-soundcloud https://engineering.carsguide.com.au/elasticsearch-zero-downtime-reindexing-e3a53000f0ac Full reference https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html
  • #29 1 – Reindex documents already exists in the dest index 2 – The version will be increased with the updated data