Elastic
@mcascallares
Elastic{ON} 2017 Recap
Matias Cascallares
3
85,000 Members
Community
4
100M
45M
25M
Downloads
100,000,000
2015 20162014
Elasticsearch
6
• Faster
• Friendlier
• Smaller
• Smarter
• Safer
26 October 2016
Elasticsearch 5.0.0
Throughput with one replica on two nodes, with auto-generated IDs
Append-only indexing
7
0
8
15
23
30
v2.4.2 v5.2.1 master
K docs/s
What’s new in 5.x?
Mappings
Range Fields & Queries
11
What’s on at Elasticon
tomorrow between 11am and 2pm?
12
13
Wednesday 11am - 2pm
14
Wednesday 11am - 2pm - INTERSECTS
15
Wednesday 11am - 2pm - CONTAINS
16
Wednesday 11am - 2pm - WITHIN
Keyword Normalizers
18
{
"city": {
"type": "string",
"index": "analyzed",
"fields": {
"city.keyword": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
19
{
"city": {
"type": "text"
"fields": {
"city.keyword": {
"type": "keyword"
}
}
}
}
20
{
"city": {
"type": "text"
"fields": {
"city.keyword": {
"type": "keyword"
}
}
}
}
Full text queries
Full text analysis
21
{
"city": {
"type": "text"
"fields": {
"city.keyword": {
"type": "keyword"
}
}
}
}
Keyword queries
Aggregations
Sorting
22
{
"city": {
"type": "text"
"fields": {
"city.keyword": {
"type": "keyword"
}
}
}
}
No analysis
23
San Francisco
SAN FRANCISCO
san francisco
San franciscO
24
San Francisco
SAN FRANCISCO
san francisco
San franciscO
san francisco
Normalizer
Search & Aggregations
Multi-Word Synonyms
27
NY
NYC
New York
New York City
} Synonyms
Phrase query:
“NYC is OLD!”
29
Synonym Filter:
(ny|nyc|new), (is|york), (old,city)
30
Synonym Filter:
(ny|nyc|new), (is|york), (old,city)
31
Synonym Filter:
(ny|nyc|new), (is|york), (old,city)
ny is old
nyc
new york city
Synonym Graph Filter:
More Search Improvements
33
Query Optimizations
• Smarter query caching
• Faster geo, range, and nested queries
• Unified highlighter
• Field collapsing
• Cancellable searches
• Partitioned term aggs
Operational Improvements
35
When your cluster is RED…
/_cat/allocation /_cat/indices
/_cat/nodes
/_cat/recovery
/_cat/shards
/_cluster/health
/_cluster/state
/{index}/_shard_stores
/_cluster/settings
/_node/stats
/{index}/_settings
/_node
36
When your cluster is RED…
/_cluster/allocation/explain
37
/_cluster/allocation/explain
…
"allocate_explanation" : "cannot allocate because allocation
is not permitted to any of the nodes”,
…
{
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index setting
[index.routing.allocation.include]
filters [_name:"non_existent_node"]"
}
…
38
/_cluster/allocation/explain
…
"unassigned_info" : {
"reason" : "NODE_LEFT",
"at" : "2017-01-04T18:03:28.464Z",
"details" : "node_left[OIWe8UhhThCK0V5XfmdrmQ]",
"last_allocation_status" : "no_valid_shard_copy"
},
"can_allocate" : "no_valid_shard_copy",
"allocate_explanation" : "cannot allocate because a
previous copy of the primary shard existed but can no longer
be found on the nodes in the cluster"
…
39
/_cluster/allocation/explain
…
"rebalance_explanation" : "cannot rebalance as no target node
exists that can both allocate this shard and improve the
cluster balance",
"node_allocation_decisions" : [
{
"node_id" : "oE3EGFc8QN-Tdi5FFEprIA",
"node_name" : "node_t1",
"transport_address" : "127.0.0.1:9401",
"node_decision" : "worse_balance",
"weight_ranking" : 1
}
…
Java REST Client
41
Java REST Client - behind the scenes
• Came late to the party…
• Isn’t nearly as extensive as the Transport Client
• Maintaining a transport protocol based client causes a massive engineering overhead
• It’s a “second” entry point into the system
• Complicates distinguishing between clients and nodes
42
Java low-level HTTP client
• Released in 5.0.0
• JSON strings only
• Resilient, but not user friendly due to the lack of a higher level API
43
Java high-level HTTP client
• IDE friendly
• Similar API to Transport Client - easy migration
• Based on low-level REST client
• Support CRUD & Search
• Previews in 5.5
• Depends on elasticsearch-core
Tribe Node
45
How the Tribe Node works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
46
How the Tribe Node works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Tribe Node Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
tribe:
t1:
cluster.name: sales
t2:
cluster.name: r_and_d
47
How the Tribe Node works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Tribe Node
t1 Node Client
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
48
How the Tribe Node works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Tribe Node
t1 Node Client
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
t2 Node Client
49
How the Tribe Node works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Tribe Node
t1 Node Client
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
t2 Node ClientCluster State Cluster State
50
How the Tribe Node works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Tribe Node
t1 Node Client
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
t2 Node Client
Cluster State Cluster State
51
How the Tribe Node works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Tribe Node
t1 Node Client
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
t2 Node Client
Merged Cluster State
52
How the Tribe Node works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Tribe Node
t1 Node Client
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
t2 Node Client
Merged Cluster State
Kibana
53
Problems With How the Tribe Node works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Tribe Node
t1 Node Client
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
t2 Node Client
Merged Cluster State
Kibana
54
Problems With How the Tribe Node works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Tribe Node
t1 Node Client
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
t2 Node Client
Merged Cluster State
Kibana
Static Configuration
tribe:
t1:
cluster.name: sales
t2:
cluster.name: r_and_d
55
Problems With How the Tribe Node works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Tribe Node
t1 Node Client
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
t2 Node Client
Kibana
Merged Cluster State
Connections to All Nodes
56
Problems With How the Tribe Node works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Tribe Node
t1 Node Client
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
t2 Node Client
Kibana
Merged Cluster State
Frequent cluster
state updates
57
Problems With How the Tribe Node works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Tribe Node
t1 Node Client
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
t2 Node Client
Kibana
Merged Cluster State
Index names
must be unique
58
Problems With How the Tribe Node works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
t1 Node Client
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
t2 Node Client
Merged Cluster State
Tribe Node
Kibana
No master node
No index creation
59
Problems With How the Tribe Node works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Tribe Node
t1 Node Client
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
t2 Node Client
Merged Cluster State
Kibana
Reduce results from
many shards
The Tribe Node is Dead
Long Live
Cross-Cluster Search!
Minimal viable solution
to supersede tribe
62
Reduces the problem domain
to query execution
63
Cluster related information is
reduced to a namespace
64
65
How Cross-Cluster search works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
66
How Cross-Cluster search works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
Any node can perform
cross-cluster search
67
How Cross-Cluster search works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Optional dedicated cross-cluster search cluster
Master/Data Node
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
68
How Cross-Cluster search works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
PUT _cluster/settings
{
"transient": {
"search.remote": {
"sales.seeds": "10.0.0.1:9300",
“r_and_d.seeds”: "10.1.0.1:9300"
}
}
}
Dynamic settings
Optional dedicated cross-cluster search cluster
69
How Cross-Cluster search works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
No cluster state updates
Optional dedicated cross-cluster search cluster
70
How Cross-Cluster search works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
Kibana
Optional dedicated cross-cluster search cluster
71
How Cross-Cluster search works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
Kibana
Can create indices
Optional dedicated cross-cluster search cluster
72
How Cross-Cluster search works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
Kibana
Optional dedicated cross-cluster search cluster
73
How Cross-Cluster search works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
Kibana
Few lightweight
connections
Optional dedicated cross-cluster search cluster
74
How Cross-Cluster search works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
Kibana
Index namespacing
GET sales:*,r_and_d:logs*/_search
{
"query": { … }
}
Optional dedicated cross-cluster search cluster
75
How Cross-Cluster search works
Cluster Sales
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
Cluster R&D
Master Nodes
Data Node
Data Node
Data Node
Master/Data Node
Kibana
With many shards
Batched Reduce Phase
Optional dedicated cross-cluster search cluster
Cross-Cluster Search
v5.3.0
Batched Reduce Phase
v5.4.0
v6 and beyond
Doc Values
v2.x
80
Doc Values
• Columnar store
• Fast access to a field’s value for many documents.
• Used for aggregations, sorting, scripting, and some queries
• Written to disk at index time.
• Cached in the file-system cache
81
Doc Values - Dense Values
Segment 1
Docs Field 1 Field 2
1 One A
2 Two B
3 Three C
Segment 2
Docs Field 1 Field 2
1 Four D
82
Doc Values - Dense Values
Segment 1
Docs Field 1 Field 2
1 One A
2 Two B
3 Three C
Segment 2
Docs Field 1 Field 2
1 Four D
Merged Segment 3
Docs Field 1 Field 2
1 One A
2 Two B
3 Three C
4 Four D
83
Doc Values - Sparse Values
Segment 1
Docs Field 1 Field 2
1 One A
2 Two B
3 Three C
Segment 2
Docs Field 3 Field 4 Field 5
1 Foo Bar Baz
84
Doc Values - Sparse Values
Segment 1
Docs Field 1 Field 2
1 One A
2 Two B
3 Three C
Segment 2
Docs Field 3 Field 4 Field 5
1 Foo Bar Baz
Merged Segment 3
Docs
Field
1
Field
2
Field
3
Field
4
Field
5
1 One A Null Null Null
2 Two B Null Null Null
3 Three C Null Null Null
4 Null Null Foo Bar Baz
Sparse Doc Values
Lucene 7
Index Sorting
Lucene 7
87
Index sorting
• Sort index by e.g. weight, recency, or popularity
• Ultra-fast search - can terminate once enough hits found
88
Index sorting
• Sort index by e.g. weight, recency, or popularity
• Ultra-fast search - can terminate once enough hits found
• Even helps with total count and aggregations
• Sort index by low cardinality terms - faster search
• Better sparse index compression
• Slower indexing, good for static indices
Sequence Numbers
v6.0.0
90
Sequence Numbers
• Internal Feature
• Every operation gets a sequence number
• In 6.0: Fast replica recovery on active indices
• Lays groundwork for:
• Primary-Replica syncing when Primary fails
• Cross Data-Centre Recovery
• Changes API
Upgrading
Rolling Upgrades
v6.0.0
93
Rolling Upgrades
• Upgrade from 5.latest to 6.latest, without a full cluster restart
• Why now and not earlier?
• Testing needs to be ready
• The team and the code must be ready
• Growing user-base and faster release cycles required less painful upgrades
94
Rolling Upgrades
• What is 5.latest?
• It’s the latest release of 5.x that is GA once 6.0.0 goes GA
• All 6.x releases will allow upgrading from that 5.x release
• There might be subsequent 5.x releases that are also eligible for upgrades to 6.x
95
Rolling Upgrades
• Caveats:
• If using security, must have TLS enabled
• Reserve the right to require full cluster restart in the future - but only if absolutely
necessary
• All nodes must be upgraded to 5.latest in order to upgrade
• Indices created in 2.x still need to be reindexed before upgrading to 6.x
Cross Major Version Search
v6.0.0
97
Cross Major Version Search
v5.2.0
Kibana
Master Nodes
Data Node
Data Node
98
Cross Major Version Search
v5.2.0
Kibana
v6.0.0
Master Nodes
Data Node
Data Node
Master Nodes
Data Node
Data Node
99
Cross Major Version Search
v5.2.0
Master Nodes
Data Node
Data Node
v6.0.0
v5.latest
Kibana
Master Nodes
Data Node
Data Node
100
Cross Major Version Search
v5.2.0
Master Nodes
Data Node
Data Node
v6.0.0
Kibana
Master Nodes
Data Node
Cross Cluster Client
v5.latest
Elasticsearch SQL
102
Elasticsearch SQL
• Support Elasticsearch features as much as possible
• Support standard SQL as much as possible
• Same experience out of the box as Elasticsearch, which means
• Lightweight & fast
• Easy to pick up
• Still Elasticsearch :)
Kibana
Visualizations
105
Tag Cloud
106
Heatmaps
107
Vector Maps
Monitoring & Troubleshooting
109
Logstash Monitoring
110
Advanced View
111
Query Profiler
Kibana Canvas
113
Logstash Monitoring
•  New visualization application on top
of Elasticsearch data
•  Use Case:
•  live infographics
•  presentations with live data feeds
•  highly customized reports
•  Currently, in the prototyping phase
•  Release date: TBD
114
Logstash Monitoring
Beats
116
Beats
Packetbeat
Network data
Filebeat
Log files
Winlogbeat
Windows Event Logs
Heartbeat
Uptime monitoring
+40 community Beats
Metricbeat
Metrics
New Beat: Heartbeat
118
Heartbeat
host!
!
!Your app!
OS!
TCP/TLS connect!
ICMP ping!
HTTP/S request!
Metricbeat
120
Metricbeat modules
MySQL
Memcac
he
PHP-
FPM
CEPH
Zoo
keeper
GolangDocker
Apache Kafka
HAProx
y
System Redis
Couchb
ase
NGINX
Postgre
s
Prometh
eus
Jolokia
Filebeat
122
Filebeat modules
Filebeat
configuration!
Ingest pipelines!
Elasticsearch
template!
Kibana
dashboards!
Logstash
Persistent Queues
125
Persistent Queues
• Survive (temporary) machine failures
• Limited impact on performance
• View stats in Monitoring UI
Pipeline Visualizer
127
Pipeline Visualizer
www.elastic.co
Except	where	otherwise	noted,	this	work	is	licensed	under	
http://creativecommons.org/licenses/by-nd/4.0/	
Creative	Commons	and	the	double	C	in	a	circle	are		
registered	trademarks	of	Creative	Commons	in	the	United	States	and	other	countries.	
Third	party	marks	and	brands	are	the	property	of	their	respective	holders.
129
Please attribute Elastic with a link to elastic.co

Elastic{ON} 2017 Recap