SlideShare a Scribd company logo
1 of 83
elasticsearch.
Florian Hopf
www.florian-hopf.de
@fhopf
Agenda
●
Suche
●
Verteilung
●
Elasticsearch und Java
●
Aggregationen
●
Zentralisiertes Logging
Suche
Suche
Installation
# download archive
wget https://artifacts.elastic.co/downloads/
elasticsearch/elasticsearch-5.0.0.zip
unzip elasticsearch-5.0.0.zip
# on windows: elasticsearch.bat
elasticsearch-5.0.0/bin/elasticsearch
Zugriff per HTTP
curl -XGET "http://localhost:9200"
{
"name" : "LI8ZN-t",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "UvbMAoJ8TieUqugCGw7Xrw",
"version" : {
"number" : "5.0.0",
"build_hash" : "253032b",
"build_date" : "2016-10-26T04:37:51.531Z",
"build_snapshot" : false,
"lucene_version" : "6.2.0"
},
"tagline" : "You Know, for Search"
}
Indizierung
POST /library/book
{
"title": "Elasticsearch in Action",
"author": [ "Radu Gheorghe",
"Matthew Lee Hinman",
"Roy Russo" ],
"published": "2015-06-30T00:00:00.000Z",
"publisher": {
"name": "Manning",
"country": "USA"
}
}
Suche
GET /library/book/_search?q=elasticsearch
{
[...]
"hits": {
"hits": [
{
"_index": "library",
"_type": "book",
"_source": {
"title": "Elasticsearch in Action",
[...]
Suche per Query DSL
POST /library/book/_search
{
"query": {
"bool": {
"must": {
"match": {
"title": "elasticsearch"
}
},
"filter": {
"term": {
"publisher.name": "manning"
}
}
}
}
}
Sprachspezifische Inhalte
POST /library/book/
{
"title": "Elasticsearch - Ein praktischer Einstieg",
"author": "Florian Hopf",
"published": "2015-10-26T00:00:00.000Z",
"publisher": {
"name": "dpunkt.verlag",
"country": "DE"
},
"tags": ["Lucene", "Elasticsearch"]
}
Sprachspezifische Inhalte
POST /library/book/_search
{
"query": {
"match": {
"title": "praktisch"
}
}
}
Analyzing
Term Document Id
Action 1
ein 2
Einstieg 2
Elasticsearch 1,2
in 1
praktischer 2
1. Tokenization
Elasticsearch
in Action
Elasticsearch:
Ein praktischer
Einstieg
Analyzing
Term Document Id
action 1
ein 2
einstieg 2
elasticsearch 1,2
in 1
praktischer 2
1. Tokenization
Elasticsearch
in Action
Elasticsearch:
Ein praktischer
Einstieg
2. Lowercasing
Suche
Term Document Id
action 1
ein 2
einstieg 2
elasticsearch 1,2
in 1
praktischer 2
1. Tokenization
2. LowercasingElasticsearch elasticsearch
Suche
Term Document Id
action 1
ein 2
einstieg 2
elasticsearch 1,2
in 1
praktischer 2
1. Tokenization
2. Lowercasingpraktisch praktisch
Analyzing
Term Document Id
action 1
ein 2
einstieg 2
elasticsearch 1,2
in 1
praktisch 2
1. Tokenization
Elasticsearch
in Action
Elasticsearch:
Ein praktischer
Einstieg
2. Lowercasing
3. Stemming
Suche
Term Document Id
action 1
ein 2
einstieg 2
elasticsearch 1,2
in 1
praktisch 2
1. Tokenization
2. Lowercasingpraktisch praktisch
3. Stemming
Mapping
PUT /library/book/_mapping
{
"book": {
"properties": {
"title": {
"type": "text",
"analyzer": "german"
}
}
}
}
Suchfeatures
●
Fertige Analyzer, Konfiguration von eigenen
●
Relevanzberechnung
●
Paginierung, Sortierung
●
Highlighting, Autovervollständigung, ...
●
Facettierung über Aggregationen
Recap
●
Java-basierter Suchserver
●
Kommunikation über HTTP und JSON
●
Dokumentenbasierte Speicherung
●
Unterstützung unterschiedlicher Datentypen
●
Suche über Query DSL auf invertiertem Index
Verteilung
Verteilung
Verteilung
Verteilung
Suche im Cluster
Suche im Cluster
Recap
●
Knoten können Cluster bilden
●
Datenverteilung durch Sharding
●
Replicas zur Lastverteilung und Ausfallsicherheit
●
Verteilte Suche
●
Cluster-Zustand auf jedem Knoten
Java
Java!
dependencies {
compile group: 'org.elasticsearch',
name: 'elasticsearch',
version: '2.4.0'
}
Java!
TransportAddress address =
new InetSocketTransportAddress("localhost", 9300);
Client client = new TransportClient().
addTransportAddress(address);
Suche per Query DSL
POST /library/book/_search
{
"query": {
"bool": {
"must": {
"match": {
"title": "elasticsearch"
}
},
"filter": {
"term": {
"publisher.name": "manning"
}
}
}
}
}
Suche über Java-Client
SearchResponse searchResponse = client
.prepareSearch("library")
.setQuery(
boolQuery().
must(matchQuery("title", "elasticsearch")).
filter(termQuery("publisher.name", "manning")))
.execute().actionGet();
assertEquals(1, searchResponse.getHits().getTotalHits());
SearchHit hit = searchResponse.getHits().getAt(0);
assertEquals("Elasticsearch in Action",
hit.getSource().get("title"));
Indizierung über Java-Client
XContentBuilder jsonBuilder = jsonBuilder()
.startObject()
.field("title", "Elasticsearch")
.field("author", "Florian Hopf")
.startObject("publisher")
.field("name", "dpunkt")
.endObject()
.endObject();
client.prepareIndex("library", "book")
.setSource(jsonBuilder)
.execute().actionGet();
Indizierung über Java-Client
Transport-Client
●
Verbindet sich mit bestehendem Cluster
TransportAddress address =
new InetSocketTransportAddress("localhost", 9300);
Client client = new TransportClient().
addTransportAddress(address);
Transport-Client
Node-Client
Node node = nodeBuilder().client(true).node();
Client client = node.client();
Node-Client
●
Startet eigenen Knoten
●
Anwendung wird Teil des Clusters
Node node = nodeBuilder().client(true).node();
Client client = node.client();
Node-Client
Standard-Client
●
Volle API-Unterstützung
●
Effiziente Kommunikation
●
Node-Client
●
Cluster-State spart unnötige Hops
Standard-Client Gotchas
●
Elasticsearch-/JDK-Version Client == Server
●
Node-Client
●
Bei Start und Stop wird Cluster-State übertragen
●
Speicherverbrauch
●
Elasticsearch-Dependency
Elasticsearch 5
dependencies {
compile group: 'org.elasticsearch.client',
name: 'transport',
version: '5.0.0'
}
Java!
TransportAddress address =
new InetSocketTransportAddress(
InetAddress.getByName("localhost"), 9300);
Client client = new PreBuiltTransportClient(Settings.EMPTY)
addTransportAddress(address);
Elasticsearch 5
●
TransportClient hat eigenes Artefakt
●
NodeClient nicht mehr empfohlen
Elasticsearch 5 REST-Client
dependencies {
compile group: 'org.elasticsearch.client',
name: 'rest',
version: '5.0.0'
}
Elasticsearch 5 REST-Client
+--- org.apache.httpcomponents:httpclient:4.5.2
+--- org.apache.httpcomponents:httpcore:4.4.5
+--- org.apache.httpcomponents:httpasyncclient:4.1.2
+--- org.apache.httpcomponents:httpcore-nio:4.4.5
+--- commons-codec:commons-codec:1.10
--- commons-logging:commons-logging:1.1.3
Elasticsearch 5 REST-Client
RestClient restClient = RestClient.builder(
new HttpHost("localhost", 9200, "http"),
new HttpHost("localhost", 9201, "http")).build();
Elasticsearch 5 REST-Client
try (RestClient restClient = createRestClient()) {
HttpEntity entity = new NStringEntity(
"{ "query": { "match_all": {}}}",
ContentType.APPLICATION_JSON);
// alternative: performRequestAsync
Response response = restClient.performRequest("POST",
"/_search", emptyMap(), entity);
String json = toString(response.getEntity());
// ...
}
Elasticsearch 5 REST-Client
●
Anwendung unabhängiger von Elasticsearch-Version
●
Saubere Trennung Elasticsearch-Cluster/Anwendung
●
Minimale Dependencies
●
Unterstützung für Node-Sniffing, Fehlerhandling,
Timeout-Config, Basic-Auth, Header, ...
Alternativen
●
Jest: Http-Client
●
Spring Data Elasticsearch
●
Große Auswahl für JVM-Sprachen
Aggregationen
Aggregations
●
Informationen über die Daten
●
Ersetzen und ermöglichen Facetten
●
Anwendungsfälle für Suchanwendungen und
Analytics
Aggregations
Invertierter Index
Term Document Id
Elasticsearch 1,2,3
Java 3
Lucene 2,3
Terms-Aggregation
POST /library/book/_search
{
"size": 0,
"aggs": {
"common-tags": {
"terms": {
"field": "tags.keyword"
}
}
}
}
Terms-Aggregation
"aggregations": {
"common-tags": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Elasticsearch",
"doc_count": 2
},
{
"key": "Lucene",
"doc_count": 2
},
{
"key": "Java",
"doc_count": 1
}]
[...]
Terms-Aggregation
●
Bucket mit Wert und Anzahl
●
Facettierung
●
Informationen aus Daten
Aggregationen kombinieren
POST /devoxx/tweet/_search
{
"aggs" : {
"hashtags" : {
"terms" : {
"field" : "hashtag.text"
}
}
}
}
Aggregationen kombinieren
"aggregations": {
"hashtags": {
"buckets": [
{
"key": "dartlang",
"doc_count": 229
},
{
"key": "java",
"doc_count": 216
},
[...]
Aggregationen kombinieren
POST /devoxx/tweet/_search
{
"aggs" : {
"hashtags" : {
"terms" : {
"field" : "hashtag.text"
},
"aggs" : {
"hashtagusers" : {
"terms" : {
"field" : "user.screen_name"
}
}
}
}
}
}
Aggregationen kombinieren
"key": "scala",
"doc_count": 130,
"hashtagusers": {
"buckets": [
{
"key": "jaceklaskowski",
"doc_count": 74
},
{
"key": "ManningBooks",
"doc_count": 3
},
[...]
Bucket-Aggregationen
●
Range-Aggregationen
●
Histogramme
●
Filter
●
Geo-Aggregationen
●
…
Metric-Aggregationen
●
Berechnen einen oder mehrere Werte
●
Meist auf numerischen Feldern
●
Stats, Percentiles, Min, Max, Sum, Avg, ...
Stats-Aggregationen
GET /library/book/_search
{
"aggs": {
"published_stats": {
"stats": {
"field": "published"
}
}
}
}
Stats-Aggregationen
"aggregations": {
"published_stats": {
"count": 5,
"min": 1419292800000,
"max": 1445990400000,
"avg": 1440547200000,
"sum": 7202736000000,
"min_as_string": "2014-12-23T00:00:00.000Z",
"max_as_string": "2015-10-28T00:00:00.000Z",
"avg_as_string": "2015-08-26T00:00:00.000Z",
"sum_as_string": "2198-03-31T00:00:00.000Z"
}
}
Significant Terms Aggregation
●
Findet Besonderheiten in Daten
●
Bewertet Datensatz gegen Hintergrundmenge
●
Beispiel: Häufige Hashtag-Kombinationen
Significant Terms Aggregation
POST /conf/_search
{
"aggs": {
"hashtag": {
"terms": {
"field": "hashtag.text"
},
"aggs": {
"associated_hashtags": {
"significant_terms": {
"field": "hashtag.text"
}
}
}
}
}
}
Significant Terms Aggregation
●
bbuzz (Berlin Buzzwords)
●
elasticsearch, devops, bigdata, cassandra
●
javaland
●
java8, javafx, javaee7, nighthacking
●
dwx14 (Developer Week)
●
nueww, nodejs, intelandroid, web, microsoft
Significant Terms Aggregation
●
Empfehlungen
●
Fraud-Detection
●
Geographische Häufungen
Recap
●
Aggregationen bieten unterschiedliche
Einblicke in die Daten
●
Facettierung
●
Kombination mehrerer Aggregationen
●
Grundlage für Visualisierungen
Zentralisiertes Logging
Zentralisiertes Logging
●
Zentralisierung Logs aus Anwendungen
●
Zentralisierung Logs über Maschinen
●
Auch ohne Zugriff
●
Leichte Durchsuchbarkeit
●
Real-Time-Analysis / Visualisierung
●
Zugang zu Daten
Zentralisiertes Logging
●
Einlesen
●
Beats/Logstash/Ingest Node
●
Speicherung
●
Elasticsearch
●
Auswertung
●
Kibana
Logfile-Analyse
Beats
●
Filebeat
●
Metricbeat
●
Packetbeat
●
...
Logstash-Config
input {
file {
path => "/var/log/apache2/access.log"
}
}
filter {
grok {
match => { message => "%{COMBINEDAPACHELOG}" }
}
}
output {
elasticsearch_http {
host => "localhost"
}
}
Kibana
Kibana
Recap
●
Einlesen, Anreichern, Speichern von Logevents
●
Zahlreiche Inputs in Logstash
●
Konsolidierung
●
Zentralisierung
●
Auswertung
Noch viel mehr!
●
Unterschiedliche Suchfeatures
●
Viele Aggregationen
●
Geodaten
●
Percolator
Elasticsearch 5.0
●
Text- und Keyword-Felder
●
Painless Scripting Language
●
Shrink-API
●
Verbesserugen Performance, Betrieb, …
●
http://elasticsearch-buch.de/2016/10/27/neues-
in-elasticsearch-5.html
Weitere Infos
Weitere Infos
●
http://elastic.co
●
Elasticsearch – The definitive Guide
●
https://www.elastic.co/guide/en/elasticsearch/guide/master
/index.html
●
Elasticsearch in Action
●
https://www.manning.com/books/elasticsearch-in-action
●
http://blog.florian-hopf.de

More Related Content

Viewers also liked

Risk Mitigation Trees - Review test handovers with stakeholders (2004)
Risk Mitigation Trees - Review test handovers with stakeholders (2004)Risk Mitigation Trees - Review test handovers with stakeholders (2004)
Risk Mitigation Trees - Review test handovers with stakeholders (2004)Neil Thompson
 
Isn power point copy
Isn power point   copyIsn power point   copy
Isn power point copypenopoly
 
A Decade of Crying Mothers
A Decade of Crying MothersA Decade of Crying Mothers
A Decade of Crying MothersDr Nahin Mamun
 
Owwl upgrades
Owwl upgradesOwwl upgrades
Owwl upgradesknollnook
 
ALIA NLS7 Career Planning Workshop Contributed Slides
ALIA NLS7 Career Planning Workshop Contributed SlidesALIA NLS7 Career Planning Workshop Contributed Slides
ALIA NLS7 Career Planning Workshop Contributed SlidesSue Hutley
 
'Best Practices' & 'Context-Driven' - Building a bridge (2003)
'Best Practices' & 'Context-Driven' - Building a bridge (2003)'Best Practices' & 'Context-Driven' - Building a bridge (2003)
'Best Practices' & 'Context-Driven' - Building a bridge (2003)Neil Thompson
 
La raccolta differenziata e la tarsu
La raccolta differenziata e la tarsuLa raccolta differenziata e la tarsu
La raccolta differenziata e la tarsuexlab
 
Holistic Test Analysis & Design (2007)
Holistic Test Analysis & Design (2007)Holistic Test Analysis & Design (2007)
Holistic Test Analysis & Design (2007)Neil Thompson
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearchFlorian Hopf
 
Can the Internet Change Bangladesh
Can the Internet Change BangladeshCan the Internet Change Bangladesh
Can the Internet Change BangladeshDr Nahin Mamun
 
ใบงานที่ ประวัติส่วนตัว2
ใบงานที่ ประวัติส่วนตัว2ใบงานที่ ประวัติส่วนตัว2
ใบงานที่ ประวัติส่วนตัว2princess Thirteenpai
 
12 practice paper_3_h_-_set_a_mark_scheme
12 practice paper_3_h_-_set_a_mark_scheme12 practice paper_3_h_-_set_a_mark_scheme
12 practice paper_3_h_-_set_a_mark_schemeclaire meadows-smith
 
Chapter 1
Chapter 1Chapter 1
Chapter 1Cel Koh
 
Social pres
Social presSocial pres
Social presadshock
 
English project (aviral gupta)
English project (aviral gupta)English project (aviral gupta)
English project (aviral gupta)aviralgupta14
 

Viewers also liked (18)

Risk Mitigation Trees - Review test handovers with stakeholders (2004)
Risk Mitigation Trees - Review test handovers with stakeholders (2004)Risk Mitigation Trees - Review test handovers with stakeholders (2004)
Risk Mitigation Trees - Review test handovers with stakeholders (2004)
 
Isn power point copy
Isn power point   copyIsn power point   copy
Isn power point copy
 
A Decade of Crying Mothers
A Decade of Crying MothersA Decade of Crying Mothers
A Decade of Crying Mothers
 
Presentation
PresentationPresentation
Presentation
 
Owwl upgrades
Owwl upgradesOwwl upgrades
Owwl upgrades
 
ALIA NLS7 Career Planning Workshop Contributed Slides
ALIA NLS7 Career Planning Workshop Contributed SlidesALIA NLS7 Career Planning Workshop Contributed Slides
ALIA NLS7 Career Planning Workshop Contributed Slides
 
'Best Practices' & 'Context-Driven' - Building a bridge (2003)
'Best Practices' & 'Context-Driven' - Building a bridge (2003)'Best Practices' & 'Context-Driven' - Building a bridge (2003)
'Best Practices' & 'Context-Driven' - Building a bridge (2003)
 
La raccolta differenziata e la tarsu
La raccolta differenziata e la tarsuLa raccolta differenziata e la tarsu
La raccolta differenziata e la tarsu
 
Holistic Test Analysis & Design (2007)
Holistic Test Analysis & Design (2007)Holistic Test Analysis & Design (2007)
Holistic Test Analysis & Design (2007)
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
Can the Internet Change Bangladesh
Can the Internet Change BangladeshCan the Internet Change Bangladesh
Can the Internet Change Bangladesh
 
ใบงานที่ ประวัติส่วนตัว2
ใบงานที่ ประวัติส่วนตัว2ใบงานที่ ประวัติส่วนตัว2
ใบงานที่ ประวัติส่วนตัว2
 
Crab fishing
Crab fishingCrab fishing
Crab fishing
 
Networking Basics
Networking BasicsNetworking Basics
Networking Basics
 
12 practice paper_3_h_-_set_a_mark_scheme
12 practice paper_3_h_-_set_a_mark_scheme12 practice paper_3_h_-_set_a_mark_scheme
12 practice paper_3_h_-_set_a_mark_scheme
 
Chapter 1
Chapter 1Chapter 1
Chapter 1
 
Social pres
Social presSocial pres
Social pres
 
English project (aviral gupta)
English project (aviral gupta)English project (aviral gupta)
English project (aviral gupta)
 

More from Florian Hopf

Modern Java Features
Modern Java Features Modern Java Features
Modern Java Features Florian Hopf
 
Einführung in Elasticsearch
Einführung in ElasticsearchEinführung in Elasticsearch
Einführung in ElasticsearchFlorian Hopf
 
Java clients for elasticsearch
Java clients for elasticsearchJava clients for elasticsearch
Java clients for elasticsearchFlorian Hopf
 
Einfuehrung in Elasticsearch
Einfuehrung in ElasticsearchEinfuehrung in Elasticsearch
Einfuehrung in ElasticsearchFlorian Hopf
 
Data modeling for Elasticsearch
Data modeling for ElasticsearchData modeling for Elasticsearch
Data modeling for ElasticsearchFlorian Hopf
 
Einführung in Elasticsearch
Einführung in ElasticsearchEinführung in Elasticsearch
Einführung in ElasticsearchFlorian Hopf
 
Anwendungsfälle für Elasticsearch JAX 2015
Anwendungsfälle für Elasticsearch JAX 2015Anwendungsfälle für Elasticsearch JAX 2015
Anwendungsfälle für Elasticsearch JAX 2015Florian Hopf
 
Anwendungsfälle für Elasticsearch JavaLand 2015
Anwendungsfälle für Elasticsearch JavaLand 2015Anwendungsfälle für Elasticsearch JavaLand 2015
Anwendungsfälle für Elasticsearch JavaLand 2015Florian Hopf
 
Anwendungsfaelle für Elasticsearch
Anwendungsfaelle für ElasticsearchAnwendungsfaelle für Elasticsearch
Anwendungsfaelle für ElasticsearchFlorian Hopf
 
Search Evolution - Von Lucene zu Solr und ElasticSearch (Majug 20.06.2013)
Search Evolution - Von Lucene zu Solr und ElasticSearch (Majug 20.06.2013)Search Evolution - Von Lucene zu Solr und ElasticSearch (Majug 20.06.2013)
Search Evolution - Von Lucene zu Solr und ElasticSearch (Majug 20.06.2013)Florian Hopf
 
Search Evolution - Von Lucene zu Solr und ElasticSearch
Search Evolution - Von Lucene zu Solr und ElasticSearchSearch Evolution - Von Lucene zu Solr und ElasticSearch
Search Evolution - Von Lucene zu Solr und ElasticSearchFlorian Hopf
 
Akka Presentation Schule@synyx
Akka Presentation Schule@synyxAkka Presentation Schule@synyx
Akka Presentation Schule@synyxFlorian Hopf
 
Lucene Solr talk at Java User Group Karlsruhe
Lucene Solr talk at Java User Group KarlsruheLucene Solr talk at Java User Group Karlsruhe
Lucene Solr talk at Java User Group KarlsruheFlorian Hopf
 

More from Florian Hopf (13)

Modern Java Features
Modern Java Features Modern Java Features
Modern Java Features
 
Einführung in Elasticsearch
Einführung in ElasticsearchEinführung in Elasticsearch
Einführung in Elasticsearch
 
Java clients for elasticsearch
Java clients for elasticsearchJava clients for elasticsearch
Java clients for elasticsearch
 
Einfuehrung in Elasticsearch
Einfuehrung in ElasticsearchEinfuehrung in Elasticsearch
Einfuehrung in Elasticsearch
 
Data modeling for Elasticsearch
Data modeling for ElasticsearchData modeling for Elasticsearch
Data modeling for Elasticsearch
 
Einführung in Elasticsearch
Einführung in ElasticsearchEinführung in Elasticsearch
Einführung in Elasticsearch
 
Anwendungsfälle für Elasticsearch JAX 2015
Anwendungsfälle für Elasticsearch JAX 2015Anwendungsfälle für Elasticsearch JAX 2015
Anwendungsfälle für Elasticsearch JAX 2015
 
Anwendungsfälle für Elasticsearch JavaLand 2015
Anwendungsfälle für Elasticsearch JavaLand 2015Anwendungsfälle für Elasticsearch JavaLand 2015
Anwendungsfälle für Elasticsearch JavaLand 2015
 
Anwendungsfaelle für Elasticsearch
Anwendungsfaelle für ElasticsearchAnwendungsfaelle für Elasticsearch
Anwendungsfaelle für Elasticsearch
 
Search Evolution - Von Lucene zu Solr und ElasticSearch (Majug 20.06.2013)
Search Evolution - Von Lucene zu Solr und ElasticSearch (Majug 20.06.2013)Search Evolution - Von Lucene zu Solr und ElasticSearch (Majug 20.06.2013)
Search Evolution - Von Lucene zu Solr und ElasticSearch (Majug 20.06.2013)
 
Search Evolution - Von Lucene zu Solr und ElasticSearch
Search Evolution - Von Lucene zu Solr und ElasticSearchSearch Evolution - Von Lucene zu Solr und ElasticSearch
Search Evolution - Von Lucene zu Solr und ElasticSearch
 
Akka Presentation Schule@synyx
Akka Presentation Schule@synyxAkka Presentation Schule@synyx
Akka Presentation Schule@synyx
 
Lucene Solr talk at Java User Group Karlsruhe
Lucene Solr talk at Java User Group KarlsruheLucene Solr talk at Java User Group Karlsruhe
Lucene Solr talk at Java User Group Karlsruhe
 

Einführung in Elasticsearch