2. About
● Horizonal scaling
● No single point of failure (if you set it up replicated)
● communicates using http (rest + json)
● No ACL, everone can do everything if you can connect to it
● Based on Apache Lucent, just like Apache Solr
● Alternative products: Apache Solr, Splunk, Apache Hadoop
● No schema. You don’t get the “alter table” problem with locking
● Second most populere search engine (solr is #1)
● Automatically rebalance data
● Used by: Wikipedia, Mozilla, Cern, Foursquare, SoundCloud,
StumbleUpon, Github
● First release 2010
3. ● 1 master with failover who controll the cluster
● Don’t neeed an external HA solution, it automatically connects to someone
who’s working. Ask any node and it will ask all the other nodes required to
get the full answare
● Uses litle cpu and memory, but Lot’s of disk IO. Large ES require more
memory
About II
4. Installasjonen Ubuntu
● Logs: /var/log/elasticsearch
● Config: /etc/default/elasticsearch,
/etc/elasticsearch/elasticsearch.yml og
/usr/lib/systemd/system/elasticsearch.service
● Start/stop/restart/status: sudo start/stop/restart/status
elasticsearch
● Stopper den så er den nok tom for minne.. Øk i config’en
i så fall.. Se install doc’en…
5. Terms
● node - machines
● index = "tables". It’s recommeded to create lot’s of index’es. Many
applications use one for each day
● shards = a index is distributed over x shards (default 5). One node can have
multiple shards. It’s not possible to change shards later on an existing
index without an export and import.
● replika = Number of copies you want of your data(1 is default)
7. Operations tasks
● offline
curl -XPOST 'localhost:9200/my_index/_close'
● online
curl -XPOST 'localhost:9200/my_index/_close'
● Delete index (wildcard er slått av)
curl -XDELETE 'http://localhost:9200/twitter/'
● cluster status
curl -XGET http://localhost:9200/_recovery?pretty=true
● optimize (you probably only need to do this is you delete stuff in an index)
● ++ probably more
● https://www.elastic.co/guide/en/elasticsearch/reference/current/indices.
html
8. Nice to know
● Limit on how many records you get back on a search (if they haven’t changed it...)
● Faster searches if you limit the number of index’es you search on.. You can use * as part of the
index name, events-2015-01-*
● Look for out of memory errors and max open files. Usually a problem with bigger installations
● They recommend giving ES halv of the memory of the machine
● https://www.elastic.co/products/watcher can be used to look for patterns in real time and give
alerts when it’s detected
● You can see the shards on the FS and who’s got what like this:
/var/lib/elasticsearch/ndpelastic/nodes/0/indices/winlogbeat-2016.09.16
node 1 = 0 3
node 3 = 1 2 4
node 2 = 1 2
node 4 = 0 3 4
Men hos experis så ligger alt på 1 node