0
You know, for search  February 18th, 2011, RivieraJUG
About me●   Lukáš Vlček ( @lukasvlcek )●   Java developer since 2001●   Joined Red Hat (JBoss division) in 2010●   Member ...
In the beginning there was...
Zzzzz......Shay Banon is not sleeping but dreaming!
Highly-available                {JSON}   RESTfulSearch Engine                                     Distributed             ...
!                              *                                 :-)*        ... and @ElasticSearch                     wa...
Dreams do come truehttps://github.com/elasticsearch/elasticsearch        http://www.elasticsearch.org
What is ElasticSearch ?●   Distributed●   Highly-available●   RESTful search engine (on top of Lucene)●   Designed to spea...
Demo #1RESTful JSON teaser
RESTful●   Network interface for data indexing, searching    and administration.curl ­XGET http://localhost:9200/index1,in...
Highly available●   For each index you can specify:    ●   Number of shards        –   Each index has fixed number of shar...
Distributed●   Check next slides...
ZEN Discovery      Node 1                    Node 2              Node 3              Node 4                      A        ...
ZEN Discovery      Node 1                    Node 2              Node 3              Node 4                      A        ...
Talking to the cluster●   Native client in Java and Groovy          Client                    Node 1   Node 2    Node 3●  ...
Talking to the cluster●   REST client             Client                       Node 1      Node 2      Node 3●   Many clie...
Nodes do not have to be equal●   Can be a master●   Can be a data node●   Can allow for REST transport interface    ●     ...
Gateway●   Long time persistency allows for whole (and    partial) cluster backup and recovery.    Types:    ●   Local (de...
Distributed queries●   You can control the type of the search query    per search request:    ●   Query and Fetch    ●   Q...
Demo #2 Dynamic allocation of indices,shards, replicas and Health API
Admin API●   Indices       –   Status       –   CRUD operation       –   Mapping, Open/Close, Update settings       –   Fl...
Demo #3Admin API: getting JVM and OS stats
Rich query API●   There is rich Query DSL for search, includes:    ●   Queries         –   Boolean, Fuzzy, MLT, Prefix, Di...
Facets●   Facets allows to provide aggregated data on    the search request.    ●   query    ●   filter    ●   terms    ● ...
Scripting support●   There is a support for using scripting languages    in many places (for example for custom scoring,  ...
Demo #4      Java API: Indexing dataREST API: Faceted search, Highlighting
Parent / Child●   The parent/child support allows to define a    parent relationship from a child to a parent type.    ●  ...
River●   Lets listen on stream of changes and index the    data...    ●   CouchDB    ●   RabbitMQ    ●   Twitter    ●   Wi...
Versioning (new in 0.15)●   “update if current” functionality●   ie: I can get a document, change it and then put    it ba...
Percolator (new in 0.15)●   The percolator API allows to register queries    against an index, and then send a percolate  ...
Q&A
Thank you!
Elastic Search
Upcoming SlideShare
Loading in...5
×

Elastic Search

5,507

Published on

My talk about Elastic Search, February 18th, RivieraJUG 2011

Transcript of "Elastic Search"

  1. 1. You know, for search February 18th, 2011, RivieraJUG
  2. 2. About me● Lukáš Vlček ( @lukasvlcek )● Java developer since 2001● Joined Red Hat (JBoss division) in 2010● Member of JBoss.org team, focusing on search
  3. 3. In the beginning there was...
  4. 4. Zzzzz......Shay Banon is not sleeping but dreaming!
  5. 5. Highly-available {JSON} RESTfulSearch Engine Distributed CLOUD Asynchronous Open Sourced ... buzzword dreaming! (ASL2)
  6. 6. ! * :-)* ... and @ElasticSearch was born!
  7. 7. Dreams do come truehttps://github.com/elasticsearch/elasticsearch http://www.elasticsearch.org
  8. 8. What is ElasticSearch ?● Distributed● Highly-available● RESTful search engine (on top of Lucene)● Designed to speak JSON (JSON in, JSON out)● and more... ... but first, lets check simple examples.
  9. 9. Demo #1RESTful JSON teaser
  10. 10. RESTful● Network interface for data indexing, searching and administration.curl ­XGET http://localhost:9200/index1,index2/typeA,typeB/_search ­d {  “query“ : { “match_all“ : {} }}You can query one or more indices.Indices can have aliases, you can alsouse _all for all indices.Each index have one or more types, somethinglike columns in DB table.
  11. 11. Highly available● For each index you can specify: ● Number of shards – Each index has fixed number of shards ● Number of replicas – Each shard can have 0-many replicas, can be changed dynamically
  12. 12. Distributed● Check next slides...
  13. 13. ZEN Discovery Node 1 Node 2 Node 3 Node 4 A B CA: { shards: 3, replicas: 2 } B: { shards: 2, replicas: 3 } C: { shards: 1, replicas: 0 } A1 A1 A1 B1 B1 B1 B1 A2 A2 A2 C1 B2 B2 B2 B2 A3 A3 A3 Gateway (longterm persistency of cluster data & metadata)
  14. 14. ZEN Discovery Node 1 Node 2 Node 3 Node 4 A B CA: { shards: 3, replicas: 2 } B: { shards: 2, replicas: 3 } C: { shards: 1, replicas: 4 } A1 A1 A1 B1 B1 B1 B1 A2 A2 A2 C1 C1 C1 C1 C1 B2 B2 B2 B2 A3 A3 A3 Can not allocate all replicas! Check Health API Gateway (longterm persistency of cluster data & metadata)
  15. 15. Talking to the cluster● Native client in Java and Groovy Client Node 1 Node 2 Node 3● Client type: ● Node client ● Transport Client
  16. 16. Talking to the cluster● REST client Client Node 1 Node 2 Node 3● Many clients built on top of REST API ● Perl, PHP, Python, Ruby, Erlang, ... etc
  17. 17. Nodes do not have to be equal● Can be a master● Can be a data node● Can allow for REST transport interface ● Http, memcached, thrift● Index store (file, memory) Node 1● JMX enabled● Thread pool type● ... Node 2 Node 3
  18. 18. Gateway● Long time persistency allows for whole (and partial) cluster backup and recovery. Types: ● Local (default) ● NFS ● HDFS ● AWS: S3
  19. 19. Distributed queries● You can control the type of the search query per search request: ● Query and Fetch ● Query then Fetch ● Dfs, Query and Fetch ● Dfs, Query then Fetch
  20. 20. Demo #2 Dynamic allocation of indices,shards, replicas and Health API
  21. 21. Admin API● Indices – Status – CRUD operation – Mapping, Open/Close, Update settings – Flush, Refresh, Snapshot, Optimize● Cluster – Health – State – Node Info and stats – Shutdown
  22. 22. Demo #3Admin API: getting JVM and OS stats
  23. 23. Rich query API● There is rich Query DSL for search, includes: ● Queries – Boolean, Fuzzy, MLT, Prefix, DisMax, ... ● Filters – And/Or/Not, Boolean, Geo, Missing, Exists, ... ● Highlighting ● Sort ● Facets – on a next slide...
  24. 24. Facets● Facets allows to provide aggregated data on the search request. ● query ● filter ● terms ● range ● (date) histogram ● statistical ● geo distance
  25. 25. Scripting support● There is a support for using scripting languages in many places (for example for custom scoring, script fields, script key in facets ...) ● mvel (default) ● JS ● Groovy ● Python
  26. 26. Demo #4 Java API: Indexing dataREST API: Faceted search, Highlighting
  27. 27. Parent / Child● The parent/child support allows to define a parent relationship from a child to a parent type. ● has_child (query, filter) ● top_children (filter)
  28. 28. River● Lets listen on stream of changes and index the data... ● CouchDB ● RabbitMQ ● Twitter ● Wikipedia
  29. 29. Versioning (new in 0.15)● “update if current” functionality● ie: I can get a document, change it and then put it back in (referencing the version ID I fetched) and it will either index or fail (if the document has been modified in the interim)● Completely real-time
  30. 30. Percolator (new in 0.15)● The percolator API allows to register queries against an index, and then send a percolate request which includes a document, and getting back the queries that match on that document out of set of registered queries.
  31. 31. Q&A
  32. 32. Thank you!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×