Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Real-time search in Drupal with Elasticsearch @Moldcamp

1,065 views

Published on

Published in: Technology
  • Be the first to comment

Real-time search in Drupal with Elasticsearch @Moldcamp

  1. 1. Real-time search in Drupal. Meet Elasticsearch By Alexei Gorobets asgorobets
  2. 2. Elasticsearch Flexible and powerful open source, distributed real-time search and analytics engine for the cloud
  3. 3. Why use Elasticsearch?
  4. 4. ● RESTful API ● Open Source ● JSON over HTTP ● based on Lucene ● distributed ● highly available ● schema free ● massively scalable
  5. 5. Setup in 2 steps: 1. Extract the archive 2. > bin/elasticsearch
  6. 6. How to use it?
  7. 7. > curl -XGET localhost:9200/?pretty
  8. 8. > curl -XGET localhost:9200/?pretty { "ok" : true, "status" : 200, "name" : "Infinity", "version" : { "number" : "0.90.1", "snapshot_build" : false, "lucene_version" : "4.3" }, "tagline" : "You Know, for Search" }
  9. 9. > curl -XGET localhost:9200/?pretty action (verb)
  10. 10. > curl -XGET localhost:9200/?pretty node + port
  11. 11. > curl -XGET localhost:9200/?pretty path
  12. 12. > curl -XGET localhost:9200/?pretty query string
  13. 13. Let's index some data
  14. 14. > PUT /index/type/id Where? It's very similar to database in SQL
  15. 15. > PUT /index/type/id What? Table Content type, Entity type, any kind of type you decide
  16. 16. > PUT /index/type/id Which? Node ID, Entity ID, any kind of serial ID
  17. 17. > PUT /mysite/node/1 -d { "nid": "1", "status": "1", "title": "Hello elasticsearch", "body": "First elasticsearch document" }
  18. 18. > PUT /mysite/node/1 -d { "nid": "1", "status": "1", "title": "Hello elasticsearch", "body": "First elasticsearch document" } { "ok":true, "_index":"mysite", "_type":"node", "_id":"1", "_version":1 }
  19. 19. Let's GET some data
  20. 20. > GET /mysite/node/1 { "_index" : "mysite", "_type" : "node", "_id" : "1", "_version" : 1, "exists" : true, "_source" : { "nid":"1", "status":"1", "title":"Hello elasticsearch", "body":"First elasticsearch document" }
  21. 21. > GET /mysite/node/1?fields=title,body Get specific fields
  22. 22. > GET /mysite/node/1?fields=title,body Get specific fields > GET /mysite/node/1/_source Get source only
  23. 23. Let's UPDATE some data
  24. 24. > PUT /mysite/node/1 -d { "status":"0" }
  25. 25. > PUT /mysite/node/1 -d { "ok":true, "_index":"mysite", "_type":"node", "_id":"1", "_version":2 } { "status":"0" }
  26. 26. UPDATE = DELETE + PUT
  27. 27. Let's DELETE some data
  28. 28. > DELETE /mysite/node/1
  29. 29. > DELETE /mysite/node/1 { "ok":true, "found":true, "_index":"mysite", "_type":"node", "_id":"1", "_version":3 }
  30. 30. Distributed, Highly Available
  31. 31. > PUT /new_index -d '{ "settings" : { "number_of_shards" : 3, "number_of_replicas" : 2 } }'
  32. 32. Concurrency, Version control
  33. 33. > PUT /myapp/node/1?version=1 { "title": "hi girl" }
  34. 34. > PUT /myapp/node/1?version=1 { "title": "hi girl" } { "_index": "myapp", "_type": "node", "_id": "1", "_version": 1, "created": false }
  35. 35. > PUT /myapp/node/1?version=1 { "title": "hey boy" } # 200
  36. 36. > PUT /myapp/node/1?version=1 { "title": "hey boy" } # 409 > version conflict, current [2], provided [1]
  37. 37. Let's SEARCH for something
  38. 38. > GET /_search
  39. 39. > GET /_search { "took" : 32, "timed_out" : false, "_shards" : { "total" : 20, "successful" : 20, "failed" : 0 }, "hits" : { results... } }
  40. 40. Let's SEARCH in multiple indices and types
  41. 41. > GET /index/_search > GET /index/type/_search > GET /index1,index2/_search > GET /myapp_*/type, entity_*/_search
  42. 42. Let's PAGINATE results
  43. 43. > GET /_search?size=10&from=20 size = results per page from = starting from
  44. 44. Let's search oldschool
  45. 45. > GET /_search?q=title:elasticsearch > GET /_search?q=nid:60
  46. 46. +title:awesome +status:1 +created:[1369917354 TO *]
  47. 47. ?q=title:awesome%20%2Bcreated: [1369917354%20TO%20*]%2Bstatus:1 +title:awesome +status:1 +created:[1369917354 TO *] The ugly encoding =)
  48. 48. Query DSL style
  49. 49. > GET /_search -d { "query": { "match": "awesome" } }
  50. 50. > GET /_search -d { "query": { "match" : { "title" : { "query" : "+awesome -poor", "boost" : 2.0, } } } }
  51. 51. Mappings and types
  52. 52. Core types * string * number * date * boolean
  53. 53. Complex types * array type * object type * nested type Others: ip type geo point geo shape attachments
  54. 54. Define type mapping
  55. 55. > PUT /myapp/node -d { "node" : { "properties" : { "message" : { "type" : "string", "store" : true } } } }
  56. 56. Indexed fields
  57. 57. Full text analyzed == is splitted into terms Term not analyzed == is stored as is
  58. 58. > PUT /myapp/node -d { "node" : { "properties" : { "name" : { "type" : "string", "store" : true, “index”: “not_analyzed” } } } }
  59. 59. Dynamic mapping
  60. 60. Analysis and indexing
  61. 61. Inverted index 1. “The quick brown fox jumped over the lazy dog” 2. “Quick brown foxes leap over lazy dogs in summer” Term Doc_1 Doc_2 ------------------------- Quick | | X The | X | brown | X | X dog | X | dogs | | X fox | X | foxes | | X in | | X jumped | X | lazy | X | X leap | | X over | X | X quick | X | summer | | X the | X |
  62. 62. Analyzer Tokenizers ● standard ● keyword ● whitespace ● ngram TokenFilters standard lowercase stop truncate snowball
  63. 63. > GET /_analyze?analyzer=standard -d 'this is a test baby' { "tokens" : [ { "token" : "test", "start_offset" : 10, "end_offset" : 14, "type" : "<ALPHANUM>", "position" : 4 }, { "token" : "baby", "start_offset" : 15, "end_offset" : 19, "type" : "<ALPHANUM>", "position" : 5 } ] }
  64. 64. Autocomplete fields
  65. 65. Queries & Filters
  66. 66. Queries & Filters full text search relevance score heavy not cacheable exact match show or hide lightning fast cacheable
  67. 67. Combine Filters & Queries
  68. 68. > GET /_search -d { "query": { "filtered": { "query": { "match": { "title": "awesome" } }, "filter": { "term": { "type": "article" } } } } }
  69. 69. and Sorting
  70. 70. > GET /_search -d { "query": { "filtered": { "query": { "match": { "title": "awesome" } }, "filter": { "term": { "type": "article" } } } } "sort": {"date":"desc"} }
  71. 71. Relevance. Explain API
  72. 72. Term frequency How often does the term appear in the field? The more often, the more relevant. Inverse document frequency How often does each term appear in the index? The more often, the less relevant. T Field norm How long is the field? The longer it is, the less likely it is that words in the field will be relevant.
  73. 73. and Facets
  74. 74. Facets on Amazon
  75. 75. > GET /_search -d { "facets": { "home_team": { "terms": { "field": "field_home_team" } } } }
  76. 76. > GET /_search -d { "facets": { "home_team": { "terms": { "field": "field_home_team" } } } } Give your facet a name
  77. 77. > GET /_search -d { "facets": { "home_team": { "terms": { "field": "field_home_team" } } } } Your facet filter can be: ● Terms ● Range ● Histogram ● Date Histogram ● Filter ● Query ● Statistical ● Terms Stats ● Geo Distance
  78. 78. "facets" : { "home_team" : { "_type" : "terms", "missing" : 203, "total" : 100, "other" : 42, "terms" : [ { "term" : "hou", "count" : 8 }, { "term" : "sln", "count" : 6 }, ...
  79. 79. STOP! I want this in Drupal?
  80. 80. Available modules: Elasticsearch Elasticsearch Connector Search API elasticsearch
  81. 81. Development directions: 1. Search API implementation 2. Field Storage API 3. Alternative backends Available modules: Elasticsearch Elasticsearch Connector Search API elasticsearch
  82. 82. Field Storage API implementation Elasticsearch field storage sandbox by Damien Tournoud Started in July 2011
  83. 83. Field Storage API implementation Elasticsearch field storage sandbox by Damien Tournoud Started in July 2011 Elasticsearch EntityFieldQuery sandbox https://drupal.org/sandbox/asgorobets/2073151
  84. 84. Let's DEMO
  85. 85. Let the Search be with you

×