Successfully reported this slideshow.
Your SlideShare is downloading. ×

Elasticsearch - Devoxx France 2012 - English version

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 156 Ad

Elasticsearch - Devoxx France 2012 - English version

Elasticsearch presentation for Devoxx France 2012
English translation (feel free to correct my bad english ;-) )
French version is available here : http://www.slideshare.net/dadoonet/elasticsearch-devoxx-france-2012

Elasticsearch presentation for Devoxx France 2012
English translation (feel free to correct my bad english ;-) )
French version is available here : http://www.slideshare.net/dadoonet/elasticsearch-devoxx-france-2012

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Advertisement

Similar to Elasticsearch - Devoxx France 2012 - English version (20)

More from David Pilato (20)

Advertisement

Recently uploaded (20)

Elasticsearch - Devoxx France 2012 - English version

  1. 1. Elasticsearch : search engine designed for cloud by David Pilato @dadoonet and @elasticsearchfr 1
  2. 2. { “speaker” : “David Pilato” } $ curl http://localhost:9200/devoxx/speaker/dpilato { "name" : "David Pilato", "jobs" : [ { "company" : "SRA Europe (SSII)", "mission" : "bon à tout faire", "duration" : 3 }, { "company" : "SFR", "mission" : "touche à tout", "duration" : 3 }, { "company" : "e-Brands / Vivendi", "mission" : "chef de projets", "duration" : 4 }, { "company" : "DGDDI (customs)", "mission" : "mouton à 5 pattes", "duration" : 7 } ], "passions" : [ "family", "job", "deejay" ], "blog" : "http://dev.david.pilato.fr/", "twitter" : [ "@dadoonet", "@elasticsearchfr" ], "email" : "david@pilato.fr" } 2
  3. 3. Abstract • The need for a search engine ? • Elasticsearch : a complete, simple and performant solution • What about indexing Twitter ? Make some noise on @DevoxxFR with the #elasticsearch hashtag ! 3
  4. 4. A search engine ? What for ? DO WE NEED A SEARCH ENGINE ? 4
  5. 5. Usual use case with « SQL old school » Having a document persisted in database : • date attribute : 19/04/2012 • coded attribute country : FR • Association table code/label • Code : FR • Label : France • comment attribute : "There is a type error in the comment for this product. We should call David." Engine Elasticsearch Rivers Facets Demo Architecture Community 5
  6. 6. Usual use case with « SQL old school » Having a document persisted in database : doc country • date attribute : 19/04/2012 date code • coded attribute country : FR country label • Association table code/label comment • Code : FR • Label : France • comment attribute : "There is a type error in the comment for this product. We should call David." Engine Elasticsearch Rivers Facets Demo Architecture Community 5
  7. 7. Usual need with « SQL old school » • Find a document from december 2011 about france containing error and david • SQL : Engine Elasticsearch Rivers Facets Demo Architecture Community 6
  8. 8. Usual need with « SQL old school » • Find a document from december 2011 about france containing error and david • SQL : SELECT doc.*, pays.* FROM doc, pays WHERE doc.pays_code = pays.code AND doc.date_doc > to_date('2011-12', 'yyyy-mm') AND doc.date_doc < to_date('2012-01', 'yyyy-mm') AND lower(pays.libelle) = 'france' AND lower(doc.commentaire) LIKE ‘%error%' AND lower(doc.commentaire) LIKE ‘%david%'; Engine Elasticsearch Rivers Facets Demo Architecture Community 6
  9. 9. Performance impact of like ‘%’ Engine Elasticsearch Rivers Facets Demo Architecture Community 7
  10. 10. Performance impact of like ‘%’ See also : http://www.cestpasdur.com/2012/04/01/elasticsearch-vs-mysql-recherche Engine Elasticsearch Rivers Facets Demo Architecture Community 7
  11. 11. What is a search engine ? Engine Elasticsearch Rivers Facets Demo Architecture Community 8
  12. 12. What is a search engine ? • A search engine is : • an index engine for documents • a search engine on indexes Engine Elasticsearch Rivers Facets Demo Architecture Community 8
  13. 13. What is a search engine ? • A search engine is : • an index engine for documents • a search engine on indexes • A search engine is more powerful to do searches : Engine Elasticsearch Rivers Facets Demo Architecture Community 8
  14. 14. What is a search engine ? • A search engine is : • an index engine for documents • a search engine on indexes • A search engine is more powerful to do searches : it’s designed for it ! Engine Elasticsearch Rivers Facets Demo Architecture Community 8
  15. 15. ELASTICSEARCH 9
  16. 16. Your Data, your Search ! ELASTICSEARCH 9
  17. 17. Elasticsearch Engine Elasticsearch Rivers Facets Demo Architecture Community 10
  18. 18. Elasticsearch • Search engine for the NoSQL generation Engine Elasticsearch Rivers Facets Demo Architecture Community 10
  19. 19. Elasticsearch • Search engine for the NoSQL generation • Based on the standard Apache Lucene library Engine Elasticsearch Rivers Facets Demo Architecture Community 10
  20. 20. Elasticsearch • Search engine for the NoSQL generation • Based on the standard Apache Lucene library • Hide the Java / Lucene complexity with standard HTTP / RESTful / JSON services Engine Elasticsearch Rivers Facets Demo Architecture Community 10
  21. 21. Elasticsearch • Search engine for the NoSQL generation • Based on the standard Apache Lucene library • Hide the Java / Lucene complexity with standard HTTP / RESTful / JSON services • You can use it from whatever language or platform Engine Elasticsearch Rivers Facets Demo Architecture Community 10
  22. 22. Elasticsearch • Search engine for the NoSQL generation • Based on the standard Apache Lucene library • Hide the Java / Lucene complexity with standard HTTP / RESTful / JSON services • You can use it from whatever language or platform • Add the cloud layer that Lucene miss Engine Elasticsearch Rivers Facets Demo Architecture Community 10
  23. 23. Elasticsearch • Search engine for the NoSQL generation • Based on the standard Apache Lucene library • Hide the Java / Lucene complexity with standard HTTP / RESTful / JSON services • You can use it from whatever language or platform • Add the cloud layer that Lucene miss • It’s an engine, not a graphical user interface ! Engine Elasticsearch Rivers Facets Demo Architecture Community 10
  24. 24. Key points Engine Elasticsearch Rivers Facets Demo Architecture Community 11
  25. 25. Key points • Easy ! In some minutes (Zero Conf), you will get a full search engine ready to get your documents and perform your searches. Engine Elasticsearch Rivers Facets Demo Architecture Community 11
  26. 26. Key points • Easy ! In some minutes (Zero Conf), you will get a full search engine ready to get your documents and perform your searches. • Efficient ! Just start new Elasticsearch nodes to scale horizontally with replication and load balancing. Engine Elasticsearch Rivers Facets Demo Architecture Community 11
  27. 27. Key points • Easy ! In some minutes (Zero Conf), you will get a full search engine ready to get your documents and perform your searches. • Efficient ! Just start new Elasticsearch nodes to scale horizontally with replication and load balancing. • Powerful ! Lucene based product, with parallel processing to get acceptable response time (mainly less than 100ms). Engine Elasticsearch Rivers Facets Demo Architecture Community 11
  28. 28. Key points • Easy ! In some minutes (Zero Conf), you will get a full search engine ready to get your documents and perform your searches. • Efficient ! Just start new Elasticsearch nodes to scale horizontally with replication and load balancing. • Powerful ! Lucene based product, with parallel processing to get acceptable response time (mainly less than 100ms). • Complete ! Many features : analysis and facets, percolation, rivers, plugins, … Engine Elasticsearch Rivers Facets Demo Architecture Community 11
  29. 29. Storing your data Engine Elasticsearch Rivers Facets Demo Architecture Community 12
  30. 30. Storing your data • Document : A full object containing all your data (NoSQL meaning). To think "search", you have to forget RDBMS and think "Documents" Engine Elasticsearch Rivers Facets Demo Architecture Community 12
  31. 31. Storing your data • Document : A full object containing all your data (NoSQL meaning). To think "search", you have to forget RDBMS and think "Documents" { "text": "Bienvenue à la conférence #elasticsearch pour #devoxxfr", "created_at": "2012-04-06T20:45:36.000Z", "source": "Twitter for iPad", "truncated": false, A tweet "retweet_count": 0, "hashtag": [ { "text": "elasticsearch", "start": 27, "end": 40 }, { "text": "devoxxfr", "start": 47, "end": 55 } ], "user": { "id": 51172224, "name": "David Pilato", "screen_name": "dadoonet", "location": "France", "description": "Soft Architect, Project Manager, Senior Developper.rnAt this time, enjoying NoSQL world : CouchDB, ElasticSearch.rnDeeJay 4 times a year, just for fun !" } } Engine Elasticsearch Rivers Facets Demo Architecture Community 12
  32. 32. Storing your data • Document : A full object containing all your data (NoSQL meaning). To think "search", you have to forget RDBMS and think "Documents" { "text": "Bienvenue à la conférence #elasticsearch pour #devoxxfr", "created_at": "2012-04-06T20:45:36.000Z", "source": "Twitter for iPad", "truncated": false, A tweet "retweet_count": 0, "hashtag": [ { "text": "elasticsearch", "start": 27, "end": 40 }, { "text": "devoxxfr", "start": 47, "end": 55 } ], "user": { "id": 51172224, "name": "David Pilato", "screen_name": "dadoonet", "location": "France", "description": "Soft Architect, Project Manager, Senior Developper.rnAt this time, enjoying NoSQL world : CouchDB, ElasticSearch.rnDeeJay 4 times a year, just for fun !" } } • Type : Includes all documents of the same type Engine Elasticsearch Rivers Facets Demo Architecture Community 12
  33. 33. Storing your data • Document : A full object containing all your data (NoSQL meaning). To think "search", you have to forget RDBMS and think "Documents" { "text": "Bienvenue à la conférence #elasticsearch pour #devoxxfr", "created_at": "2012-04-06T20:45:36.000Z", "source": "Twitter for iPad", "truncated": false, A tweet "retweet_count": 0, "hashtag": [ { "text": "elasticsearch", "start": 27, "end": 40 }, { "text": "devoxxfr", "start": 47, "end": 55 } ], "user": { "id": 51172224, "name": "David Pilato", "screen_name": "dadoonet", "location": "France", "description": "Soft Architect, Project Manager, Senior Developper.rnAt this time, enjoying NoSQL world : CouchDB, ElasticSearch.rnDeeJay 4 times a year, just for fun !" } } • Type : Includes all documents of the same type • Index : Logical storage of related document types Engine Elasticsearch Rivers Facets Demo Architecture Community 12
  34. 34. Playing with Elasticsearch REST API : http://host:port/[index]/[type]/[_action/id] HTTP Methods : GET, POST, PUT, DELETE Engine Elasticsearch Rivers Facets Demo Architecture Community 13
  35. 35. Playing with Elasticsearch REST API : http://host:port/[index]/[type]/[_action/id] HTTP Methods : GET, POST, PUT, DELETE Documents • curl -XPUT http://localhost:9200/twitter/tweet/1 Engine Elasticsearch Rivers Facets Demo Architecture Community 13
  36. 36. Playing with Elasticsearch REST API : http://host:port/[index]/[type]/[_action/id] HTTP Methods : GET, POST, PUT, DELETE Documents • curl -XPUT http://localhost:9200/twitter/tweet/1 • curl -XGET http://localhost:9200/twitter/tweet/1 Engine Elasticsearch Rivers Facets Demo Architecture Community 13
  37. 37. Playing with Elasticsearch REST API : http://host:port/[index]/[type]/[_action/id] HTTP Methods : GET, POST, PUT, DELETE Documents • curl -XPUT http://localhost:9200/twitter/tweet/1 • curl -XGET http://localhost:9200/twitter/tweet/1 • curl -XDELETE http://localhost:9200/twitter/tweet/1 Engine Elasticsearch Rivers Facets Demo Architecture Community 13
  38. 38. Playing with Elasticsearch REST API : http://host:port/[index]/[type]/[_action/id] HTTP Methods : GET, POST, PUT, DELETE Documents • curl -XPUT http://localhost:9200/twitter/tweet/1 • curl -XGET http://localhost:9200/twitter/tweet/1 • curl -XDELETE http://localhost:9200/twitter/tweet/1 Search • curl -XGET http://localhost:9200/twitter/tweet/_search Engine Elasticsearch Rivers Facets Demo Architecture Community 13
  39. 39. Playing with Elasticsearch REST API : http://host:port/[index]/[type]/[_action/id] HTTP Methods : GET, POST, PUT, DELETE Documents • curl -XPUT http://localhost:9200/twitter/tweet/1 • curl -XGET http://localhost:9200/twitter/tweet/1 • curl -XDELETE http://localhost:9200/twitter/tweet/1 Search • curl -XGET http://localhost:9200/twitter/tweet/_search • curl -XGET http://localhost:9200/twitter/_search Engine Elasticsearch Rivers Facets Demo Architecture Community 13
  40. 40. Playing with Elasticsearch REST API : http://host:port/[index]/[type]/[_action/id] HTTP Methods : GET, POST, PUT, DELETE Documents • curl -XPUT http://localhost:9200/twitter/tweet/1 • curl -XGET http://localhost:9200/twitter/tweet/1 • curl -XDELETE http://localhost:9200/twitter/tweet/1 Search • curl -XGET http://localhost:9200/twitter/tweet/_search • curl -XGET http://localhost:9200/twitter/_search • curl -XGET http://localhost:9200/_search Engine Elasticsearch Rivers Facets Demo Architecture Community 13
  41. 41. Playing with Elasticsearch REST API : http://host:port/[index]/[type]/[_action/id] HTTP Methods : GET, POST, PUT, DELETE Documents • curl -XPUT http://localhost:9200/twitter/tweet/1 • curl -XGET http://localhost:9200/twitter/tweet/1 • curl -XDELETE http://localhost:9200/twitter/tweet/1 Search • curl -XGET http://localhost:9200/twitter/tweet/_search • curl -XGET http://localhost:9200/twitter/_search • curl -XGET http://localhost:9200/_search Elasticsearch Meta Data • curl -XGET http://localhost:9200/twitter/_status Engine Elasticsearch Rivers Facets Demo Architecture Community 13
  42. 42. Let’s index a document $ curl -XPUT localhost:9200/twitter/tweet/1 -d ' { "text": "Bienvenue à la conférence #elasticsearch pour #devoxxfr", "created_at": "2012-04-06T20:45:36.000Z", "source": "Twitter for iPad", "truncated": false, "retweet_count": 0, "hashtag": [ { "text": "elasticsearch", "start": 27, "end": 40 }, { "text": "devoxxfr", "start": 47, "end": 55 } ], "user": { "id": 51172224, "name": "David Pilato", "screen_name": "dadoonet", "location": "France", "description": "Soft Architect, Project Manager, Senior Developper.rnAt this time, enjoying NoSQL world : CouchDB, ElasticSearch.rnDeeJay 4 times a year, just for fun !" } }' Engine Elasticsearch Rivers Facets Demo Architecture Community 14
  43. 43. Let’s index a document $ curl -XPUT localhost:9200/twitter/tweet/1 -d ' { "text": "Bienvenue à la conférence #elasticsearch pour #devoxxfr", "created_at": "2012-04-06T20:45:36.000Z", "source": "Twitter for iPad", "truncated": false, "retweet_count": 0, "hashtag": [ { "text": "elasticsearch", "start": 27, "end": 40 }, { "text": "devoxxfr", "start": 47, "end": 55 } ], "user": { "id": 51172224, "name": "David Pilato", "screen_name": "dadoonet", "location": "France", "description": "Soft Architect, Project Manager, Senior Developper.rnAt this time, enjoying NoSQL world : CouchDB, ElasticSearch.rnDeeJay 4 times a year, just for fun !" } }' { "ok":true, "_index":"twitter", "_type":"tweet", "_id":"1" } Engine Elasticsearch Rivers Facets Demo Architecture Community 14
  44. 44. Let’s search for documents $ curl localhost:9200/twitter/tweet/_search?q=elasticsearch Engine Elasticsearch Rivers Facets Demo Architecture Community 15
  45. 45. Let’s search for documents $ curl localhost:9200/twitter/tweet/_search?q=elasticsearch { "took" : 24, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.227, "hits" : [ { "_index" : "twitter", "_type" : "tweet", "_id" : "1", "_score" : 0.227, "_source" : { "text": "Bienvenue à la conférence #elasticsearch pour #devoxxfr", "created_at": "2012-04-06T20:45:36.000Z", "source": "Twitter for iPad", […] } } ] } } Engine Elasticsearch Rivers Facets Demo Architecture Community 15
  46. 46. Let’s search for documents $ curl localhost:9200/twitter/tweet/_search?q=elasticsearch { "took" : 24, "timed_out" : false, Total number of documents "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.227, "hits" : [ { "_index" : "twitter", "_type" : "tweet", "_id" : "1", "_score" : 0.227, "_source" : { "text": "Bienvenue à la conférence #elasticsearch pour #devoxxfr", "created_at": "2012-04-06T20:45:36.000Z", "source": "Twitter for iPad", […] } } ] } } Engine Elasticsearch Rivers Facets Demo Architecture Community 15
  47. 47. Let’s search for documents $ curl localhost:9200/twitter/tweet/_search?q=elasticsearch { "took" : 24, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.227, "hits" : [ { location "_index" : "twitter", "_type" : "tweet", "_id" : "1", "_score" : 0.227, "_source" : { "text": "Bienvenue à la conférence #elasticsearch pour #devoxxfr", "created_at": "2012-04-06T20:45:36.000Z", "source": "Twitter for iPad", […] } } ] } } Engine Elasticsearch Rivers Facets Demo Architecture Community 15
  48. 48. Let’s search for documents $ curl localhost:9200/twitter/tweet/_search?q=elasticsearch { "took" : 24, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.227, "hits" : [ { "_index" : "twitter", "_type" : "tweet", "_id" : "1", "_score" : 0.227, "_source" : { "text": "Bienvenue à la conférence #elasticsearch pour #devoxxfr", "created_at": "2012-04-06T20:45:36.000Z", "source": "Twitter for iPad", […] Relevance } } ] } } Engine Elasticsearch Rivers Facets Demo Architecture Community 15
  49. 49. Let’s search for documents $ curl localhost:9200/twitter/tweet/_search?q=elasticsearch { "took" : 24, "timed_out" : false, Document "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { source "total" : 1, "max_score" : 0.227, "hits" : [ { "_index" : "twitter", "_type" : "tweet", "_id" : "1", "_score" : 0.227, "_source" : { "text": "Bienvenue à la conférence #elasticsearch pour #devoxxfr", "created_at": "2012-04-06T20:45:36.000Z", "source": "Twitter for iPad", […] } } ] } } Engine Elasticsearch Rivers Facets Demo Architecture Community 15
  50. 50. Search results Engine Elasticsearch Rivers Facets Demo Architecture Community 16
  51. 51. Search results • Elasticsearch gives you the 10 first results (even on many millions) : pagination • You can move in the resultset $ curl "localhost:9200/twitter/tweet/_search?q=elasticsearch&from=10&size=10" Engine Elasticsearch Rivers Facets Demo Architecture Community 16
  52. 52. Search results • Elasticsearch gives you the 10 first results (even on many millions) : pagination • You can move in the resultset $ curl "localhost:9200/twitter/tweet/_search?q=elasticsearch&from=10&size=10" • Scoring is computed with term frequency in a document relative to the term frequency in the index $ curl "localhost:9200/twitter/tweet/_search?q=elasticsearch&explain=true" Engine Elasticsearch Rivers Facets Demo Architecture Community 16
  53. 53. Searches QueryDSL for advanced searches Type Description Search for everything (useful combined with filters) Search with term analysis, wildcards (Lucene syntax* +, -, FROM, TO, ^) Search for individual term without analysis Search for a text with analysis (OR is applied between tokens by default) Wildcard search (*, ?) Combine many criteria (MUST, MUST NOT, SHOULD) Range search (>, >=, <, <=) Useful for autocomplete requirements Filtering queries Useful to find documents that are “like” provided text Useful to find documents that are “like” provided text with a minimal constraint on found terms Engine Elasticsearch Rivers Facets Demo Architecture Community 17
  54. 54. Searches QueryDSL for advanced searches Type Description Match All Search for everything (useful combined with filters) QueryString Search with term analysis, wildcards (Lucene syntax* +, -, FROM, TO, ^) Term Search for individual term without analysis Text Search for a text with analysis (OR is applied between tokens by default) Wildcard Wildcard search (*, ?) Bool Combine many criteria (MUST, MUST NOT, SHOULD) Range Range search (>, >=, <, <=) Prefix Useful for autocomplete requirements Filtered Filtering queries Fuzzy like this Useful to find documents that are “like” provided text More like this Useful to find documents that are “like” provided text with a minimal constraint on found terms * http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/queryparsersyntax.html Engine Elasticsearch Rivers Facets Demo Architecture Community 17
  55. 55. AUTOMATIC PULLING DATA 18
  56. 56. Or "life is a long quiet river !" AUTOMATIC PULLING DATA 18
  57. 57. Pulling documents Engine Elasticsearch Rivers Facets Demo Architecture Community 19
  58. 58. Pulling documents Database Engine Elasticsearch Rivers Facets Demo Architecture Community 19
  59. 59. Pulling documents Doc Database Engine Elasticsearch Rivers Facets Demo Architecture Community 19
  60. 60. Pulling documents Database Engine Doc Engine Elasticsearch Rivers Facets Demo Architecture Community 20
  61. 61. Pulling documents Doc Database Engine Doc Engine Elasticsearch Rivers Facets Demo Architecture Community 20
  62. 62. Pulling documents Database Doc Doc Engine Elasticsearch Rivers Facets Demo Architecture Community 21
  63. 63. Pulling documents Database Doc Doc Engine Elasticsearch Rivers Facets Demo Architecture Community 21
  64. 64. Pulling documents Database Doc Doc Engine Elasticsearch Rivers Facets Demo Architecture Community 21
  65. 65. Pulling documents Database Doc Doc Engine Elasticsearch Rivers Facets Demo Architecture Community 22
  66. 66. Pulling documents Doc Database Doc Doc Engine Elasticsearch Rivers Facets Demo Architecture Community 22
  67. 67. Pulling documents Database Doc Doc Engine Elasticsearch Rivers Facets Demo Architecture Community 23
  68. 68. Pulling documents Database Doc Engine Elasticsearch Rivers Facets Demo Architecture Community 24
  69. 69. Rivers Engine Elasticsearch Rivers Facets Demo Architecture Community 25
  70. 70. Rivers • CouchDB River Engine Elasticsearch Rivers Facets Demo Architecture Community 25
  71. 71. Rivers • CouchDB River • MongoDB River Engine Elasticsearch Rivers Facets Demo Architecture Community 25
  72. 72. Rivers • CouchDB River • MongoDB River • Wikipedia River Engine Elasticsearch Rivers Facets Demo Architecture Community 25
  73. 73. Rivers • CouchDB River • MongoDB River • Wikipedia River • Twitter River Engine Elasticsearch Rivers Facets Demo Architecture Community 25
  74. 74. Rivers • CouchDB River • MongoDB River • Wikipedia River • Twitter River • RabbitMQ River Engine Elasticsearch Rivers Facets Demo Architecture Community 25
  75. 75. Rivers • CouchDB River • MongoDB River • Wikipedia River • Twitter River • RabbitMQ River • RSS River Engine Elasticsearch Rivers Facets Demo Architecture Community 25
  76. 76. Rivers • CouchDB River • MongoDB River • Wikipedia River • Twitter River • RabbitMQ River • RSS River • Dick Rivers Engine Elasticsearch Rivers Facets Demo Architecture Community 25
  77. 77. Looking at your data from different points of views RESULT ANALYSIS (IN NEAR REAL TIME) 26
  78. 78. Facets ID Username Date Hashtags 1 dadoonet 2012-04-18 1 2 devoxxfr 2012-04-18 5 Some tweets 3 elasticsearchfr 2012-04-18 2 4 dadoonet 2012-04-18 2 5 devoxxfr 2012-04-18 6 6 elasticsearchfr 2012-04-19 3 7 dadoonet 2012-04-19 3 8 devoxxfr 2012-04-19 7 9 elasticsearchfr 2012-04-20 4 Engine Elasticsearch Rivers Facets Demo Architecture Community 27
  79. 79. Term Facet Username Date Hashtags dadoonet 2012-04-18 1 devoxxfr 2012-04-18 5 elasticsearchfr 2012-04-18 2 dadoonet 2012-04-18 2 devoxxfr 2012-04-18 6 elasticsearchfr 2012-04-19 3 dadoonet 2012-04-19 3 devoxxfr 2012-04-19 7 elasticsearchfr 2012-04-20 4 Engine Elasticsearch Rivers Facets Demo Architecture Community 28
  80. 80. Term Facet Username Date Hashtags dadoonet 2012-04-18 1 devoxxfr 2012-04-18 5 elasticsearchfr 2012-04-18 Username 2 Count dadoonet 2012-04-18 dadoonet 2 3 devoxxfr 2012-04-18 devoxxfr6 3 elasticsearchfr 2012-04-19 elasticsearchfr 3 3 dadoonet 2012-04-19 3 devoxxfr 2012-04-19 7 elasticsearchfr 2012-04-20 4 Engine Elasticsearch Rivers Facets Demo Architecture Community 28
  81. 81. Term Facet "facets" : { "users" : { "terms" : {"field" : "username"} } } ID Username Date Hashtags 1 dadoonet 2012-04-18 1 2 devoxxfr 2012-04-18 5 3 elasticsearchfr 2012-04-18 2 4 dadoonet 2012-04-18 2 5 devoxxfr 2012-04-18 6 6 elasticsearchfr 2012-04-19 3 7 dadoonet 2012-04-19 3 8 devoxxfr 2012-04-19 7 9 elasticsearchfr 2012-04-20 4 Engine Elasticsearch Rivers Facets Demo Architecture Community 29
  82. 82. Term Facet "facets" : { "users" : { "terms" : {"field" : "username"} } } ID Username Date "facets" : { Hashtags 1 dadoonet 2012-04-18 : { "users" 1 2 devoxxfr 2012-04-18 : "terms", "_type" 5 "missing" : 0, 3 elasticsearchfr 2012-04-18 2 "total": 9, 4 dadoonet 2012-04-18 "other": 0, 2 5 devoxxfr 2012-04-18 : [ "terms" 6 6 elasticsearchfr { "term" : "dadoonet", "count" : 3 }, 2012-04-19 3 { "term" : "devoxxfr", "count" : 3 }, 7 dadoonet 2012-04-19 3 { "term" : "elasticsearchfr", "count" : 3 } 8 devoxxfr 2012-04-19 ] 7 9 elasticsearchfr } 2012-04-20 4 Engine Elasticsearch Rivers Facets Demo Architecture Community 29
  83. 83. Date Histogram Facet ame Date Hashtags onet 2012-04-18 1 xxfr 2012-04-18 5 archfr 2012-04-18 2 onet 2012-04-18 2 xxfr 2012-04-18 6 archfr 2012-04-19 3 onet 2012-04-19 3 xxfr 2012-04-19 7 archfr 2012-04-20 4 Engine Elasticsearch Rivers Facets Demo Architecture Community 30
  84. 84. Date Histogram Facet ame Date Hashtags onet 2012-04-18 1 Per month Date Count xxfr 2012-04-18 5 2012-04 9 archfr 2012-04-18 2 onet 2012-04-18 2 Per day xxfr 2012-04-18 6 Date Count archfr 2012-04-19 3 2012-04-18 5 onet 2012-04-19 3 2012-04-19 3 xxfr 2012-04-19 7 2012-04-20 1 archfr 2012-04-20 4 Engine Elasticsearch Rivers Facets Demo Architecture Community 30
  85. 85. Date Histogram Facet "facets" : { "perday" : { "date_histogram" : { "field" : "date", ame Date "interval" : "day" Hashtags } onet 2012-04-18 }1 xxfr 2012-04-18 } 5 archfr 2012-04-18 2 onet 2012-04-18 2 xxfr 2012-04-18 6 archfr 2012-04-19 3 onet 2012-04-19 3 xxfr 2012-04-19 7 archfr 2012-04-20 4 Engine Elasticsearch Rivers Facets Demo Architecture Community 31
  86. 86. Date Histogram Facet "facets" : { "perday" : { "date_histogram" : { "field" : "date", ame Date "interval" : "day" Hashtags } onet 2012-04-18 }1 xxfr 2012-04-18 } 5 archfr 2012-04-18 2 "facets" : { onet 2012-04-18 2 "perday" : { xxfr 2012-04-18 "_type" : "date_histogram", 6 "entries": [ archfr 2012-04-19 3 { "time": 1334700000000, "count": 5 }, onet 2012-04-19 3 { "time": 1334786400000, "count": 3 }, xxfr 2012-04-19 7 { "time": 1334872800000, "count": 1 } ] archfr 2012-04-20 } 4 } Engine Elasticsearch Rivers Facets Demo Architecture Community 31
  87. 87. Ranges Facet Hashtags 8 1 8 5 8 2 8 2 8 6 9 3 9 3 9 7 0 4 Engine Elasticsearch Rivers Facets Demo Architecture Community 32
  88. 88. Ranges Facet Hashtags 8 1 8 5 Ranges Count Min Max Mean Total 8 2 x<3 3 1 2 1.667 5 8 2 3 <= x < 5 3 3 4 3.333 10 8 6 x >= 5 3 5 7 6 18 9 3 9 3 9 7 0 4 Engine Elasticsearch Rivers Facets Demo Architecture Community 32
  89. 89. Ranges Facet "facets" : { "hashtags" : { "range" : { "field" : "hashtags", "ranges" : [ { "to" : 3 }, { "from" : 3, "to" : 5 }, { "from" : 5 } Hashtags ] } } } 8 1 8 5 8 2 8 2 8 6 9 3 9 3 9 7 0 4 Engine Elasticsearch Rivers Facets Demo Architecture Community 33
  90. 90. Ranges Facet "facets" : { "hashtags" : { "range" : { "field" : "hashtags", "ranges" : [ { "to" : 3 }, { "from" : 3, "to" : 5 }, { "from" : 5 } Hashtags ] } } } 8 1 "facets" : { 8 5 "hashtags" : { "_type" : "range", 8 2 "ranges" : [ 8 2 { "to": 3, 8 6 "count": 3, "min": 1, "max": 2, "total": 5, "mean": 1.667 }, 9 3 { "from":3, "to" : 5, 9 3 "count": 3, "min": 3, "max": 4, "total": 10, "mean": 3.333 }, 9 7 { "from":5, 0 4 "count": 3, "min": 5, "max": 7, "total": 18, "mean": 6 } ] } } Engine Elasticsearch Rivers Facets Demo Architecture Community 33
  91. 91. Commerce site usage Engine Elasticsearch Rivers Facets Demo Architecture Community 34
  92. 92. Commerce site usage Engine Elasticsearch Rivers Facets Demo Architecture Community 34
  93. 93. Commerce site usage Engine Elasticsearch Rivers Facets Demo Architecture Community 34
  94. 94. Commerce site usage Ranges Term Term Ranges Engine Elasticsearch Rivers Facets Demo Architecture Community 34
  95. 95. Faceted navigation Engine Elasticsearch Rivers Facets Demo Architecture Community 35
  96. 96. Faceted navigation Fixed Criteria Engine Elasticsearch Rivers Facets Demo Architecture Community 35
  97. 97. Faceted navigation Fixed Criteria Results Engine Elasticsearch Rivers Facets Demo Architecture Community 35
  98. 98. Faceted navigation Fixed Criteria Term Results Engine Elasticsearch Rivers Facets Demo Architecture Community 35
  99. 99. Faceted navigation Fixed Criteria Term Date histogram Results Engine Elasticsearch Rivers Facets Demo Architecture Community 35
  100. 100. Faceted navigation Fixed Criteria Term Ranges Date histogram Results Engine Elasticsearch Rivers Facets Demo Architecture Community 35
  101. 101. Faceted navigation Engine Elasticsearch Rivers Facets Demo Architecture Community 36
  102. 102. Faceted navigation Criteria Engine Elasticsearch Rivers Facets Demo Architecture Community 36
  103. 103. Near Real Time Data Visualization • Perform a matchAll search on all data • Update screen every x seconds • While indexing new documents Date histogram Term Engine Elasticsearch Rivers Facets Demo Architecture Community 37
  104. 104. Did we make noise ? DEMO APPLICATION 38
  105. 105. Demo architecture Engine Elasticsearch Rivers Facets Demo Architecture Community 39
  106. 106. Demo architecture Engine Elasticsearch Rivers Facets Demo Architecture Community 39
  107. 107. Demo architecture Twitter Streaming API Engine Elasticsearch Rivers Facets Demo Architecture Community 39
  108. 108. Demo architecture Twitter Streaming API Engine Elasticsearch Rivers Facets Demo Architecture Community 39
  109. 109. Demo architecture Twitter Streaming API Engine Elasticsearch Rivers Facets Demo Architecture Community 39
  110. 110. Demo architecture Twitter Twitter Streaming River API $ curl -XPUT localhost:9200/_river/twitter/_meta -d ' { "type" : "twitter", "twitter" : { "user" : "twitter_user", "password" : "twitter_passowrd", "filter" : { "tracks" : ["devoxxfr"] } } }' Engine Elasticsearch Rivers Facets Demo Architecture Community 39
  111. 111. Demo architecture Chrome Twitter Twitter Streaming River API $ curl -XPUT localhost:9200/_river/twitter/_meta -d ' { "type" : "twitter", "twitter" : { "user" : "twitter_user", "password" : "twitter_passowrd", "filter" : { "tracks" : ["devoxxfr"] } } }' Engine Elasticsearch Rivers Facets Demo Architecture Community 39
  112. 112. Let’s go further : sharding / replica / scalabilty ARCHITECTURE 40
  113. 113. Glossary Engine Elasticsearch Rivers Facets Demo Architecture Community 41
  114. 114. Glossary • Node : An Elasticsearch instance (~ server ?) Engine Elasticsearch Rivers Facets Demo Architecture Community 41
  115. 115. Glossary • Node : An Elasticsearch instance (~ server ?) • Cluster : A set of nodes Engine Elasticsearch Rivers Facets Demo Architecture Community 41
  116. 116. Glossary • Node : An Elasticsearch instance (~ server ?) • Cluster : A set of nodes • Shard : an index shard where you distribute documents Engine Elasticsearch Rivers Facets Demo Architecture Community 41
  117. 117. Glossary • Node : An Elasticsearch instance (~ server ?) • Cluster : A set of nodes • Shard : an index shard where you distribute documents • Replica : One or more shard copy in the cluster Engine Elasticsearch Rivers Facets Demo Architecture Community 41
  118. 118. Glossary • Node : An Elasticsearch instance (~ server ?) • Cluster : A set of nodes • Shard : an index shard where you distribute documents • Replica : One or more shard copy in the cluster • Primary shard : shard elected as primary in the cluster. Lucene index documents there. Engine Elasticsearch Rivers Facets Demo Architecture Community 41
  119. 119. Glossary • Node : An Elasticsearch instance (~ server ?) • Cluster : A set of nodes • Shard : an index shard where you distribute documents • Replica : One or more shard copy in the cluster • Primary shard : shard elected as primary in the cluster. Lucene index documents there. • Secondary shard : store replicas of primary shards Engine Elasticsearch Rivers Facets Demo Architecture Community 41
  120. 120. Let’s create an index Cluster Nœud 1 Client CURL Engine Elasticsearch Rivers Facets Demo Architecture Community 42
  121. 121. Let’s create an index $ curl -XPUT localhost:9200/twitter -d '{ Cluster "index" : { "number_of_shards" : 2, Nœud 1 "number_of_replicas" : 1 Shard 0 } }' Shard 1 replication rule is not satisfied Client CURL Engine Elasticsearch Rivers Facets Demo Architecture Community 42
  122. 122. Let’s create an index $ curl -XPUT localhost:9200/twitter -d '{ Cluster "index" : { "number_of_shards" : 2, Node 1 Node 2 "number_of_replicas" : 1 Shard 0 Shard 0 } }' Shard 1 Shard 1 replication rule is satisfied Client CURL Engine Elasticsearch Rivers Facets Demo Architecture Community 42
  123. 123. Dynamic reallocation Cluster Node 1 Node 2 Shard 0 Shard 0 Shard 1 Shard 1 Engine Elasticsearch Rivers Facets Demo Architecture Community 43
  124. 124. Dynamic reallocation Cluster Node 1 Node 2 Node 3 Shard 0 Shard 0 Shard 1 Shard 1 Engine Elasticsearch Rivers Facets Demo Architecture Community 43
  125. 125. Dynamic reallocation Cluster Node 1 Node 2 Node 3 Shard 0 Shard 0 Shard 0 Shard 1 Shard 1 Engine Elasticsearch Rivers Facets Demo Architecture Community 44
  126. 126. Dynamic reallocation Cluster Node 1 Node 2 Node 3 Shard 0 Shard 0 Shard 1 Shard 1 Engine Elasticsearch Rivers Facets Demo Architecture Community 44
  127. 127. Dynamic reallocation Cluster Node 1 Node 2 Node 3 Node 4 Shard 0 Shard 0 Shard 1 Shard 1 Engine Elasticsearch Rivers Facets Demo Architecture Community 44
  128. 128. Dynamic reallocation Cluster Node 1 Node 2 Node 3 Node 4 Shard 0 Shard 0 Shard 1 Shard 1 Shard 1 Engine Elasticsearch Rivers Facets Demo Architecture Community 45
  129. 129. Dynamic reallocation Cluster Node 1 Node 2 Node 3 Node 4 Shard 0 Shard 0 Shard 1 Shard 1 Tuning is finding the best numbers for nodes, shards and replicas ! Engine Elasticsearch Rivers Facets Demo Architecture Community 45
  130. 130. Let’s index a document Cluster Node 1 Node 2 Node 3 Node 4 Shard 0 Shard 0 Shard 1 Shard 1 Doc 1 Client $ curl -XPUT localhost:9200/twitter/tweet/1 -d ' CURL { "text": "Bienvenue à la conférence #elasticsearch pour #devoxxfr", "created_at": "2012-04-06T20:45:36.000Z", "source": "Twitter for iPad", ... }' Engine Elasticsearch Rivers Facets Demo Architecture Community 46
  131. 131. Let’s index a document Cluster Node 1 Node 2 Node 3 Node 4 Doc Shard 0 1 Shard 0 Shard 1 Shard 1 Client $ curl -XPUT localhost:9200/twitter/tweet/1 -d ' CURL { "text": "Bienvenue à la conférence #elasticsearch pour #devoxxfr", "created_at": "2012-04-06T20:45:36.000Z", "source": "Twitter for iPad", ... }' Engine Elasticsearch Rivers Facets Demo Architecture Community 47
  132. 132. Let’s index a document Cluster Node 1 Node 2 Node 3 Node 4 Doc Shard 0 1 Shard 0 Shard 1 Shard 1 Client $ curl -XPUT localhost:9200/twitter/tweet/1 -d ' CURL { "text": "Bienvenue à la conférence #elasticsearch pour #devoxxfr", "created_at": "2012-04-06T20:45:36.000Z", "source": "Twitter for iPad", ... }' Engine Elasticsearch Rivers Facets Demo Architecture Community 47
  133. 133. Let’s index a document Cluster Node 1 Node 2 Node 3 Node 4 Doc Doc Shard 0 1 Shard 0 1 Shard 1 Shard 1 Client $ curl -XPUT localhost:9200/twitter/tweet/1 -d ' CURL { "text": "Bienvenue à la conférence #elasticsearch pour #devoxxfr", "created_at": "2012-04-06T20:45:36.000Z", "source": "Twitter for iPad", ... }' Engine Elasticsearch Rivers Facets Demo Architecture Community 48
  134. 134. Let’s index another document Cluster Node 1 Node 2 Node 3 Node 4 Doc Doc Shard 0 1 Shard 0 1 Shard 1 Shard 1 Doc 2 Client $ curl -XPUT localhost:9200/twitter/tweet/2 -d ' CURL { "text": "Je fais du bruit pour #elasticsearch à #devoxxfr", "created_at": "2012-04-06T21:12:52.000Z", "source": "Twitter for iPad", ... }' Engine Elasticsearch Rivers Facets Demo Architecture Community 49
  135. 135. Let’s index another document Cluster Node 1 Node 2 Node 3 Node 4 Doc Doc Shard 0 1 Shard 0 1 Shard 1 Shard 1 Doc 2 Client $ curl -XPUT localhost:9200/twitter/tweet/2 -d ' CURL { "text": "Je fais du bruit pour #elasticsearch à #devoxxfr", "created_at": "2012-04-06T21:12:52.000Z", "source": "Twitter for iPad", ... }' Engine Elasticsearch Rivers Facets Demo Architecture Community 50
  136. 136. Let’s index another document Cluster Node 1 Node 2 Node 3 Node 4 Doc Doc Shard 0 1 Shard 0 1 Shard 1 Doc Shard 1 2 Client $ curl -XPUT localhost:9200/twitter/tweet/2 -d ' CURL { "text": "Je fais du bruit pour #elasticsearch à #devoxxfr", "created_at": "2012-04-06T21:12:52.000Z", "source": "Twitter for iPad", ... }' Engine Elasticsearch Rivers Facets Demo Architecture Community 51
  137. 137. Let’s index another document Cluster Node 1 Node 2 Node 3 Node 4 Doc Doc Shard 0 1 Shard 0 1 Shard 1 Doc Shard 1 2 Client $ curl -XPUT localhost:9200/twitter/tweet/2 -d ' CURL { "text": "Je fais du bruit pour #elasticsearch à #devoxxfr", "created_at": "2012-04-06T21:12:52.000Z", "source": "Twitter for iPad", ... }' Engine Elasticsearch Rivers Facets Demo Architecture Community 51
  138. 138. Let’s index another document Cluster Node 1 Node 2 Node 3 Node 4 Doc Doc Shard 0 1 Shard 0 1 Doc Doc Shard 1 Shard 1 2 2 Client $ curl -XPUT localhost:9200/twitter/tweet/2 -d ' CURL { "text": "Je fais du bruit pour #elasticsearch à #devoxxfr", "created_at": "2012-04-06T21:12:52.000Z", "source": "Twitter for iPad", ... }' Engine Elasticsearch Rivers Facets Demo Architecture Community 52
  139. 139. Let’s search for documents Cluster Node 1 Node 2 Node 3 Node 4 Doc Doc Shard 0 1 Shard 0 1 Doc Doc Shard 1 Shard 1 2 2 Client $ curl localhost:9200/twitter/_search?q=elasticsearch CURL Engine Elasticsearch Rivers Facets Demo Architecture Community 53

Editor's Notes

  • \n
  • \n
  • Points abord&amp;#xE9;s :\nA quels besoins essaye t on de r&amp;#xE9;pondre ? A quoi servirait un moteur de recherche dans mon SI ?\nComment Elasticsearch r&amp;#xE9;pond &amp;#xE0; ces besoins et &amp;#xE0; bien d&apos;autres encore\nD&amp;#xE9;mo Live : indexation de messages Twitter ! Faites du bruit en twittant sur @devoxxfr et #elasticsearch\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

×