Mastering ElasticSearch with Ruby and Tire

6,643 views
6,294 views

Published on

A tutorial on what is ElasticSearch and how to use it effectively in a real project.
The talk discusses how to integrate a search experience in an existing application, showing all the steps from downloading&configuring elastic search, to building the UI and wire the search logic (in a Rails application).

The talk was presented at RubyConf 2013.

Published in: Technology
1 Comment
14 Likes
Statistics
Notes
  • karmiq has pointed me out to: https://github.com/rubygems/rubygems.org/pull/455

    It is a real integration of ElasticSearch into RubyGems (waiting to be integrated in the RubyGems code base), it's very worth to check it. Awesome job Karel!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
6,643
On SlideShare
0
From Embeds
0
Number of Embeds
331
Actions
Shares
0
Downloads
57
Comments
1
Likes
14
Embeds 0
No embeds

No notes for slide

Mastering ElasticSearch with Ruby and Tire

  1. 1. Mastering ElasticSearch w/ Ruby (+ Tire) Luca Bonmassar, Gild RubyConf 2013 Tuesday, 5 November 13
  2. 2. Who am I? I’m Luca Bonmassar (@openmosix) # 31 # Italian living in San Francisco (and Stockholm) I work at Gild I love building products [for fun, profit and boredom] Tuesday, 5 November 13
  3. 3. Search use case You’re building a product User generated content Let (other) users find or discover this content Tuesday, 5 November 13
  4. 4. Search is NOT easy It usually starts as but then you want to support AND, OR, NOT, double quotes on multiple fields so and then it goes like Tuesday, 5 November 13
  5. 5. And so it goes... Tuesday, 5 November 13
  6. 6. Agenda Let’s define a “pet project” Boilerplate (download, install, scaffold, config, bla bla bla, yadda, yadda, yadda) Build a website w/ simple search Build a more advanced search What next (homework) Tuesday, 5 November 13
  7. 7. Project! (for fun and no profit) Tuesday, 5 November 13
  8. 8. Project: build a full textsearch for RubyGems.org Tuesday, 5 November 13
  9. 9. RubyGems Search Architecture RubyGems.org Web Spider Elastic Search Rails App (Tire) Tuesday, 5 November 13 MongoDB
  10. 10. RubyGems crawler github.com/openmosix/rubygems-crawler 1. Download all ruby gem names from https://rubygems.org/gems 2. Use gems API to download each gem info: read JSON - write JSON (MongoDB) Tuesday, 5 November 13
  11. 11. RubyGems Crawler II Tuesday, 5 November 13
  12. 12. RubyGems Search v1 Tuesday, 5 November 13 Code: github.com/openmosix/rubygems-search Spoiler: http://rubyconf.bonmassar.it
  13. 13. ElasticSearch Tuesday, 5 November 13
  14. 14. ElasticSearch II Open source search+analytics engine Distributed [near] Realtime search Multi tenant Built on Apache Lucene REST APIs JSON documents Tuesday, 5 November 13
  15. 15. ElasticSearch III 1. Download / set up a ES cluster 2. Define settings and data mapping (opt) 3. Index Data 4. Query Index MongoDB Elastic Search {    "ok"  :  true,    "status"  :  200,    "version"  :  {        "number"  :  "0.90.5",        "lucene_version"  :  "4.4"    },    "tagline"  :  "You  Know,  for  Search" } Tuesday, 5 November 13 curl  -­‐X  GET  'h.p://localhost:9200/ ruby_gems/_search?from=0&size=25&pre.y' > Curl
  16. 16. ES download + setup > wget http://download.elasticsearch.org/ elasticsearch/elasticsearch/ elasticsearch-0.90.6.tar.gz > tar zxvf elasticsearch-0.90.6.tar.gz > sudo mv elasticsearch-0.90.6 /usr/local/ Hint #1: you need Java Hint #2: you need Oracle Java Tuesday, 5 November 13
  17. 17. ES config > ls elasticsearch-0.90.6/config/ Logging.yml: where to log, how much to log Elasticsearch.yml: all server config. Defines: Name of the cluster (change it!!!) Node parameters (master/slave, store data/router) Sharing and # replicas Paths Plugins Memory (JVM, heap, memory locking) Network config “Gateway” (cluster backup) Recovery Discovery Slow log + GC log Default options are good enough for dev env Tuesday, 5 November 13
  18. 18. ES boot + test Start: bin/elasticsearch Test: curl http://localhost:9200/ { "ok" : true, "status" : 200, "name" : "Iron Fist", "version" : { "number" : "0.90.6", "lucene_version" : "4.5.1" }, "tagline" : "You Know, for Search" } Stop: curl -XPOST 'http://localhost:9200/ _shutdown' {"cluster_name":"elasticsearch","nodes": {"jm2Z3J4dSzebjJ7Px2fAJg":{"name":"Iron Fist"}}} Tuesday, 5 November 13
  19. 19. Profit (you are now an ElasticSearch expert - go and tell the world) Tuesday, 5 November 13
  20. 20. ElasticSearch operations Create a “RubyGem” Index Defines a “RubyGem” Index data mapping Index data (e.g. upload data from MongoDB to ES index = POST) Query (= GET) Tuesday, 5 November 13
  21. 21. Tire now Re-Tire ;( A ruby gem wrapping ElasticSearch REST APIs into a powerful ruby DSL ActiveModel integration Rake tasks and utilities to load and query ElasticSearch Tuesday, 5 November 13
  22. 22. Tire setup cat “gem ‘tire’” > Gemfile && bundle install > cat config/initializers/elasticsearch.rb ... Tire::Configuration.url('http://localhost:9200') Tire.configure { logger "#{Rails.root}/log/elasticsearchqueries.log" } if ENV['ES_LOG'] Tuesday, 5 November 13
  23. 23. Define an ES index (with Tire DSL) Tuesday, 5 November 13
  24. 24. Indexing Get a record Convert it to JSON format (to_indexed_json) Push it to Elastic Search (.update_index) ...under the hood... Tuesday, 5 November 13
  25. 25. Index (all data) Naive (POST on index for each record): Use bulk updates: ...under the hood... Tuesday, 5 November 13
  26. 26. Search I Tuesday, 5 November 13
  27. 27. Search II Tuesday, 5 November 13
  28. 28. Simple Search Tuesday, 5 November 13
  29. 29. Highlight matches Text ...add some CSS... Tuesday, 5 November 13
  30. 30. Advanced Search Tuesday, 5 November 13
  31. 31. Advanced Search II Tuesday, 5 November 13
  32. 32. Advanced Search III Tuesday, 5 November 13
  33. 33. Facets Tuesday, 5 November 13
  34. 34. ES facets Tuesday, 5 November 13
  35. 35. ES facets (running) Tuesday, 5 November 13
  36. 36. Facets - UI Tuesday, 5 November 13
  37. 37. Bonsai Cool I Tuesday, 5 November 13 Search Suggesters (Did you mean... ?)
  38. 38. Bonsai Cool II Tuesday, 5 November 13 “Similar to this” (aka “More Like This” API)
  39. 39. Bonsai Cool III Tuesday, 5 November 13 Percolate API
  40. 40. Deployment I Run your own cluster Some learnings: at least 3 nodes memory profiling / GC install very good monitoring (github.com/ karmi/elasticsearch-paramedic) more RAM is (always) better Check IOPS (if on AWS) Pros: Total control Cheaper (lot cheaper) Cons: Can be a nightmare / Require dedicated devop Tuesday, 5 November 13
  41. 41. Deployment II ElasticSearch as a service http://found.no http://searchly.com http://bonsai.io Pros: Get cluster up & running in a minute Focus on dev, not troubleshooting Professional support Cons: Expensive Can be in the wrong region / hosting provider Expensive Did I say expensive? Tuesday, 5 November 13
  42. 42. Thanks! Code: github.com/openmosix/rubygems-search github.com/openmosix/rubygems-crawler Demo (will be down by the end of rubyconf): http://rubyconf.bonmassar.it Say “hi”: Luca Bonmassar - luca@gild.com github.com/openmosix twitter.com/openmosix linkedin.com/in/lucabonmassar Tuesday, 5 November 13

×