Successfully reported this slideshow.
Your SlideShare is downloading. ×

Solr & Lucene @ Etsy by Gregg Donovan

Ad

Solr & Lucene at Etsy
       Gregg Donovan
    Technical Lead, Search
      gregg@etsy.com

Ad

1.5 years Solr & Lucene at Etsy.com

3 years Solr & Lucene at TheLadders.com

Ad

8+ million members

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Loading in …3
×

Check these out next

1 of 84 Ad
1 of 84 Ad
Advertisement

More Related Content

Viewers also liked (20)

Advertisement

Solr & Lucene @ Etsy by Gregg Donovan

  1. 1. Solr & Lucene at Etsy Gregg Donovan Technical Lead, Search gregg@etsy.com
  2. 2. 1.5 years Solr & Lucene at Etsy.com 3 years Solr & Lucene at TheLadders.com
  3. 3. 8+ million members
  4. 4. 9.3 million items
  5. 5. 800k+ active sellers
  6. 6. 1+ billion pageviews / month
  7. 7. Maximize Solr out-of-the-box
  8. 8. Hack at a low-level
  9. 9. Know when to do each
  10. 10. Or
  11. 11. Don’t fear trunk
  12. 12. builds.apache.org/job/Solr-trunk/changes
  13. 13. http://localhost:8393/solr/placesuggest/ select? q={!lucene}s* &sfield=latlong&pt=37.595804,-122.364521 &sort=div(geodist(),sqrt(sum(population,50))) %20asc
  14. 14. {!lucene} {!field} {!term} {!boost} {!func} {!dismax} {!edismax}
  15. 15. Cheap ranking awesomeness
  16. 16. ExternalFileField ftw!
  17. 17. schema.xml: <fieldType name="file" keyField="treasury_id" defVal="0" stored="false" indexed="true" class="solr.ExternalFileField" valType="float"/> <field name="hotness" type="file"/> /search/data/treasury/external_hotness.1306390802088: 1=2.3 2=1.7 3=1.1 Solr query: sort={!func}hotness+desc
  18. 18. ExternalFileField caveats
  19. 19. More relevance: boost query
  20. 20. http://localhost:8983/solr/listings/select? q={!boost b=$rel v=$qq} &rel=category:furniture^10+OR+((-material:acrylic) ^5) &qq=desk
  21. 21. Impression tracking
  22. 22. etsy.com/search?q=desk&explain=1
  23. 23. Side-by-Side testing
  24. 24. Cheap performance wins
  25. 25. Put off sharding till you must
  26. 26. cat ${indexDir}/* > /dev/null
  27. 27. Return IDs, minimize stored fields
  28. 28. RAM: $10-20 / GB
  29. 29. SSD: 0.1ms vs 10ms seek
  30. 30. Custom?
  31. 31. solr-user
  32. 32. Tools for low-level hacking
  33. 33. Continuous deployment
  34. 34. One button. So easy a dog could do it.
  35. 35. MTTR > MTBF
  36. 36. github.com/etsy/logster
  37. 37. Tracking GC
  38. 38. export GC_DEBUG="-verbose:gc -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX: +PrintGCApplicationStoppedTime -XX:+PrintAdaptiveSizePolicy - XX:AdaptiveSizePolicyOutputInterval=1 -XX:+PrintTenuringDistribution -XX: +PrintGCDetails -Xloggc:/var/log/search/gc.log"
  39. 39. Alerting
  40. 40. Testing
  41. 41. SaveAsFixture
  42. 42. Profiling
  43. 43. Java Primitive Library fastutil trove4j
  44. 44. Know the hooks SolrRequestHandler SearchComponent QParserPlugin SolrEventListener SolrCache ValueSourceParser
  45. 45. SolrIndexSearcher gotchas reference counting using it as a cache key: WeakHashMap<SolrIndexSearcher,MyValue> myCache...
  46. 46. Example: personalized collections
  47. 47. fq={!term f=id}123 OR {!term f=id}456
  48. 48. Need a map of PK to docId
  49. 49. Use custom SolrCache plus SolrEventListener to fill it
  50. 50. github.com/giokincade/FastTermFilter
  51. 51. i18n currency sorting and filtering
  52. 52. currency.xml: <currencyConfig version="1.0"> ! <currencies> ! ! <currency name="United States Dollar" symbol="$" code="USD"/> ! ! <currency name="Australian Dollar" symbol="$" code="AUD"/> ! ! <currency name="Canadian Dollar" symbol="$" code="CAD"/> ! ! <currency name="Czech Koruna" symbol="Kč" code="CZK"/> ... ! </currencies> ! <rates> ! ! <rate from="USD" to="AUD" rate="1.168750"/> ! ! <rate from="USD" to="CAD" rate="1.085000"/> ! ! <rate from="USD" to="CZK" rate="20.107500"/> ! ! <rate from="USD" to="DKK" rate="5.323750"/> ... </rates> </currencyConfig>
  53. 53. price:[$10.00 to $50.00] price:[10.00USD to 50.00USD] price:20.00EUR
  54. 54. MoneyFieldType.java: @Override public Query getRangeQuery(QParser parser, SchemaField field, String part1, String part2, final boolean minInclusive, final boolean maxInclusive) { final MoneyValue p1 = MoneyValue.parse(part1, defaultCurrency); final MoneyValue p2 = MoneyValue.parse(part2, defaultCurrency); if (!p1.getCurrencyCode().equals(p2.getCurrencyCode())) { throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, new ParseException("Cannot parse range query " + part1 + " to " + part2 + ": range queries only supported when upper and lower bound have same currency.")); } String currencyCode = p1.getCurrencyCode(); final MoneyValueSource vs = new MoneyValueSource(field, currencyCode, parser); return new SolrConstantScoreQuery(new ValueSourceRangeFilter(vs, p1.getAmount() + "", p2.getAmount() + "", minInclusive, maxInclusive)); }
  55. 55. Replication gotcha
  56. 56. SOLR-2202
  57. 57. Related Searches
  58. 58. Autosuggest!
  59. 59. bjewlery dewelry ejewelry ejwelry ewelery ewerly ewlery fewelry fewlery fjewelery fjewelry gewerly gewlery hewelery hewelry hewerly hewlery hjewelry iewelry ijewelry jawelery jawlery jeawlery jeelery jeelry jeewelery jeewelry jeewlery jeewlry jefwelry jejelry jelelry jelery jellery jelwelery jelwelry jelwlery jemelry jemerly jemwelry jeqwelry jerelery jerelry jerely jererly jerlery jerwelery jerwelry jerwely jerwerly jeselery jeselry jevelry jeverly jewalery jewdelry jewedlry jeweelrry jeweelry jeweely jeweer jeweery jeweilry jeweiry jewejery jewejlry jewejrly jewejry jewekey jewekry jewelary jeweldy jewele jewelee jewelelry jewelera jewelerey jewelerly jewelert jewelerty jeweleru jeweleruy jeweleryl jewelerys jeweleryy jewelet jewelety jeweleya jewelfry jewelfy jeweliy jewellryp jewelltry jewelly jewelory jewelra jewelray jewelre jewelree jewelreyy jewelrfy jewelrh jewelri jewelrky jewelrly jewelrr jewelrs jewelrsy jewelrt jewelrty jewelru jewelruy jewelrye jewelryh jewelryl jewelrym jewelryr jewelrys jewelryt jewelryu jewelryuk jewelryy jewelrz jewelsry jewelsy jeweltry jewelty jewelw jewelwery jewelwey jewelwy jewelya jewelyj jewelyr jewelyry jewelyu jewelyy jewelzry jeweory jewerey jeweriy jewerky jewerlary jewerley jewerli jewerlly jewerls jewerlt jewerlu jewerlyh jewerlyr jewerlys jewerlyu jewerry jeweryl jewetry jewewlry jewewly jewewrly jewewry jeweylry jewiery jewilary jewkery jewlary jewledy jewleery jewlelery jewlely
  60. 60. The TermDictionary is not a whitelist

×