Solr & Lucene @ Etsy by Gregg Donovan

Gregg Donovan
Gregg DonovanTechnical Lead, Search at Etsy.com
Solr & Lucene at Etsy
       Gregg Donovan
    Technical Lead, Search
      gregg@etsy.com
1.5 years Solr & Lucene at Etsy.com

3 years Solr & Lucene at TheLadders.com
Solr & Lucene @ Etsy by Gregg Donovan
8+ million members
9.3 million items
800k+ active sellers
1+ billion pageviews / month
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
Maximize Solr out-of-the-box
Hack at a low-level
Know when to do each
Solr & Lucene @ Etsy by Gregg Donovan
Or
Solr & Lucene @ Etsy by Gregg Donovan
Don’t fear trunk
builds.apache.org/job/Solr-trunk/changes
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
http://localhost:8393/solr/placesuggest/
                   select?
                q={!lucene}s*
  &sfield=latlong&pt=37.595804,-122.364521
&sort=div(geodist(),sqrt(sum(population,50)))
                    %20asc
{!lucene}
 {!field}
 {!term}
 {!boost}
 {!func}
{!dismax}
{!edismax}
Cheap ranking awesomeness
Solr & Lucene @ Etsy by Gregg Donovan
ExternalFileField ftw!
schema.xml:
    <fieldType name="file" keyField="treasury_id" defVal="0"
stored="false" indexed="true" class="solr.ExternalFileField"
valType="float"/>
    <field name="hotness" type="file"/>

/search/data/treasury/external_hotness.1306390802088:
1=2.3
2=1.7
3=1.1

Solr query:
sort={!func}hotness+desc
ExternalFileField caveats
More relevance: boost query
http://localhost:8983/solr/listings/select?
q={!boost b=$rel v=$qq}
&rel=category:furniture^10+OR+((-material:acrylic)
^5)
&qq=desk
Impression tracking
etsy.com/search?q=desk&explain=1
Side-by-Side testing
Solr & Lucene @ Etsy by Gregg Donovan
Cheap performance wins
Put off sharding till you must
cat ${indexDir}/* > /dev/null
Return IDs, minimize stored fields
RAM: $10-20 / GB
SSD: 0.1ms vs 10ms seek
Custom?
solr-user
Tools for low-level hacking
Continuous deployment
Solr & Lucene @ Etsy by Gregg Donovan
One button.
So easy a dog could do it.
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
MTTR > MTBF
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
github.com/etsy/logster
Tracking GC
export GC_DEBUG="-verbose:gc -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:
+PrintGCApplicationStoppedTime -XX:+PrintAdaptiveSizePolicy -
XX:AdaptiveSizePolicyOutputInterval=1 -XX:+PrintTenuringDistribution -XX:
+PrintGCDetails -Xloggc:/var/log/search/gc.log"
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
Alerting
Testing
Solr & Lucene @ Etsy by Gregg Donovan
SaveAsFixture
Profiling
Java Primitive Library
         fastutil
         trove4j
Know the hooks
  SolrRequestHandler
  SearchComponent
    QParserPlugin
   SolrEventListener
       SolrCache
   ValueSourceParser
SolrIndexSearcher gotchas
                reference counting
             using it as a cache key:
   WeakHashMap<SolrIndexSearcher,MyValue> myCache...
Example:
personalized collections
Solr & Lucene @ Etsy by Gregg Donovan
fq={!term f=id}123 OR {!term f=id}456
Need a map of PK to docId
Use custom SolrCache plus SolrEventListener
                 to fill it
github.com/giokincade/FastTermFilter
i18n currency sorting and filtering
Solr & Lucene @ Etsy by Gregg Donovan
currency.xml:

<currencyConfig version="1.0">
! <currencies>
! ! <currency name="United States Dollar" symbol="$" code="USD"/>
! ! <currency name="Australian Dollar" symbol="$" code="AUD"/>
! ! <currency name="Canadian Dollar" symbol="$" code="CAD"/>
! ! <currency name="Czech Koruna" symbol="Kč" code="CZK"/>
...
! </currencies>
! <rates>
! ! <rate from="USD" to="AUD" rate="1.168750"/>
! ! <rate from="USD" to="CAD" rate="1.085000"/>
! ! <rate from="USD" to="CZK" rate="20.107500"/>
! ! <rate from="USD" to="DKK" rate="5.323750"/>
...
    </rates>
</currencyConfig>
price:[$10.00 to $50.00]

price:[10.00USD to 50.00USD]

       price:20.00EUR
MoneyFieldType.java:

  @Override
  public Query getRangeQuery(QParser parser, SchemaField field, String part1, String part2,
final boolean minInclusive, final boolean maxInclusive) {
    final MoneyValue p1 = MoneyValue.parse(part1, defaultCurrency);
    final MoneyValue p2 = MoneyValue.parse(part2, defaultCurrency);

    if (!p1.getCurrencyCode().equals(p2.getCurrencyCode())) {
      throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
              new ParseException("Cannot parse range query " + part1 + " to " + part2 +
                      ": range queries only supported when upper and lower bound have same
currency."));
    }

      String currencyCode = p1.getCurrencyCode();
      final MoneyValueSource vs = new MoneyValueSource(field, currencyCode, parser);

      return new SolrConstantScoreQuery(new ValueSourceRangeFilter(vs,
              p1.getAmount() + "", p2.getAmount() + "", minInclusive, maxInclusive));
  }
Replication gotcha
SOLR-2202
Related Searches
Autosuggest!
bjewlery dewelry ejewelry ejwelry ewelery ewerly ewlery fewelry
fewlery fjewelery fjewelry gewerly gewlery hewelery hewelry hewerly
hewlery hjewelry iewelry ijewelry jawelery jawlery jeawlery jeelery
jeelry jeewelery jeewelry jeewlery jeewlry jefwelry jejelry jelelry
jelery jellery jelwelery jelwelry jelwlery jemelry jemerly jemwelry
jeqwelry jerelery jerelry jerely jererly jerlery jerwelery jerwelry
jerwely jerwerly jeselery jeselry jevelry jeverly jewalery jewdelry
jewedlry jeweelrry jeweelry jeweely jeweer jeweery jeweilry jeweiry
jewejery jewejlry jewejrly jewejry jewekey jewekry jewelary jeweldy
jewele jewelee jewelelry jewelera jewelerey jewelerly jewelert
jewelerty jeweleru jeweleruy jeweleryl jewelerys jeweleryy jewelet
jewelety jeweleya jewelfry jewelfy jeweliy jewellryp jewelltry
jewelly jewelory jewelra jewelray jewelre jewelree jewelreyy
jewelrfy jewelrh jewelri jewelrky jewelrly jewelrr jewelrs jewelrsy
jewelrt jewelrty jewelru jewelruy jewelrye jewelryh jewelryl
jewelrym jewelryr jewelrys jewelryt jewelryu jewelryuk
jewelryy jewelrz jewelsry jewelsy jeweltry jewelty jewelw jewelwery
jewelwey jewelwy jewelya jewelyj jewelyr jewelyry jewelyu jewelyy
jewelzry jeweory jewerey jeweriy jewerky jewerlary jewerley jewerli
jewerlly jewerls jewerlt jewerlu jewerlyh jewerlyr jewerlys jewerlyu
jewerry jeweryl jewetry jewewlry jewewly jewewrly jewewry jeweylry
jewiery jewilary jewkery jewlary jewledy jewleery jewlelery jewlely
The TermDictionary is not a whitelist
Solr & Lucene @ Etsy by Gregg Donovan
1 of 84

More Related Content

What's hot(20)

Redis for the Everyday DeveloperRedis for the Everyday Developer
Redis for the Everyday Developer
Ross Tuck70.9K views
Intro to The PHP SPLIntro to The PHP SPL
Intro to The PHP SPL
Chris Tankersley5.8K views
Nubilus PerlNubilus Perl
Nubilus Perl
Flavio Poletti614 views
Spl Not A Bridge Too Far phpNW09Spl Not A Bridge Too Far phpNW09
Spl Not A Bridge Too Far phpNW09
Michelangelo van Dam2K views
Perl Web ClientPerl Web Client
Perl Web Client
Flavio Poletti1.7K views
Solr Anti - patternsSolr Anti - patterns
Solr Anti - patterns
Rafał Kuć4.5K views
はじめてのMongoDBはじめてのMongoDB
はじめてのMongoDB
Takahiro Inoue16.1K views
dotCloud and godotCloud and go
dotCloud and go
Flavio Poletti871 views
Invertible-syntax 入門Invertible-syntax 入門
Invertible-syntax 入門
Hiromi Ishii2.3K views
groovy & grails - lecture 2groovy & grails - lecture 2
groovy & grails - lecture 2
Alexandre Masselot545 views
Things I Believe Now That I'm OldThings I Believe Now That I'm Old
Things I Believe Now That I'm Old
Ross Tuck6.6K views
Adventures in OptimizationAdventures in Optimization
Adventures in Optimization
David Golden178 views

Similar to Solr & Lucene @ Etsy by Gregg Donovan(20)

Open Source Search: An AnalysisOpen Source Search: An Analysis
Open Source Search: An Analysis
Justin Finkelstein1.4K views
Dirty Secrets of the PHP SOAP ExtensionDirty Secrets of the PHP SOAP Extension
Dirty Secrets of the PHP SOAP Extension
Adam Trachtenberg748 views
Solr As A SparkSQL DataSourceSolr As A SparkSQL DataSource
Solr As A SparkSQL DataSource
Spark Summit2.2K views
Os PruettOs Pruett
Os Pruett
oscon20071.1K views
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by Case
Alexandre Rafalovitch89.8K views
Php & my sqlPhp & my sql
Php & my sql
Norhisyam Dasuki1K views
HelvetiaHelvetia
Helvetia
ESUG375 views
Rapid prototyping search applications with solrRapid prototyping search applications with solr
Rapid prototyping search applications with solr
Lucidworks (Archived)3K views
Using Apache SolrUsing Apache Solr
Using Apache Solr
pittaya5.2K views
PHP security auditsPHP security audits
PHP security audits
Damien Seguy5.3K views
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
Dmitry Buzdin3.5K views
Propel sfugmdPropel sfugmd
Propel sfugmd
iKlaus464 views
jQuery introductionjQuery introduction
jQuery introduction
Stijn Van Minnebruggen1.3K views

Recently uploaded(20)

Liqid: Composable CXL PreviewLiqid: Composable CXL Preview
Liqid: Composable CXL Preview
CXL Forum114 views
Web Dev - 1 PPT.pdfWeb Dev - 1 PPT.pdf
Web Dev - 1 PPT.pdf
gdsczhcet44 views
Java Platform Approach 1.0 - Picnic MeetupJava Platform Approach 1.0 - Picnic Meetup
Java Platform Approach 1.0 - Picnic Meetup
Rick Ossendrijver20 views

Solr & Lucene @ Etsy by Gregg Donovan