SlideShare a Scribd company logo
Solr @ eBay Kleinanzeigen

   Olaf Zschiedrich, eBay Classifieds Group
ozschiedrich@ebay-kleinanzeigen.de, 5/25/2011
Who I am?
!    Olaf Zschiedrich
!    eBay Classifieds Group
!    Head of Technology @ eBay Kleinanzeigen
!    Area of expertise/interest:
     •    High traffic web-applications
     •    Agile development
     •    Java/JEE
     •    Search technologies




                                               3
Agenda
!    About eBay Classifieds Group/ebay Kleinanzeigen
!    Metrics & Traffic Numbers
!    Why Solr?
!    Solr Features in Action
!    Data Indexing
!    Solr in Production
!    Best Practices
!    Problems
!    Outlook
!    Questions
                                                       4
About eBay Classifieds Group




                               5
About eBay Classifieds Group


                 online
                 classifieds
                 company in
                 the world



                               6
About eBay Kleinanzeigen
!  Typilcal classifieds ad platform (horizontal, local trading)
!  Launched 2009 after 4 months of development
!  Small agile team (using Scrum)
   •  12-15 people total
   •  5-7 developers
!  Leverages open source (Spring, Solr, MySQL, ActiveMQ)
!  Applications:
   •    Public website
   •    Customer support tool
   •    API (Rest supporting JSON and XML)
   •    Iphone App (~ 250.000 installations)
   •    Facebook App

                                                                  7
Metrics & Traffic Numbers
!  Site metrics:
   •  ~ 3.2 M active ads
   •  16 – 24 M PVs per day
   •  Peak hours = 1.8 M PVs (~ 500 PVs per second)
!  Solr request metrics:
   •  ~ 60 M requests per day
   •  Peak hours = ~ 1500 request per second
!  Avg. response time
   •  20 ms (search) and 3 ms for auto-suggest


            Site is rapidly growing !!!
                                                      8
Why Solr
!    Open Source
!    Good documentation / big community
!    Java-based (the language we know/use)
!    Widely used (especially lucene)
!    Based on lucene (de-facto standard for full text search in java)
!    Feature-rich (including enterprise features)
!    Extensible (e.g. easy implementation of own tokenizers)
!    Easy to integrate (HTTP, SolrJ client)
!    Easy to setup (java web application)

Most promising option we looked at. Due to very aggressive
  timelines no time consuming research was possible!

                                                                        9
Solr Features in Action
!    Faceting
!    Language specific stemming
!    More Like This
!    Auto-Suggest based on TermComponent
!    Spellchecking
!    Synonyms
!    Stopwords
!    Dynamic fields



                                           10
Data Indexing
                                               !    Use of Delta Import Handler
                                               !    Delta import runs every 10 minutes
               JDBC
MySQL                            Solr Master   !    Full import only done in case schema
Slave   Delta Import Handler                        change requires full index rebuild
                                               !    Index optimized once a day




                      HTTP / REST API
                      Replication Handler




    Solr Slave                    Solr Slave        Solr Slave




                                                                                      11
Solr In Production
!  2 datacenters
!  1 Master + 6 Slaves per datacenter
     Slaves show very low resource consumption. Could go down to 4
     slaves per datacenter while still having 50% overcapacity
!    Master only used for indexing
!    Load balancer in front of slaves
!    Varnish in front of slaves (for dedicated use cases)
!    Working closely with SITE-OPS Team
!    DEV-OPS are part of development process


                                                                     12
Solr 3.1 in Production
!  Solr 3.1 productive since mid of May
!  Not plug and play. Needs migration path as:
  •  Index format has changed
  •  Java-bin format has changed
!  Two major problems:
  •  Bug in spellchecker (SOLR-2462)
    Leads to infinite GC loops
  •  Bug in replication handler (SOLR-2469)
    Leads to growing disk usage as old index files are not removed is
    case “replicateAfter=startup” is used.




                                                                        13
Best Practises
!  Use solr cores right from the beginning
     Allows you to run mutiple indexes on one box in dev and distribute indexes to mutiple boxes in production


!        Use filter queries
!        Use caching (FieldCache, QueryCache, Web Proxy Cache e.g. Varnish or Squid)
!        Tune JVM properly
!        Build search-layer hiding the usage of solr
     SearchCommand cmd = new SearchCommand();

     cmd.setKeywords(“BMW 323“);

     ...

     SearchResult result = searchService.searchActiveAds(cmd);

                   "
     List<Ad> ads = result.getAds();


!  Create a QueryBuilder to ease query building
     SolrQueryBuilder sqb = new SolrQueryBuilder();

     sqb = sqb.freetext("freetext", "BMW").and().in("color", "RED", "BLACK“);

     sqb = sqb.and().not().eq("fuel_type", "GAS").and().lt(“price“, "10000");

     ...

     String query = sqb.build();

     

     (Just an example. Normally filter queries should be used for a query like this!)

                                                                                                                 14
Problems
!  Distance search including sorting
  •  Not supported in previous Solr versions
  •  LocalSolr
    not working with Solr 1.4 final, GC issues, performance issues
  •  Solution:
    Got rid of sort by distance. Implemented own distance search
    based on bounding boxes and simple range queries.
  •  Solved in 3.1
!  Real time updates
!  Deep paging large result sets (SOLR-1726)




                                                                     15
Outlook / Future Plans
!  Migrate further applications to Solr
  Most batch-jobs and customer support tool search against db
  which is getting slower due to growth of data.


!  Evaluate new features of Solr 3.1
   •  Spatial/distance search
   •  New auto-suggest component
   •  Extended dismax query parser




                                                                16
Questions ?




              17
Contact
!  Olaf Zschiedrich
  •  ozschiedrich@ebay-kleinanzeigen.de
  •  ozschiedrich@ebay.com
  •  www.ebay-kleinanzeigen.de




                                          18

More Related Content

What's hot

Managing a SolrCloud cluster using APIs
Managing a SolrCloud cluster using APIsManaging a SolrCloud cluster using APIs
Managing a SolrCloud cluster using APIs
Anshum Gupta
 
Solaris cluster roadshow day 1 technical presentation
Solaris cluster roadshow day 1 technical presentationSolaris cluster roadshow day 1 technical presentation
Solaris cluster roadshow day 1 technical presentationxKinAnx
 
TWJUG August, MySQL JDBC Driver "Connector/J"
TWJUG August, MySQL JDBC Driver "Connector/J"TWJUG August, MySQL JDBC Driver "Connector/J"
TWJUG August, MySQL JDBC Driver "Connector/J"
Ryusuke Kajiyama
 
Sparc solaris servers
Sparc solaris serversSparc solaris servers
Sparc solaris servers
solarisyougood
 
Customer overview oracle solaris cluster, enterprise edition
Customer overview oracle solaris cluster, enterprise editionCustomer overview oracle solaris cluster, enterprise edition
Customer overview oracle solaris cluster, enterprise edition
solarisyougood
 
Solving Performance Problems Using MySQL Enterprise Monitor
Solving Performance Problems Using MySQL Enterprise MonitorSolving Performance Problems Using MySQL Enterprise Monitor
Solving Performance Problems Using MySQL Enterprise Monitor
OracleMySQL
 
KEYNOTE: Solr- Past, Present & Future
KEYNOTE: Solr- Past, Present & Future KEYNOTE: Solr- Past, Present & Future
KEYNOTE: Solr- Past, Present & Future lucenerevolution
 

What's hot (7)

Managing a SolrCloud cluster using APIs
Managing a SolrCloud cluster using APIsManaging a SolrCloud cluster using APIs
Managing a SolrCloud cluster using APIs
 
Solaris cluster roadshow day 1 technical presentation
Solaris cluster roadshow day 1 technical presentationSolaris cluster roadshow day 1 technical presentation
Solaris cluster roadshow day 1 technical presentation
 
TWJUG August, MySQL JDBC Driver "Connector/J"
TWJUG August, MySQL JDBC Driver "Connector/J"TWJUG August, MySQL JDBC Driver "Connector/J"
TWJUG August, MySQL JDBC Driver "Connector/J"
 
Sparc solaris servers
Sparc solaris serversSparc solaris servers
Sparc solaris servers
 
Customer overview oracle solaris cluster, enterprise edition
Customer overview oracle solaris cluster, enterprise editionCustomer overview oracle solaris cluster, enterprise edition
Customer overview oracle solaris cluster, enterprise edition
 
Solving Performance Problems Using MySQL Enterprise Monitor
Solving Performance Problems Using MySQL Enterprise MonitorSolving Performance Problems Using MySQL Enterprise Monitor
Solving Performance Problems Using MySQL Enterprise Monitor
 
KEYNOTE: Solr- Past, Present & Future
KEYNOTE: Solr- Past, Present & Future KEYNOTE: Solr- Past, Present & Future
KEYNOTE: Solr- Past, Present & Future
 

Viewers also liked

Id digital y seguridad en la red presentacion
Id digital y seguridad en la red  presentacionId digital y seguridad en la red  presentacion
Id digital y seguridad en la red presentacionHebe Gargiulo
 
My Background Thierry Verlhiac
My Background Thierry VerlhiacMy Background Thierry Verlhiac
My Background Thierry Verlhiac
verlhiac
 
131023 instrumentos-de-los-parques
131023 instrumentos-de-los-parques131023 instrumentos-de-los-parques
131023 instrumentos-de-los-parquesÀlex Brossa Enrique
 
Vicari Group unleashed
Vicari Group unleashedVicari Group unleashed
Vicari Group unleashed
Vicari Group, Ltd.
 
EBI Presentation 2011
EBI Presentation 2011 EBI Presentation 2011
EBI Presentation 2011
Rod Kimber
 
Massive Social Bookmarking Checklist Regarding Search Engine Optimisation Alo...
Massive Social Bookmarking Checklist Regarding Search Engine Optimisation Alo...Massive Social Bookmarking Checklist Regarding Search Engine Optimisation Alo...
Massive Social Bookmarking Checklist Regarding Search Engine Optimisation Alo...ebooker97
 
Boletin Degremont octubre 2012
Boletin Degremont octubre 2012Boletin Degremont octubre 2012
Boletin Degremont octubre 2012slidesharedgt
 
Manual de usuario firma de documentos excel 2010
Manual de usuario firma de documentos excel 2010Manual de usuario firma de documentos excel 2010
Manual de usuario firma de documentos excel 2010
Security Data
 
Software educativo. sebran abc
Software educativo. sebran abcSoftware educativo. sebran abc
Software educativo. sebran abc
Fabiana Suárez
 
Catálogo de cursos de tantra. Agosto y septiembre 2016
Catálogo de cursos de tantra. Agosto y septiembre 2016Catálogo de cursos de tantra. Agosto y septiembre 2016
Catálogo de cursos de tantra. Agosto y septiembre 2016
Tantra y Amor Consciente
 
Productos tradicionales y denominaciones consagradas por el uso
Productos tradicionales y denominaciones consagradas por el usoProductos tradicionales y denominaciones consagradas por el uso
Productos tradicionales y denominaciones consagradas por el uso
ainia centro tecnológico
 
Threesides CollabIT - Digital Marketing Lifestages
Threesides CollabIT  - Digital Marketing LifestagesThreesides CollabIT  - Digital Marketing Lifestages
Threesides CollabIT - Digital Marketing Lifestages
Threesides Marketing
 
Grupo de presion ccpp
Grupo de presion ccppGrupo de presion ccpp
Grupo de presion ccppsamy2628
 
Big Data: Explore Hadoop and BigInsights self-study lab
Big Data:  Explore Hadoop and BigInsights self-study labBig Data:  Explore Hadoop and BigInsights self-study lab
Big Data: Explore Hadoop and BigInsights self-study lab
Cynthia Saracco
 
Téléassurances
TéléassurancesTéléassurances
Téléassurances
EASYRECRUE
 
Presentacion consultoria empresarial pyme jica
Presentacion consultoria empresarial pyme jicaPresentacion consultoria empresarial pyme jica
Presentacion consultoria empresarial pyme jicaLuis Lopez Acosta
 

Viewers also liked (20)

Id digital y seguridad en la red presentacion
Id digital y seguridad en la red  presentacionId digital y seguridad en la red  presentacion
Id digital y seguridad en la red presentacion
 
My Background Thierry Verlhiac
My Background Thierry VerlhiacMy Background Thierry Verlhiac
My Background Thierry Verlhiac
 
131023 instrumentos-de-los-parques
131023 instrumentos-de-los-parques131023 instrumentos-de-los-parques
131023 instrumentos-de-los-parques
 
Vicari Group unleashed
Vicari Group unleashedVicari Group unleashed
Vicari Group unleashed
 
EBI Presentation 2011
EBI Presentation 2011 EBI Presentation 2011
EBI Presentation 2011
 
Biodanza internacional
Biodanza internacionalBiodanza internacional
Biodanza internacional
 
Massive Social Bookmarking Checklist Regarding Search Engine Optimisation Alo...
Massive Social Bookmarking Checklist Regarding Search Engine Optimisation Alo...Massive Social Bookmarking Checklist Regarding Search Engine Optimisation Alo...
Massive Social Bookmarking Checklist Regarding Search Engine Optimisation Alo...
 
Lil wayne
Lil wayneLil wayne
Lil wayne
 
Boletin Degremont octubre 2012
Boletin Degremont octubre 2012Boletin Degremont octubre 2012
Boletin Degremont octubre 2012
 
Manual de usuario firma de documentos excel 2010
Manual de usuario firma de documentos excel 2010Manual de usuario firma de documentos excel 2010
Manual de usuario firma de documentos excel 2010
 
Software educativo. sebran abc
Software educativo. sebran abcSoftware educativo. sebran abc
Software educativo. sebran abc
 
Catálogo de cursos de tantra. Agosto y septiembre 2016
Catálogo de cursos de tantra. Agosto y septiembre 2016Catálogo de cursos de tantra. Agosto y septiembre 2016
Catálogo de cursos de tantra. Agosto y septiembre 2016
 
Productos tradicionales y denominaciones consagradas por el uso
Productos tradicionales y denominaciones consagradas por el usoProductos tradicionales y denominaciones consagradas por el uso
Productos tradicionales y denominaciones consagradas por el uso
 
Threesides CollabIT - Digital Marketing Lifestages
Threesides CollabIT  - Digital Marketing LifestagesThreesides CollabIT  - Digital Marketing Lifestages
Threesides CollabIT - Digital Marketing Lifestages
 
Grupo de presion ccpp
Grupo de presion ccppGrupo de presion ccpp
Grupo de presion ccpp
 
Pg systems fr f700 fr-f740 fr-f746 mitsubishi
Pg systems fr f700 fr-f740 fr-f746 mitsubishiPg systems fr f700 fr-f740 fr-f746 mitsubishi
Pg systems fr f700 fr-f740 fr-f746 mitsubishi
 
Big Data: Explore Hadoop and BigInsights self-study lab
Big Data:  Explore Hadoop and BigInsights self-study labBig Data:  Explore Hadoop and BigInsights self-study lab
Big Data: Explore Hadoop and BigInsights self-study lab
 
Téléassurances
TéléassurancesTéléassurances
Téléassurances
 
Ligres
LigresLigres
Ligres
 
Presentacion consultoria empresarial pyme jica
Presentacion consultoria empresarial pyme jicaPresentacion consultoria empresarial pyme jica
Presentacion consultoria empresarial pyme jica
 

Similar to Solr @ eBay Kleinanzeigen

Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's Architecture
Tony Tam
 
Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache CassandraCassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra
DataStax Academy
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
Lucidworks (Archived)
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher
lucenerevolution
 
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objectsBacking Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
ITD Systems
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloud
Anshum Gupta
 
Data Science
Data ScienceData Science
Data Science
Ahmet Bulut
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
Rahul Jain
 
Stardog 1.1: Easier, Smarter, Faster RDF Database
Stardog 1.1: Easier, Smarter, Faster RDF DatabaseStardog 1.1: Easier, Smarter, Faster RDF Database
Stardog 1.1: Easier, Smarter, Faster RDF Database
Clark & Parsia LLC
 
Stardog 1.1: An Easier, Smarter, Faster RDF Database
Stardog 1.1: An Easier, Smarter, Faster RDF DatabaseStardog 1.1: An Easier, Smarter, Faster RDF Database
Stardog 1.1: An Easier, Smarter, Faster RDF Database
kendallclark
 
Cvcc performance tuning
Cvcc performance tuningCvcc performance tuning
Cvcc performance tuning
John McCaffrey
 
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Lucidworks
 
Oracle Fusion Middleware - pragmatic approach to build up your applications -...
Oracle Fusion Middleware - pragmatic approach to build up your applications -...Oracle Fusion Middleware - pragmatic approach to build up your applications -...
Oracle Fusion Middleware - pragmatic approach to build up your applications -...
ORACLE USER GROUP ESTONIA
 
Solr 101
Solr 101Solr 101
Solr 101
Findwise
 
mtl_rubykaigi
mtl_rubykaigimtl_rubykaigi
mtl_rubykaigi
Hirotomo Oi
 
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Lucidworks
 
Innovation dank DevOps (DevOpsCon Berlin 2015)
Innovation dank DevOps (DevOpsCon Berlin 2015)Innovation dank DevOps (DevOpsCon Berlin 2015)
Innovation dank DevOps (DevOpsCon Berlin 2015)
Wooga
 
Ncku csie talk about Spark
Ncku csie talk about SparkNcku csie talk about Spark
Ncku csie talk about Spark
Giivee The
 
Scaling Up Machine Learning Experimentation at Tubi 5x and Beyond
Scaling Up Machine Learning Experimentation at Tubi 5x and BeyondScaling Up Machine Learning Experimentation at Tubi 5x and Beyond
Scaling Up Machine Learning Experimentation at Tubi 5x and Beyond
ScyllaDB
 
MySQL Day Paris 2016 - State Of The Dolphin
MySQL Day Paris 2016 - State Of The DolphinMySQL Day Paris 2016 - State Of The Dolphin
MySQL Day Paris 2016 - State Of The Dolphin
Olivier DASINI
 

Similar to Solr @ eBay Kleinanzeigen (20)

Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's Architecture
 
Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache CassandraCassandra Day SV 2014: Spark, Shark, and Apache Cassandra
Cassandra Day SV 2014: Spark, Shark, and Apache Cassandra
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher
 
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objectsBacking Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloud
 
Data Science
Data ScienceData Science
Data Science
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
 
Stardog 1.1: Easier, Smarter, Faster RDF Database
Stardog 1.1: Easier, Smarter, Faster RDF DatabaseStardog 1.1: Easier, Smarter, Faster RDF Database
Stardog 1.1: Easier, Smarter, Faster RDF Database
 
Stardog 1.1: An Easier, Smarter, Faster RDF Database
Stardog 1.1: An Easier, Smarter, Faster RDF DatabaseStardog 1.1: An Easier, Smarter, Faster RDF Database
Stardog 1.1: An Easier, Smarter, Faster RDF Database
 
Cvcc performance tuning
Cvcc performance tuningCvcc performance tuning
Cvcc performance tuning
 
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
 
Oracle Fusion Middleware - pragmatic approach to build up your applications -...
Oracle Fusion Middleware - pragmatic approach to build up your applications -...Oracle Fusion Middleware - pragmatic approach to build up your applications -...
Oracle Fusion Middleware - pragmatic approach to build up your applications -...
 
Solr 101
Solr 101Solr 101
Solr 101
 
mtl_rubykaigi
mtl_rubykaigimtl_rubykaigi
mtl_rubykaigi
 
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
 
Innovation dank DevOps (DevOpsCon Berlin 2015)
Innovation dank DevOps (DevOpsCon Berlin 2015)Innovation dank DevOps (DevOpsCon Berlin 2015)
Innovation dank DevOps (DevOpsCon Berlin 2015)
 
Ncku csie talk about Spark
Ncku csie talk about SparkNcku csie talk about Spark
Ncku csie talk about Spark
 
Scaling Up Machine Learning Experimentation at Tubi 5x and Beyond
Scaling Up Machine Learning Experimentation at Tubi 5x and BeyondScaling Up Machine Learning Experimentation at Tubi 5x and Beyond
Scaling Up Machine Learning Experimentation at Tubi 5x and Beyond
 
MySQL Day Paris 2016 - State Of The Dolphin
MySQL Day Paris 2016 - State Of The DolphinMySQL Day Paris 2016 - State Of The Dolphin
MySQL Day Paris 2016 - State Of The Dolphin
 

More from Lucidworks (Archived)

Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
Lucidworks (Archived)
 
The Data-Driven Paradigm
The Data-Driven ParadigmThe Data-Driven Paradigm
The Data-Driven Paradigm
Lucidworks (Archived)
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Lucidworks (Archived)
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
Lucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
Lucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
Lucidworks (Archived)
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Lucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchLucidworks (Archived)
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Lucidworks (Archived)
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...Lucidworks (Archived)
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Lucidworks (Archived)
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCLucidworks (Archived)
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCLucidworks (Archived)
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCLucidworks (Archived)
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCLucidworks (Archived)
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
Lucidworks (Archived)
 

More from Lucidworks (Archived) (20)

Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
 
The Data-Driven Paradigm
The Data-Driven ParadigmThe Data-Driven Paradigm
The Data-Driven Paradigm
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DC
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
 

Recently uploaded

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 

Solr @ eBay Kleinanzeigen

  • 1. Solr @ eBay Kleinanzeigen Olaf Zschiedrich, eBay Classifieds Group ozschiedrich@ebay-kleinanzeigen.de, 5/25/2011
  • 2. Who I am? !  Olaf Zschiedrich !  eBay Classifieds Group !  Head of Technology @ eBay Kleinanzeigen !  Area of expertise/interest: •  High traffic web-applications •  Agile development •  Java/JEE •  Search technologies 3
  • 3. Agenda !  About eBay Classifieds Group/ebay Kleinanzeigen !  Metrics & Traffic Numbers !  Why Solr? !  Solr Features in Action !  Data Indexing !  Solr in Production !  Best Practices !  Problems !  Outlook !  Questions 4
  • 5. About eBay Classifieds Group online classifieds company in the world 6
  • 6. About eBay Kleinanzeigen !  Typilcal classifieds ad platform (horizontal, local trading) !  Launched 2009 after 4 months of development !  Small agile team (using Scrum) •  12-15 people total •  5-7 developers !  Leverages open source (Spring, Solr, MySQL, ActiveMQ) !  Applications: •  Public website •  Customer support tool •  API (Rest supporting JSON and XML) •  Iphone App (~ 250.000 installations) •  Facebook App 7
  • 7. Metrics & Traffic Numbers !  Site metrics: •  ~ 3.2 M active ads •  16 – 24 M PVs per day •  Peak hours = 1.8 M PVs (~ 500 PVs per second) !  Solr request metrics: •  ~ 60 M requests per day •  Peak hours = ~ 1500 request per second !  Avg. response time •  20 ms (search) and 3 ms for auto-suggest Site is rapidly growing !!! 8
  • 8. Why Solr !  Open Source !  Good documentation / big community !  Java-based (the language we know/use) !  Widely used (especially lucene) !  Based on lucene (de-facto standard for full text search in java) !  Feature-rich (including enterprise features) !  Extensible (e.g. easy implementation of own tokenizers) !  Easy to integrate (HTTP, SolrJ client) !  Easy to setup (java web application) Most promising option we looked at. Due to very aggressive timelines no time consuming research was possible! 9
  • 9. Solr Features in Action !  Faceting !  Language specific stemming !  More Like This !  Auto-Suggest based on TermComponent !  Spellchecking !  Synonyms !  Stopwords !  Dynamic fields 10
  • 10. Data Indexing !  Use of Delta Import Handler !  Delta import runs every 10 minutes JDBC MySQL Solr Master !  Full import only done in case schema Slave Delta Import Handler change requires full index rebuild !  Index optimized once a day HTTP / REST API Replication Handler Solr Slave Solr Slave Solr Slave 11
  • 11. Solr In Production !  2 datacenters !  1 Master + 6 Slaves per datacenter Slaves show very low resource consumption. Could go down to 4 slaves per datacenter while still having 50% overcapacity !  Master only used for indexing !  Load balancer in front of slaves !  Varnish in front of slaves (for dedicated use cases) !  Working closely with SITE-OPS Team !  DEV-OPS are part of development process 12
  • 12. Solr 3.1 in Production !  Solr 3.1 productive since mid of May !  Not plug and play. Needs migration path as: •  Index format has changed •  Java-bin format has changed !  Two major problems: •  Bug in spellchecker (SOLR-2462) Leads to infinite GC loops •  Bug in replication handler (SOLR-2469) Leads to growing disk usage as old index files are not removed is case “replicateAfter=startup” is used. 13
  • 13. Best Practises !  Use solr cores right from the beginning Allows you to run mutiple indexes on one box in dev and distribute indexes to mutiple boxes in production !  Use filter queries !  Use caching (FieldCache, QueryCache, Web Proxy Cache e.g. Varnish or Squid) !  Tune JVM properly !  Build search-layer hiding the usage of solr SearchCommand cmd = new SearchCommand();
 cmd.setKeywords(“BMW 323“);
 ...
 SearchResult result = searchService.searchActiveAds(cmd);
 " List<Ad> ads = result.getAds(); !  Create a QueryBuilder to ease query building SolrQueryBuilder sqb = new SolrQueryBuilder();
 sqb = sqb.freetext("freetext", "BMW").and().in("color", "RED", "BLACK“);
 sqb = sqb.and().not().eq("fuel_type", "GAS").and().lt(“price“, "10000");
 ...
 String query = sqb.build();
 
 (Just an example. Normally filter queries should be used for a query like this!) 14
  • 14. Problems !  Distance search including sorting •  Not supported in previous Solr versions •  LocalSolr not working with Solr 1.4 final, GC issues, performance issues •  Solution: Got rid of sort by distance. Implemented own distance search based on bounding boxes and simple range queries. •  Solved in 3.1 !  Real time updates !  Deep paging large result sets (SOLR-1726) 15
  • 15. Outlook / Future Plans !  Migrate further applications to Solr Most batch-jobs and customer support tool search against db which is getting slower due to growth of data. !  Evaluate new features of Solr 3.1 •  Spatial/distance search •  New auto-suggest component •  Extended dismax query parser 16
  • 17. Contact !  Olaf Zschiedrich •  ozschiedrich@ebay-kleinanzeigen.de •  ozschiedrich@ebay.com •  www.ebay-kleinanzeigen.de 18