Start Your Search Engines: Optimizing Solr to Improve Results

4,286 views
4,091 views

Published on

Advanced troubleshooting and optimization techniques to improve solr search results.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
4,286
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
25
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • • Complex migration and integration projects are a backbone of our company• ExactTarget Gold Partner with full integration between ExactTarget and Magento• We’ve built e-commerce sites ground up, handled complicated product catalog migrations for large B2B companies, and integrated email, ecommerce, digital experience, and business analytic solutions for B2C retail companies. 
  • For the next hour we will be speaking about the integrations of Solr and Magento and making the setup work best for your ecommerce site.Today we are going to go over more advanced topics such as:Basic Troubleshooting-Useful Solr tools and Common problems and solutions.Advanced optimization of search results.-Making changes in Solr configuration to better your results. In the previous presentation we covered modifications direclty in Magento. Today we will be covering changes done to Solr.Improving search speed-Optimizing Magento to improve search.
  • Solr is an open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document handling.Magento Enterprise integrates with Solr right out of the box.
  • We did a more in depth introduction of Solr September of 2011. You can watch the full video by going to the URL displayed or by going to Magento's webinar section on their website. It covers setup, indexing, and fine tuning search results through Magento.
  • So now let us go over useful Solr tools and Common problems and solutions.
  • Web Interface (5 minutes)Schema fileIf you make file changes you can confirm Solr has loaded them by looking for them in this file.Show config fileIf you make file changes you can confirm Solr has loaded them by looking for them in this file.Schema BrowserNumber of docs in the indexActual indexed fields and some statistics about them.Ping URLThe URL used to test if Solr is running properly.Solr StatsRequest handlers used and other high level stats and configurations.readDir pathLuke (5 minutes)Lucene Index BrowserTokenized terms for searchCommand Line (5 minutes)Show logs during indexShow logs during query
  • Do you have the right URL and port?For example the default port for Tomcat for 8080 and Jetty is 8983.Show test button.What the button actually does.Ping URL to Solr and the response.PHP Setting to fix it and why. (90% of the time it's fixed by this.)
  • What the problem is…Bad data, Solr not committing changes.Final commit vs Partial commit.How to diagnose this issue. (Tailing the log look for rollback)It tells which product ID has critical error.
  • Direct configuration changes in Solr to better suite you business needs.There are two different types of settings in Solr: Query time and Index time.Query time settings are settings that take effect when a Query is ran. These do not require a reindex of data.Index time settings are used during index, if a change is made to index time setting then you must reindex to see the changes take place.
  • When dealing with queries there are 3 types of "clauses" that Lucene knows about: mandatory, prohibited, and 'optional' (aka: "SHOULD") By default all words or phrases specified in the "q" param are treated as "optional" clauses unless they are preceeded by a "+" or a "-". When dealing with these "optional" clauses, the "mm" option makes it possible to say that a certain minimum number of those clauses must match (mm). Specifying this minimum number can be done in complex ways, equating to ideas like...   At least 2 of the optional clauses must match, regardless of how many clauses there are: "2"At least 75% of the optional clauses must match, rounded down: "75%"  If there are less than 3 optional clauses, they all must match; if there are 3 or more, then 75% must match, rounded up: "2<-25%"  If there are less than 3 optional clauses, they all must match; for 3 to 5 clauses, one less than the number of clauses must match, for 6 or more clauses, 80% must match, rounded down: "2<-1 5<80%"This is modified in the query time configuration file solrconfig.xmlThis setting is language specific.No reindex will be needed
  • Perhaps there will be a situation when products will need to be promoted in your search or boosted. With Solr's "Boost Query" parameter this can easily be accomplished.This is modified in the query time configuration file solrconfig.xmlThis setting is language specific.No reindex will be needed
  • How Magento and Solr Work together-First of all it's not Solr it's Magento-Solr returns product IDs not data. Magento does the data grabber
  • Checking query time logging for the "Q" time in milliseconds.Solr optimization that we do not have time to cover here. Go to: http://wiki.apache.org/solr/SolrPerformanceFactors
  • Make sure you have the most recent version of MySQLMake sure your MySQL settings are tuned per Magento's recommendations.Use the Memory (HEAP) storage engine for temp tables.Leverage MySQL query caching as recommended by Magento.
  • Start Your Search Engines: Optimizing Solr to Improve Results

    1. 1. SOLR Facts 65% of IT organizations were able to reduce the costs of developing and deploying their search application by 50% or more as a result of using SOLR Source: Survey of 26 Solr/Lucene users conducted by TechValidate 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 1
    2. 2. SOLR Facts 43% of IT organizations index or update 1,000,001 to 5,000,000 or more documents each week with SOLR. 43% Source: Survey of 26 Solr/Lucene users conducted by TechValidate 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 2
    3. 3. SOLR Facts “We were able to decrease risk by allowing our catalog of 6 million- plus items and 50 million user profiles to be searched well beyond the possibilities with MySQL.” Source: Executive, Small Business Computer Software Company 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 3
    4. 4. SOLR Facts “With SOLR’s Dis-Max query parser, we were able to drastically increase the relevance of returned search results.” Source: IT Architect, Small Business Media & Entertainment Company 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 4
    5. 5. Click to edit Master title style SOLR Optimizing SOLR to Improve Search Presented by: Rob Miller, Jason Grim & Ryan Street 08/15/2012© 2011 Crown Partners. All Rights2011 Crown Partners. All Rights Reserved. 10/23/2012 10/23/201 © Reserved. 5 5
    6. 6. About Crown Certified Magento Development Team SOLR experts in Advanced Search Integration between SOLR and Magento, ERP 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 6
    7. 7. Agenda Overview of SOLR Basic Solr Troubleshooting – Common SOLR Troubleshooting and Solutions Advanced optimization of search results – Making changes in Solr configuration to better your results. Improving search speed – Optimizing to improve search speed. 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 7
    8. 8. 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 8
    9. 9. Crown’s First SOLR Webinar Crown’s SOLR 1.0 Webinar – Support for Spelling/Synonyms/Stop Words – Improved Layered Navigation – September 21, 2011 – http://bit.ly/solrmagentowebinar 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 9
    10. 10. Basic Troubleshooting10/23/2012 © 2011 Crown Partners. All Rights Reserved. 10
    11. 11. Useful SOLR Tools Web Interface Luke Command Line 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 11
    12. 12. Magento Cannot Connect to SOLR Do you have the right URL and Port? Does Your server communicate SOLR and Magento? 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 12
    13. 13. Magento and Solr Show Bad or No Results Bad Data Change from Final to Partial Commit Look into the command line for critical errors during index 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 13
    14. 14. Where to find more answers… Magento Forums – http://www.magentocommerce.com/boards/ Magento Answers – http://www.magentocommerce.com/answers/welcome Dr. Gento – http://www.drgento.com And of course… Crown! – http://www.crownpartners.com/ 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 14
    15. 15. Advanced Optimization of Search Results10/23/2012 © 2011 Crown Partners. All Rights Reserved. 15
    16. 16. Configuration: Minimum Must Match Must Match Formats – 2 – 75% – 2<-25% – 2<-1 5<80% Setting is language specific Will NOT require Reindex (Query time parameter) 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 16
    17. 17. Query Boosting Results Boost individual product attributes Query time configuration Language specific 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 17
    18. 18. Where to Find More Answers Apache’s Wiki – http://wiki.apache.org/solr/ Dr. Gento – http://www.drgento.com And of course… Crown! – http://www.crownpartners.com/ 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 18
    19. 19. Improving Search Speed10/23/2012 © 2011 Crown Partners. All Rights Reserved. 19
    20. 20. SOLR and Magento Relationship User submits a search query Magento connects to SOLR and sends over query SOLR processes query and returns Magento Product IDs Magento loads the product IDs and displays them to the user 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 20
    21. 21. Is SOLR the Problem? Check qtime of a query – /select params={…} hits=79 status=0 QTime=48 Solr Performance Enhancements – http://wiki.apache.org/solr/SolrPerformanceFactors 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 21
    22. 22. MySQL Optimization Update your version of MySQL to the latest version Make sure your MySQL settings are tuned per Magento’s recommendations – http://www.magentocommerce.com/whitepaper/ Using the Memory (HEAP) Storage Engine for Temp Tables – http://dev.mysql.com/doc//refman/5.0/en/memory-storage-engine.html Leverage MySQL query caching 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 22
    23. 23. Where to Find More Answers Magento Forums – http://www.magentocommerce.com/boards/ Magento U Performance and Optimization for System Administrators – http://www.magentocommerce.com/services/training Dr. Gento – http://www.drgento.com And of course… Crown! – http://www.crownpartners.com/ 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 23
    24. 24. Questions?10/23/2012 © 2011 Crown Partners. All Rights Reserved. 24
    25. 25. Thank You! Rob Miller rmiller@crownpartners.com Jason Grim jgrim@crownpartners.com Ryan Street rstreet@crownpartners.com10/23/2012 © 2011 Crown Partners. All Rights Reserved. 25

    ×