Your SlideShare is downloading. ×
0
Introduction to Apache Solr.
Introduction to Apache Solr.
Introduction to Apache Solr.
Introduction to Apache Solr.
Introduction to Apache Solr.
Introduction to Apache Solr.
Introduction to Apache Solr.
Introduction to Apache Solr.
Introduction to Apache Solr.
Introduction to Apache Solr.
Introduction to Apache Solr.
Introduction to Apache Solr.
Introduction to Apache Solr.
Introduction to Apache Solr.
Introduction to Apache Solr.
Introduction to Apache Solr.
Introduction to Apache Solr.
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Introduction to Apache Solr.

13,814

Published on

Slides of my Tech Talk on Apache Solr, at BarCamp 5, Chennai.

Slides of my Tech Talk on Apache Solr, at BarCamp 5, Chennai.

Published in: Technology
0 Comments
10 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
13,814
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
353
Comments
0
Likes
10
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Barcamp 5, Chennai Apache Solr – I can haz Search! Ashish Yadav (ashish_0x90)
  • 2. Agenda <ul><li>Overview of Apache Solr </li></ul><ul><li>Why Solr? </li></ul><ul><li>Installing Apache Solr </li></ul><ul><li>Getting Solr configuration right. </li></ul><ul><li>Solr query basics and not so basic stuff. </li></ul><ul><li>Scaling Solr </li></ul><ul><li>Some tips on Solr Caching </li></ul>
  • 3. Overview <ul><li>Apache Solr is a standalone full-text search server with Apache Lucene at the backend. </li></ul><ul><li>Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. </li></ul><ul><li>In brief Apache Solr exposes Lucene's JAVA API as REST like API's which can be called over HTTP from any programming language/platform. </li></ul>
  • 4. Features <ul><li>Full Text Search </li></ul><ul><li>Faceted navigation </li></ul><ul><li>More items like this(Recommendation)/ Related searches </li></ul><ul><li>Spell Suggest/Auto-Complete </li></ul><ul><li>Custom document ranking/ordering </li></ul><ul><li>Snippet generation/highlighting </li></ul><ul><li>And a lot More.... </li></ul>
  • 5. So, why would “I” need solr?? <ul><li>Want Greater control over your website search. </li></ul><ul><li>Caching, Replication, Distributed search. </li></ul><ul><li>Reallly fast Indexing/Searching, Indexes can be merged/optimized (Index compaction). </li></ul><ul><li>Great admin interface can be used over HTTP. </li></ul><ul><li>Awesome community support too. </li></ul><ul><li>Support for integration with various other products like drupal CMS, etc. </li></ul>
  • 6. Products using Solr <ul><li>E-commerce sites, CMS, Blog sites. </li></ul><ul><li>Heavily used by LinkedIn, Twitter, Cnet, Netflix, Digg. </li></ul><ul><li>Many of them contribute back, like LinkedIN SNA(Search, Network, and Analytics team) </li></ul>
  • 7. Installation <ul><li>Minimum Requirements. </li></ul><ul><li>Directory for storing index files. </li></ul><ul><li>Directory for storing configuration files. </li></ul><ul><li>Solr_Home having other dependencies </li></ul><ul><li>A Servlet container(tomcat, jetty) </li></ul><ul><li>with appropriate configuration. </li></ul>
  • 8. Configuring Solr <ul><li>Schema.xml – Contains all of the details about document structure, index-time and query-time processing. </li></ul><ul><li>Solrconfig.xml - Contains most of the parameters for configuring Solr itself. </li></ul>
  • 9. Querying Solr: The basics <ul><li>Plain text search </li></ul><ul><li>q = text:&quot;I love android&quot; </li></ul><ul><li>Expanding search to more fields : </li></ul><ul><li>title:android & type:review & price:[* To 500] </li></ul><ul><li>Add facets </li></ul><ul><li>facet.field=product & facet.field=rating </li></ul>
  • 10. Querying Solr: The basics <ul><li>Add facets for range queries </li></ul><ul><li>facet.query=price:[* TO 100]&facet.query=price:[100 TO 200]&facet.query=price:[500 TO *] </li></ul><ul><li>Ordering results </li></ul><ul><li>sort = score desc, price asc </li></ul><ul><li>Limiting results </li></ul><ul><li>rows=15 </li></ul><ul><li>Paginating on results </li></ul><ul><li>start=25 & rows=10 </li></ul>
  • 11. Querying Solr - Not so basics stuff <ul><li>Advanced Query operators: </li></ul><ul><li>fq : FilterQuery , Example: fq = type:review & price:[* TO 500] </li></ul><ul><li>fl : Restrict fields to be returned with the resultset. </li></ul><ul><li>Example: fl=id,title,text </li></ul>
  • 12. Querying Solr - Not so basics stuff <ul><li>hl : Highlighting matches in snippet, Snippet generation etc. </li></ul><ul><li>Example query : hl=true&hl.fl=title,text </li></ul><ul><li>Custom Field boosting </li></ul><ul><li>Example: q=product:samsung&text:awesome & defType=dismax & qf=product^20.0+text^0.3 </li></ul><ul><li>debug = true </li></ul>
  • 13. Solr Search Custom handlers <ul><li>Request Handlers </li></ul><ul><li>DataImportHandler, DisMaxHandler </li></ul><ul><li>Response Writers </li></ul><ul><li>json,xml,csv format writers </li></ul>
  • 14. External Search Components <ul><li>SpellCheckComponent : </li></ul><ul><li>Uses solr indexes, Custom dictionaries etc. </li></ul><ul><li>More Like this - (Term Suggest, Similar items etc.) </li></ul><ul><li>Clustering component </li></ul><ul><li>TermVector Component </li></ul><ul><li>Returns advanced information about Query terms, offset, positions </li></ul><ul><li>Query Elevation Component - Sponsored Results </li></ul>
  • 15. Scaling Solr (I feel the Need for Speed >>>> ) <ul><li>Distributed Search a.k.a Sharding. </li></ul><ul><li>Create Separate indexes(Rsync/Scp) </li></ul><ul><li>OR </li></ul><ul><li>Can run Solr index Replication daemon. </li></ul><ul><li>Optimization/Autocommit for the indexes. </li></ul>
  • 16. Solr Caching <ul><li>Build your queries wisely. </li></ul><ul><li>External Caching : Memcached, etc. </li></ul><ul><li>Internal Caching </li></ul><ul><li>Different types of cache: </li></ul><ul><li>1) FilterCache: Used by facetQueries(fq), sometimes for faceting too. </li></ul><ul><li>2) QueryResultCache : Used for results returned by generic queries </li></ul>
  • 17. Links and resources <ul><li>http://wiki.apache.org/solr/ </li></ul><ul><li>http://www.lucidimagination.com/developer/Articles </li></ul><ul><li>http://khaidoan.wikidot.com/solr </li></ul><ul><li>http://42bits.wordpress.com </li></ul>Links and resources
  • 18. Thanks! This talk wouldn't have been possible without the support from Paypal and Apache Solr project. <ul><li>Questions ? </li></ul>

×