Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to Apache Solr

1,820 views

Published on

Presented at the 4th Bangalore Apache Solr/Lucene Meetup Group

Published in: Software, Technology
  • Be the first to comment

Introduction to Apache Solr

  1. 1. Introduction to Apache Lucene/Solr Shalin Shekhar Mangar
  2. 2. 4th Bangalore Lucene/Solr Meetup 19th April 2014 Who am I? ● Apache Lucene/Solr Committer and PMC member ● Contributor since January 2008 ● Currently: Engineer at LucidWorks ● Formerly with AOL ● Email: shalin@apache.org ● Twitter: shalinmangar ● Blog: http://shal.in
  3. 3. 4th Bangalore Lucene/Solr Meetup 19th April 2014 Apache Lucene ● http://lucene.apache.org/java ● Java based API for adding search and indexing to your applications ● High performance indexing – over 150GB/hour on modern hardware ● Fast and efficient scoring and indexing algorithms ● Support for multiple query types, hit highlighting, faceting, joins, grouping, typo-tolerant suggestions and multiple languages ● Most widely deployed search library on the planet
  4. 4. 4th Bangalore Lucene/Solr Meetup 19th April 2014 Apache Lucene Work Pipeline
  5. 5. 4th Bangalore Lucene/Solr Meetup 19th April 2014 Inverted Index
  6. 6. 4th Bangalore Lucene/Solr Meetup 19th April 2014 Lucene Query Syntax ● +red +shoes = red AND shoes ● +shoes -red = shoes NOT red ● “android phone” ● “android phone” -samsung = “android phone” NOT samsung ● “android samsung”~4 ● merced* ● createDate:[201301 TO 201401] ● author:shalin ● author:”shalin mangar” ● author:”shalin mangar” AND project:(lucene OR solr) ● title:samsung^5 category:phone
  7. 7. 4th Bangalore Lucene/Solr Meetup 19th April 2014 Apache Solr ● http://lucene.apache.org/solr ● Lucene based search server + other features ● Access Lucene over HTTP: – Java, Ruby, Python, .NET, PHP over XML/JSON and other formats ● Most programming tasks in Lucene are configuration tasks in Solr ● Faceting (guided navigation, filters etc) ● Replication and distributed search ● Lucene best practices
  8. 8. 4th Bangalore Lucene/Solr Meetup 19th April 2014 Other features ● Data Import Handler – Index Databases, Mails, RSS, XMLs etc. ● Rich document support – PDF, MS Office, Images etc ● Replication for high query volume ● Distributed search for large indexes – Production systems with 1B+ documents ● Very extensible and customizable – Embedded in commercial search products from LucidWorks, DataStax, Cloudera, Hortonworks, Amazon CloudSearch and Riak
  9. 9. 4th Bangalore Lucene/Solr Meetup 19th April 2014 Apache Solr
  10. 10. 4th Bangalore Lucene/Solr Meetup 19th April 2014 Where does Solr fit?
  11. 11. 4th Bangalore Lucene/Solr Meetup 19th April 2014 Solr block diagram
  12. 12. 4th Bangalore Lucene/Solr Meetup 19th April 2014 /select?q=video&sort=price desc&fl=name,id,price&wt=json&indent=on
  13. 13. 4th Bangalore Lucene/Solr Meetup 19th April 2014 /select?q=video+card&fl=name,id&hl=true&hl.fl=name,features
  14. 14. 4th Bangalore Lucene/Solr Meetup 19th April 2014 /select?wt=json&indent=on&q=*:*&fl=name&facet=true&facet.field=cat
  15. 15. 4th Bangalore Lucene/Solr Meetup 19th April 2014 Indexing data using SolrJ – The official Solr Java client
  16. 16. 4th Bangalore Lucene/Solr Meetup 19th April 2014 Searching using SolrJ – The official Solr Java client
  17. 17. 4th Bangalore Lucene/Solr Meetup 19th April 2014
  18. 18. 4th Bangalore Lucene/Solr Meetup 19th April 2014 Bangalore Baby Apache Solr Meetup Group ● http://www.meetup.com/Bangalore-Baby-Apache-Solr-Group/ ● Already had one successful meetup ● Great tutorial + hands-on workshop ● Must join for all new comers ● Planning to have another meetup next month
  19. 19. 4th Bangalore Lucene/Solr Meetup 19th April 2014 Thank you Shalin Shekhar Mangar LucidWorks

×