Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Intro to Apache Solr


Published on

Learn the capabilities of Apache Solr, including how to run in standalone and cloud mode as well as how to contribute

Published in: Software
  • Be the first to comment

Intro to Apache Solr

  1. 1. Apache Solr Introduction & Demo
  2. 2. • What is Apache Solr? • Start/stop Solr • Indexing data to Solr • Searching data • Running a SolrCloud cluster • Hacking Solr Agenda
  3. 3. • Lucene based search server + other features • Access Lucene over HTTP: • Java, Python, Ruby, .NET, PHP over XML/JSON and other formats • Faceting (guided navigation), suggestions, highlighting etc. • Replication and distributed search • Lucene best practices What is Apache Solr?
  4. 4. • Extract: • tar xvf solr-5.1.0.tgz (linux/mac) • unzip or click+extract (windows) • Run: • ./bin/solr start -e schemaless • ./bin/solr start -e schemaless -p 8983 • ./bin/solr -help • ./bin/solr start -help • Stop: • ./bin/solr stop Running Solr
  5. 5. • ./bin/post script • Using curl directly • Using the Admin UI • SolrJ and other indexing clients Indexing data
  6. 6. Demo time
  7. 7. Inverted index
  8. 8. • +red +shoes = red AND shoes • +shoes -red = shoes NOT red • “android phone” • “android phone” -samsung = “android phone” NOT samsung “android samsung”~4 • merced* • createDate:[201301 TO 201401] • author:shalin • author:”shalin mangar” • author:”shalin mangar” AND project:(lucene OR solr) title:samsung^5 category:phone Lucene/Solr query syntax
  9. 9. • DataImportHandler: Index databases, Email, RSS, XMLs etc. • Rich document support: PDF, MS Office, Images etc. • Faceting, stats, analytics • Replication for high query volume • Production systems with billions of documents • Very extensible and customizable • Embedded in commercial search products from Lucidworks, DataStax, Cloudera, Hortonworks, Pivotal, Amazon Cloudsearch, Riak etc. Other features of Solr
  10. 10. • Subset of optional features in Solr to enable and simplify horizontal scaling a search index using sharding and replication • Goals: scalability, performance, high-availability, simplicity, and elasticity What is SolrCloud?
  11. 11. • ./bin/solr -e cloud • Yeah, it’s that simple! Running SolrCloud
  12. 12. SolrCloud demo
  13. 13. • • Pre-requisites: • git: git clone lucene-solr.git • github: fork and clone apache/lucene-solr • ant 1.8.x or above • Eclipse or Intellij Idea (I recommend Idea) • Put svn/git and ant in your $PATH or %PATH% Hacking Solr
  14. 14. • ant ivy-bootstrap (required only once) • ant idea or ant eclipse (generated a complete project for you which you can open in your favourite IDE) • Find an existing Jira issue or open a new one at http:// • Make changes, write tests, once finished: • run ‘cd solr; ant server’ to build Solr and start via bin/solr scripts • run ‘ant test’ (it can take a while), ensure all tests pass • run ‘ant precommit’, (run from the checkout root) ensure it passes • Generate a patch with ‘svn diff’ or ‘git diff’ and attach to Jira Hacking Solr
  15. 15. • • Apache+Solr+Reference+Guide • • Ask me: • Ask other users: • Ask developers: (use sparingly) Resources
  16. 16. Thank you Shalin Shekhar Mangar,