Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Getting Started with Solr

555 views

Published on

Presentation at FOSSETCON 2015
http://www.fossetcon.org/2015/sessions/getting-started-solr-open-source-search-platform-0

Solr is a very popular open source search engine which builds upon the capabilities of Lucene. It's the perfect tool to index loads of text and make it easily searchable. And it's very fast!

Powerful features such as facets, typeahead, and "did you mean" help your users to quickly navigate through a very large dataset and find what they're looking for.

A REST-style JSON interface makes it language-agnostic, you can even work with it straight from the command line using curl!

A flexible plugin mechanism lets you augment your searches with complementary tools such as rich document parsing, text analysis, or your own custom code.

In this session, learn the basics of making your content searchable with Solr.

Published in: Software

Getting Started with Solr

  1. 1. Getting Started with Solr an open source search platform Travis Carlson http://tcarlson.systems
  2. 2. Search Engine Concepts ● Documents & Fields ● Inverted Index ● Analysis / Tokenization ● Precision vs. Recall ● Faceting
  3. 3. Document Model Solr indexes documents which contain fields (like a NoSQL database)
  4. 4. Diagram from https://developer.apple.com/library/mac/documentation/UserExperience/Conceptual/SearchKitConcepts/searchKit_basics/searchKit_basics.html Inverted Index
  5. 5. Diagram from https://developer.apple.com/library/mac/documentation/UserExperience/Conceptual/SearchKitConcepts/searchKit_basics/searchKit_basics.html How is it so fast?
  6. 6. Analysis / Tokenization
  7. 7. Index Query Analysis / Tokenization
  8. 8. Precision vs. Recall There is generally a tradeoff to be made between false positives and false negatives
  9. 9. Precision vs. Recall
  10. 10. Faceting
  11. 11. Let’s take a tour
  12. 12. Getting Started Start up Solr and create a new “core” ./bin/solr start ./bin/solr create -c films -d basic_configs Customize the schema for our document type edit server/solr/films/conf/schema.xml ./bin/solr restart Import our documents ./bin/post -c films example/films/films.json Open the Admin UI to run some test searches open http://localhost:8983/solr/#/films/query <field name="name" type="text_en"/> <field name="directed_by" type="text_general" multiValued="true"/> <field name="initial_release_date" type="tdate"/> <field name="genre" type="text_en" multiValued="true"/> <field name="genre-facet" type="string" multiValued="true" stored=”false”/> <copyField source="genre" dest="genre-facet"/>
  13. 13. Example Searches Directed by Steven Spielberg or John Singleton: http://localhost:8983/solr/films/select?q=directed_by%3A(spielberg+OR+singleton)&fl=name Faceted by genre: http://localhost:8983/solr/films/select?q=*%3A*&facet=true&facet.field=genre-facet&fl=id Thriller movies since 2010: http://localhost:8983/solr/films/select?q=genre%3Athriller+AND+initial_release_date%3A%5B2010-01-01T00%3A00%3A00Z+TO+NOW%5D
  14. 14. APIs make things easier SolrClient solr = new HttpSolrClient("http://localhost:8983/solr/films"); SolrQuery query = new SolrQuery(); query.setQuery("directed_by:(spielberg OR singleton)"); query.setFields("name"); SolrDocumentList list = solr.query(query).getResults();
  15. 15. Going Further ● Custom Request Handlers ● Field Boosts ● Matching Excerpts (highlighting) ● Auto-suggest / Spellcheck / “Did you mean” ● “More like this” ● Geospatial queries
  16. 16. Going Further Download http://lucene.apache.org/solr/ Reference Guide https://cwiki.apache.org/confluence/display/solr
  17. 17. Open Source http://wiki.apache.org/solr/HowToContribute svn checkout http://svn.apache.org/repos/asf/lucene/dev/trunk Apache License, Version 2.0 https://issues.apache.org/jira/browse/SOLR

×