Solr in 5... minutes
     DOSUG Ignite Night
      February 2, 2010

       Mike Brevoort
    Avalon Consulting LLC
It was religion...

         that brought me to Solr
•Created by Yonik Seeley for CNET
•Contributed to Apache Jan 2006
•Version 1.4 released Nov 2009
Lucene = engine

      Lucene is a high-
      performance text
      search engine
      library
Solr = Serverlization of
       Lucene++


          •Exposed over HTTP,
           REST-like interface
          •Java Web Application
Basic Config
•schema.xml        •solrconfig.xml
 •field types and    •Lucene index
   fields              parameters

 •*dynamic fields    •request handler
                      mappings
                    •cache settings
                    •plugins
Indexing Data - HTTP Post




    •Commit/Rollback
    •Global modification state
Data Import Handler

•Index data
 from database
 or HTTP GET
•Full and
 incremental
 indexing
Index Binary
                Documents

•ExtractingRequestHandler a.k.a.
    “Solr Cell”
•MS Office, PDF, RTF, OpenDocument,
    Images, MP3, Zip, etc.



curl 'http://localhost:8983/solr/update/extract?literal.id=doc1&commit=true' -F
                              "myfile=@resume.pdf"
Searching
•http://localhost:8983/solr/select?
  q=query
 •&start=50
 •&rows=25
 •&fq=filter+query
 •&facet=on&facet.field=category
 •&sort=dist(2, point1, point2) desc
            *coming in Solr 1.5
Results
•default format
  is XML
•&wt=json
•&wt=php
•&wt=ruby
•&wt=python
  *wt = writer type
Query Parser
•&deftype=lucene (default)
 •   q=title:saint* AND zipcode:[80000 TO 81999]

 •   advanced syntax


•&deftype=dismax
 •   q=ipod +shuffle -touch

 •   simplified syntax

 •   ideal for processing query string from user
Facets (guided navigation)
&facet=on
&facet.field=listingTraditions
Spell Checking



Highlighting



More Like This
  q=saint&mlt=true&mlt.fl=title_t


                                   for each result,
                                   returns similar results
                                   based on &mlt.fl
SolrJs

•JQuery
 Widgets
 Framework
•Solr
 Powered UI
APIs
•HTTP GET/POST
•JSON
•SolrJ (java)
•ruby, python, PHP, C#
•Integrations: Drupal, Rails, Grails
  (workin’ on it), etc.
Security
•(listen for crickets)
•Relies on server and container
  security
•TOTALLY OPEN BY DEFAULT - it’s
  up to you to secure it
•No standard document level security
  model
Scaling: master/
         slave
•Index +
 configuration
 replication
•Load balanced
 queries
•Supported
 OOTB
Scaling: sharding
•Massive
 Indexes
•Relevancy per
 index and
 merged
•Some features
 not supported
Resources
Resources
 • http://lucene.apache.org/solr/
 • solr-user@lucene.apache.org


              Mike Brevoort |
       brevoortm@avalonconsult.com
            twitter: @mbrevoort

SOLR