SlideShare a Scribd company logo
1 of 50
Download to read offline
Solr
   Search at the Speed of Light


          JavaZone 2009
           September 10
               Oslo
  Erik Hatcher, Lucid Imagination
erik.hatcher@lucidimagination.com




                                    1
Solr History

     • Created by Yonik Seeley for CNET
     • Contributed to Apache in January 2006
     • December 2006:Version 1.1 released
     • June 2007:Version 1.2 released
     • September 2008:Version 1.3 released
     • ~September 2009:Version 1.4
http://lucene.apache.org/solr
    © 2008-2009          Lucid Imagination, Inc.
                                                   2
Solr: Big Picture
                                   Data


                                                       DB


              Document
               Document
                 Documents




                                Solr




                               Search Results




© 2008-2009                  Lucid Imagination, Inc.
                                                            3
Features

 • Lucene power exposed over HTTP
 • Scalability: caching, replication, distributed
      search
 • Faceting
 • And more: spell checking, highlighting,
      clustering, rich document and DB indexing,
      "more like this"


© 2008-2009            Lucid Imagination, Inc.
                                                    4
Lucene

 • Fast, scalable search library
 • Lucene index structure
  • Index contains documents
    • documents have fields
      • indexed fields have terms

© 2008-2009        Lucid Imagination, Inc.
                                             5
Inverted Index

 • Commonly used search
      engine data structure
 • Efficient lookup of terms
      across large number of
      documents
 • Usually stores positional
      information to enable From "Taming Text" by Grant Ingersoll and Tom Morton
      phrase/proximity queries


© 2008-2009                     Lucid Imagination, Inc.
                                                                                   6
Analysis Process




© 2008-2009         Lucid Imagination, Inc.
                                              7
Analyzing the analyzer
                    Example phrase

      The quick brown fox jumps over the lazy dog.




© 2008-2009            Lucid Imagination, Inc.
                                                     8
WhitespaceAnalyzer
                Simplest built-in analyzer
      The quick brown fox jumps over the lazy dog.




  [The] [quick] [brown] [fox] [jumps] [over] [the]
                    [lazy] [dog.]

© 2008-2009             Lucid Imagination, Inc.
                                                     9
SimpleAnalyzer
          Lowercases, splits at non-letter boundaries
      the quick brown fox jumps over the lazy dog.




  [the] [quick] [brown] [fox] [jumps] [over] [the]
                    [lazy] [dog]

© 2008-2009               Lucid Imagination, Inc.
                                                        10
StopAnalyzer
              Lowercases and removes stop words


      The quick brown fox jumps over the lazy dog.




 [quick] [brown] [fox] [jumps] [over] [lazy] [dog]




© 2008-2009               Lucid Imagination, Inc.
                                                     11
SnowballAnalyzer
                   Stemming algorithm
      The quick brown fox jumps over the lazi dog.




   [the] [quick] [brown] [fox] [jump] [over] [the]
                     [lazi] [dog]

© 2008-2009            Lucid Imagination, Inc.
                                                     12
What's in a token?




© 2008-2009          Lucid Imagination, Inc.
                                               13
Relevance

 •    Term frequency (TF): number of times a term
      appears in a document

 •    Inverse document frequency (IDF): One over
      number of times term appears in the index (1/df)

 •    Field length normalization: control affect field
      length, in number of terms, has on score

 •    Boost factors: terms, fields, or documents



© 2008-2009               Lucid Imagination, Inc.
                                                         14
Lucene Scoring
                                  d1




                                                q1
                  Θ




© 2008-2009           Lucid Imagination, Inc.
                                                     15
Solr APIs

 • HTTP GET/POST (curl or any other HTTP
      client)
 • JSON
 • SolrJ (embedded or HTTP)
 • solr-ruby
 • python, PHP, solrsharp, XSLT

© 2008-2009         Lucid Imagination, Inc.
                                              16
Solr in Production
                                              Incoming Search
                                                  Requests




                                               Load Balancer




                                                  Solr
                                                 Solr Master
                                                  Solr Master


                              Shard Request                    Shard Request


                   Load Balancer                                          Load Balancer



                      Shard                                                    Shard
          Shard                                                  Shard
          Master                                 1..n            Master
                          Replicant             shards                            Replicant
                           Replicant                                               Replicant
                            Replicant                                               Replicant
                              Replicant                                               Replicant




© 2008-2009                                    Lucid Imagination, Inc.
                                                                                                  17
Getting Started:
                 It's This Easy
1.Start Solr

  java -jar start.jar
2.Index your data

  java -jar post.jar *.xml
3.Search

  http://localhost:8983/solr
  © 2008-2009         Lucid Imagination, Inc.
                                                18
Configuration
 •    schema.xml

     •    field types and fields

 •    solrconfig.xml

     •    request handler mappings

     •    cache settings: filter, query, document

     •    warming listeners

     •    HTTP cache settings

     •    Lucene index parameters

     •    plugins: spell checking, highlighting


© 2008-2009                      Lucid Imagination, Inc.
                                                           19
Solr add/update XML
<add><doc>
  <field name="id">MA147LL/A</field>
  <field name="name">Apple 60 GB iPod with Video Playback Black</field>
  <field name="manu">Apple Computer Inc.</field>
  <field name="cat">electronics</field>
  <field name="cat">music</field>
  <field name="features">iTunes, Podcasts, Audiobooks</field>
  <field name="features">Stores up to 15,000 songs, 25,000 photos, or 150 hours of
               video</field>
  <field name="features">2.5-inch, 320x240 color TFT LCD display
                         with LED backlight</field>
  <field name="features">Up to 20 hours of battery life</field>
  <field name="features">Plays AAC, MP3, WAV, AIFF, Audible, Apple Lossless,
                         H.264 video</field>
  <field name="features">Notes, Calendar, Phone book, Hold button, Date display,
      Photo wallet, Built-in games, JPEG photo playback, Upgradeable firmware,
      USB 2.0 compatibility, Playback speed control, Rechargeable capability,
      Battery level indication</field>
  <field name="includes">earbud headphones, USB cable</field>
  <field name="weight">5.5</field>
  <field name="price">399.00</field>
  <field name="popularity">10</field>
  <field name="inStock">true</field>
</doc></add>


     © 2008-2009                     Lucid Imagination, Inc.
                                                                                     20
Indexing Solr XML
 • Via curl:'http://localhost:8983/
   curl
      solr/update?commit=true' --
      data-binary @ipod_video.xml -
      H 'Content-type:text/xml;
      charset=utf-8'

 • Via Solr's Java-based post tool:
      java -jar post.jar ipod_video.xml



© 2008-2009            Lucid Imagination, Inc.
                                                 21
Indexing CSV


curl 'http://localhost:8983/solr/update/
csv?commit=true' --data-binary @books.csv -
H 'Content-type:text/plain; charset=utf-8'




   © 2008-2009       Lucid Imagination, Inc.
                                               22
Content Streams

 •    Allows Solr server to fetch local or remote data
      itself. Must enable remote streaming in
      solrconfig.xml

 •    http://localhost:8983/solr/update?stream.file=<local
      Solr path to exampledocs>/ipod_video.xml

 •    &stream.url=<url to content>

 •    Security warning: allows Solr to fetch arbitrary
      server-side file or network URL content



© 2008-2009                Lucid Imagination, Inc.
                                                            23
Indexing Rich Documents


curl 'http://localhost:8983/solr/update/
extract?
literal.id=doc1&commit=true&extractOnly=true
&wt=ruby&indent=on' -F
"myfile=@tutorial.html"




    © 2008-2009     Lucid Imagination, Inc.
                                               24
Indexing with SolrJ

SolrServer solr =
    new CommonsHttpSolrServer(new URL("http://localhost:8983/solr"));

SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", "JAVAZONE_09");
doc.addField("title", "JavaZone 2009 SolrJ Example");
solr.add(doc);
solr.commit();     // after a batch, not per document
solr.optimize();   // periodically, when needed




    © 2008-2009                Lucid Imagination, Inc.
                                                                        25
Indexing with Ruby

solr = Connection.new(
  'http://localhost:8983/solr',
  :autocommit => :on)

solr.add(:id => 123,
         :title => 'Solr in Action')

solr.optimize       # periodically, as needed




  © 2008-2009           Lucid Imagination, Inc.
                                                  26
Data Import Handler


• Indexes relational database, XML data sources,
   e-mail, and more
• Supports full and incremental/delta indexing
• Extensible with custom data sources,
   transformers, etc
• http://wiki.apache.org/solr/DataImportHandler
 © 2008-2009           Lucid Imagination, Inc.
                                                   27
DB Indexing



http://localhost:8983/solr/db/dataimport?
command=full-import




  © 2008-2009       Lucid Imagination, Inc.
                                              28
Example Search Request

 • http://localhost:8983/solr/select?q=query
  • &start=50
  • &rows=25
  • &fq=filter+query
  • &facet=on&facet.field=category

© 2008-2009         Lucid Imagination, Inc.
                                               29
Debug Query


 • &debugQuery=true is your friend
 • Includes parsed query, explanations, and
      search component timings in response




© 2008-2009           Lucid Imagination, Inc.
                                                30
Query Parser

 • Controlled by defType parameter
  • &defType=lucene (actually a Solr
          extension of Lucene’s QueryParser)
     • &defType=dismax
 • Local {!..} override syntax

© 2008-2009             Lucid Imagination, Inc.
                                                  31
Solr Query Parser

 • http://lucene.apache.org/java/2_4_0/
      queryparsersyntax.html + Solr extensions
 • Kitchen sink parser, includes advanced user-
      unfriendly syntax
 • Syntax errors throw parse exceptions back
      to client
 • Example: title:ipod* AND price:[0 TO 100]
© 2008-2009               Lucid Imagination, Inc.
                                                    32
Dismax Query Parser

 • Simplified syntax:
      loose text “quote phrases” -prohibited
      +required
 • Spreads query terms across query fields
      (qf) with dynamic boosting per field, implicit
      phrase construction (pf), boosting function
      (bf), boosting query (bq), and minimum
      match (mm)


© 2008-2009            Lucid Imagination, Inc.
                                                      33
Searching with SolrJ


SolrServer server = new CommonsHttpSolrServer("http://
  localhost:8983/solr");
SolrQuery params = new SolrQuery("author:John");
params.setFields("*,score");
params.setRows(3);
QueryResponse response = server.query(params);
for (SolrDocument document : response.getResults()) {
      System.out.println("Doc: " + document);
}




   © 2008-2009            Lucid Imagination, Inc.
                                                         34
Searching with Ruby


conn = Connection.new(
    'http://localhost:8983/solr')

conn.query('my query') do |hit|
  puts hit.inspect
end




© 2008-2009           Lucid Imagination, Inc.
                                                35
delete, update, etc
 •    Delete:
     • <delete><id>05991</id></delete>
     •    <delete>
             <query>category:Unused</query>
          </delete>

     •    java -Ddata=args -jar post.jar
          "<delete><query>*:*</query></delete>"

 •    Update: simply <add> doc with same unique key

 •    Commit: <commit/>

 •    Optimize: <optimize/>
© 2008-2009              Lucid Imagination, Inc.
                                                      36
Faceting


• Counts per subset within results
• Facet on: field terms, queries, date
    ranges
• &facet=on
    &facet.field=cat
    &facet.query=price:[0 TO 100]
• http://wiki.apache.org/solr/
    SimpleFacetParameters
© 2008-2009          Lucid Imagination, Inc.
                                               37
Spell checking


•    Not enabled by default, see example config to wire it in

•    http://localhost:8983/solr/spell?
     q=epod&spellcheck=on&spellcheck.build=true

•    File or index-based dictionaries

•    Supports pluggable distance algorithms: Levenstein and
     JaroWinkler

•    http://wiki.apache.org/solr/SpellCheckComponent


© 2008-2009                Lucid Imagination, Inc.
                                                               38
Highlighting


 • http://localhost:8983/solr/select?
      q=ipod&hl=on&hl.fl=manu,name
 • http://wiki.apache.org/solr/
      HighlightingParameters




© 2008-2009           Lucid Imagination, Inc.
                                                39
More Like This


 • http://localhost:8983/solr/select?
      q=ipod&mlt=true&mlt.fl=manu,cat&mlt.min
      df=1&mlt.mintf=1&fl=id,score,name
 • http://wiki.apache.org/solr/MoreLikeThis


© 2008-2009          Lucid Imagination, Inc.
                                               40
Scaling: Query Throughput

 • Replication
  • slaves poll master for index updates
  • transfers index files from master to slave
  • configuration files can also be transferred
  • entirely Java/HTTP-based in Solr 1.4
          (prior versions used rsync)



© 2008-2009              Lucid Imagination, Inc.
                                                   41
Scaling: Collection Size

 • Distribution
  • Index documents across shards
  • query single server with shards
          parameter
         • sends requests to each shard
         • aggregates result to a single response

© 2008-2009             Lucid Imagination, Inc.
                                                    42
Solr-powered UI

 • Solritas (from "celeritas"):
      VelocityResponseWriter
     • easily templated output
 • SolrJS: jQuery-based widgets
  • see http://solrjs.solrstuff.org/
 • Blacklight and Flare: RoR plugins

© 2008-2009           Lucid Imagination, Inc.
                                                43
Lucene in Action, 2nd Edition




              http://www.manning.com/lucene
© 2008-2009               Lucid Imagination, Inc.
                                                    44
Search at Lucid
http://search.lucidimagination.com/?q=javazone




© 2008-2009         Lucid Imagination, Inc.
                                                 45
/")$/#$0(#
            !"#$%&'()*$+),$-+&$0&,12&#-((23#$)4&2+,$,5&-6 78)#12&
            !"#2+29:-43&2#-050,2(
            !"#$%&,2)(&$+#4"%20&,12&4)3*20,&#-442#,$-+&-6&
            !"#2+29:-43&#-(($,,230.&#-+,3$;",-30&)+%&$+64"2+#230&
            <"3&($00$-+&$0&,-&023=2&)0&!"#$%#&'#($)*$+,-#..#&-#$6-3&
            !"#2+29:-43>;)02%&02)3#1&0-4",$-+0
                 ?248&-"3&#"0,-(230&*2,&,12&(-0,&-",&-6&!"#2+29:-43&> !"#$%&'(
                 (-0,&@$%245&"02%&-82+&0-"3#2&02)3#1&0-6,@)32&&&




  A&BCCD>BCCE
   © 2008-2009                     !"#$%&'()*$+),$-+.&'+#/Inc.
                                   Lucid Imagination,            !"#$%$&'()*+',%-'./$0+'*)1)2',+$'.+,-$3,+42')5'./$'67,#/$'()5.8,+$'9)"%-,.0)%

                                                                                                                                                 46
!"#$%&'()*$+),$-+&./#0+$#)1&./)(
                          ! 2-+$3&4//1/56                                          ! <)8#&F8/11/+9,/$+6
                                     012),-1&-3&4-51&&
     Unique                          !"#2+264-51&#-(($,,21.&780&(2(921
                                                                                                 0-;3-"+%21.&0=G64H7.&<-1,:21+&!$*:,
 Combination of           ! 78)+,&'+*/89-116
                                                                                                 H7&42)1#:.&0=G.&I5J2K$21
Enterprise Search                    !"#$%&"'&(')*+,#-#'.&&%'!$/01                 ! @8$)+&G$+3/8,-+6
   and Lucene                        !"#2+264-51&#-(($,,21.&0:)$1.&780                           L2K25-@2%&M2901)N521.&,:2&N29OJ&3$1J,&
                          ! :8$3&;),#0/86                                                        #-(@12:2+J$K2&J2)1#:&2+*$+2&
    Expertise
                                     0-;$+%2&"'&(')*+,#-#'3-'4,%3&-1'5&&6                        71$+#$@)5&P1#:$,2#,&),&PF
                                     !"#2+264-51&#-(($,,21.&780&(2(921             ! 4$(-+&H-9/+,0)16
                          ! <)83&<$11/8                                                          4-5",$-+J&)1#:$,2#,.&<-1,:21+&!$*:,
                                     !"#2+264-51&#-(($,,21.&780&
                                     (2(921                                        ! I)5&;$116
                          ! 4)($&4$8/+                                                           4-5",$-+J&P1#:$,2#,.&M255J&Q)1*-
                                     <",#:6=$>)&#-(($,,21.&780&(2(921
                                                                                   ! H5)+&<#F$+1/56
                          ! =+%8>/?&@$1)1/#3$&
                                                                                                 !"#2+264-51&#-(($,,21.&&780&(2(921
                                     !"#2+26<",#:6?)%--@&#-(($,,21.&780&
                                     (2(921&
                                                                                   ! B08$9&;-9,/,,/86&C=%D$9-8E
                          ! A-"*&B",,$+*6&C=%D$9-8E
                                                                                                 !"#2+264-51&#-(($,,21.&&780&(2(921
                                     012),-1&-3&!"#2+2.&<",#:&A&?)%--@
                                                                                                 82(921&P@)#:2&4-3,N)12&Q-"+%),$-+


       B&CDDE;CDDF
           © 2008-2009                                   !"#$%&'()*$+),$-+.&'+#/
                                                         Lucid Imagination, Inc.
                                                                                                                                          47
!"#$%&'()*$+),$-+&."/$+0//&1-%02
  ;:00
<-=+2-)%
                                                                                  ()*+,-,./+"0+,/.1)
                       2+,*.3.+4"5./*,.67*.1)/
                             & 8,++"&

                        3)2"04)%%&567

     !"#0+0
                                                   89*:)%0
   >9)#?0@-:*




      2199+,:.;<""=7--1,*>" ?,;.).)@>" 21)/7<*.)@"


  !"#$$%&#$$'
         © 2008-2009                        A7:.4"B9;@.);*.1) 21)3.4+)*.;<   !"#$%$&'()*+',%-'./$0+'*)1)2',+$'.+,-$3,+42')5'./$'67,#/$'()5.8,+$'9)"%-,.0)%
                                                   Lucid Imagination, Inc.
                                                                                                                                                             48
Thank you




              http://www.lucidimagination.com
© 2008-2009                Lucid Imagination, Inc.
                                                                 49
© 2008-2009   Lucid Imagination, Inc.
                                        50

More Related Content

What's hot

WhereScape, the pioneer in data warehouse automation software
WhereScape, the pioneer in data warehouse automation software WhereScape, the pioneer in data warehouse automation software
WhereScape, the pioneer in data warehouse automation software Patrick Van Renterghem
 
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOxInfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOxInfluxData
 
JSON and the Oracle Database
JSON and the Oracle DatabaseJSON and the Oracle Database
JSON and the Oracle DatabaseMaria Colgan
 
Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comJungsu Heo
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangDatabricks
 
Migration to Oracle Multitenant
Migration to Oracle MultitenantMigration to Oracle Multitenant
Migration to Oracle MultitenantJitendra Singh
 
Oracle Performance Tuning Fundamentals
Oracle Performance Tuning FundamentalsOracle Performance Tuning Fundamentals
Oracle Performance Tuning FundamentalsEnkitec
 
Same plan different performance
Same plan different performanceSame plan different performance
Same plan different performanceMauro Pagano
 
Spark and S3 with Ryan Blue
Spark and S3 with Ryan BlueSpark and S3 with Ryan Blue
Spark and S3 with Ryan BlueDatabricks
 
Introduction to ELK
Introduction to ELKIntroduction to ELK
Introduction to ELKYuHsuan Chen
 
ELK Stack - Kibana操作實務
ELK Stack - Kibana操作實務ELK Stack - Kibana操作實務
ELK Stack - Kibana操作實務Kedy Chang
 
Deep Dive Into Elasticsearch
Deep Dive Into ElasticsearchDeep Dive Into Elasticsearch
Deep Dive Into ElasticsearchKnoldus Inc.
 
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics MeetupIntroduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetupiwrigley
 
Using ANTLR on real example - convert "string combined" queries into paramete...
Using ANTLR on real example - convert "string combined" queries into paramete...Using ANTLR on real example - convert "string combined" queries into paramete...
Using ANTLR on real example - convert "string combined" queries into paramete...Alexey Diyan
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevAltinity Ltd
 
Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...
Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...
Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...HostedbyConfluent
 
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...Edureka!
 
Creating a Data validation and Testing Strategy
Creating a Data validation and Testing StrategyCreating a Data validation and Testing Strategy
Creating a Data validation and Testing StrategyRTTS
 
Dawid Weiss- Finite state automata in lucene
 Dawid Weiss- Finite state automata in lucene Dawid Weiss- Finite state automata in lucene
Dawid Weiss- Finite state automata in luceneLucidworks (Archived)
 

What's hot (20)

WhereScape, the pioneer in data warehouse automation software
WhereScape, the pioneer in data warehouse automation software WhereScape, the pioneer in data warehouse automation software
WhereScape, the pioneer in data warehouse automation software
 
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOxInfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
 
JSON and the Oracle Database
JSON and the Oracle DatabaseJSON and the Oracle Database
JSON and the Oracle Database
 
Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.com
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 
Presto: SQL-on-anything
Presto: SQL-on-anythingPresto: SQL-on-anything
Presto: SQL-on-anything
 
Migration to Oracle Multitenant
Migration to Oracle MultitenantMigration to Oracle Multitenant
Migration to Oracle Multitenant
 
Oracle Performance Tuning Fundamentals
Oracle Performance Tuning FundamentalsOracle Performance Tuning Fundamentals
Oracle Performance Tuning Fundamentals
 
Same plan different performance
Same plan different performanceSame plan different performance
Same plan different performance
 
Spark and S3 with Ryan Blue
Spark and S3 with Ryan BlueSpark and S3 with Ryan Blue
Spark and S3 with Ryan Blue
 
Introduction to ELK
Introduction to ELKIntroduction to ELK
Introduction to ELK
 
ELK Stack - Kibana操作實務
ELK Stack - Kibana操作實務ELK Stack - Kibana操作實務
ELK Stack - Kibana操作實務
 
Deep Dive Into Elasticsearch
Deep Dive Into ElasticsearchDeep Dive Into Elasticsearch
Deep Dive Into Elasticsearch
 
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics MeetupIntroduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
 
Using ANTLR on real example - convert "string combined" queries into paramete...
Using ANTLR on real example - convert "string combined" queries into paramete...Using ANTLR on real example - convert "string combined" queries into paramete...
Using ANTLR on real example - convert "string combined" queries into paramete...
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
 
Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...
Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...
Improving fault tolerance and scaling out in Kafka Streams with Bill Bejeck |...
 
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
 
Creating a Data validation and Testing Strategy
Creating a Data validation and Testing StrategyCreating a Data validation and Testing Strategy
Creating a Data validation and Testing Strategy
 
Dawid Weiss- Finite state automata in lucene
 Dawid Weiss- Finite state automata in lucene Dawid Weiss- Finite state automata in lucene
Dawid Weiss- Finite state automata in lucene
 

Viewers also liked

Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash courseTommaso Teofili
 
Using Apache Solr
Using Apache SolrUsing Apache Solr
Using Apache Solrpittaya
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineTrey Grainger
 
Solr for Indexing and Searching Logs
Solr for Indexing and Searching LogsSolr for Indexing and Searching Logs
Solr for Indexing and Searching LogsSematext Group, Inc.
 
Enterprise Search in Practice: A Presentation of Survey Results and Areas for...
Enterprise Search in Practice: A Presentation of Survey Results and Areas for...Enterprise Search in Practice: A Presentation of Survey Results and Areas for...
Enterprise Search in Practice: A Presentation of Survey Results and Areas for...Findwise
 
Solr introduction
Solr introductionSolr introduction
Solr introductionLap Tran
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache SolrEdureka!
 
Enterprise Search Using Apache Solr
Enterprise Search Using Apache SolrEnterprise Search Using Apache Solr
Enterprise Search Using Apache Solrsagar chaturvedi
 
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...Sematext Group, Inc.
 
Spark overview
Spark overviewSpark overview
Spark overviewLisa Hua
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Apache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 AcquiaApache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 AcquiaDropsolid
 
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, Lucidworks
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, LucidworksState of Solr Security 2016: Presented by Ishan Chattopadhyaya, Lucidworks
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, LucidworksLucidworks
 
Apache Solr-Webinar
Apache Solr-WebinarApache Solr-Webinar
Apache Solr-WebinarEdureka!
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache SolrChristos Manios
 

Viewers also liked (20)

Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash course
 
Using Apache Solr
Using Apache SolrUsing Apache Solr
Using Apache Solr
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engine
 
Solr for Indexing and Searching Logs
Solr for Indexing and Searching LogsSolr for Indexing and Searching Logs
Solr for Indexing and Searching Logs
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Enterprise Search in Practice: A Presentation of Survey Results and Areas for...
Enterprise Search in Practice: A Presentation of Survey Results and Areas for...Enterprise Search in Practice: A Presentation of Survey Results and Areas for...
Enterprise Search in Practice: A Presentation of Survey Results and Areas for...
 
Solr introduction
Solr introductionSolr introduction
Solr introduction
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache Solr
 
Enterprise Search Using Apache Solr
Enterprise Search Using Apache SolrEnterprise Search Using Apache Solr
Enterprise Search Using Apache Solr
 
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
 
How Solr Search Works
How Solr Search WorksHow Solr Search Works
How Solr Search Works
 
Spark overview
Spark overviewSpark overview
Spark overview
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Apache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 AcquiaApache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 Acquia
 
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, Lucidworks
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, LucidworksState of Solr Security 2016: Presented by Ishan Chattopadhyaya, Lucidworks
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, Lucidworks
 
Apache Solr-Webinar
Apache Solr-WebinarApache Solr-Webinar
Apache Solr-Webinar
 
High Performance Solr
High Performance SolrHigh Performance Solr
High Performance Solr
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 

Similar to Solr: Search at the Speed of Light

The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill lucenerevolution
 
The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill lucenerevolution
 
Games for the Masses (Jax)
Games for the Masses (Jax)Games for the Masses (Jax)
Games for the Masses (Jax)Wooga
 
Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Lucidworks (Archived)
 
Building specialized industry apps using solr - By Rahul Agarwalla
Building specialized industry apps using solr - By Rahul Agarwalla   Building specialized industry apps using solr - By Rahul Agarwalla
Building specialized industry apps using solr - By Rahul Agarwalla lucenerevolution
 
Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Lucidworks (Archived)
 
Tricks And Tradeoffs Of Deploying My Sql Clusters In The Cloud
Tricks And Tradeoffs Of Deploying My Sql Clusters In The CloudTricks And Tradeoffs Of Deploying My Sql Clusters In The Cloud
Tricks And Tradeoffs Of Deploying My Sql Clusters In The CloudMySQLConference
 
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to HivePerformance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to HiveYukinori Suda
 
HBase and Hadoop at Adobe
HBase and Hadoop at AdobeHBase and Hadoop at Adobe
HBase and Hadoop at AdobeCosmin Lehene
 
Oracle+golden+gate+introduction
Oracle+golden+gate+introductionOracle+golden+gate+introduction
Oracle+golden+gate+introductionxiakaicd
 
Oslo Enterprise MeetUp May 12th 2010 - Jan Høydahl
Oslo Enterprise MeetUp May 12th 2010 - Jan HøydahlOslo Enterprise MeetUp May 12th 2010 - Jan Høydahl
Oslo Enterprise MeetUp May 12th 2010 - Jan HøydahlCominvent AS
 
Mule ESB - Integration Simplified
Mule ESB - Integration SimplifiedMule ESB - Integration Simplified
Mule ESB - Integration SimplifiedRich Software
 
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...jaxLondonConference
 
Ontology and semantic web (2016)
Ontology and semantic web (2016)Ontology and semantic web (2016)
Ontology and semantic web (2016)Craig Trim
 
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...cwensel
 
MarkLogic Server / NoSQL at ApacheCon
MarkLogic Server / NoSQL at ApacheConMarkLogic Server / NoSQL at ApacheCon
MarkLogic Server / NoSQL at ApacheConhunterhacker
 
Building Scale Free Applications with Hadoop and Cascading
Building Scale Free Applications with Hadoop and CascadingBuilding Scale Free Applications with Hadoop and Cascading
Building Scale Free Applications with Hadoop and Cascadingcwensel
 

Similar to Solr: Search at the Speed of Light (20)

The Seven Deadly Sins of Solr
The Seven Deadly Sins of SolrThe Seven Deadly Sins of Solr
The Seven Deadly Sins of Solr
 
The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill
 
The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill
 
Games for the Masses (Jax)
Games for the Masses (Jax)Games for the Masses (Jax)
Games for the Masses (Jax)
 
Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...
 
Building specialized industry apps using solr - By Rahul Agarwalla
Building specialized industry apps using solr - By Rahul Agarwalla   Building specialized industry apps using solr - By Rahul Agarwalla
Building specialized industry apps using solr - By Rahul Agarwalla
 
Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...
 
Tricks And Tradeoffs Of Deploying My Sql Clusters In The Cloud
Tricks And Tradeoffs Of Deploying My Sql Clusters In The CloudTricks And Tradeoffs Of Deploying My Sql Clusters In The Cloud
Tricks And Tradeoffs Of Deploying My Sql Clusters In The Cloud
 
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to HivePerformance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
 
HBase and Hadoop at Adobe
HBase and Hadoop at AdobeHBase and Hadoop at Adobe
HBase and Hadoop at Adobe
 
Oracle+golden+gate+introduction
Oracle+golden+gate+introductionOracle+golden+gate+introduction
Oracle+golden+gate+introduction
 
Oslo Enterprise MeetUp May 12th 2010 - Jan Høydahl
Oslo Enterprise MeetUp May 12th 2010 - Jan HøydahlOslo Enterprise MeetUp May 12th 2010 - Jan Høydahl
Oslo Enterprise MeetUp May 12th 2010 - Jan Høydahl
 
Mule ESB - Integration Simplified
Mule ESB - Integration SimplifiedMule ESB - Integration Simplified
Mule ESB - Integration Simplified
 
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
 
Ontology and semantic web (2016)
Ontology and semantic web (2016)Ontology and semantic web (2016)
Ontology and semantic web (2016)
 
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...
 
Solr @ eBay Kleinanzeigen
Solr @ eBay KleinanzeigenSolr @ eBay Kleinanzeigen
Solr @ eBay Kleinanzeigen
 
MarkLogic Server / NoSQL at ApacheCon
MarkLogic Server / NoSQL at ApacheConMarkLogic Server / NoSQL at ApacheCon
MarkLogic Server / NoSQL at ApacheCon
 
Building Scale Free Applications with Hadoop and Cascading
Building Scale Free Applications with Hadoop and CascadingBuilding Scale Free Applications with Hadoop and Cascading
Building Scale Free Applications with Hadoop and Cascading
 
Pig programming is fun
Pig programming is funPig programming is fun
Pig programming is fun
 

More from Erik Hatcher

Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Erik Hatcher
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksErik Hatcher
 
Solr Powered Libraries
Solr Powered LibrariesSolr Powered Libraries
Solr Powered LibrariesErik Hatcher
 
Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query ParsingErik Hatcher
 
"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - Chicago"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - ChicagoErik Hatcher
 
Query Parsing - Tips and Tricks
Query Parsing - Tips and TricksQuery Parsing - Tips and Tricks
Query Parsing - Tips and TricksErik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0Erik Hatcher
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development TutorialErik Hatcher
 

More from Erik Hatcher (20)

Ted Talk
Ted TalkTed Talk
Ted Talk
 
Solr Payloads
Solr PayloadsSolr Payloads
Solr Payloads
 
it's just search
it's just searchit's just search
it's just search
 
Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis Tricks
 
Solr Powered Libraries
Solr Powered LibrariesSolr Powered Libraries
Solr Powered Libraries
 
Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query Parsing
 
"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - Chicago"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - Chicago
 
Query Parsing - Tips and Tricks
Query Parsing - Tips and TricksQuery Parsing - Tips and Tricks
Query Parsing - Tips and Tricks
 
Solr 4
Solr 4Solr 4
Solr 4
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Solr Flair
Solr FlairSolr Flair
Solr Flair
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development Tutorial
 

Recently uploaded

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Recently uploaded (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

Solr: Search at the Speed of Light

  • 1. Solr Search at the Speed of Light JavaZone 2009 September 10 Oslo Erik Hatcher, Lucid Imagination erik.hatcher@lucidimagination.com 1
  • 2. Solr History • Created by Yonik Seeley for CNET • Contributed to Apache in January 2006 • December 2006:Version 1.1 released • June 2007:Version 1.2 released • September 2008:Version 1.3 released • ~September 2009:Version 1.4 http://lucene.apache.org/solr © 2008-2009 Lucid Imagination, Inc. 2
  • 3. Solr: Big Picture Data DB Document Document Documents Solr Search Results © 2008-2009 Lucid Imagination, Inc. 3
  • 4. Features • Lucene power exposed over HTTP • Scalability: caching, replication, distributed search • Faceting • And more: spell checking, highlighting, clustering, rich document and DB indexing, "more like this" © 2008-2009 Lucid Imagination, Inc. 4
  • 5. Lucene • Fast, scalable search library • Lucene index structure • Index contains documents • documents have fields • indexed fields have terms © 2008-2009 Lucid Imagination, Inc. 5
  • 6. Inverted Index • Commonly used search engine data structure • Efficient lookup of terms across large number of documents • Usually stores positional information to enable From "Taming Text" by Grant Ingersoll and Tom Morton phrase/proximity queries © 2008-2009 Lucid Imagination, Inc. 6
  • 7. Analysis Process © 2008-2009 Lucid Imagination, Inc. 7
  • 8. Analyzing the analyzer Example phrase The quick brown fox jumps over the lazy dog. © 2008-2009 Lucid Imagination, Inc. 8
  • 9. WhitespaceAnalyzer Simplest built-in analyzer The quick brown fox jumps over the lazy dog. [The] [quick] [brown] [fox] [jumps] [over] [the] [lazy] [dog.] © 2008-2009 Lucid Imagination, Inc. 9
  • 10. SimpleAnalyzer Lowercases, splits at non-letter boundaries the quick brown fox jumps over the lazy dog. [the] [quick] [brown] [fox] [jumps] [over] [the] [lazy] [dog] © 2008-2009 Lucid Imagination, Inc. 10
  • 11. StopAnalyzer Lowercases and removes stop words The quick brown fox jumps over the lazy dog. [quick] [brown] [fox] [jumps] [over] [lazy] [dog] © 2008-2009 Lucid Imagination, Inc. 11
  • 12. SnowballAnalyzer Stemming algorithm The quick brown fox jumps over the lazi dog. [the] [quick] [brown] [fox] [jump] [over] [the] [lazi] [dog] © 2008-2009 Lucid Imagination, Inc. 12
  • 13. What's in a token? © 2008-2009 Lucid Imagination, Inc. 13
  • 14. Relevance • Term frequency (TF): number of times a term appears in a document • Inverse document frequency (IDF): One over number of times term appears in the index (1/df) • Field length normalization: control affect field length, in number of terms, has on score • Boost factors: terms, fields, or documents © 2008-2009 Lucid Imagination, Inc. 14
  • 15. Lucene Scoring d1 q1 Θ © 2008-2009 Lucid Imagination, Inc. 15
  • 16. Solr APIs • HTTP GET/POST (curl or any other HTTP client) • JSON • SolrJ (embedded or HTTP) • solr-ruby • python, PHP, solrsharp, XSLT © 2008-2009 Lucid Imagination, Inc. 16
  • 17. Solr in Production Incoming Search Requests Load Balancer Solr Solr Master Solr Master Shard Request Shard Request Load Balancer Load Balancer Shard Shard Shard Shard Master 1..n Master Replicant shards Replicant Replicant Replicant Replicant Replicant Replicant Replicant © 2008-2009 Lucid Imagination, Inc. 17
  • 18. Getting Started: It's This Easy 1.Start Solr java -jar start.jar 2.Index your data java -jar post.jar *.xml 3.Search http://localhost:8983/solr © 2008-2009 Lucid Imagination, Inc. 18
  • 19. Configuration • schema.xml • field types and fields • solrconfig.xml • request handler mappings • cache settings: filter, query, document • warming listeners • HTTP cache settings • Lucene index parameters • plugins: spell checking, highlighting © 2008-2009 Lucid Imagination, Inc. 19
  • 20. Solr add/update XML <add><doc> <field name="id">MA147LL/A</field> <field name="name">Apple 60 GB iPod with Video Playback Black</field> <field name="manu">Apple Computer Inc.</field> <field name="cat">electronics</field> <field name="cat">music</field> <field name="features">iTunes, Podcasts, Audiobooks</field> <field name="features">Stores up to 15,000 songs, 25,000 photos, or 150 hours of video</field> <field name="features">2.5-inch, 320x240 color TFT LCD display with LED backlight</field> <field name="features">Up to 20 hours of battery life</field> <field name="features">Plays AAC, MP3, WAV, AIFF, Audible, Apple Lossless, H.264 video</field> <field name="features">Notes, Calendar, Phone book, Hold button, Date display, Photo wallet, Built-in games, JPEG photo playback, Upgradeable firmware, USB 2.0 compatibility, Playback speed control, Rechargeable capability, Battery level indication</field> <field name="includes">earbud headphones, USB cable</field> <field name="weight">5.5</field> <field name="price">399.00</field> <field name="popularity">10</field> <field name="inStock">true</field> </doc></add> © 2008-2009 Lucid Imagination, Inc. 20
  • 21. Indexing Solr XML • Via curl:'http://localhost:8983/ curl solr/update?commit=true' -- data-binary @ipod_video.xml - H 'Content-type:text/xml; charset=utf-8' • Via Solr's Java-based post tool: java -jar post.jar ipod_video.xml © 2008-2009 Lucid Imagination, Inc. 21
  • 22. Indexing CSV curl 'http://localhost:8983/solr/update/ csv?commit=true' --data-binary @books.csv - H 'Content-type:text/plain; charset=utf-8' © 2008-2009 Lucid Imagination, Inc. 22
  • 23. Content Streams • Allows Solr server to fetch local or remote data itself. Must enable remote streaming in solrconfig.xml • http://localhost:8983/solr/update?stream.file=<local Solr path to exampledocs>/ipod_video.xml • &stream.url=<url to content> • Security warning: allows Solr to fetch arbitrary server-side file or network URL content © 2008-2009 Lucid Imagination, Inc. 23
  • 24. Indexing Rich Documents curl 'http://localhost:8983/solr/update/ extract? literal.id=doc1&commit=true&extractOnly=true &wt=ruby&indent=on' -F "myfile=@tutorial.html" © 2008-2009 Lucid Imagination, Inc. 24
  • 25. Indexing with SolrJ SolrServer solr = new CommonsHttpSolrServer(new URL("http://localhost:8983/solr")); SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", "JAVAZONE_09"); doc.addField("title", "JavaZone 2009 SolrJ Example"); solr.add(doc); solr.commit(); // after a batch, not per document solr.optimize(); // periodically, when needed © 2008-2009 Lucid Imagination, Inc. 25
  • 26. Indexing with Ruby solr = Connection.new( 'http://localhost:8983/solr', :autocommit => :on) solr.add(:id => 123, :title => 'Solr in Action') solr.optimize # periodically, as needed © 2008-2009 Lucid Imagination, Inc. 26
  • 27. Data Import Handler • Indexes relational database, XML data sources, e-mail, and more • Supports full and incremental/delta indexing • Extensible with custom data sources, transformers, etc • http://wiki.apache.org/solr/DataImportHandler © 2008-2009 Lucid Imagination, Inc. 27
  • 29. Example Search Request • http://localhost:8983/solr/select?q=query • &start=50 • &rows=25 • &fq=filter+query • &facet=on&facet.field=category © 2008-2009 Lucid Imagination, Inc. 29
  • 30. Debug Query • &debugQuery=true is your friend • Includes parsed query, explanations, and search component timings in response © 2008-2009 Lucid Imagination, Inc. 30
  • 31. Query Parser • Controlled by defType parameter • &defType=lucene (actually a Solr extension of Lucene’s QueryParser) • &defType=dismax • Local {!..} override syntax © 2008-2009 Lucid Imagination, Inc. 31
  • 32. Solr Query Parser • http://lucene.apache.org/java/2_4_0/ queryparsersyntax.html + Solr extensions • Kitchen sink parser, includes advanced user- unfriendly syntax • Syntax errors throw parse exceptions back to client • Example: title:ipod* AND price:[0 TO 100] © 2008-2009 Lucid Imagination, Inc. 32
  • 33. Dismax Query Parser • Simplified syntax: loose text “quote phrases” -prohibited +required • Spreads query terms across query fields (qf) with dynamic boosting per field, implicit phrase construction (pf), boosting function (bf), boosting query (bq), and minimum match (mm) © 2008-2009 Lucid Imagination, Inc. 33
  • 34. Searching with SolrJ SolrServer server = new CommonsHttpSolrServer("http:// localhost:8983/solr"); SolrQuery params = new SolrQuery("author:John"); params.setFields("*,score"); params.setRows(3); QueryResponse response = server.query(params); for (SolrDocument document : response.getResults()) { System.out.println("Doc: " + document); } © 2008-2009 Lucid Imagination, Inc. 34
  • 35. Searching with Ruby conn = Connection.new( 'http://localhost:8983/solr') conn.query('my query') do |hit| puts hit.inspect end © 2008-2009 Lucid Imagination, Inc. 35
  • 36. delete, update, etc • Delete: • <delete><id>05991</id></delete> • <delete> <query>category:Unused</query> </delete> • java -Ddata=args -jar post.jar "<delete><query>*:*</query></delete>" • Update: simply <add> doc with same unique key • Commit: <commit/> • Optimize: <optimize/> © 2008-2009 Lucid Imagination, Inc. 36
  • 37. Faceting • Counts per subset within results • Facet on: field terms, queries, date ranges • &facet=on &facet.field=cat &facet.query=price:[0 TO 100] • http://wiki.apache.org/solr/ SimpleFacetParameters © 2008-2009 Lucid Imagination, Inc. 37
  • 38. Spell checking • Not enabled by default, see example config to wire it in • http://localhost:8983/solr/spell? q=epod&spellcheck=on&spellcheck.build=true • File or index-based dictionaries • Supports pluggable distance algorithms: Levenstein and JaroWinkler • http://wiki.apache.org/solr/SpellCheckComponent © 2008-2009 Lucid Imagination, Inc. 38
  • 39. Highlighting • http://localhost:8983/solr/select? q=ipod&hl=on&hl.fl=manu,name • http://wiki.apache.org/solr/ HighlightingParameters © 2008-2009 Lucid Imagination, Inc. 39
  • 40. More Like This • http://localhost:8983/solr/select? q=ipod&mlt=true&mlt.fl=manu,cat&mlt.min df=1&mlt.mintf=1&fl=id,score,name • http://wiki.apache.org/solr/MoreLikeThis © 2008-2009 Lucid Imagination, Inc. 40
  • 41. Scaling: Query Throughput • Replication • slaves poll master for index updates • transfers index files from master to slave • configuration files can also be transferred • entirely Java/HTTP-based in Solr 1.4 (prior versions used rsync) © 2008-2009 Lucid Imagination, Inc. 41
  • 42. Scaling: Collection Size • Distribution • Index documents across shards • query single server with shards parameter • sends requests to each shard • aggregates result to a single response © 2008-2009 Lucid Imagination, Inc. 42
  • 43. Solr-powered UI • Solritas (from "celeritas"): VelocityResponseWriter • easily templated output • SolrJS: jQuery-based widgets • see http://solrjs.solrstuff.org/ • Blacklight and Flare: RoR plugins © 2008-2009 Lucid Imagination, Inc. 43
  • 44. Lucene in Action, 2nd Edition http://www.manning.com/lucene © 2008-2009 Lucid Imagination, Inc. 44
  • 46. /")$/#$0(# !"#$%&'()*$+),$-+&$0&,12&#-((23#$)4&2+,$,5&-6 78)#12& !"#2+29:-43&2#-050,2( !"#$%&,2)(&$+#4"%20&,12&4)3*20,&#-442#,$-+&-6& !"#2+29:-43&#-(($,,230.&#-+,3$;",-30&)+%&$+64"2+#230& <"3&($00$-+&$0&,-&023=2&)0&!"#$%#&'#($)*$+,-#..#&-#$6-3& !"#2+29:-43>;)02%&02)3#1&0-4",$-+0 ?248&-"3&#"0,-(230&*2,&,12&(-0,&-",&-6&!"#2+29:-43&> !"#$%&'( (-0,&@$%245&"02%&-82+&0-"3#2&02)3#1&0-6,@)32&&& A&BCCD>BCCE © 2008-2009 !"#$%&'()*$+),$-+.&'+#/Inc. Lucid Imagination, !"#$%$&'()*+',%-'./$0+'*)1)2',+$'.+,-$3,+42')5'./$'67,#/$'()5.8,+$'9)"%-,.0)% 46
  • 47. !"#$%&'()*$+),$-+&./#0+$#)1&./)( ! 2-+$3&4//1/56 ! <)8#&F8/11/+9,/$+6 012),-1&-3&4-51&& Unique !"#2+264-51&#-(($,,21.&780&(2(921 0-;3-"+%21.&0=G64H7.&<-1,:21+&!$*:, Combination of ! 78)+,&'+*/89-116 H7&42)1#:.&0=G.&I5J2K$21 Enterprise Search !"#$%&"'&(')*+,#-#'.&&%'!$/01 ! @8$)+&G$+3/8,-+6 and Lucene !"#2+264-51&#-(($,,21.&0:)$1.&780 L2K25-@2%&M2901)N521.&,:2&N29OJ&3$1J,& ! :8$3&;),#0/86 #-(@12:2+J$K2&J2)1#:&2+*$+2& Expertise 0-;$+%2&"'&(')*+,#-#'3-'4,%3&-1'5&&6 71$+#$@)5&P1#:$,2#,&),&PF !"#2+264-51&#-(($,,21.&780&(2(921 ! 4$(-+&H-9/+,0)16 ! <)83&<$11/8 4-5",$-+J&)1#:$,2#,.&<-1,:21+&!$*:, !"#2+264-51&#-(($,,21.&780& (2(921 ! I)5&;$116 ! 4)($&4$8/+ 4-5",$-+J&P1#:$,2#,.&M255J&Q)1*- <",#:6=$>)&#-(($,,21.&780&(2(921 ! H5)+&<#F$+1/56 ! =+%8>/?&@$1)1/#3$& !"#2+264-51&#-(($,,21.&&780&(2(921 !"#2+26<",#:6?)%--@&#-(($,,21.&780& (2(921& ! B08$9&;-9,/,,/86&C=%D$9-8E ! A-"*&B",,$+*6&C=%D$9-8E !"#2+264-51&#-(($,,21.&&780&(2(921 012),-1&-3&!"#2+2.&<",#:&A&?)%--@ 82(921&P@)#:2&4-3,N)12&Q-"+%),$-+ B&CDDE;CDDF © 2008-2009 !"#$%&'()*$+),$-+.&'+#/ Lucid Imagination, Inc. 47
  • 48. !"#$%&'()*$+),$-+&."/$+0//&1-%02 ;:00 <-=+2-)% ()*+,-,./+"0+,/.1) 2+,*.3.+4"5./*,.67*.1)/ & 8,++"& 3)2"04)%%&567 !"#0+0 89*:)%0 >9)#?0@-:* 2199+,:.;<""=7--1,*>" ?,;.).)@>" 21)/7<*.)@" !"#$$%&#$$' © 2008-2009 A7:.4"B9;@.);*.1) 21)3.4+)*.;< !"#$%$&'()*+',%-'./$0+'*)1)2',+$'.+,-$3,+42')5'./$'67,#/$'()5.8,+$'9)"%-,.0)% Lucid Imagination, Inc. 48
  • 49. Thank you http://www.lucidimagination.com © 2008-2009 Lucid Imagination, Inc. 49
  • 50. © 2008-2009 Lucid Imagination, Inc. 50