SlideShare a Scribd company logo
1 of 50
Download to read offline
Solr
   Search at the Speed of Light


          JavaZone 2009
           September 10
               Oslo
  Erik Hatcher, Lucid Imagination
erik.hatcher@lucidimagination.com




                                    1
Solr History

     • Created by Yonik Seeley for CNET
     • Contributed to Apache in January 2006
     • December 2006:Version 1.1 released
     • June 2007:Version 1.2 released
     • September 2008:Version 1.3 released
     • ~September 2009:Version 1.4
http://lucene.apache.org/solr
    © 2008-2009          Lucid Imagination, Inc.
                                                   2
Solr: Big Picture
                                   Data


                                                       DB


              Document
               Document
                 Documents




                                Solr




                               Search Results




© 2008-2009                  Lucid Imagination, Inc.
                                                            3
Features

 • Lucene power exposed over HTTP
 • Scalability: caching, replication, distributed
      search
 • Faceting
 • And more: spell checking, highlighting,
      clustering, rich document and DB indexing,
      "more like this"


© 2008-2009            Lucid Imagination, Inc.
                                                    4
Lucene

 • Fast, scalable search library
 • Lucene index structure
  • Index contains documents
    • documents have fields
      • indexed fields have terms

© 2008-2009        Lucid Imagination, Inc.
                                             5
Inverted Index

 • Commonly used search
      engine data structure
 • Efficient lookup of terms
      across large number of
      documents
 • Usually stores positional
      information to enable From "Taming Text" by Grant Ingersoll and Tom Morton
      phrase/proximity queries


© 2008-2009                     Lucid Imagination, Inc.
                                                                                   6
Analysis Process




© 2008-2009         Lucid Imagination, Inc.
                                              7
Analyzing the analyzer
                    Example phrase

      The quick brown fox jumps over the lazy dog.




© 2008-2009            Lucid Imagination, Inc.
                                                     8
WhitespaceAnalyzer
                Simplest built-in analyzer
      The quick brown fox jumps over the lazy dog.




  [The] [quick] [brown] [fox] [jumps] [over] [the]
                    [lazy] [dog.]

© 2008-2009             Lucid Imagination, Inc.
                                                     9
SimpleAnalyzer
          Lowercases, splits at non-letter boundaries
      the quick brown fox jumps over the lazy dog.




  [the] [quick] [brown] [fox] [jumps] [over] [the]
                    [lazy] [dog]

© 2008-2009               Lucid Imagination, Inc.
                                                        10
StopAnalyzer
              Lowercases and removes stop words


      The quick brown fox jumps over the lazy dog.




 [quick] [brown] [fox] [jumps] [over] [lazy] [dog]




© 2008-2009               Lucid Imagination, Inc.
                                                     11
SnowballAnalyzer
                   Stemming algorithm
      The quick brown fox jumps over the lazi dog.




   [the] [quick] [brown] [fox] [jump] [over] [the]
                     [lazi] [dog]

© 2008-2009            Lucid Imagination, Inc.
                                                     12
What's in a token?




© 2008-2009          Lucid Imagination, Inc.
                                               13
Relevance

 •    Term frequency (TF): number of times a term
      appears in a document

 •    Inverse document frequency (IDF): One over
      number of times term appears in the index (1/df)

 •    Field length normalization: control affect field
      length, in number of terms, has on score

 •    Boost factors: terms, fields, or documents



© 2008-2009               Lucid Imagination, Inc.
                                                         14
Lucene Scoring
                                  d1




                                                q1
                  Θ




© 2008-2009           Lucid Imagination, Inc.
                                                     15
Solr APIs

 • HTTP GET/POST (curl or any other HTTP
      client)
 • JSON
 • SolrJ (embedded or HTTP)
 • solr-ruby
 • python, PHP, solrsharp, XSLT

© 2008-2009         Lucid Imagination, Inc.
                                              16
Solr in Production
                                              Incoming Search
                                                  Requests




                                               Load Balancer




                                                  Solr
                                                 Solr Master
                                                  Solr Master


                              Shard Request                    Shard Request


                   Load Balancer                                          Load Balancer



                      Shard                                                    Shard
          Shard                                                  Shard
          Master                                 1..n            Master
                          Replicant             shards                            Replicant
                           Replicant                                               Replicant
                            Replicant                                               Replicant
                              Replicant                                               Replicant




© 2008-2009                                    Lucid Imagination, Inc.
                                                                                                  17
Getting Started:
                 It's This Easy
1.Start Solr

  java -jar start.jar
2.Index your data

  java -jar post.jar *.xml
3.Search

  http://localhost:8983/solr
  © 2008-2009         Lucid Imagination, Inc.
                                                18
Configuration
 •    schema.xml

     •    field types and fields

 •    solrconfig.xml

     •    request handler mappings

     •    cache settings: filter, query, document

     •    warming listeners

     •    HTTP cache settings

     •    Lucene index parameters

     •    plugins: spell checking, highlighting


© 2008-2009                      Lucid Imagination, Inc.
                                                           19
Solr add/update XML
<add><doc>
  <field name="id">MA147LL/A</field>
  <field name="name">Apple 60 GB iPod with Video Playback Black</field>
  <field name="manu">Apple Computer Inc.</field>
  <field name="cat">electronics</field>
  <field name="cat">music</field>
  <field name="features">iTunes, Podcasts, Audiobooks</field>
  <field name="features">Stores up to 15,000 songs, 25,000 photos, or 150 hours of
               video</field>
  <field name="features">2.5-inch, 320x240 color TFT LCD display
                         with LED backlight</field>
  <field name="features">Up to 20 hours of battery life</field>
  <field name="features">Plays AAC, MP3, WAV, AIFF, Audible, Apple Lossless,
                         H.264 video</field>
  <field name="features">Notes, Calendar, Phone book, Hold button, Date display,
      Photo wallet, Built-in games, JPEG photo playback, Upgradeable firmware,
      USB 2.0 compatibility, Playback speed control, Rechargeable capability,
      Battery level indication</field>
  <field name="includes">earbud headphones, USB cable</field>
  <field name="weight">5.5</field>
  <field name="price">399.00</field>
  <field name="popularity">10</field>
  <field name="inStock">true</field>
</doc></add>


     © 2008-2009                     Lucid Imagination, Inc.
                                                                                     20
Indexing Solr XML
 • Via curl:'http://localhost:8983/
   curl
      solr/update?commit=true' --
      data-binary @ipod_video.xml -
      H 'Content-type:text/xml;
      charset=utf-8'

 • Via Solr's Java-based post tool:
      java -jar post.jar ipod_video.xml



© 2008-2009            Lucid Imagination, Inc.
                                                 21
Indexing CSV


curl 'http://localhost:8983/solr/update/
csv?commit=true' --data-binary @books.csv -
H 'Content-type:text/plain; charset=utf-8'




   © 2008-2009       Lucid Imagination, Inc.
                                               22
Content Streams

 •    Allows Solr server to fetch local or remote data
      itself. Must enable remote streaming in
      solrconfig.xml

 •    http://localhost:8983/solr/update?stream.file=<local
      Solr path to exampledocs>/ipod_video.xml

 •    &stream.url=<url to content>

 •    Security warning: allows Solr to fetch arbitrary
      server-side file or network URL content



© 2008-2009                Lucid Imagination, Inc.
                                                            23
Indexing Rich Documents


curl 'http://localhost:8983/solr/update/
extract?
literal.id=doc1&commit=true&extractOnly=true
&wt=ruby&indent=on' -F
"myfile=@tutorial.html"




    © 2008-2009     Lucid Imagination, Inc.
                                               24
Indexing with SolrJ

SolrServer solr =
    new CommonsHttpSolrServer(new URL("http://localhost:8983/solr"));

SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", "JAVAZONE_09");
doc.addField("title", "JavaZone 2009 SolrJ Example");
solr.add(doc);
solr.commit();     // after a batch, not per document
solr.optimize();   // periodically, when needed




    © 2008-2009                Lucid Imagination, Inc.
                                                                        25
Indexing with Ruby

solr = Connection.new(
  'http://localhost:8983/solr',
  :autocommit => :on)

solr.add(:id => 123,
         :title => 'Solr in Action')

solr.optimize       # periodically, as needed




  © 2008-2009           Lucid Imagination, Inc.
                                                  26
Data Import Handler


• Indexes relational database, XML data sources,
   e-mail, and more
• Supports full and incremental/delta indexing
• Extensible with custom data sources,
   transformers, etc
• http://wiki.apache.org/solr/DataImportHandler
 © 2008-2009           Lucid Imagination, Inc.
                                                   27
DB Indexing



http://localhost:8983/solr/db/dataimport?
command=full-import




  © 2008-2009       Lucid Imagination, Inc.
                                              28
Example Search Request

 • http://localhost:8983/solr/select?q=query
  • &start=50
  • &rows=25
  • &fq=filter+query
  • &facet=on&facet.field=category

© 2008-2009         Lucid Imagination, Inc.
                                               29
Debug Query


 • &debugQuery=true is your friend
 • Includes parsed query, explanations, and
      search component timings in response




© 2008-2009           Lucid Imagination, Inc.
                                                30
Query Parser

 • Controlled by defType parameter
  • &defType=lucene (actually a Solr
          extension of Lucene’s QueryParser)
     • &defType=dismax
 • Local {!..} override syntax

© 2008-2009             Lucid Imagination, Inc.
                                                  31
Solr Query Parser

 • http://lucene.apache.org/java/2_4_0/
      queryparsersyntax.html + Solr extensions
 • Kitchen sink parser, includes advanced user-
      unfriendly syntax
 • Syntax errors throw parse exceptions back
      to client
 • Example: title:ipod* AND price:[0 TO 100]
© 2008-2009               Lucid Imagination, Inc.
                                                    32
Dismax Query Parser

 • Simplified syntax:
      loose text “quote phrases” -prohibited
      +required
 • Spreads query terms across query fields
      (qf) with dynamic boosting per field, implicit
      phrase construction (pf), boosting function
      (bf), boosting query (bq), and minimum
      match (mm)


© 2008-2009            Lucid Imagination, Inc.
                                                      33
Searching with SolrJ


SolrServer server = new CommonsHttpSolrServer("http://
  localhost:8983/solr");
SolrQuery params = new SolrQuery("author:John");
params.setFields("*,score");
params.setRows(3);
QueryResponse response = server.query(params);
for (SolrDocument document : response.getResults()) {
      System.out.println("Doc: " + document);
}




   © 2008-2009            Lucid Imagination, Inc.
                                                         34
Searching with Ruby


conn = Connection.new(
    'http://localhost:8983/solr')

conn.query('my query') do |hit|
  puts hit.inspect
end




© 2008-2009           Lucid Imagination, Inc.
                                                35
delete, update, etc
 •    Delete:
     • <delete><id>05991</id></delete>
     •    <delete>
             <query>category:Unused</query>
          </delete>

     •    java -Ddata=args -jar post.jar
          "<delete><query>*:*</query></delete>"

 •    Update: simply <add> doc with same unique key

 •    Commit: <commit/>

 •    Optimize: <optimize/>
© 2008-2009              Lucid Imagination, Inc.
                                                      36
Faceting


• Counts per subset within results
• Facet on: field terms, queries, date
    ranges
• &facet=on
    &facet.field=cat
    &facet.query=price:[0 TO 100]
• http://wiki.apache.org/solr/
    SimpleFacetParameters
© 2008-2009          Lucid Imagination, Inc.
                                               37
Spell checking


•    Not enabled by default, see example config to wire it in

•    http://localhost:8983/solr/spell?
     q=epod&spellcheck=on&spellcheck.build=true

•    File or index-based dictionaries

•    Supports pluggable distance algorithms: Levenstein and
     JaroWinkler

•    http://wiki.apache.org/solr/SpellCheckComponent


© 2008-2009                Lucid Imagination, Inc.
                                                               38
Highlighting


 • http://localhost:8983/solr/select?
      q=ipod&hl=on&hl.fl=manu,name
 • http://wiki.apache.org/solr/
      HighlightingParameters




© 2008-2009           Lucid Imagination, Inc.
                                                39
More Like This


 • http://localhost:8983/solr/select?
      q=ipod&mlt=true&mlt.fl=manu,cat&mlt.min
      df=1&mlt.mintf=1&fl=id,score,name
 • http://wiki.apache.org/solr/MoreLikeThis


© 2008-2009          Lucid Imagination, Inc.
                                               40
Scaling: Query Throughput

 • Replication
  • slaves poll master for index updates
  • transfers index files from master to slave
  • configuration files can also be transferred
  • entirely Java/HTTP-based in Solr 1.4
          (prior versions used rsync)



© 2008-2009              Lucid Imagination, Inc.
                                                   41
Scaling: Collection Size

 • Distribution
  • Index documents across shards
  • query single server with shards
          parameter
         • sends requests to each shard
         • aggregates result to a single response

© 2008-2009             Lucid Imagination, Inc.
                                                    42
Solr-powered UI

 • Solritas (from "celeritas"):
      VelocityResponseWriter
     • easily templated output
 • SolrJS: jQuery-based widgets
  • see http://solrjs.solrstuff.org/
 • Blacklight and Flare: RoR plugins

© 2008-2009           Lucid Imagination, Inc.
                                                43
Lucene in Action, 2nd Edition




              http://www.manning.com/lucene
© 2008-2009               Lucid Imagination, Inc.
                                                    44
Search at Lucid
http://search.lucidimagination.com/?q=javazone




© 2008-2009         Lucid Imagination, Inc.
                                                 45
/")$/#$0(#
            !"#$%&'()*$+),$-+&$0&,12&#-((23#$)4&2+,$,5&-6 78)#12&
            !"#2+29:-43&2#-050,2(
            !"#$%&,2)(&$+#4"%20&,12&4)3*20,&#-442#,$-+&-6&
            !"#2+29:-43&#-(($,,230.&#-+,3$;",-30&)+%&$+64"2+#230&
            <"3&($00$-+&$0&,-&023=2&)0&!"#$%#&'#($)*$+,-#..#&-#$6-3&
            !"#2+29:-43>;)02%&02)3#1&0-4",$-+0
                 ?248&-"3&#"0,-(230&*2,&,12&(-0,&-",&-6&!"#2+29:-43&> !"#$%&'(
                 (-0,&@$%245&"02%&-82+&0-"3#2&02)3#1&0-6,@)32&&&




  A&BCCD>BCCE
   © 2008-2009                     !"#$%&'()*$+),$-+.&'+#/Inc.
                                   Lucid Imagination,            !"#$%$&'()*+',%-'./$0+'*)1)2',+$'.+,-$3,+42')5'./$'67,#/$'()5.8,+$'9)"%-,.0)%

                                                                                                                                                 46
!"#$%&'()*$+),$-+&./#0+$#)1&./)(
                          ! 2-+$3&4//1/56                                          ! <)8#&F8/11/+9,/$+6
                                     012),-1&-3&4-51&&
     Unique                          !"#2+264-51&#-(($,,21.&780&(2(921
                                                                                                 0-;3-"+%21.&0=G64H7.&<-1,:21+&!$*:,
 Combination of           ! 78)+,&'+*/89-116
                                                                                                 H7&42)1#:.&0=G.&I5J2K$21
Enterprise Search                    !"#$%&"'&(')*+,#-#'.&&%'!$/01                 ! @8$)+&G$+3/8,-+6
   and Lucene                        !"#2+264-51&#-(($,,21.&0:)$1.&780                           L2K25-@2%&M2901)N521.&,:2&N29OJ&3$1J,&
                          ! :8$3&;),#0/86                                                        #-(@12:2+J$K2&J2)1#:&2+*$+2&
    Expertise
                                     0-;$+%2&"'&(')*+,#-#'3-'4,%3&-1'5&&6                        71$+#$@)5&P1#:$,2#,&),&PF
                                     !"#2+264-51&#-(($,,21.&780&(2(921             ! 4$(-+&H-9/+,0)16
                          ! <)83&<$11/8                                                          4-5",$-+J&)1#:$,2#,.&<-1,:21+&!$*:,
                                     !"#2+264-51&#-(($,,21.&780&
                                     (2(921                                        ! I)5&;$116
                          ! 4)($&4$8/+                                                           4-5",$-+J&P1#:$,2#,.&M255J&Q)1*-
                                     <",#:6=$>)&#-(($,,21.&780&(2(921
                                                                                   ! H5)+&<#F$+1/56
                          ! =+%8>/?&@$1)1/#3$&
                                                                                                 !"#2+264-51&#-(($,,21.&&780&(2(921
                                     !"#2+26<",#:6?)%--@&#-(($,,21.&780&
                                     (2(921&
                                                                                   ! B08$9&;-9,/,,/86&C=%D$9-8E
                          ! A-"*&B",,$+*6&C=%D$9-8E
                                                                                                 !"#2+264-51&#-(($,,21.&&780&(2(921
                                     012),-1&-3&!"#2+2.&<",#:&A&?)%--@
                                                                                                 82(921&P@)#:2&4-3,N)12&Q-"+%),$-+


       B&CDDE;CDDF
           © 2008-2009                                   !"#$%&'()*$+),$-+.&'+#/
                                                         Lucid Imagination, Inc.
                                                                                                                                          47
!"#$%&'()*$+),$-+&."/$+0//&1-%02
  ;:00
<-=+2-)%
                                                                                  ()*+,-,./+"0+,/.1)
                       2+,*.3.+4"5./*,.67*.1)/
                             & 8,++"&

                        3)2"04)%%&567

     !"#0+0
                                                   89*:)%0
   >9)#?0@-:*




      2199+,:.;<""=7--1,*>" ?,;.).)@>" 21)/7<*.)@"


  !"#$$%&#$$'
         © 2008-2009                        A7:.4"B9;@.);*.1) 21)3.4+)*.;<   !"#$%$&'()*+',%-'./$0+'*)1)2',+$'.+,-$3,+42')5'./$'67,#/$'()5.8,+$'9)"%-,.0)%
                                                   Lucid Imagination, Inc.
                                                                                                                                                             48
Thank you




              http://www.lucidimagination.com
© 2008-2009                Lucid Imagination, Inc.
                                                                 49
© 2008-2009   Lucid Imagination, Inc.
                                        50

More Related Content

What's hot

Near RealTime search @Flipkart
Near RealTime search @FlipkartNear RealTime search @Flipkart
Near RealTime search @FlipkartUmesh Prasad
 
ElasticSearch at berlinbuzzwords 2010
ElasticSearch at berlinbuzzwords 2010ElasticSearch at berlinbuzzwords 2010
ElasticSearch at berlinbuzzwords 2010Elasticsearch
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1Maruf Hassan
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearchpmanvi
 
엘라스틱 서치 세미나
엘라스틱 서치 세미나엘라스틱 서치 세미나
엘라스틱 서치 세미나종현 김
 
Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)Ryan Cuprak
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
2021.laravelconf.tw.slides1
2021.laravelconf.tw.slides12021.laravelconf.tw.slides1
2021.laravelconf.tw.slides1LiviaLiaoFontech
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataDataWorks Summit
 
Marquez: A Metadata Service for Data Abstraction, Data Lineage, and Event-bas...
Marquez: A Metadata Service for Data Abstraction, Data Lineage, and Event-bas...Marquez: A Metadata Service for Data Abstraction, Data Lineage, and Event-bas...
Marquez: A Metadata Service for Data Abstraction, Data Lineage, and Event-bas...Willy Lulciuc
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?lucenerevolution
 
Continuous Deployment: The Dirty Details
Continuous Deployment: The Dirty DetailsContinuous Deployment: The Dirty Details
Continuous Deployment: The Dirty DetailsMike Brittain
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.Jurriaan Persyn
 
Hive and Apache Tez: Benchmarked at Yahoo! Scale
Hive and Apache Tez: Benchmarked at Yahoo! ScaleHive and Apache Tez: Benchmarked at Yahoo! Scale
Hive and Apache Tez: Benchmarked at Yahoo! ScaleDataWorks Summit
 
Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query ParsingErik Hatcher
 
Hive Anatomy
Hive AnatomyHive Anatomy
Hive Anatomynzhang
 
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...Rahul K Chauhan
 

What's hot (20)

Near RealTime search @Flipkart
Near RealTime search @FlipkartNear RealTime search @Flipkart
Near RealTime search @Flipkart
 
ElasticSearch at berlinbuzzwords 2010
ElasticSearch at berlinbuzzwords 2010ElasticSearch at berlinbuzzwords 2010
ElasticSearch at berlinbuzzwords 2010
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
Elasticsearch Introduction
Elasticsearch IntroductionElasticsearch Introduction
Elasticsearch Introduction
 
엘라스틱 서치 세미나
엘라스틱 서치 세미나엘라스틱 서치 세미나
엘라스틱 서치 세미나
 
Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
2021.laravelconf.tw.slides1
2021.laravelconf.tw.slides12021.laravelconf.tw.slides1
2021.laravelconf.tw.slides1
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
 
Marquez: A Metadata Service for Data Abstraction, Data Lineage, and Event-bas...
Marquez: A Metadata Service for Data Abstraction, Data Lineage, and Event-bas...Marquez: A Metadata Service for Data Abstraction, Data Lineage, and Event-bas...
Marquez: A Metadata Service for Data Abstraction, Data Lineage, and Event-bas...
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?
 
Continuous Deployment: The Dirty Details
Continuous Deployment: The Dirty DetailsContinuous Deployment: The Dirty Details
Continuous Deployment: The Dirty Details
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
 
Count min sketch
Count min sketchCount min sketch
Count min sketch
 
Hive and Apache Tez: Benchmarked at Yahoo! Scale
Hive and Apache Tez: Benchmarked at Yahoo! ScaleHive and Apache Tez: Benchmarked at Yahoo! Scale
Hive and Apache Tez: Benchmarked at Yahoo! Scale
 
Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query Parsing
 
Hive Anatomy
Hive AnatomyHive Anatomy
Hive Anatomy
 
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
 

Viewers also liked

Using Apache Solr
Using Apache SolrUsing Apache Solr
Using Apache Solrpittaya
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineTrey Grainger
 
Enterprise Search in Practice: A Presentation of Survey Results and Areas for...
Enterprise Search in Practice: A Presentation of Survey Results and Areas for...Enterprise Search in Practice: A Presentation of Survey Results and Areas for...
Enterprise Search in Practice: A Presentation of Survey Results and Areas for...Findwise
 
Solr introduction
Solr introductionSolr introduction
Solr introductionLap Tran
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache SolrEdureka!
 
Enterprise Search Using Apache Solr
Enterprise Search Using Apache SolrEnterprise Search Using Apache Solr
Enterprise Search Using Apache Solrsagar chaturvedi
 
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...Sematext Group, Inc.
 
Spark overview
Spark overviewSpark overview
Spark overviewLisa Hua
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Apache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 AcquiaApache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 AcquiaDropsolid
 
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, Lucidworks
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, LucidworksState of Solr Security 2016: Presented by Ishan Chattopadhyaya, Lucidworks
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, LucidworksLucidworks
 
Apache Solr-Webinar
Apache Solr-WebinarApache Solr-Webinar
Apache Solr-WebinarEdureka!
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache SolrChristos Manios
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development TutorialErik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 

Viewers also liked (20)

Using Apache Solr
Using Apache SolrUsing Apache Solr
Using Apache Solr
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engine
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Enterprise Search in Practice: A Presentation of Survey Results and Areas for...
Enterprise Search in Practice: A Presentation of Survey Results and Areas for...Enterprise Search in Practice: A Presentation of Survey Results and Areas for...
Enterprise Search in Practice: A Presentation of Survey Results and Areas for...
 
Solr introduction
Solr introductionSolr introduction
Solr introduction
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache Solr
 
Enterprise Search Using Apache Solr
Enterprise Search Using Apache SolrEnterprise Search Using Apache Solr
Enterprise Search Using Apache Solr
 
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
Large Scale Performance Monitoring for ElasticSearch, HBase, Solr, SenseiDB, ...
 
How Solr Search Works
How Solr Search WorksHow Solr Search Works
How Solr Search Works
 
Spark overview
Spark overviewSpark overview
Spark overview
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Apache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 AcquiaApache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 Acquia
 
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, Lucidworks
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, LucidworksState of Solr Security 2016: Presented by Ishan Chattopadhyaya, Lucidworks
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, Lucidworks
 
Apache Solr-Webinar
Apache Solr-WebinarApache Solr-Webinar
Apache Solr-Webinar
 
High Performance Solr
High Performance SolrHigh Performance Solr
High Performance Solr
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development Tutorial
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 

Similar to Solr: Search at the Speed of Light

The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill lucenerevolution
 
The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill lucenerevolution
 
Games for the Masses (Jax)
Games for the Masses (Jax)Games for the Masses (Jax)
Games for the Masses (Jax)Wooga
 
Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Lucidworks (Archived)
 
Building specialized industry apps using solr - By Rahul Agarwalla
Building specialized industry apps using solr - By Rahul Agarwalla   Building specialized industry apps using solr - By Rahul Agarwalla
Building specialized industry apps using solr - By Rahul Agarwalla lucenerevolution
 
Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Lucidworks (Archived)
 
Tricks And Tradeoffs Of Deploying My Sql Clusters In The Cloud
Tricks And Tradeoffs Of Deploying My Sql Clusters In The CloudTricks And Tradeoffs Of Deploying My Sql Clusters In The Cloud
Tricks And Tradeoffs Of Deploying My Sql Clusters In The CloudMySQLConference
 
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to HivePerformance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to HiveYukinori Suda
 
HBase and Hadoop at Adobe
HBase and Hadoop at AdobeHBase and Hadoop at Adobe
HBase and Hadoop at AdobeCosmin Lehene
 
Oracle+golden+gate+introduction
Oracle+golden+gate+introductionOracle+golden+gate+introduction
Oracle+golden+gate+introductionxiakaicd
 
Oslo Enterprise MeetUp May 12th 2010 - Jan Høydahl
Oslo Enterprise MeetUp May 12th 2010 - Jan HøydahlOslo Enterprise MeetUp May 12th 2010 - Jan Høydahl
Oslo Enterprise MeetUp May 12th 2010 - Jan HøydahlCominvent AS
 
Mule ESB - Integration Simplified
Mule ESB - Integration SimplifiedMule ESB - Integration Simplified
Mule ESB - Integration SimplifiedRich Software
 
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...jaxLondonConference
 
Ontology and semantic web (2016)
Ontology and semantic web (2016)Ontology and semantic web (2016)
Ontology and semantic web (2016)Craig Trim
 
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...cwensel
 
MarkLogic Server / NoSQL at ApacheCon
MarkLogic Server / NoSQL at ApacheConMarkLogic Server / NoSQL at ApacheCon
MarkLogic Server / NoSQL at ApacheConhunterhacker
 
Building Scale Free Applications with Hadoop and Cascading
Building Scale Free Applications with Hadoop and CascadingBuilding Scale Free Applications with Hadoop and Cascading
Building Scale Free Applications with Hadoop and Cascadingcwensel
 

Similar to Solr: Search at the Speed of Light (20)

The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill
 
The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill The Seven Deadly Sins of Solr - By Jay Hill
The Seven Deadly Sins of Solr - By Jay Hill
 
The Seven Deadly Sins of Solr
The Seven Deadly Sins of SolrThe Seven Deadly Sins of Solr
The Seven Deadly Sins of Solr
 
Games for the Masses (Jax)
Games for the Masses (Jax)Games for the Masses (Jax)
Games for the Masses (Jax)
 
Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...
 
Building specialized industry apps using solr - By Rahul Agarwalla
Building specialized industry apps using solr - By Rahul Agarwalla   Building specialized industry apps using solr - By Rahul Agarwalla
Building specialized industry apps using solr - By Rahul Agarwalla
 
Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...
 
Tricks And Tradeoffs Of Deploying My Sql Clusters In The Cloud
Tricks And Tradeoffs Of Deploying My Sql Clusters In The CloudTricks And Tradeoffs Of Deploying My Sql Clusters In The Cloud
Tricks And Tradeoffs Of Deploying My Sql Clusters In The Cloud
 
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to HivePerformance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
 
HBase and Hadoop at Adobe
HBase and Hadoop at AdobeHBase and Hadoop at Adobe
HBase and Hadoop at Adobe
 
Oracle+golden+gate+introduction
Oracle+golden+gate+introductionOracle+golden+gate+introduction
Oracle+golden+gate+introduction
 
Oslo Enterprise MeetUp May 12th 2010 - Jan Høydahl
Oslo Enterprise MeetUp May 12th 2010 - Jan HøydahlOslo Enterprise MeetUp May 12th 2010 - Jan Høydahl
Oslo Enterprise MeetUp May 12th 2010 - Jan Høydahl
 
Mule ESB - Integration Simplified
Mule ESB - Integration SimplifiedMule ESB - Integration Simplified
Mule ESB - Integration Simplified
 
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
 
Ontology and semantic web (2016)
Ontology and semantic web (2016)Ontology and semantic web (2016)
Ontology and semantic web (2016)
 
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...
 
Solr @ eBay Kleinanzeigen
Solr @ eBay KleinanzeigenSolr @ eBay Kleinanzeigen
Solr @ eBay Kleinanzeigen
 
MarkLogic Server / NoSQL at ApacheCon
MarkLogic Server / NoSQL at ApacheConMarkLogic Server / NoSQL at ApacheCon
MarkLogic Server / NoSQL at ApacheCon
 
Building Scale Free Applications with Hadoop and Cascading
Building Scale Free Applications with Hadoop and CascadingBuilding Scale Free Applications with Hadoop and Cascading
Building Scale Free Applications with Hadoop and Cascading
 
Pig programming is fun
Pig programming is funPig programming is fun
Pig programming is fun
 

More from Erik Hatcher

Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Erik Hatcher
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksErik Hatcher
 
Solr Powered Libraries
Solr Powered LibrariesSolr Powered Libraries
Solr Powered LibrariesErik Hatcher
 
"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - Chicago"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - ChicagoErik Hatcher
 
Query Parsing - Tips and Tricks
Query Parsing - Tips and TricksQuery Parsing - Tips and Tricks
Query Parsing - Tips and TricksErik Hatcher
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0Erik Hatcher
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes WorkshopErik Hatcher
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 

More from Erik Hatcher (20)

Ted Talk
Ted TalkTed Talk
Ted Talk
 
Solr Payloads
Solr PayloadsSolr Payloads
Solr Payloads
 
it's just search
it's just searchit's just search
it's just search
 
Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis Tricks
 
Solr Powered Libraries
Solr Powered LibrariesSolr Powered Libraries
Solr Powered Libraries
 
"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - Chicago"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - Chicago
 
Query Parsing - Tips and Tricks
Query Parsing - Tips and TricksQuery Parsing - Tips and Tricks
Query Parsing - Tips and Tricks
 
Solr 4
Solr 4Solr 4
Solr 4
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Solr Flair
Solr FlairSolr Flair
Solr Flair
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes Workshop
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 

Recently uploaded

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 

Recently uploaded (20)

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 

Solr: Search at the Speed of Light

  • 1. Solr Search at the Speed of Light JavaZone 2009 September 10 Oslo Erik Hatcher, Lucid Imagination erik.hatcher@lucidimagination.com 1
  • 2. Solr History • Created by Yonik Seeley for CNET • Contributed to Apache in January 2006 • December 2006:Version 1.1 released • June 2007:Version 1.2 released • September 2008:Version 1.3 released • ~September 2009:Version 1.4 http://lucene.apache.org/solr © 2008-2009 Lucid Imagination, Inc. 2
  • 3. Solr: Big Picture Data DB Document Document Documents Solr Search Results © 2008-2009 Lucid Imagination, Inc. 3
  • 4. Features • Lucene power exposed over HTTP • Scalability: caching, replication, distributed search • Faceting • And more: spell checking, highlighting, clustering, rich document and DB indexing, "more like this" © 2008-2009 Lucid Imagination, Inc. 4
  • 5. Lucene • Fast, scalable search library • Lucene index structure • Index contains documents • documents have fields • indexed fields have terms © 2008-2009 Lucid Imagination, Inc. 5
  • 6. Inverted Index • Commonly used search engine data structure • Efficient lookup of terms across large number of documents • Usually stores positional information to enable From "Taming Text" by Grant Ingersoll and Tom Morton phrase/proximity queries © 2008-2009 Lucid Imagination, Inc. 6
  • 7. Analysis Process © 2008-2009 Lucid Imagination, Inc. 7
  • 8. Analyzing the analyzer Example phrase The quick brown fox jumps over the lazy dog. © 2008-2009 Lucid Imagination, Inc. 8
  • 9. WhitespaceAnalyzer Simplest built-in analyzer The quick brown fox jumps over the lazy dog. [The] [quick] [brown] [fox] [jumps] [over] [the] [lazy] [dog.] © 2008-2009 Lucid Imagination, Inc. 9
  • 10. SimpleAnalyzer Lowercases, splits at non-letter boundaries the quick brown fox jumps over the lazy dog. [the] [quick] [brown] [fox] [jumps] [over] [the] [lazy] [dog] © 2008-2009 Lucid Imagination, Inc. 10
  • 11. StopAnalyzer Lowercases and removes stop words The quick brown fox jumps over the lazy dog. [quick] [brown] [fox] [jumps] [over] [lazy] [dog] © 2008-2009 Lucid Imagination, Inc. 11
  • 12. SnowballAnalyzer Stemming algorithm The quick brown fox jumps over the lazi dog. [the] [quick] [brown] [fox] [jump] [over] [the] [lazi] [dog] © 2008-2009 Lucid Imagination, Inc. 12
  • 13. What's in a token? © 2008-2009 Lucid Imagination, Inc. 13
  • 14. Relevance • Term frequency (TF): number of times a term appears in a document • Inverse document frequency (IDF): One over number of times term appears in the index (1/df) • Field length normalization: control affect field length, in number of terms, has on score • Boost factors: terms, fields, or documents © 2008-2009 Lucid Imagination, Inc. 14
  • 15. Lucene Scoring d1 q1 Θ © 2008-2009 Lucid Imagination, Inc. 15
  • 16. Solr APIs • HTTP GET/POST (curl or any other HTTP client) • JSON • SolrJ (embedded or HTTP) • solr-ruby • python, PHP, solrsharp, XSLT © 2008-2009 Lucid Imagination, Inc. 16
  • 17. Solr in Production Incoming Search Requests Load Balancer Solr Solr Master Solr Master Shard Request Shard Request Load Balancer Load Balancer Shard Shard Shard Shard Master 1..n Master Replicant shards Replicant Replicant Replicant Replicant Replicant Replicant Replicant © 2008-2009 Lucid Imagination, Inc. 17
  • 18. Getting Started: It's This Easy 1.Start Solr java -jar start.jar 2.Index your data java -jar post.jar *.xml 3.Search http://localhost:8983/solr © 2008-2009 Lucid Imagination, Inc. 18
  • 19. Configuration • schema.xml • field types and fields • solrconfig.xml • request handler mappings • cache settings: filter, query, document • warming listeners • HTTP cache settings • Lucene index parameters • plugins: spell checking, highlighting © 2008-2009 Lucid Imagination, Inc. 19
  • 20. Solr add/update XML <add><doc> <field name="id">MA147LL/A</field> <field name="name">Apple 60 GB iPod with Video Playback Black</field> <field name="manu">Apple Computer Inc.</field> <field name="cat">electronics</field> <field name="cat">music</field> <field name="features">iTunes, Podcasts, Audiobooks</field> <field name="features">Stores up to 15,000 songs, 25,000 photos, or 150 hours of video</field> <field name="features">2.5-inch, 320x240 color TFT LCD display with LED backlight</field> <field name="features">Up to 20 hours of battery life</field> <field name="features">Plays AAC, MP3, WAV, AIFF, Audible, Apple Lossless, H.264 video</field> <field name="features">Notes, Calendar, Phone book, Hold button, Date display, Photo wallet, Built-in games, JPEG photo playback, Upgradeable firmware, USB 2.0 compatibility, Playback speed control, Rechargeable capability, Battery level indication</field> <field name="includes">earbud headphones, USB cable</field> <field name="weight">5.5</field> <field name="price">399.00</field> <field name="popularity">10</field> <field name="inStock">true</field> </doc></add> © 2008-2009 Lucid Imagination, Inc. 20
  • 21. Indexing Solr XML • Via curl:'http://localhost:8983/ curl solr/update?commit=true' -- data-binary @ipod_video.xml - H 'Content-type:text/xml; charset=utf-8' • Via Solr's Java-based post tool: java -jar post.jar ipod_video.xml © 2008-2009 Lucid Imagination, Inc. 21
  • 22. Indexing CSV curl 'http://localhost:8983/solr/update/ csv?commit=true' --data-binary @books.csv - H 'Content-type:text/plain; charset=utf-8' © 2008-2009 Lucid Imagination, Inc. 22
  • 23. Content Streams • Allows Solr server to fetch local or remote data itself. Must enable remote streaming in solrconfig.xml • http://localhost:8983/solr/update?stream.file=<local Solr path to exampledocs>/ipod_video.xml • &stream.url=<url to content> • Security warning: allows Solr to fetch arbitrary server-side file or network URL content © 2008-2009 Lucid Imagination, Inc. 23
  • 24. Indexing Rich Documents curl 'http://localhost:8983/solr/update/ extract? literal.id=doc1&commit=true&extractOnly=true &wt=ruby&indent=on' -F "myfile=@tutorial.html" © 2008-2009 Lucid Imagination, Inc. 24
  • 25. Indexing with SolrJ SolrServer solr = new CommonsHttpSolrServer(new URL("http://localhost:8983/solr")); SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", "JAVAZONE_09"); doc.addField("title", "JavaZone 2009 SolrJ Example"); solr.add(doc); solr.commit(); // after a batch, not per document solr.optimize(); // periodically, when needed © 2008-2009 Lucid Imagination, Inc. 25
  • 26. Indexing with Ruby solr = Connection.new( 'http://localhost:8983/solr', :autocommit => :on) solr.add(:id => 123, :title => 'Solr in Action') solr.optimize # periodically, as needed © 2008-2009 Lucid Imagination, Inc. 26
  • 27. Data Import Handler • Indexes relational database, XML data sources, e-mail, and more • Supports full and incremental/delta indexing • Extensible with custom data sources, transformers, etc • http://wiki.apache.org/solr/DataImportHandler © 2008-2009 Lucid Imagination, Inc. 27
  • 29. Example Search Request • http://localhost:8983/solr/select?q=query • &start=50 • &rows=25 • &fq=filter+query • &facet=on&facet.field=category © 2008-2009 Lucid Imagination, Inc. 29
  • 30. Debug Query • &debugQuery=true is your friend • Includes parsed query, explanations, and search component timings in response © 2008-2009 Lucid Imagination, Inc. 30
  • 31. Query Parser • Controlled by defType parameter • &defType=lucene (actually a Solr extension of Lucene’s QueryParser) • &defType=dismax • Local {!..} override syntax © 2008-2009 Lucid Imagination, Inc. 31
  • 32. Solr Query Parser • http://lucene.apache.org/java/2_4_0/ queryparsersyntax.html + Solr extensions • Kitchen sink parser, includes advanced user- unfriendly syntax • Syntax errors throw parse exceptions back to client • Example: title:ipod* AND price:[0 TO 100] © 2008-2009 Lucid Imagination, Inc. 32
  • 33. Dismax Query Parser • Simplified syntax: loose text “quote phrases” -prohibited +required • Spreads query terms across query fields (qf) with dynamic boosting per field, implicit phrase construction (pf), boosting function (bf), boosting query (bq), and minimum match (mm) © 2008-2009 Lucid Imagination, Inc. 33
  • 34. Searching with SolrJ SolrServer server = new CommonsHttpSolrServer("http:// localhost:8983/solr"); SolrQuery params = new SolrQuery("author:John"); params.setFields("*,score"); params.setRows(3); QueryResponse response = server.query(params); for (SolrDocument document : response.getResults()) { System.out.println("Doc: " + document); } © 2008-2009 Lucid Imagination, Inc. 34
  • 35. Searching with Ruby conn = Connection.new( 'http://localhost:8983/solr') conn.query('my query') do |hit| puts hit.inspect end © 2008-2009 Lucid Imagination, Inc. 35
  • 36. delete, update, etc • Delete: • <delete><id>05991</id></delete> • <delete> <query>category:Unused</query> </delete> • java -Ddata=args -jar post.jar "<delete><query>*:*</query></delete>" • Update: simply <add> doc with same unique key • Commit: <commit/> • Optimize: <optimize/> © 2008-2009 Lucid Imagination, Inc. 36
  • 37. Faceting • Counts per subset within results • Facet on: field terms, queries, date ranges • &facet=on &facet.field=cat &facet.query=price:[0 TO 100] • http://wiki.apache.org/solr/ SimpleFacetParameters © 2008-2009 Lucid Imagination, Inc. 37
  • 38. Spell checking • Not enabled by default, see example config to wire it in • http://localhost:8983/solr/spell? q=epod&spellcheck=on&spellcheck.build=true • File or index-based dictionaries • Supports pluggable distance algorithms: Levenstein and JaroWinkler • http://wiki.apache.org/solr/SpellCheckComponent © 2008-2009 Lucid Imagination, Inc. 38
  • 39. Highlighting • http://localhost:8983/solr/select? q=ipod&hl=on&hl.fl=manu,name • http://wiki.apache.org/solr/ HighlightingParameters © 2008-2009 Lucid Imagination, Inc. 39
  • 40. More Like This • http://localhost:8983/solr/select? q=ipod&mlt=true&mlt.fl=manu,cat&mlt.min df=1&mlt.mintf=1&fl=id,score,name • http://wiki.apache.org/solr/MoreLikeThis © 2008-2009 Lucid Imagination, Inc. 40
  • 41. Scaling: Query Throughput • Replication • slaves poll master for index updates • transfers index files from master to slave • configuration files can also be transferred • entirely Java/HTTP-based in Solr 1.4 (prior versions used rsync) © 2008-2009 Lucid Imagination, Inc. 41
  • 42. Scaling: Collection Size • Distribution • Index documents across shards • query single server with shards parameter • sends requests to each shard • aggregates result to a single response © 2008-2009 Lucid Imagination, Inc. 42
  • 43. Solr-powered UI • Solritas (from "celeritas"): VelocityResponseWriter • easily templated output • SolrJS: jQuery-based widgets • see http://solrjs.solrstuff.org/ • Blacklight and Flare: RoR plugins © 2008-2009 Lucid Imagination, Inc. 43
  • 44. Lucene in Action, 2nd Edition http://www.manning.com/lucene © 2008-2009 Lucid Imagination, Inc. 44
  • 46. /")$/#$0(# !"#$%&'()*$+),$-+&$0&,12&#-((23#$)4&2+,$,5&-6 78)#12& !"#2+29:-43&2#-050,2( !"#$%&,2)(&$+#4"%20&,12&4)3*20,&#-442#,$-+&-6& !"#2+29:-43&#-(($,,230.&#-+,3$;",-30&)+%&$+64"2+#230& <"3&($00$-+&$0&,-&023=2&)0&!"#$%#&'#($)*$+,-#..#&-#$6-3& !"#2+29:-43>;)02%&02)3#1&0-4",$-+0 ?248&-"3&#"0,-(230&*2,&,12&(-0,&-",&-6&!"#2+29:-43&> !"#$%&'( (-0,&@$%245&"02%&-82+&0-"3#2&02)3#1&0-6,@)32&&& A&BCCD>BCCE © 2008-2009 !"#$%&'()*$+),$-+.&'+#/Inc. Lucid Imagination, !"#$%$&'()*+',%-'./$0+'*)1)2',+$'.+,-$3,+42')5'./$'67,#/$'()5.8,+$'9)"%-,.0)% 46
  • 47. !"#$%&'()*$+),$-+&./#0+$#)1&./)( ! 2-+$3&4//1/56 ! <)8#&F8/11/+9,/$+6 012),-1&-3&4-51&& Unique !"#2+264-51&#-(($,,21.&780&(2(921 0-;3-"+%21.&0=G64H7.&<-1,:21+&!$*:, Combination of ! 78)+,&'+*/89-116 H7&42)1#:.&0=G.&I5J2K$21 Enterprise Search !"#$%&"'&(')*+,#-#'.&&%'!$/01 ! @8$)+&G$+3/8,-+6 and Lucene !"#2+264-51&#-(($,,21.&0:)$1.&780 L2K25-@2%&M2901)N521.&,:2&N29OJ&3$1J,& ! :8$3&;),#0/86 #-(@12:2+J$K2&J2)1#:&2+*$+2& Expertise 0-;$+%2&"'&(')*+,#-#'3-'4,%3&-1'5&&6 71$+#$@)5&P1#:$,2#,&),&PF !"#2+264-51&#-(($,,21.&780&(2(921 ! 4$(-+&H-9/+,0)16 ! <)83&<$11/8 4-5",$-+J&)1#:$,2#,.&<-1,:21+&!$*:, !"#2+264-51&#-(($,,21.&780& (2(921 ! I)5&;$116 ! 4)($&4$8/+ 4-5",$-+J&P1#:$,2#,.&M255J&Q)1*- <",#:6=$>)&#-(($,,21.&780&(2(921 ! H5)+&<#F$+1/56 ! =+%8>/?&@$1)1/#3$& !"#2+264-51&#-(($,,21.&&780&(2(921 !"#2+26<",#:6?)%--@&#-(($,,21.&780& (2(921& ! B08$9&;-9,/,,/86&C=%D$9-8E ! A-"*&B",,$+*6&C=%D$9-8E !"#2+264-51&#-(($,,21.&&780&(2(921 012),-1&-3&!"#2+2.&<",#:&A&?)%--@ 82(921&P@)#:2&4-3,N)12&Q-"+%),$-+ B&CDDE;CDDF © 2008-2009 !"#$%&'()*$+),$-+.&'+#/ Lucid Imagination, Inc. 47
  • 48. !"#$%&'()*$+),$-+&."/$+0//&1-%02 ;:00 <-=+2-)% ()*+,-,./+"0+,/.1) 2+,*.3.+4"5./*,.67*.1)/ & 8,++"& 3)2"04)%%&567 !"#0+0 89*:)%0 >9)#?0@-:* 2199+,:.;<""=7--1,*>" ?,;.).)@>" 21)/7<*.)@" !"#$$%&#$$' © 2008-2009 A7:.4"B9;@.);*.1) 21)3.4+)*.;< !"#$%$&'()*+',%-'./$0+'*)1)2',+$'.+,-$3,+42')5'./$'67,#/$'()5.8,+$'9)"%-,.0)% Lucid Imagination, Inc. 48
  • 49. Thank you http://www.lucidimagination.com © 2008-2009 Lucid Imagination, Inc. 49
  • 50. © 2008-2009 Lucid Imagination, Inc. 50