Your SlideShare is downloading. ×
0
Solr: Enterprise Search Server
Solr: Enterprise Search Server
Solr: Enterprise Search Server
Solr: Enterprise Search Server
Solr: Enterprise Search Server
Solr: Enterprise Search Server
Solr: Enterprise Search Server
Solr: Enterprise Search Server
Solr: Enterprise Search Server
Solr: Enterprise Search Server
Solr: Enterprise Search Server
Solr: Enterprise Search Server
Solr: Enterprise Search Server
Solr: Enterprise Search Server
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Solr: Enterprise Search Server

868

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
868
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
30
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. PRESENTATION Solr Enterprise Search Server by Armen Polischuk
  • 2. Introduction
    • Java Web Application (Http/XML)
    • 3. Uses Apache Lucene as text search engine
    • 4. Inverted Index Data Structure
    November, 30 2010
  • 5. Common Usage November, 30 2010
  • 6. Lucene Features November, 30 2010
    • A text-based inverted index persistent storage for efficient retrieval of
    • 7. documents by indexed terms
    • 8. A rich set of text analyzers to transform a string of text into a series of terms (words), which are the fundamental units indexed and searched
    • 9. A query syntax with a parser and a variety of query types
    • 10. Lookup to exotic fuzzy matches
    • 11. A good scoring algorithm based on sound Information Retrieval (IR)
    • 12. principles to produce the more likely candidates first, with flexible means
    • 13. to affect the scoring
    • 14. A highlighter feature to show words found in context
    • A query spellchecker based on indexed content
  • 15. Solr Features November, 30 2010
    • HTTP request processing for indexing and querying documents
    • 16. Several caches for faster query responses
    • 17. A web-based administrative interface
    • 18. Configuration files for the schema and the server itself
    • 19. The disjunction-max query handler
    • 20. A more like this plugin to list documents that are similar to a chosen document
    • 21. A distributed Solr server model
  • 22. Indexing Data November, 30 2010
    • Solr's native XML
    • 23. CSV, JSON
    • 24. Direct Database and XML Import through Solr's DataImportHandler
    • 25. Rich documents through Solr Cell (pdf, doc, xls, ppt)
  • 26. Indexing XML Request November, 30 2010
      <add allowDups = &quot;false&quot; >
        <doc boost = &quot;2.0&quot; >
          <field name = &quot;doc_id&quot; > 1 </field> <field name = &quot;type&quot; > PERSON </field> <field name = &quot;first_name&quot; boost = &quot;2.5&quot; > Armen </field>
        • <field name = &quot;last_name&quot; > Polischuk </field>
      • </doc>
      • <doc>
        • <field name = &quot;doc_id&quot; > 2 </field>
        • 27. <field name = &quot;type&quot; > PERSON </field>
        • 28. <field name = &quot;first_name&quot; > John </field>
        • 29. <field name = &quot;last_name&quot; > Smith </field>
      • </doc>
      </add>
      Adding documents:
      Deleting documents:
      <delete><id> doc_id:2 </id><id> doc_id:3 </id></delete>
  • 30. Basic Searching November, 30 2010
    • Using Web Interface
    • 31. Using http request/response
    • 32. Using SolrJ client
    http://localhost:8983/solr/select?indent=on&version=2.2&q=*%3A*&start=0&rows=10&fl=*%2Cscore&qt=standard&wt=standard&explainOther=&hl.fl= Example:
  • 33. Searching XML Response November, 30 2010 <response> <lst name = &quot;responseHeader&quot; > <int name = &quot;status&quot; > 0 </int> <int name = &quot;QTime&quot; > 392 </int> <lst name = &quot;params&quot; > <str name = &quot;explainOther&quot; /> <str name = &quot;fl&quot; > *,score </str> <str name = &quot;start&quot; > 0 </str> <str name = &quot;q&quot; > *:* </str> <str name = &quot;hl.fl&quot; /> </lst> </lst> <result name = &quot;response&quot; numFound = &quot;1002272&quot; start = &quot;0&quot; maxScore = &quot;1.0&quot; > <doc> <float name = &quot;score&quot; > 1.0 </float> <str name = &quot;id&quot; > PERSON:1 </str> <str name = &quot;first_name&quot; > Armen </str> <str name = &quot;last_name&quot; > Polischuk </str> </doc> </result> </response>
  • 34. Features November, 30 2010
    • And, or, not:
    Java AND Developer NOT swing (Java OR Python) AND Developer
    • Field qualifier:
    first_name:John AND last_name:Doe Phrase and term proximity “ Web Developer” “ Web Developer”~3
    • Wildcards
    Java* AND Developer
    • Boosting
    Java^10 AND Web^5 AND Developer
    • Filtering and sorting
    q=Java&fq=type%3APERSON&sort=score+asc
  • 35. Advanced Features November, 30 2010
    • Highlighting
    • 36. Query elevation
    • 37. Spell checking aka “Did you mean...”
    • 38. The more-like-this search
  • 39. Scaling - approaches November, 30 2010
    • Optimizing a single Solr server
    • Split data by doc type
  • 43. Scaling – whole picture November, 30 2010
  • 44. Documentation November, 30 2010
      • http://wiki.apache.org/lucene-java/FrontPage
      • http://wiki.apache.org/solr/FrontPage

×