Your SlideShare is downloading. ×

Search with Solr

4,010

Published on

With Google constantly pushing the customer expectations of searching, is it time to move away from our database full-text search in pursuit of a more targeted platform? Can implementing Solr offer …

With Google constantly pushing the customer expectations of searching, is it time to move away from our database full-text search in pursuit of a more targeted platform? Can implementing Solr offer more than an answer to a search? Implementing a search platform isn’t always suitable for all applications, but in this talk we’ll look at identifying the right search solution, choosing the best way to integrate it into our application and exploring all the benefits a search server can offer.

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,010
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
91
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Twitter: @paulmatthews86Personal Blog: 86pTechnicalNon-techSoftware Engineer at IbuildingsTechportalMongoDBSolr (May 2011)Solr ProjectsTravel CompanyMedia Company
  • This talk What Is Solr? When is right timeWhySearch ?How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  • This talk When right time – identifying Why Search benefits Dark horse How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  • What is search? Text based navigation To content / products Customers describing something Capture queries SortingOrganising content Examples Quick search Category listing Advanced search
  • The Power of SearchFrom LIKE to SOLR
  • First up DB Like
  • Pros: Little effort to use, or understand.Cons: Not good User data: Not greater than 1 word
  • Full Text Lots of people use
  • Pros: Some power Convenient In DBCons: Feature poor Slow
  • Basic / Easy to use proper Search
  • Pros: Can be very fast Often simple to setupCons: Feature poor Less accurate More application code?Google Custom Search Engine Crawls siteXapian Simple search solution
  • Pros:Poweful Feature rich Relatively Simple Lots of pluginsCons: Could be overkill Different language
  • On Java stand alone Requires servlet container Tomcat Jetty stand alone Lucene Search library Offers Full Text High performance Java - other implementations available
  • This talk When right time – identifying Why Search benefits Dark horse How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  • Who? Traffic Not for Facebook Works for average Features It has many No need to use themWhen? Designed from beginning Easily used to enrich site navigation Implementation as post-live project Implementation into existing open source softwareDrupalMagento
  • Spending time / effort / money on the search box Fixing bugs Endless tuning Adding functionalityCustomers complaining Not finding content High Bounce rates Site is slow Not finding the *right* content
  • Large data sets 10000 records Speed Like queriesMySQL full-text Site performanceSlowlog? Results Inaccurate MissingGraceful degradation Important for quality Low cost
  • This talk When right time – identifying Why Search benefits Dark horse How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  • Is Solr right for me?Before Answering:Terms:Find materialsCommunicate to peopleFunctionality:Most Use – Know FunctionalityRe-invent – Wheel
  • Main 2Database tables Data Import Handler Easy – just configAPI Anything publish API Hooked into contentCSV & XMLSolr Cell - Rich Docs PDF MS Office
  • Parse: text generate index Removes junk Improve matchesHalf now, half later: Reduce time searching
  • Analyzer Groups actions of Parsing Important to do same / similar in searching
  • TokenizerStrings to tokensExample ones:Whitespace – splits on whitespaceKeyword – strips special charsStandard – General purpose, adds context
  • Transforms tokensLower case.Stop – filters out stop words: a, if, to, andStandard – Remove dots, ‘s (Context only)Synonym.
  • Hit Highlighting* Remember to set the delimiter, not everything is a web page.
  • Spell checkingConfigureSpellingsNames - FlickrKeywords
  • Autocomplete Common queries
  • Phrase queries "search for a phrase"Wildcard queries Match with wildcards ? single * multipleFuzzy queriesLevenshtein Distance Similar to word ~Proximity queries Words close together "two words"~12Range queries Between two values started:[20110101 TO 20120101] Inclusivename:{Paul TO Jeff} exclusive
  • Fields Single field Target search Multiple field Build Queries
  • Faceted Set Counts Filter data Multiple classifications
  • Ordered results based on best matchOr order by any field
  • Simultaneous update and search
  • This talk When right time – identifying Why Search benefits Dark horse How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  • Blog post – to explainsConfigure ContainerSolrIndex Documents Any sourceSearch Default search Advanced search
  • Container setup Choose Configure Accessible
  • Define the data Define what is indexed Define what is storedIntegral to returning relevant search responsesRequire tweaking to get rightConscious of space size of the index - speed
  • Docs to Schema SpecIndexing by Database or API
  • Partial Words Analyzing?Search all fields Possibly the main onesResponse Less data Stay clear of additional queries consider caching
  • Consider using stemming analyzers to return more resultsIncrease matching columnsUse session data affect results Consider caching effectsMore response data required
  • Users modify their search Specify fields For enriching the results Consider bloated storage Tradeoff with Additional queries Tweak later?Advanced for returning More / Less results Search more of the document Filter on property
  • This talk When right time – identifying Why Search benefits Dark horse How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  • Twitter: @paulmatthews86Personal Blog: 86pTechnicalNon-techSoftware Engineer at IbuildingsTechportalMongoDBSolr (May 2011)Solr ProjectsTravel CompanyMedia Company
  • Transcript

    • 1. Searching with SolrWhen, Why and How?
      By Paul Matthews
    • 2. 86p
      @paulmatthews86
      86p.paul-matthews.co.uk
      pmatthews@ibuildings.com
      techportal.ibuildings.com
      Projects:
      Travel companies
      Media corporations
      1
    • 3. Searching…
      What?
      When?
      Why?
      How?
      2
    • 4. Searching…
      What?
      When?
      Why?
      How?
      3
    • 5. What is search?
      Text navigation
      Customers describing
      Sorting
      Examples
      Quick search
      Category listings
      4
    • 6. The power of search
      5
    • 7. Database Like
      6
    • 8. Database Like
      Very little effort
      A very basic search
      Poor at: > 1 word
      7
    • 9. Database Full-Text
      8
    • 10. Database Full-Text
      Some power
      Convenient
      Feature poor
      Often very slow
      9
    • 11. Basic Search Systems
      10
    • 12. Basic Search Systems
      Rapid search
      Simple to setup
      Feature poor
      Accuracy
      Require more application code
      11
    • 13. Solr Search
      12
    • 14. Solr Search
      Very powerful
      Feature rich
      Relatively simple
      Lots of plugins (community)
      Overkill?
      Java
      13
    • 15. Things you need to know
      14
    • 16. Searching…
      What?
      When?
      Why?
      How?
      15
    • 17. Applicable to me?
      Who is Solr designed for?
      Traffic
      Features
      When is a good time to implement it?
      Creation
      Post-live
      Open Source projects
      16
    • 18. Business indicators
      Money / Time / Effort spent
      Bugs
      Tuning
      Features
      Customers
      17
    • 19. Development indicators
      Data
      MySQL Full Text
      Degradation
      18
    • 20. Searching…
      What?
      When?
      Why?
      How?
      19
    • 21. Is Solr right for me?
      Know your enemy
      With great functionality comes great responsibility
      20
    • 22. Data sources
      Database
      Easy
      API
      Features
      CSV & XML
      Solr Cell - Rich Documents
      PDF
      MS Office
      21
    • 23. Indexing
      Parsing
      Half now, half later
      22
    • 24. Analyzer
      Process documents
      The query gets analyzed too
      23
    • 25. Tokenizer
      24
    • 26. TokenizerFilter
      Synonym
      25
    • 27. Stemming
      Matching similar words
      Reduce to Stem
      26
      Searching
      Search
      Searches
      Searched
      Searchers
      Search
    • 28. Hit Highlighting
      “Hit” ==> “This is a <em>Hit</em> test.”
      27
    • 29. Spell Check
      Spelchk
      Did you mean …?
      “flickr”
      28
    • 30. 29
    • 31. By the power of Queries!
      Phrase “Search for a phrase”
      Wildcards Look*familiar?
      Fuzzy fuzzy~
      Proximity “two words”~12
      Range name:{Paul TO Jeff}
      30
    • 32. name:paul AND location:uk
      A single field
      Multiple Fields
      31
    • 33. Faceting (21)
      Pre-fetching (11)
      Results (37)
      32
    • 34. Ranked Search
      Ordered
      Any field
      33
    • 35. Simultaneous update & search
      Hold on a minute!
      Actually, I don’t have to…
      34
    • 36. Searching…
      What?
      When?
      Why?
      How?
      35
    • 37. Flow
      36
    • 38. Container
      Choose container
      Make accessible
      http://<host>:<port>/solr/admin
      37
    • 39. SolrConfig
      Cores ~ Database Schema
      schema.xml ~ Schema definition
      38
    • 40. Fields
      Define the data
      indexed
      Stored
      Important to model accurately
      Tweak to achieve functionality
      Conscious of space and index
      39
    • 41. Index
      Create documents to Schema Spec
      40
    • 42. Search
      Quick Search
      Default Search
      Advanced Search
      41
    • 43. Quick Search
      Partial words
      Search all fields?
      Required response data
      42
    • 44. Default Search
      Consider useful Analyzers
      Potentially match on more fields
      Enrich or refine results with personal data
      More in depth results
      43
    • 45. Advanced Search
      Offer user control
      Consider search storage
      Data size vs Additional queries
      To return more / less results
      “Search entire document”
      “Filter by Colour”
      44
    • 46. Searching…
      What?
      When?
      Why?
      How?
      45
    • 47. Questions?
      46
    • 48. We’re Hiring
      NL
      Vlissingen
      Utrecht
      UK
      London
      Sheffield
      Liverpool
      Speak to me at the end…
      pmatthews@ibuildings.com
      47
    • 49. Thank you
      Resources Links:
      http://www.delicious.com/paulm86/solr
      This talk:
      http://joind.in/3221
      Contact Me:
      @paulmatthews86
      http://about.me/paul.matthews
      48

    ×