Your SlideShare is downloading. ×
0
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Search with Solr
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Search with Solr

4,061

Published on

With Google constantly pushing the customer expectations of searching, is it time to move away from our database full-text search in pursuit of a more targeted platform? Can implementing Solr offer …

With Google constantly pushing the customer expectations of searching, is it time to move away from our database full-text search in pursuit of a more targeted platform? Can implementing Solr offer more than an answer to a search? Implementing a search platform isn’t always suitable for all applications, but in this talk we’ll look at identifying the right search solution, choosing the best way to integrate it into our application and exploring all the benefits a search server can offer.

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,061
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
91
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Twitter: @paulmatthews86Personal Blog: 86pTechnicalNon-techSoftware Engineer at IbuildingsTechportalMongoDBSolr (May 2011)Solr ProjectsTravel CompanyMedia Company
  • This talk What Is Solr? When is right timeWhySearch ?How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  • This talk When right time – identifying Why Search benefits Dark horse How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  • What is search? Text based navigation To content / products Customers describing something Capture queries SortingOrganising content Examples Quick search Category listing Advanced search
  • The Power of SearchFrom LIKE to SOLR
  • First up DB Like
  • Pros: Little effort to use, or understand.Cons: Not good User data: Not greater than 1 word
  • Full Text Lots of people use
  • Pros: Some power Convenient In DBCons: Feature poor Slow
  • Basic / Easy to use proper Search
  • Pros: Can be very fast Often simple to setupCons: Feature poor Less accurate More application code?Google Custom Search Engine Crawls siteXapian Simple search solution
  • Pros:Poweful Feature rich Relatively Simple Lots of pluginsCons: Could be overkill Different language
  • On Java stand alone Requires servlet container Tomcat Jetty stand alone Lucene Search library Offers Full Text High performance Java - other implementations available
  • This talk When right time – identifying Why Search benefits Dark horse How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  • Who? Traffic Not for Facebook Works for average Features It has many No need to use themWhen? Designed from beginning Easily used to enrich site navigation Implementation as post-live project Implementation into existing open source softwareDrupalMagento
  • Spending time / effort / money on the search box Fixing bugs Endless tuning Adding functionalityCustomers complaining Not finding content High Bounce rates Site is slow Not finding the *right* content
  • Large data sets 10000 records Speed Like queriesMySQL full-text Site performanceSlowlog? Results Inaccurate MissingGraceful degradation Important for quality Low cost
  • This talk When right time – identifying Why Search benefits Dark horse How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  • Is Solr right for me?Before Answering:Terms:Find materialsCommunicate to peopleFunctionality:Most Use – Know FunctionalityRe-invent – Wheel
  • Main 2Database tables Data Import Handler Easy – just configAPI Anything publish API Hooked into contentCSV & XMLSolr Cell - Rich Docs PDF MS Office
  • Parse: text generate index Removes junk Improve matchesHalf now, half later: Reduce time searching
  • Analyzer Groups actions of Parsing Important to do same / similar in searching
  • TokenizerStrings to tokensExample ones:Whitespace – splits on whitespaceKeyword – strips special charsStandard – General purpose, adds context
  • Transforms tokensLower case.Stop – filters out stop words: a, if, to, andStandard – Remove dots, ‘s (Context only)Synonym.
  • Hit Highlighting* Remember to set the delimiter, not everything is a web page.
  • Spell checkingConfigureSpellingsNames - FlickrKeywords
  • Autocomplete Common queries
  • Phrase queries "search for a phrase"Wildcard queries Match with wildcards ? single * multipleFuzzy queriesLevenshtein Distance Similar to word ~Proximity queries Words close together "two words"~12Range queries Between two values started:[20110101 TO 20120101] Inclusivename:{Paul TO Jeff} exclusive
  • Fields Single field Target search Multiple field Build Queries
  • Faceted Set Counts Filter data Multiple classifications
  • Ordered results based on best matchOr order by any field
  • Simultaneous update and search
  • This talk When right time – identifying Why Search benefits Dark horse How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  • Blog post – to explainsConfigure ContainerSolrIndex Documents Any sourceSearch Default search Advanced search
  • Container setup Choose Configure Accessible
  • Define the data Define what is indexed Define what is storedIntegral to returning relevant search responsesRequire tweaking to get rightConscious of space size of the index - speed
  • Docs to Schema SpecIndexing by Database or API
  • Partial Words Analyzing?Search all fields Possibly the main onesResponse Less data Stay clear of additional queries consider caching
  • Consider using stemming analyzers to return more resultsIncrease matching columnsUse session data affect results Consider caching effectsMore response data required
  • Users modify their search Specify fields For enriching the results Consider bloated storage Tradeoff with Additional queries Tweak later?Advanced for returning More / Less results Search more of the document Filter on property
  • This talk When right time – identifying Why Search benefits Dark horse How Start journey– investigate Explain to business integrateWho is this talk aimed at? Developers Toying with search DB search Starting with search
  • Twitter: @paulmatthews86Personal Blog: 86pTechnicalNon-techSoftware Engineer at IbuildingsTechportalMongoDBSolr (May 2011)Solr ProjectsTravel CompanyMedia Company
  • Transcript

    • 1. Searching with SolrWhen, Why and How?<br />By Paul Matthews<br />
    • 2. 86p<br />@paulmatthews86<br />86p.paul-matthews.co.uk<br />pmatthews@ibuildings.com<br />techportal.ibuildings.com<br />Projects:<br />Travel companies<br />Media corporations<br />1<br />
    • 3. Searching…<br />What?<br />When?<br />Why?<br />How?<br />2<br />
    • 4. Searching…<br />What?<br />When?<br />Why?<br />How?<br />3<br />
    • 5. What is search?<br />Text navigation<br />Customers describing<br />Sorting<br />Examples<br />Quick search<br />Category listings<br />4<br />
    • 6. The power of search<br />5<br />
    • 7. Database Like<br />6<br />
    • 8. Database Like<br />Very little effort<br />A very basic search<br />Poor at: > 1 word<br />7<br />
    • 9. Database Full-Text<br />8<br />
    • 10. Database Full-Text<br />Some power<br />Convenient<br />Feature poor<br />Often very slow<br />9<br />
    • 11. Basic Search Systems<br />10<br />
    • 12. Basic Search Systems<br />Rapid search<br />Simple to setup<br />Feature poor<br />Accuracy<br />Require more application code<br />11<br />
    • 13. Solr Search<br />12<br />
    • 14. Solr Search<br />Very powerful<br />Feature rich<br />Relatively simple<br />Lots of plugins (community)<br />Overkill?<br />Java<br />13<br />
    • 15. Things you need to know<br />14<br />
    • 16. Searching…<br />What?<br />When?<br />Why?<br />How?<br />15<br />
    • 17. Applicable to me?<br />Who is Solr designed for?<br />Traffic<br />Features<br />When is a good time to implement it?<br />Creation<br />Post-live<br />Open Source projects<br />16<br />
    • 18. Business indicators<br />Money / Time / Effort spent<br />Bugs<br />Tuning<br />Features<br />Customers<br />17<br />
    • 19. Development indicators<br />Data<br />MySQL Full Text<br />Degradation<br />18<br />
    • 20. Searching…<br />What?<br />When?<br />Why?<br />How?<br />19<br />
    • 21. Is Solr right for me?<br />Know your enemy<br />With great functionality comes great responsibility<br />20<br />
    • 22. Data sources<br />Database<br />Easy<br />API<br />Features<br />CSV & XML<br />Solr Cell - Rich Documents<br />PDF<br />MS Office<br />21<br />
    • 23. Indexing<br />Parsing<br />Half now, half later<br />22<br />
    • 24. Analyzer<br />Process documents<br />The query gets analyzed too<br />23<br />
    • 25. Tokenizer<br />24<br />
    • 26. TokenizerFilter<br />Synonym<br />25<br />
    • 27. Stemming<br />Matching similar words<br />Reduce to Stem<br />26<br />Searching<br />Search<br />Searches<br />Searched<br />Searchers<br />Search<br />
    • 28. Hit Highlighting<br />“Hit” ==> “This is a <em>Hit</em> test.”<br />27<br />
    • 29. Spell Check<br />Spelchk<br />Did you mean …?<br />“flickr” <br />28<br />
    • 30. 29<br />
    • 31. By the power of Queries!<br />Phrase “Search for a phrase”<br />Wildcards Look*familiar?<br />Fuzzy fuzzy~ <br />Proximity “two words”~12<br />Range name:{Paul TO Jeff}<br />30<br />
    • 32. name:paul AND location:uk<br />A single field<br />Multiple Fields<br />31<br />
    • 33. Faceting (21)<br />Pre-fetching (11)<br />Results (37)<br />32<br />
    • 34. Ranked Search<br />Ordered<br />Any field<br />33<br />
    • 35. Simultaneous update & search<br />Hold on a minute!<br />Actually, I don’t have to…<br />34<br />
    • 36. Searching…<br />What?<br />When?<br />Why?<br />How?<br />35<br />
    • 37. Flow<br />36<br />
    • 38. Container<br />Choose container<br />Make accessible<br />http://<host>:<port>/solr/admin<br />37<br />
    • 39. SolrConfig<br />Cores ~ Database Schema<br />schema.xml ~ Schema definition<br />38<br />
    • 40. Fields<br />Define the data<br />indexed<br />Stored<br />Important to model accurately<br />Tweak to achieve functionality<br />Conscious of space and index<br />39<br />
    • 41. Index<br />Create documents to Schema Spec<br />40<br />
    • 42. Search<br />Quick Search<br />Default Search<br />Advanced Search<br />41<br />
    • 43. Quick Search<br />Partial words<br />Search all fields?<br />Required response data<br />42<br />
    • 44. Default Search<br />Consider useful Analyzers<br />Potentially match on more fields<br />Enrich or refine results with personal data<br />More in depth results<br />43<br />
    • 45. Advanced Search<br />Offer user control<br />Consider search storage<br />Data size vs Additional queries<br />To return more / less results<br />“Search entire document”<br />“Filter by Colour”<br />44<br />
    • 46. Searching…<br />What?<br />When?<br />Why?<br />How?<br />45<br />
    • 47. Questions?<br />46<br />
    • 48. We’re Hiring<br />NL<br />Vlissingen<br />Utrecht<br />UK<br />London<br />Sheffield<br />Liverpool<br />Speak to me at the end…<br />pmatthews@ibuildings.com<br />47<br />
    • 49. Thank you<br />Resources Links:<br />http://www.delicious.com/paulm86/solr<br />This talk:<br />http://joind.in/3221<br />Contact Me:<br />@paulmatthews86<br />http://about.me/paul.matthews<br />48<br />

    ×