SilverStripe and    Full Text Search     Giving the people what they want     24 August, 2011 • SilverStripe Wellington Me...
What we’re covering     Big topic, not much time     • What does search give you     • Three ways to get it          • Bui...
What we’re not     But that doesn’t mean they’re not important     • Search result visualization     • Search refinement  ...
Why add search?     24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
What are you trying to do?     Be aware of the goals of your users     • Most people use navigation by preference         ...
Getting you there quicker     To be used, search has to give interesting pages faster than navigation     • Interesting is...
Technology & Tools     24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
Database internal full text search     It’s just another index     • Most databases come with some full text search built ...
External full text indexers     Solr, Sphinx, Elastic Search     • Given a schema, and a set of documents, builds an index...
External engines + SilverStripe     A tale of two abstractions     • Building schemas is hard, time consuming, annoying wh...
SilverStripe Integration     24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
Built-in search     Your database-dependent, barely acceptable default     ✓ No external dependancies, separate indexes, s...
Sphinx module     Easy, quality full text search     ✓ Very little configuration gives great results on moderate        si...
Fulltext search module     Powerful (eventually) search engine independent toolkit     ✓ Schemas generated from query stru...
Full text search module example     24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 Aug...
Define an index     Schema gets generated from this index     24 August, 2011 • SilverStripe Wellington Meetup • Hamish Fr...
Define a form     Standard SilverStripe stuff     24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWed...
Build a query & apply to an index     Filter and excludes can be build & nested     24 August, 2011 • SilverStripe Welling...
Final thoughts     24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
Search without searching     Search engines as fuzzy matchers     • Looks like navigation, acts like search     • Instant ...
Links     Modules I’ve covered + some other stuff     • https://github.com/silverstripe/silverstripe-sphinx     • https://...
Thank you!                                     Twitter: @hafriedlander                                 Email: hamish@silve...
Upcoming SlideShare
Loading in …5
×

Fulltext search pres

1,566 views
1,447 views

Published on

Published in: Technology, Design
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,566
On SlideShare
0
From Embeds
0
Number of Embeds
212
Actions
Shares
0
Downloads
5
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Fulltext search pres

  1. 1. SilverStripe and Full Text Search Giving the people what they want 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  2. 2. What we’re covering Big topic, not much time • What does search give you • Three ways to get it • Built in db backed search • Sphinx module • Full text search module 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  3. 3. What we’re not But that doesn’t mean they’re not important • Search result visualization • Search refinement • Boost, result pre-calculation, faceting, spell checking, real- time results • Integrating search with IA • Measuring search usefulness • 3rd party modules 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  4. 4. Why add search? 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  5. 5. What are you trying to do? Be aware of the goals of your users • Most people use navigation by preference • Stats depend on site, but average 70-95% navigation • Search is primarily used to locate stuff that’s not obvious how to navigate to • Deeply nested pages • Cross-cutting information not provided as an taxonomic structure • Re-discovering remembered items • If search doesn’t give immediate results, users fall back to navigation again 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  6. 6. Getting you there quicker To be used, search has to give interesting pages faster than navigation • Interesting is relative • Ideally return the page the user is after • But failing that, at least return a page the user is interested in • Speed is perception • Raw speed is rarely noticed (except when it is) • Ability to understand results is as important as accuracy of results • A second click is OK, as long as there’s a likely payoff: “did you mean” is fine, disambiguation is OK, paging is useless 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  7. 7. Technology & Tools 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  8. 8. Database internal full text search It’s just another index • Most databases come with some full text search built in • Generally work by adding new indexes to a table column • Can easily combine full text queries with other filters • But databases aren’t really designed for it • Poor query language - no booleans • Poor language processing • Limited feature set - no field boost, spell checking, search suggestions, faceting, result fragments, .... • Sometime costly technically (MyISAM) 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  9. 9. External full text indexers Solr, Sphinx, Elastic Search • Given a schema, and a set of documents, builds an index • Schema gives both text processing and result relevancy rules • Different engines either retrieve documents themselves or have documents sent to them • Indexes might be write-once (rebuild entire index to add changes) • Gives a language to query those indexes • Generally query language is engine-specific 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  10. 10. External engines + SilverStripe A tale of two abstractions • Building schemas is hard, time consuming, annoying when model changes • Can build schemas directly off models • Effectively free - all the necessary information is already present • Flexible search - can change form structure without index changes • Inefficient - includes information you won’t search against • Or can build schemas off query design • Needs more though around design of query up front • More efficient, leads to some more powerful abilities 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  11. 11. SilverStripe Integration 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  12. 12. Built-in search Your database-dependent, barely acceptable default ✓ No external dependancies, separate indexes, schema files or setup - Can only search SiteTree and File objects, and only specific fields - Quality of results is heavily database dependent 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  13. 13. Sphinx module Easy, quality full text search ✓ Very little configuration gives great results on moderate sized sites ✓ Can search any DataObject, but... - Combining search over multiple DataObjects doesn’t really work - Limited real-time update support - No exact match string mode makes filtering tricky 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  14. 14. Fulltext search module Powerful (eventually) search engine independent toolkit ✓ Schemas generated from query structure  More flexible and efficient than generating from model structure  Closer to how external engines work natively ✓ Eventually multiple search backend support  Currently: Solr  In future: Sphinx, Elastic Search, Zend_Lucene  Not intended to allow code-less swapping of backends. - Currently needs Solr, which is a Java app  Loves memory, hates empty disk space 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  15. 15. Full text search module example 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  16. 16. Define an index Schema gets generated from this index 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  17. 17. Define a form Standard SilverStripe stuff 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  18. 18. Build a query & apply to an index Filter and excludes can be build & nested 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  19. 19. Final thoughts 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  20. 20. Search without searching Search engines as fuzzy matchers • Looks like navigation, acts like search • Instant taxonomies • Deal with inconsistent data • Encourages exploration 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  21. 21. Links Modules I’ve covered + some other stuff • https://github.com/silverstripe/silverstripe-sphinx • https://github.com/silverstripe-labs/silverstripe-fulltextsearch • http://sphinxsearch.com/ • http://lucene.apache.org/solr/ • http://www.elasticsearch.org/ • https://github.com/nyeholt/silverstripe-solr • http://code.google.com/p/lucene-silverstripe-plugin/ 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  22. 22. Thank you! Twitter: @hafriedlander Email: hamish@silverstripe.com 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011

×