Fulltext search pres
Upcoming SlideShare
Loading in...5
×
 

Fulltext search pres

on

  • 1,404 views

 

Statistics

Views

Total Views
1,404
Views on SlideShare
1,204
Embed Views
200

Actions

Likes
1
Downloads
4
Comments
0

6 Embeds 200

http://www.silverstripe.org 194
http://silverstripe.org 2
http://silverstripe.com 1
http://www.cms-expert.com 1
http://attheblogzone.com 1
http://www.hraps.pl 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Fulltext search pres Fulltext search pres Presentation Transcript

  • SilverStripe and Full Text Search Giving the people what they want 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • What we’re covering Big topic, not much time • What does search give you • Three ways to get it • Built in db backed search • Sphinx module • Full text search module 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • What we’re not But that doesn’t mean they’re not important • Search result visualization • Search refinement • Boost, result pre-calculation, faceting, spell checking, real- time results • Integrating search with IA • Measuring search usefulness • 3rd party modules 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • Why add search? 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • What are you trying to do? Be aware of the goals of your users • Most people use navigation by preference • Stats depend on site, but average 70-95% navigation • Search is primarily used to locate stuff that’s not obvious how to navigate to • Deeply nested pages • Cross-cutting information not provided as an taxonomic structure • Re-discovering remembered items • If search doesn’t give immediate results, users fall back to navigation again 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • Getting you there quicker To be used, search has to give interesting pages faster than navigation • Interesting is relative • Ideally return the page the user is after • But failing that, at least return a page the user is interested in • Speed is perception • Raw speed is rarely noticed (except when it is) • Ability to understand results is as important as accuracy of results • A second click is OK, as long as there’s a likely payoff: “did you mean” is fine, disambiguation is OK, paging is useless 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • Technology & Tools 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • Database internal full text search It’s just another index • Most databases come with some full text search built in • Generally work by adding new indexes to a table column • Can easily combine full text queries with other filters • But databases aren’t really designed for it • Poor query language - no booleans • Poor language processing • Limited feature set - no field boost, spell checking, search suggestions, faceting, result fragments, .... • Sometime costly technically (MyISAM) 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • External full text indexers Solr, Sphinx, Elastic Search • Given a schema, and a set of documents, builds an index • Schema gives both text processing and result relevancy rules • Different engines either retrieve documents themselves or have documents sent to them • Indexes might be write-once (rebuild entire index to add changes) • Gives a language to query those indexes • Generally query language is engine-specific 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • External engines + SilverStripe A tale of two abstractions • Building schemas is hard, time consuming, annoying when model changes • Can build schemas directly off models • Effectively free - all the necessary information is already present • Flexible search - can change form structure without index changes • Inefficient - includes information you won’t search against • Or can build schemas off query design • Needs more though around design of query up front • More efficient, leads to some more powerful abilities 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • SilverStripe Integration 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • Built-in search Your database-dependent, barely acceptable default ✓ No external dependancies, separate indexes, schema files or setup - Can only search SiteTree and File objects, and only specific fields - Quality of results is heavily database dependent 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • Sphinx module Easy, quality full text search ✓ Very little configuration gives great results on moderate sized sites ✓ Can search any DataObject, but... - Combining search over multiple DataObjects doesn’t really work - Limited real-time update support - No exact match string mode makes filtering tricky 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • Fulltext search module Powerful (eventually) search engine independent toolkit ✓ Schemas generated from query structure  More flexible and efficient than generating from model structure  Closer to how external engines work natively ✓ Eventually multiple search backend support  Currently: Solr  In future: Sphinx, Elastic Search, Zend_Lucene  Not intended to allow code-less swapping of backends. - Currently needs Solr, which is a Java app  Loves memory, hates empty disk space 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • Full text search module example 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • Define an index Schema gets generated from this index 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • Define a form Standard SilverStripe stuff 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • Build a query & apply to an index Filter and excludes can be build & nested 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • Final thoughts 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • Search without searching Search engines as fuzzy matchers • Looks like navigation, acts like search • Instant taxonomies • Deal with inconsistent data • Encourages exploration 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • Links Modules I’ve covered + some other stuff • https://github.com/silverstripe/silverstripe-sphinx • https://github.com/silverstripe-labs/silverstripe-fulltextsearch • http://sphinxsearch.com/ • http://lucene.apache.org/solr/ • http://www.elasticsearch.org/ • https://github.com/nyeholt/silverstripe-solr • http://code.google.com/p/lucene-silverstripe-plugin/ 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011
  • Thank you! Twitter: @hafriedlander Email: hamish@silverstripe.com 24 August, 2011 • SilverStripe Wellington Meetup • Hamish FriedlanderWednesday, 24 August 2011