• Save
Making your Drupal fly with Apache SOLR

Making your Drupal fly with Apache SOLR






Total Views
Views on SlideShare
Embed Views



12 Embeds 204

http://www.exove.fi 100
http://sainfoinc.in 35
http://www.exove.com 35
http://silver.exove.net 14
http://www.slideshare.net 5
http://exove.com 4
http://localhost 3
http://slideshare6.dev 2
http://www.linkedin.com 2
https://twitter.com 2
http://exove2012.local 1
http://www.exove.co.uk 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Making your Drupal fly with Apache SOLR Making your Drupal fly with Apache SOLR Presentation Transcript

    • Making your Drupal fly with Apache SOLR
      • Kalle Virta, Exove
    • In this presentation
      About Exove and myself
      The problem – and the solution (and some cowboys)
      SOLR to do the site-wide search
      SOLR to help with Views
      SOLR to help with custom modules
      And the Fine Print
    • We deliver business-driven web services that enable our customers to conduct better business on the Internet
      We base our work to our customers’ strategy and needs
    • About me, Kalle Virta
      Software architect and developer
      High performance and complex integrations
      Almost 10 years in the business
      Seen Drupal from version 3
      A lot of big Drupal sites / systems under by belt
    • Your regular stack
      Linux + Apache
    • Damn, dude
      MySQL server
      is on
    • New guys to
      the rescue
    • Apache SOLR
    • Your enhanced stack
      Linux + Apache
      Did you notice?
      It’s still blue.
    • The new guys
      Varnish is a http cache and does it well – but it doesn’t help at all on your customized-for-every-person social media site
      Memcached is a good idea, and you can even use it with cache router to cache Drupal stuff, including your own modules, but… it still just caches stuff
      SOLR however, is a different story…
    • SOLR
      Apache SOLR is a search server around Lucene (which is a search library) written in Java
      It needs a Java container, e.g. Jetty or Tomcat
      In a simple way, you can save your stuff in XML form in it and then search from them
      SOLR will tokenize and do all kinds of (configurable) magic to the data when indexing it, but it can also store the original data (not always possible with search indexers)
    • SOLR for searching
      Obviously all the features of SOLR make it optimal for sitewide searching functionality
      You can actually find stuff with SOLR, all the fields in the search can be biased, that is, you can tune the fields in which the hits make the score go higher
      SOLR also does one really neat thing for searching…
    • ?
      Ever heard
      of a
    • The old advanced search
      Product category
      Product sub-cat
      Price range
      Too many search results (794),
      narrow your search and try again
    • The faceted search
      Order by price
      Logitech LS1 Laser Mouse
      Current search
      29 €
      A cheap laser mouse that’ll get you
      through even the most problematic
      of PowerPoint presentations.
      Logitech G3 Gaming Mouse
      wireless mice (296)
      wired mice (96)
      laser mice (163)
      59 €
      A great laser mouse with more
      buttons than you’ll ever have
      time to configure. A steal.
      Show all
      Microsoft Super Mouse
      49 €
      Microsoft (36)
      HP (3)
      A great mouse from the company
      that brought you the best product
      of all times, Windows Me.
      Show all
      Apple Mighty Mouse
      129 €
      Price range
      The mouse the image happens
      to be of. Never tried it. Looks
      pretty nice, though.
      0-50 € (384)
      50-100 € (129)
      100-300 € (50)
      page 1 2 3 4 5 6 7 8 9 10
    • SOLR for faceted searching
      Apache SOLR let’s you facet search results – that is, to show possible search filters and give counts for them
      Faceting with SOLR can also be achieved in Drupal – and now a Drupal contrib module comes to play
      With ApacheSOLR –module (http://drupal.org/project/apachesolr) you can do all this with a couple of clicks in your Drupal installation
    • SOLRfy your Drupal search 1/3
      Download SOLR package from http://www.apache.org/dyn/closer.cgi/lucene/solr/
      Unpackage it and check your server’s firewall settings to allow traffic to port 8983
      Check that you have Java (RE) installed
    • SOLRfy your Drupal search 2/3
      Then get Drupal’s “apachesolr” module, there’s two xml files in the package, solrconfig.xml and schema.xml
      Go back to your SOLR directory, rename example directory to “drupal” so you’ll find it easier
      Drop the two xml files to that drupal/solr/conf –directory
      Go to that drupal directory and fire up Apache SOLR with “java –jar start.jar”
    • SOLRfy your Drupal search 3/3
      Now you can turn on “apachesolr” module in Drupal
      Tune the SOLR server settings in Drupal, reindex all content and then start clicking on those filtering/faceting settings on apachesolr
      You’ll have to turn the facets on as blocks
      But your search experience will be something else entirely
      …and once you see how searching with SOLR works, you’re not going back
    • Apachesolr -module
      Automatically creates facets for taxonomy terms, for every vocabulary – you can just turn them on
      Automatically creates facets for CCK fields using dropdown/radio widgets (i.e. with a set of options)
      Exposes hooks for CCK fields (to make facets out of them)
      Exposes hook for altering the query (to some extent)
      Easy to use
    • Faceting without SOLR
      You can do faceting without SOLR too
      “Faceted search” module will do it for you
      But at only 10K nodes, SOLR is three times as fast
      With 100K+ nodes, faceted search without SOLR is practically unusable
      …but for small sites, SOLR is not necessary for faceting
    • SEARCH
      So you can
      with SOLR …but my site does
      A LOT more
    • SOLRify the rest of your Drupal universe
      You probably know your performance problems on your site
      If it’s somehow personalized, you usually can’t do anything about it with caching
      How about using SOLR for it?
      Apache Solr Views –module (at a very mature “dev” state ;) and Views 3 (dev too) will talk together and integrate to apachesolr –module and it’s SOLR index
      When this is stable and fully functional…
    • It’ll make your Views
    • SELECT
      FROM media
      media_type_id = type_id
      media_tag.mid = media.id
      name LIKE ‘%s’
      description LIKE ‘%s’
      OR media.id
      IN (SELECT mid FROM promoted_media)
      But my problems are in my
      custom modules
    • Custom modules
      Custom modules can be designed with ApacheSOLR in mind
      When you realize all the potential there is in a indexer that can index XML files, sky is the limit
      Whenever you have a data structure that’s too complex for MySQL to search from – and that’s not too rarely – you might benefit from indexing that data to SOLR and using your SOLR as the read-only “db”
    • Custom modules – making SOLR do the reading
      A single “row” for SOLR to index
    • Custom modules – making SOLR do the reading
      You know you need a better structure when you can’t circumvent running LEFT JOIN or subqueries – and running them gets too slow
      When you’ve optimized your code several times and restructuring your database would mean creating a read-optimized cache of everything
      Then SOLR might be just the thing to get you through
    • Custom modules – making SOLR do the reading
    • Libraries to use with custom modules
      Apachesolr –module uses a SOLR library written in PHP and licensed in New BSD (http://code.google.com/p/solr-php-client/)
      There’s also a PECL extension, but I’m not aware of any speed comparisons
      There are also contrib Drupal modules that give you an API for accessing SOLR
    • magic
      It’s no
    • Not a magic bullet 1/2
      Apache SOLR is a hassle with all the java containers and such, you’ll probably have to run it on a separate server
      You should always run stuff through Drupal or a script that will authenticate and authorize calls to SOLR (SOLR shouldn’t be exposed, unless all the data is public)
      Sometimes the extra server might be better to use on an extra MySQL node
      Sometimes you can just fix your stuff and make it as fast as it would be on Apache SOLR
    • Not a magic bullet 2/2
      And then there’s the fact SOLR is build mainly for the English language
      So make sure SOLR will do what you want for you in the language you want it to do it in
    • Recap
      SOLR will right now give your Drupal site a fast, faceted search with really easy setup (thanks to apachesolr module)
      SOLR will soon give a boost to the performance and search abilities of your views
      SOLR will right now give you a lot of more power for searching from your custom databases and complicated content types, if used by a module developer
      It’s still not a magic bullet – it has it’s downsides
    • Sounds
      Been there,
      done that?
      is recruiting
      Send your CV to jobs@exove.com
    • Thank you for your time
      If you’d rather ask me in private,
      drop a mail to kalle@exove.com