APACHE SOLR CHANGES THE
   WAY YOU BUILD SITES
 How to build dynamic navigation for dynamic content


                Jaco...
RELEVANT AND FAST
FILTERS
SORTABLE
MORE LIKE THIS
SITE NAVIGATION
INFORMATION ARCHITECTURE
•Isthe science and art of guessing
 what your users want to see or do
 on your site and helping t...
ARCHETYPES - WHO
CONTENT - YOU GOT IT
ORGANIZE AND PROSPER!
NAVIGATION
SO? WHAT’S WRONG WITH
        THAT?
(1996)


    Site Map!
WEBSITES GROW.


•Sometimes   really fast.
• Andalways in ways you didn’t
expect
WHY DOES THIS HAPPEN?
VISITORS ARE TOO VARIED TO
     THINK OF THEM ALL




 YOUR CONTENT IS TOO
 VARIED FOR YOUR MENU
WHERE ARE BEER HATS?
YOUR CONTENT IS TOO VAST
  AND TAKES TOO MANY
     CLICKS TO FIND
TOOK 7 CLICKS TO FIND
      DRUPAL
YOUR CONTENT IS
CONTRIBUTED BY USERS
 AND UNPREDICTABLE.
DRUPAL IS A SOCIAL
PUBLISHING PLATFORM
YOU USE IT SO USERS
WILL ADD CONTENT
           That’s the point!
IF WE HAVE DYNAMIC USER
  CONTRIBUTED CONTENT
IF WE HAVE DYNAMIC USER
  CONTRIBUTED CONTENT


 WHY DO WE INSIST ON
 STATIC ADMIN DEFINED
     NAVIGATION?
WEB 2.0 JARGON
          TO SAVE THE DAY

•Tag   Clouds
•Content    Recommendation
•Social   Networking
•Social   Bookmark...
HOW SEARCH DIED
       AND
HOW WE BRING IT BACK
SEARCH IS NOT A PRIORITY
MOST DRUPAL WEBSITES
REALLY IGNORE SEARCH
CAN YOU FIND THE SEARCH
          BOX?




 HINT: I’VE CIRCLED IT IN
       ORANGE
WHAT DO THE BIG SITES DO?
WHY DID SEARCH DIE?


•It   was too slow
•It   wasn’t smart enough
•Users    learned not to trust it
LANGUAGE IS IMPORTANT
GOLDEN RULE:
   No Dead Ends!
SEARCH ON G.D.O
FIND OUT WHERE TO LOOK
FOUND ‘EM
I LEFT MY TIME MACHINE AT
           HOME
SMARTER MATCHES
STEMMING
SPELLING SUGGESTIONS
BUT I’M THE ADMIN!
ALL THE COOL KIDS ARE
      DOING IT
DAMZ EN L’MASSION
A TAILOR MADE NAVIGATION
      FOR EVERY USER

      You want it, right?
The Apache Solr Project
• Stable and proven.
 – Used by Netflix, CNET, CitySearch, StubHub!, GameSpot, AOL
 – Full time mai...
Apache Solr Search Integration
• Very active project on drupal.org.
• Takes advantage of latest Solr features.
• Exposes a...
Feaure highlights
• Taxonomy, user, and language facets.
• Node type faceting, weighting, and exclusion.
• Node property (...
Integration with other modules/data
• Webmail: http://drupal.org/project/webmail_plus
• Attachments:
 http://drupal.org/pr...
You Can Run                               Yourself. Easy!
1. Get a dedicated server or a VPS and get Solr loaded on it.
2....
Or... use Acquia Search
• Sign up on acquia.com.
• Free 30 day trial subscriptions for anyone.
• You must be running a Dru...
How Acquia Search Works
                                                   Search master server


                        ...
Proving the platform




• Benchmarking our servers, on the search server
 itself, most searches run in < 200 ms, even
 un...
Who Is This For?
• Small and medium size sites - easy access to
 enterprise search for every Drupal site.
 – No hardware, ...
How-to Screencasts
• http://www.jeffnoyes.com/content/how-use-
  acquia-search-apache-solr
• http://www.jeffnoyes.com/conten...
New modules to
 enhance the
  experience

                 © 2008 Acquia, Inc.
Searching with Views 3
 • Define filters, sorts in Views like normal
 • Solr, not MySQL, gets the query
 • Views is responsi...
Javascript interface
  • http://drupal.org/project/solrjs




                                       © 2008 Acquia, Inc.
Javascript interface
  • http://drupal.org/project/solrjs




                                       © 2008 Acquia, Inc.
Autocomplete
 http://drupal.org/project/apachesolr_autocomplete




                                                © 2008...
Stats for administrators
    http://drupal.org/project/apachesolr_stats




                                              ...
DRUPAL-6--2 Branch




                     © 2008 Acquia, Inc.
DRUPAL-6--2 Branch




• Researching more substantial architectural
  changes to query building.
• Looking at ways to supp...
Learn and Contribute
• Find us at the code sprint (Saturday) if you have
  questions about the code or roadmap.
• Come to ...
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Upcoming SlideShare
Loading in...5
×

Apache Solr Changes the Way You Build Sites

5,514

Published on

By Jacob Singh and Peter Wolanin about the Apache Solr Search Integration module for Drupal. Presented at Drupalcon Paris, 2009.

Published in: Technology, Travel
0 Comments
23 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
5,514
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
243
Comments
0
Likes
23
Embeds 0
No embeds

No notes for slide
  • I&amp;#x2019;m going to talk about ApacheSolr, a revolutionary search technology
  • Which provides relevant and fast search results
  • Which provides relevant and fast search results
  • Which provides relevant and fast search results
  • that can be filtered
  • that can be filtered
  • that can be filtered
  • and sorted
  • and sorted
  • and sorted
  • and provides brilliant content recommendations
    Changing the way you think about
  • and provides brilliant content recommendations
    Changing the way you think about
  • and provides brilliant content recommendations
    Changing the way you think about
  • Classic Information Architecture and how you structure your menus and site navigation
  • I&amp;#x2019;ve built a few websites
    Usually, we start with something called IA
    What is IA?
  • * Start with Who your visitors are and what they want to do
  • * take a look at the inventory of content the site provides
  • group that content into categories which make sense.
    Often called cart sorting, mapping, etc
  • Name the groups, and they become menus and highlights (Navigation)
  • What&amp;#x2019;s wrong with these conventions? Our forefathers have used them since time immemorial (1996)
  • The web is a lot more complicated now.
    Content comes from a lot more sources (other sites and users)
    And websites do more things
    Also...
  • It isn&amp;#x2019;t 1996 anymore
    In short time your Content may start looking like... {flip}
  • It isn&amp;#x2019;t 1996 anymore
    In short time your Content may start looking like... {flip}
  • And your menus become more like {flip}
  • And now your site is hard to organize and impossible to use.
  • You made up archetypes, but your real visitor base is more varied and unique than that.
  • Did you think of the user who wanted handmade paper?
    Which category do you think handmade paper is in?
  • It literally took me 7 clicks to find Drupal on Dmoz
    No one except for someone desperately trying to prove a point during a presentation will spend this long
  • This is the most important
    And the main reason I&amp;#x2019;m speaking to you today.
  • To deal with this paradigm shift, this generation of the internet has a few new devices / patterns to address the bloat of content. They all seek to handle the issue of unpredictable content and unknown users.
  • Search is Web 0.5
    Remember, Yahoo made millions with lists of websites to visit
    Google made billions letting people find the websites they wanted
  • Most people building websites just think of search as a checkbox on their requirements list
  • I&amp;#x2019;ve taken a totally biased sampling from Dries&amp;#x2019;s blog of large and newish Drupal sites
  • I took a sample of some of the top sites on the web.
  • Search was largely abandoned by site owners because the quality of results was not good enough.
    It was too slow.
    Users learned not to trust it, and in turn, site builders learned not to prioritize it.
  • If the same word doesn&amp;#x2019;t mean the same thing, your intentions, delivery and content are worthless.
    When a user wants something from your website, they are looking for a keyword.
    You, as a site builder tried to think of them, and make links
  • Users leave only when the content they want doesn&amp;#x2019;t exist. Never before
  • In this case, the user choose to search for Drupal Work, not Drupal Jobs...
    The disadvantage of most search engines is that this doesn&amp;#x2019;t work. The user is just presented with a deluge of information they are not looking for and no obvious way to refine the set.
  • In this case, the user choose to search for Drupal Work, not Drupal Jobs...
    The disadvantage of most search engines is that this doesn&amp;#x2019;t work. The user is just presented with a deluge of information they are not looking for and no obvious way to refine the set.
  • The advantage of a good search engine is that the user can use whatever vocabulary they want and will find what they are looking for.
    What you see here is called &amp;#x201C;Faceted Search.&amp;#x201D; Because Solr is aware not just of the text of your nodes, but all of their metadata, it can provide a much richer way to filter down to just what you are looking for.
  • Newer is better in this case. No one wants taken jobs.
  • Solr allows you to sort
  • Just as vocabulary is important, spelling is to.
  • Solr Spellcheck is not from dictionary
    uses actual content in your index
    If you have a funny name for your product
  • Solr Spellcheck is not from dictionary
    uses actual content in your index
    If you have a funny name for your product
  • Just the tip of the iceberg in terms of customization
  • Buytaert.net
  • reduce width of browser + increase font
  • Let&amp;#x2019;s stop trying to think for our users
    Let&amp;#x2019;s give them tools that allow them to think they way they want
    AND find what they are looking for.

    Now, I&amp;#x2019;m going to hand over the floor to Peter Wolanin who has been the driving force behind recent development of the Apache Solr module. He and I are Acquia&amp;#x2019;s experts on the Solr server.
    He&amp;#x2019;ll be speaking to you about the Solr project in a little more detail, the communities involved and show you some of the really amazing features we&amp;#x2019;ve got planned.
  • All the Drupal module code is on drupal.org and available to everyone.
    Solr is an Apache Foundation project, avaialble free under the Apache 2.0 license.
  • Yes, it&amp;#x2019;s doable, but using Acquia hosted service allows:
    1. Small to medium sites to get rolling in 15 minutes with no special knowhow or hardware
    2. Large sites to not worry about scaling or securing yet another service and the opportunity cost that comes with it.
  • The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
  • The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
  • The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
  • The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
  • The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
  • The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
  • The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
  • If you have an Acquia Subscription, you will have search.
    The Beta which starts today is totally free. Also, ApacheSolr does not stop core search, which means you can fallback to standard drupal search any time. No Risk!
    Setup takes about 5-10 minutes. What are you waiting for!?
    Put up pictures of us up. Bring your laptop. Right Now!
  • Learn more about the admin interface the boosting features
  • Apache Solr Changes the Way You Build Sites

    1. 1. APACHE SOLR CHANGES THE WAY YOU BUILD SITES How to build dynamic navigation for dynamic content Jacob Singh and Peter Wolanin Drupalcon Paris, September 3rd 2009
    2. 2. RELEVANT AND FAST
    3. 3. FILTERS
    4. 4. SORTABLE
    5. 5. MORE LIKE THIS
    6. 6. SITE NAVIGATION
    7. 7. INFORMATION ARCHITECTURE •Isthe science and art of guessing what your users want to see or do on your site and helping them get there •Often done without actually consulting visitors or proper understanding of the target market
    8. 8. ARCHETYPES - WHO
    9. 9. CONTENT - YOU GOT IT
    10. 10. ORGANIZE AND PROSPER!
    11. 11. NAVIGATION
    12. 12. SO? WHAT’S WRONG WITH THAT?
    13. 13. (1996) Site Map!
    14. 14. WEBSITES GROW. •Sometimes really fast. • Andalways in ways you didn’t expect
    15. 15. WHY DOES THIS HAPPEN?
    16. 16. VISITORS ARE TOO VARIED TO THINK OF THEM ALL YOUR CONTENT IS TOO VARIED FOR YOUR MENU
    17. 17. WHERE ARE BEER HATS?
    18. 18. YOUR CONTENT IS TOO VAST AND TAKES TOO MANY CLICKS TO FIND
    19. 19. TOOK 7 CLICKS TO FIND DRUPAL
    20. 20. YOUR CONTENT IS CONTRIBUTED BY USERS AND UNPREDICTABLE.
    21. 21. DRUPAL IS A SOCIAL PUBLISHING PLATFORM
    22. 22. YOU USE IT SO USERS WILL ADD CONTENT That’s the point!
    23. 23. IF WE HAVE DYNAMIC USER CONTRIBUTED CONTENT
    24. 24. IF WE HAVE DYNAMIC USER CONTRIBUTED CONTENT WHY DO WE INSIST ON STATIC ADMIN DEFINED NAVIGATION?
    25. 25. WEB 2.0 JARGON TO SAVE THE DAY •Tag Clouds •Content Recommendation •Social Networking •Social Bookmarking
    26. 26. HOW SEARCH DIED AND HOW WE BRING IT BACK
    27. 27. SEARCH IS NOT A PRIORITY
    28. 28. MOST DRUPAL WEBSITES REALLY IGNORE SEARCH
    29. 29. CAN YOU FIND THE SEARCH BOX? HINT: I’VE CIRCLED IT IN ORANGE
    30. 30. WHAT DO THE BIG SITES DO?
    31. 31. WHY DID SEARCH DIE? •It was too slow •It wasn’t smart enough •Users learned not to trust it
    32. 32. LANGUAGE IS IMPORTANT
    33. 33. GOLDEN RULE: No Dead Ends!
    34. 34. SEARCH ON G.D.O
    35. 35. FIND OUT WHERE TO LOOK
    36. 36. FOUND ‘EM
    37. 37. I LEFT MY TIME MACHINE AT HOME
    38. 38. SMARTER MATCHES
    39. 39. STEMMING
    40. 40. SPELLING SUGGESTIONS
    41. 41. BUT I’M THE ADMIN!
    42. 42. ALL THE COOL KIDS ARE DOING IT
    43. 43. DAMZ EN L’MASSION
    44. 44. A TAILOR MADE NAVIGATION FOR EVERY USER You want it, right?
    45. 45. The Apache Solr Project • Stable and proven. – Used by Netflix, CNET, CitySearch, StubHub!, GameSpot, AOL – Full time maintainers – VERY Active mailing list (about 1k messages per month) • Fast: written in Java. • Uses Lucene: the top open source search library. • Distibuted: scales out in multiple directions.
    46. 46. Apache Solr Search Integration • Very active project on drupal.org. • Takes advantage of latest Solr features. • Exposes an API to modify search and display behavior. • Supported by engineers at Acquia. • All Acquia code improvements have been contributed back to the Drupal.org project. • Many of the Drupalcon sponsors and attendees are already involved and using it.
    47. 47. Feaure highlights • Taxonomy, user, and language facets. • Node type faceting, weighting, and exclusion. • Node property (e.g. sticky) and date weighting. • Date facets on content creation or change. • OG facets (optional sub-module). • Node access respected (optional sub-module). • More-like-this content recommendations. • Customizable (see drupal.org module browsing features).
    48. 48. Integration with other modules/data • Webmail: http://drupal.org/project/webmail_plus • Attachments: http://drupal.org/project/apachesolr_attachments • Views: http://drupal.org/project/apachesolr_views • Services: http://drupal.org/project/solr_service • Nodequeue: http://drupal.org/project/nodequeue • RDF: http://drupal.org/project/apachesolr_rdf • Project: http://drupal.org/project/project • Ubercart: http://drupal.org/project/apachesolr_ubercart
    49. 49. You Can Run Yourself. Easy! 1. Get a dedicated server or a VPS and get Solr loaded on it. 2. Find a Java server administrator or get some books. 3. Get the Drupal module, install the PHP library, and configure it. 4. Replace the stock Solr configuration files with Drupal ones. 5. Learn about Solr replication and configuring it. 6. Set up log management, alerting, monitoring, etc. 7. Implement regular upgrades or patches to Solr which will requiring getting your Java development set up and building from source sometimes. 8. Keep up to date with the Drupal module. 9. Implement a security regime to protect data transfer (i.e. so spammers can’t add Viagra ads to your search results) 10. Harden your servers, setup firewalls and IP-based, password-based, or other security. 11. Figure out how handle updates and versioning of Solr and your schema. 12. *Recommended: Get on the solr-user and solr-developer mailing lists to get updates and alerts on the Apache Solr project. Don’t worry, it’s only a 50 or so mails a day if you don’t count the commit messages.
    50. 50. Or... use Acquia Search • Sign up on acquia.com. • Free 30 day trial subscriptions for anyone. • You must be running a Drupal 6.x site, with PHP 5.2.0+ (5.1.4+ possible as well). • Use Acquia Drupal or install our search module package. • It leaves Drupal core search intact, so you can go back anytime. • Convert your site and start impressing users! • We will worry about everything else.
    51. 51. How Acquia Search Works Search master server authenticated Your webserver request content to index index SSL, HMAC replication authenticated search request Acquia Network results Search slave servers
    52. 52. Proving the platform • Benchmarking our servers, on the search server itself, most searches run in < 200 ms, even under high load.
    53. 53. Who Is This For? • Small and medium size sites - easy access to enterprise search for every Drupal site. – No hardware, no experience, fast setup, low cost. • Large sites and Acquia partners - the same solution you’d deploy, but faster and easier. – Don’t consume your engineering resources. – Why load your own servers? – We handle the security and availability. – Impress your users and clients.
    54. 54. How-to Screencasts • http://www.jeffnoyes.com/content/how-use- acquia-search-apache-solr • http://www.jeffnoyes.com/content/enabling- acquia-search-and-apache-solr Here’s a quick look at the admin interface:
    55. 55. New modules to enhance the experience © 2008 Acquia, Inc.
    56. 56. Searching with Views 3 • Define filters, sorts in Views like normal • Solr, not MySQL, gets the query • Views is responsible for rendering results: • you configure the visible fields • grid views • carousels • search results in blocks • Faceting works like in current Apache © 2008 Acquia, Inc.
    57. 57. Javascript interface • http://drupal.org/project/solrjs © 2008 Acquia, Inc.
    58. 58. Javascript interface • http://drupal.org/project/solrjs © 2008 Acquia, Inc.
    59. 59. Autocomplete http://drupal.org/project/apachesolr_autocomplete © 2008 Acquia, Inc.
    60. 60. Stats for administrators http://drupal.org/project/apachesolr_stats © 2008 Acquia, Inc.
    61. 61. DRUPAL-6--2 Branch © 2008 Acquia, Inc.
    62. 62. DRUPAL-6--2 Branch • Researching more substantial architectural changes to query building. • Looking at ways to support multi-site and better support multi-lingual content. © 2008 Acquia, Inc.
    63. 63. Learn and Contribute • Find us at the code sprint (Saturday) if you have questions about the code or roadmap. • Come to the Acquia table any time to learn more about Acquia Search. Thank you! jacob.singh@acquia.com peter.wolanin@acquia.com © 2009, Acquia, Inc.
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×