• Like
  • Save
Apache Solr Changes the Way You Build Sites
Upcoming SlideShare
Loading in...5
×
 

Apache Solr Changes the Way You Build Sites

on

  • 7,770 views

By Jacob Singh and Peter Wolanin about the Apache Solr Search Integration module for Drupal. Presented at Drupalcon Paris, 2009.

By Jacob Singh and Peter Wolanin about the Apache Solr Search Integration module for Drupal. Presented at Drupalcon Paris, 2009.

Statistics

Views

Total Views
7,770
Views on SlideShare
7,649
Embed Views
121

Actions

Likes
22
Downloads
240
Comments
0

9 Embeds 121

http://www.slideshare.net 54
http://webstermudge.com 27
http://thuannvn.blogspot.com 17
http://saylinux.net 16
https://i.acquia.com 3
http://chi.pair.com 1
http://thuannvn.blogspot.in 1
http://thuannvn.blogspot.co.uk 1
http://thuannvn.blogspot.it 1
More...

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • I’m going to talk about ApacheSolr, a revolutionary search technology
  • Which provides relevant and fast search results
  • Which provides relevant and fast search results
  • Which provides relevant and fast search results
  • that can be filtered
  • that can be filtered
  • that can be filtered
  • and sorted
  • and sorted
  • and sorted
  • and provides brilliant content recommendations <br /> Changing the way you think about
  • and provides brilliant content recommendations <br /> Changing the way you think about
  • and provides brilliant content recommendations <br /> Changing the way you think about
  • Classic Information Architecture and how you structure your menus and site navigation
  • I&#x2019;ve built a few websites <br /> Usually, we start with something called IA <br /> What is IA?
  • * Start with Who your visitors are and what they want to do
  • * take a look at the inventory of content the site provides
  • group that content into categories which make sense. <br /> Often called cart sorting, mapping, etc
  • Name the groups, and they become menus and highlights (Navigation)
  • What&#x2019;s wrong with these conventions? Our forefathers have used them since time immemorial (1996)
  • The web is a lot more complicated now. <br /> Content comes from a lot more sources (other sites and users) <br /> And websites do more things <br /> Also...
  • It isn&#x2019;t 1996 anymore <br /> In short time your Content may start looking like... {flip}
  • It isn&#x2019;t 1996 anymore <br /> In short time your Content may start looking like... {flip}
  • And your menus become more like {flip}
  • And now your site is hard to organize and impossible to use.
  • You made up archetypes, but your real visitor base is more varied and unique than that.
  • Did you think of the user who wanted handmade paper? <br /> Which category do you think handmade paper is in?
  • It literally took me 7 clicks to find Drupal on Dmoz <br /> No one except for someone desperately trying to prove a point during a presentation will spend this long
  • This is the most important <br /> And the main reason I&#x2019;m speaking to you today.
  • To deal with this paradigm shift, this generation of the internet has a few new devices / patterns to address the bloat of content. They all seek to handle the issue of unpredictable content and unknown users.
  • Search is Web 0.5 <br /> Remember, Yahoo made millions with lists of websites to visit <br /> Google made billions letting people find the websites they wanted
  • Most people building websites just think of search as a checkbox on their requirements list
  • I&#x2019;ve taken a totally biased sampling from Dries&#x2019;s blog of large and newish Drupal sites
  • I took a sample of some of the top sites on the web.
  • Search was largely abandoned by site owners because the quality of results was not good enough. <br /> It was too slow. <br /> Users learned not to trust it, and in turn, site builders learned not to prioritize it.
  • If the same word doesn&#x2019;t mean the same thing, your intentions, delivery and content are worthless. <br /> When a user wants something from your website, they are looking for a keyword. <br /> You, as a site builder tried to think of them, and make links
  • Users leave only when the content they want doesn&#x2019;t exist. Never before
  • In this case, the user choose to search for Drupal Work, not Drupal Jobs... <br /> The disadvantage of most search engines is that this doesn&#x2019;t work. The user is just presented with a deluge of information they are not looking for and no obvious way to refine the set.
  • In this case, the user choose to search for Drupal Work, not Drupal Jobs... <br /> The disadvantage of most search engines is that this doesn&#x2019;t work. The user is just presented with a deluge of information they are not looking for and no obvious way to refine the set.
  • The advantage of a good search engine is that the user can use whatever vocabulary they want and will find what they are looking for. <br /> What you see here is called &#x201C;Faceted Search.&#x201D; Because Solr is aware not just of the text of your nodes, but all of their metadata, it can provide a much richer way to filter down to just what you are looking for.
  • Newer is better in this case. No one wants taken jobs.
  • Solr allows you to sort
  • Just as vocabulary is important, spelling is to.
  • Solr Spellcheck is not from dictionary <br /> uses actual content in your index <br /> If you have a funny name for your product
  • Solr Spellcheck is not from dictionary <br /> uses actual content in your index <br /> If you have a funny name for your product
  • Just the tip of the iceberg in terms of customization
  • Buytaert.net
  • reduce width of browser + increase font
  • Let&#x2019;s stop trying to think for our users <br /> Let&#x2019;s give them tools that allow them to think they way they want <br /> AND find what they are looking for. <br /> <br /> Now, I&#x2019;m going to hand over the floor to Peter Wolanin who has been the driving force behind recent development of the Apache Solr module. He and I are Acquia&#x2019;s experts on the Solr server. <br /> He&#x2019;ll be speaking to you about the Solr project in a little more detail, the communities involved and show you some of the really amazing features we&#x2019;ve got planned.
  • All the Drupal module code is on drupal.org and available to everyone. <br /> Solr is an Apache Foundation project, avaialble free under the Apache 2.0 license.
  • Yes, it&#x2019;s doable, but using Acquia hosted service allows: <br /> 1. Small to medium sites to get rolling in 15 minutes with no special knowhow or hardware <br /> 2. Large sites to not worry about scaling or securing yet another service and the opportunity cost that comes with it.
  • The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
  • The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
  • The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
  • The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
  • The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
  • The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
  • The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
  • If you have an Acquia Subscription, you will have search. <br /> The Beta which starts today is totally free. Also, ApacheSolr does not stop core search, which means you can fallback to standard drupal search any time. No Risk! <br /> Setup takes about 5-10 minutes. What are you waiting for!? <br /> Put up pictures of us up. Bring your laptop. Right Now!
  • Learn more about the admin interface the boosting features

Apache Solr Changes the Way You Build Sites Apache Solr Changes the Way You Build Sites Presentation Transcript

  • APACHE SOLR CHANGES THE WAY YOU BUILD SITES How to build dynamic navigation for dynamic content Jacob Singh and Peter Wolanin Drupalcon Paris, September 3rd 2009
  • RELEVANT AND FAST
  • FILTERS
  • SORTABLE
  • MORE LIKE THIS
  • SITE NAVIGATION
  • INFORMATION ARCHITECTURE •Isthe science and art of guessing what your users want to see or do on your site and helping them get there •Often done without actually consulting visitors or proper understanding of the target market
  • ARCHETYPES - WHO
  • CONTENT - YOU GOT IT
  • ORGANIZE AND PROSPER!
  • NAVIGATION
  • SO? WHAT’S WRONG WITH THAT?
  • (1996) Site Map!
  • WEBSITES GROW. •Sometimes really fast. • Andalways in ways you didn’t expect
  • WHY DOES THIS HAPPEN?
  • VISITORS ARE TOO VARIED TO THINK OF THEM ALL YOUR CONTENT IS TOO VARIED FOR YOUR MENU
  • WHERE ARE BEER HATS?
  • YOUR CONTENT IS TOO VAST AND TAKES TOO MANY CLICKS TO FIND
  • TOOK 7 CLICKS TO FIND DRUPAL
  • YOUR CONTENT IS CONTRIBUTED BY USERS AND UNPREDICTABLE.
  • DRUPAL IS A SOCIAL PUBLISHING PLATFORM
  • YOU USE IT SO USERS WILL ADD CONTENT That’s the point!
  • IF WE HAVE DYNAMIC USER CONTRIBUTED CONTENT
  • IF WE HAVE DYNAMIC USER CONTRIBUTED CONTENT WHY DO WE INSIST ON STATIC ADMIN DEFINED NAVIGATION?
  • WEB 2.0 JARGON TO SAVE THE DAY •Tag Clouds •Content Recommendation •Social Networking •Social Bookmarking
  • HOW SEARCH DIED AND HOW WE BRING IT BACK
  • SEARCH IS NOT A PRIORITY
  • MOST DRUPAL WEBSITES REALLY IGNORE SEARCH
  • CAN YOU FIND THE SEARCH BOX? HINT: I’VE CIRCLED IT IN ORANGE
  • WHAT DO THE BIG SITES DO?
  • WHY DID SEARCH DIE? •It was too slow •It wasn’t smart enough •Users learned not to trust it
  • LANGUAGE IS IMPORTANT
  • GOLDEN RULE: No Dead Ends!
  • SEARCH ON G.D.O
  • FIND OUT WHERE TO LOOK
  • FOUND ‘EM
  • I LEFT MY TIME MACHINE AT HOME
  • SMARTER MATCHES
  • STEMMING
  • SPELLING SUGGESTIONS
  • BUT I’M THE ADMIN!
  • ALL THE COOL KIDS ARE DOING IT
  • DAMZ EN L’MASSION
  • A TAILOR MADE NAVIGATION FOR EVERY USER You want it, right?
  • The Apache Solr Project • Stable and proven. – Used by Netflix, CNET, CitySearch, StubHub!, GameSpot, AOL – Full time maintainers – VERY Active mailing list (about 1k messages per month) • Fast: written in Java. • Uses Lucene: the top open source search library. • Distibuted: scales out in multiple directions.
  • Apache Solr Search Integration • Very active project on drupal.org. • Takes advantage of latest Solr features. • Exposes an API to modify search and display behavior. • Supported by engineers at Acquia. • All Acquia code improvements have been contributed back to the Drupal.org project. • Many of the Drupalcon sponsors and attendees are already involved and using it.
  • Feaure highlights • Taxonomy, user, and language facets. • Node type faceting, weighting, and exclusion. • Node property (e.g. sticky) and date weighting. • Date facets on content creation or change. • OG facets (optional sub-module). • Node access respected (optional sub-module). • More-like-this content recommendations. • Customizable (see drupal.org module browsing features).
  • Integration with other modules/data • Webmail: http://drupal.org/project/webmail_plus • Attachments: http://drupal.org/project/apachesolr_attachments • Views: http://drupal.org/project/apachesolr_views • Services: http://drupal.org/project/solr_service • Nodequeue: http://drupal.org/project/nodequeue • RDF: http://drupal.org/project/apachesolr_rdf • Project: http://drupal.org/project/project • Ubercart: http://drupal.org/project/apachesolr_ubercart
  • You Can Run Yourself. Easy! 1. Get a dedicated server or a VPS and get Solr loaded on it. 2. Find a Java server administrator or get some books. 3. Get the Drupal module, install the PHP library, and configure it. 4. Replace the stock Solr configuration files with Drupal ones. 5. Learn about Solr replication and configuring it. 6. Set up log management, alerting, monitoring, etc. 7. Implement regular upgrades or patches to Solr which will requiring getting your Java development set up and building from source sometimes. 8. Keep up to date with the Drupal module. 9. Implement a security regime to protect data transfer (i.e. so spammers can’t add Viagra ads to your search results) 10. Harden your servers, setup firewalls and IP-based, password-based, or other security. 11. Figure out how handle updates and versioning of Solr and your schema. 12. *Recommended: Get on the solr-user and solr-developer mailing lists to get updates and alerts on the Apache Solr project. Don’t worry, it’s only a 50 or so mails a day if you don’t count the commit messages.
  • Or... use Acquia Search • Sign up on acquia.com. • Free 30 day trial subscriptions for anyone. • You must be running a Drupal 6.x site, with PHP 5.2.0+ (5.1.4+ possible as well). • Use Acquia Drupal or install our search module package. • It leaves Drupal core search intact, so you can go back anytime. • Convert your site and start impressing users! • We will worry about everything else.
  • How Acquia Search Works Search master server authenticated Your webserver request content to index index SSL, HMAC replication authenticated search request Acquia Network results Search slave servers
  • Proving the platform • Benchmarking our servers, on the search server itself, most searches run in < 200 ms, even under high load.
  • Who Is This For? • Small and medium size sites - easy access to enterprise search for every Drupal site. – No hardware, no experience, fast setup, low cost. • Large sites and Acquia partners - the same solution you’d deploy, but faster and easier. – Don’t consume your engineering resources. – Why load your own servers? – We handle the security and availability. – Impress your users and clients.
  • How-to Screencasts • http://www.jeffnoyes.com/content/how-use- acquia-search-apache-solr • http://www.jeffnoyes.com/content/enabling- acquia-search-and-apache-solr Here’s a quick look at the admin interface:
  • New modules to enhance the experience © 2008 Acquia, Inc.
  • Searching with Views 3 • Define filters, sorts in Views like normal • Solr, not MySQL, gets the query • Views is responsible for rendering results: • you configure the visible fields • grid views • carousels • search results in blocks • Faceting works like in current Apache © 2008 Acquia, Inc.
  • Javascript interface • http://drupal.org/project/solrjs © 2008 Acquia, Inc.
  • Javascript interface • http://drupal.org/project/solrjs © 2008 Acquia, Inc.
  • Autocomplete http://drupal.org/project/apachesolr_autocomplete © 2008 Acquia, Inc.
  • Stats for administrators http://drupal.org/project/apachesolr_stats © 2008 Acquia, Inc.
  • DRUPAL-6--2 Branch © 2008 Acquia, Inc.
  • DRUPAL-6--2 Branch • Researching more substantial architectural changes to query building. • Looking at ways to support multi-site and better support multi-lingual content. © 2008 Acquia, Inc.
  • Learn and Contribute • Find us at the code sprint (Saturday) if you have questions about the code or roadmap. • Come to the Acquia table any time to learn more about Acquia Search. Thank you! jacob.singh@acquia.com peter.wolanin@acquia.com © 2009, Acquia, Inc.