Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares

609 views

Published on

See conference video - http://www.lucidimagination.com/devzone/events/conferences/revolution/2011

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
609
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares

  1. 1. Building SaaS solutions with Apache Solr Alberto Mijares, Canoo Engineering AG alberto.mijares@canoo.com, 26/05/2011 Twitter: @lemaiol
  2. 2. <ul><li>Bullet point time! </li></ul>
  3. 3. What I Will Cover <ul><li>Practical applications of Apache Solr and Apache Lucene: how to increase the time spent by a user in an website and do website “ cross-selling ” . </li></ul><ul><li>Use case: how Canoo helped Axel Springer Switzerland to increased the page impressions, user permanence time and traffic in their financial online newspapers. </li></ul><ul><li>Key concepts: </li></ul><ul><ul><li>How to achieve this using Lucene & Solr </li></ul></ul><ul><ul><li>How to profit from a SaaS business model </li></ul></ul>
  4. 4. Who I am <ul><li>Alberto Mijares </li></ul><ul><li>Canoo Engineering AG </li></ul><ul><li>Background in web applications and standards: </li></ul><ul><ul><li>Participated in W3C Semantic Web interest group (SWEO) </li></ul></ul><ul><ul><li>Led web standards compliance tools development in the past (Web Accessibility and Mobile Web) </li></ul></ul><ul><ul><li>Led enterprise information retrieval projects in the recent past </li></ul></ul><ul><ul><li>Actually coaching Google Web Toolkit projects ’ development </li></ul></ul>
  5. 5. Who is Canoo <ul><li>People: </li></ul><ul><ul><li>Dirk Koenig: Groovy founder </li></ul></ul><ul><ul><li>Andres Almiray: Griffon project lead and Java Champion </li></ul></ul><ul><ul><li>Hamlet D ’ Arcy: Groovy committer and enthusiast </li></ul></ul><ul><ul><li>… almost 40 more top software engineers </li></ul></ul><ul><li>Products: </li></ul><ul><ul><li>WebTest: framework for web functional testing </li></ul></ul><ul><ul><li>RIA Suite (aka ULC): Java based RIA framework </li></ul></ul><ul><ul><li>FindIT: information retrieval and search tools </li></ul></ul><ul><ul><li>WMTrans: language analysis tools </li></ul></ul>
  6. 6. Canoo FindIT <ul><li>http://www.canoo.com/videos/FindIT.html </li></ul>
  7. 7. <ul><li>Stop “ bullet-pointing ” ! </li></ul>
  8. 8. The facts <ul><li>Axel Springer group is a market leader </li></ul>Bilanz, Handelszeitung and Stocks In Switzerland financials are important! Financial language is German Online media is the future
  9. 9. The facts <ul><li>Axel Springer group is a market leader </li></ul>Bilanz, Handelszeitung and Stocks In Switzerland financials are important! Financial language is German Online media is the future
  10. 10. The gap <ul><li>Make the online versions more profitable </li></ul>Make all newspapers “ market leaders ”
  11. 11. The gap <ul><li>Make the online versions more profitable </li></ul>Make all newspapers “ market leaders ”
  12. 12. The how <ul><li>Workshop </li></ul>“ Related articles ” “ Cross-selling ”
  13. 13. The how <ul><li>Workshop </li></ul>“ Related articles ” “ Cross-selling ”
  14. 14. The analysis <ul><li>Find a funding model </li></ul>Use Lucene ’ s “ More like this ” Integrate back the suggestions Implement a selection mechanism
  15. 15. The analysis <ul><li>Find a funding model </li></ul>Use Lucene ’ s “ More like this ” Integrate back the suggestions Implement a selection mechanism
  16. 16. The issues <ul><li>“ More like this ” was “ experimental ” </li></ul>Works out-of-the-box only in English Without “ semantics ” not always makes sense Indexing full pages produces noise
  17. 17. The issues <ul><li>“ More like this ” was “ experimental ” </li></ul>Works out-of-the-box only in English Without “ semantics ” not always makes sense Indexing full pages produces noise
  18. 18. The key
  19. 19. The key
  20. 20. The functional requirements <ul><li>Discover and index articles </li></ul>Extract only content Simple and flexible query service
  21. 21. The functional requirements <ul><li>Discover and index articles </li></ul>Extract only content Simple and flexible query service
  22. 22. The funding model
  23. 23. The business model <ul><li>SaaS </li></ul>
  24. 24. The “ other ” requirements <ul><li>Lucene-based analysis pipeline </li></ul>Web oriented platform Multi-application platform Reliable, fast and scalable Plan B?
  25. 25. The “ other ” requirements <ul><li>Lucene-based analysis pipeline </li></ul>Web oriented platform Multi-application platform Reliable, fast and scalable Plan B?
  26. 26. The search <ul><li>Wraps Lucene in a nice way </li></ul>It is mature and Open Source Supports scheduling, REST API, DIH,… Scalability out-of-the-box Well documented and has professional support
  27. 27. The search <ul><li>Wraps Lucene in a nice way </li></ul>It is mature and Open Source Supports scheduling, REST API, DIH… Scalability out-of-the-box Well documented and has professional support
  28. 28. The plan <ul><li>From POC to PROD in “ 80 days ” </li></ul>
  29. 29. The plan <ul><li>From POC to PROD in “ 80 days ” </li></ul>
  30. 30. The results <ul><li>Google analytics </li></ul>
  31. 31. The results <ul><li>Google analytics </li></ul>
  32. 32. The conclusions
  33. 33. The Q&A Thanks!
  34. 34. Sources <ul><li>Links </li></ul><ul><ul><li>http://people.canoo.com/share </li></ul></ul><ul><ul><li>http://www.canoo.com </li></ul></ul><ul><ul><li>http://www.canoo.net </li></ul></ul><ul><ul><li>http://www.leo.org </li></ul></ul><ul><ul><li>http://www.bilanz.ch </li></ul></ul><ul><ul><li>http://www.handelszeitung.ch </li></ul></ul><ul><ul><li>http://www.stocks.ch </li></ul></ul>
  35. 35. Contact <ul><li>Alberto Mijares </li></ul><ul><ul><li>[email_address] </li></ul></ul><ul><ul><li>Twitter: @lemaiol </li></ul></ul>
  36. 36. Architecture <ul><li>Platform: Apache Solr 1.4.1 </li></ul><ul><li>Architecture: </li></ul>Solr container Web container Springer Solr Springer WebApp Customer 2 Solr Customer 2 WebApp Customer 3 Solr Customer 3 WebApp Extern access Intern access Requests

×