Successfully reported this slideshow.
Your SlideShare is downloading. ×

Building a relevance platform with Couchbase and Elasticsearch

Building a relevance platform with Couchbase and Elasticsearch

Download to read offline

These slides were from my Goto Amsterdam presentation. During this presentation I went into detail about how we're building a high performance relevance platform at Hippo with Couchbase and Elasticsearch. The talk will also cover why we chose CouchBase for storage and how Elasticsearch can be used for search and analytics. I shared how we integrated and leverage both products full-circle from within our Hippo CMS product.

These slides were from my Goto Amsterdam presentation. During this presentation I went into detail about how we're building a high performance relevance platform at Hippo with Couchbase and Elasticsearch. The talk will also cover why we chose CouchBase for storage and how Elasticsearch can be used for search and analytics. I shared how we integrated and leverage both products full-circle from within our Hippo CMS product.

Advertisement
Advertisement

More Related Content

Advertisement

Related Books

Free with a 30 day trial from Scribd

See all

Building a relevance platform with Couchbase and Elasticsearch

  1. 1. OneHippo @ Goto follow the Hippo trail Building a relevance platform with Couchbase and Elasticsearch @jreijn | Hippo #gotoams, June 18
  2. 2. follow the Hippo trail OneHippo @ Goto About me • Architect @ Hippo • DevOps guy • Blogger @ http://blog.jeroenreijn.com
  3. 3. follow the Hippo trail OneHippo @ Goto About Hippo
  4. 4. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto Relevance?
  5. 5. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto “The capability of a search engine or function to retrieve data appropriate to a user's needs.” http://www.thefreedictionary.com/relevance
  6. 6. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto
  7. 7. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto How we deliver relevant content @Hippo
  8. 8. follow the Hippo trail OneHippo @ Goto Registration Visitor - entity making HTTP requests Collector - records data about a visitor or his behavior Example: location collector (GeoIPCollector) Targeting Data - all data about a specific visitor Example: IP address is located in Amsterdam
  9. 9. follow the Hippo trail OneHippo @ Goto Matching Characteristic - a type of fact about visitors Example: "comes from a city", "experiences a type of weather" Target Group - the specification of a Characteristic Example: "comes from a European city", "comes from Amsterdam" Persona - one or more target groups that describe a certain type of visitor Example: "Jim, the European urban consumer", "Alice, the Pet owner"
  10. 10. follow the Hippo trail OneHippo @ Goto What do we store? Request log Targeting data Statistics Averages, e.g. how many visitors became which persona
  11. 11. follow the Hippo trail OneHippo @ Goto Real-time analysis
  12. 12. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto Architecture
  13. 13. follow the Hippo trail OneHippo @ Goto RDBMS Hippo Delivery Tier Hippo Repository App server XMLJSON (X)HTML
  14. 14. follow the Hippo trail OneHippo @ Goto Delivery Tier URL Matching Fetch content Compose output Request Response
  15. 15. follow the Hippo trail OneHippo @ Goto Delivery Tier URL Matching Targeting Data Collection Compose output Request Response Fetch content Scoring
  16. 16. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto Scaling
  17. 17. follow the Hippo trail OneHippo @ Goto RDBMS Hippo Delivery Tier Hippo Repository App server Hippo Delivery Tier Hippo Repository App server Scaling out
  18. 18. follow the Hippo trail OneHippo @ Goto RDBMS Delivery Tier Repository App server Delivery Tier Repository App server Scaling out Targeting Datastore
  19. 19. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto What kind of ‘storage’?
  20. 20. follow the Hippo trail OneHippo @ Goto Distributed Cache?
  21. 21. follow the Hippo trail OneHippo @ Goto We have a winner!
  22. 22. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto Requirements change!
  23. 23. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto NoSQL to the rescue
  24. 24. follow the Hippo trail OneHippo @ Goto Suitable types • Key-value store • Document database
  25. 25. follow the Hippo trail OneHippo @ Goto Assessment Criteria Maturity Data model Consistency model PerformanceReplication Caching model Query model Monitoring Scalability Reliability Support
  26. 26. follow the Hippo trail OneHippo @ Goto Selection Criteria • Performance! • Scalability • Schema flexibility • Simplicity • Monitoring • Support
  27. 27. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto Performance !!
  28. 28. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto Scalability
  29. 29. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto Schema flexibility
  30. 30. follow the Hippo trail OneHippo @ Goto { "visitorId": "7a1c7e75-8539-40", "pageUrl": "http://localhost:8080/site/news", "pathInfo": "/news", "remoteAddr": "127.0.0.1", "referer": "http://localhost:8080/site/", "timestamp": 1371419505909, "collectorData": { "geo": { "country": "", "city": "", "latitude": 0, "longitude": 0 }, "returningvisitor": false, "channel": "English Website" }, "personaIdScores": [], "globalPersonaIdScores": [] } Request log document
  31. 31. follow the Hippo trail OneHippo @ Goto { "geo": { "collectorId": "geo", "city": "", "country": "", "latitude": 0, "longitude": 0 }, "channel": { "collectorId": "channel", "channels": [ "English Website" ], "lastVisitedChannel": "English Website" } } Visitor document
  32. 32. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto Simplicity
  33. 33. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto Monitoring
  34. 34. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto Support
  35. 35. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto Couchbase
  36. 36. follow the Hippo trail OneHippo @ Goto Why Couchbase? • Drop-in replacement for memcached • Read/Write-through cache • High throughput • Easy scalability • Schema flexibility • Low latency
  37. 37. follow the Hippo trail OneHippo @ Goto Couchbase • Open Source • Document-oriented • Easy Scalable • Consistent High Performance
  38. 38. follow the Hippo trail OneHippo @ Goto Performance • Object managed cache • Write Queue to disk • Avoids Cold Cache
  39. 39. follow the Hippo trail OneHippo @ Goto Easy scalable • Auto sharding • Cross cluster replication (XDCR) • Master - Master replication
  40. 40. follow the Hippo trail OneHippo @ Goto Flexible data model • Native JSON support • Incremental Map Reduce • Gives power to the developer
  41. 41. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto How we run Couchbase @Hippo
  42. 42. follow the Hippo trail OneHippo @ Goto Load Balancer Database cluster Hippo Delivery Tier Couchbase cluster •Request log data •Targeting data •Statistics data
  43. 43. follow the Hippo trail OneHippo @ Goto Query capabilities • Querying via views • Secondary indexes via views • Views based on Map - Reduce • Lacks some advanced query capabilities
  44. 44. follow the Hippo trail OneHippo @ Goto Elasticsearch • Apache Lucene • Designed to be distributed • Schema free • Apache 2 licensed • RESTful API
  45. 45. follow the Hippo trail OneHippo @ Goto Added value of ES • Full text search • Faceted search • Geo spatial search • All in (near) real-time
  46. 46. follow the Hippo trail OneHippo @ Goto Couchbase Server Cluster Elasticsearch Server Cluster Hippo Delivery Tier Java API Write Read XDCR Couchbase ES Transport plugin Replicating to ES
  47. 47. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto Demo time!
  48. 48. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto What’s Next?
  49. 49. follow the Hippo trail OneHippo @ Goto Advanced analytics
  50. 50. follow the Hippo trail OneHippo @ Goto OneHippo @ Goto Thank you! Questions? j.reijn@onehippo.com @jreijn ps. We’re hiring!

×