Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Real-time visitor analysis with
Couchbase and Elasticsearch

Jeroen Reijn | @jreijn | #nosql13
follow the Hippo trail
NoSQL Matters 2013

About me
Jeroen Reijn
Software engineer
Hippo
@jreijn
http://blog.jeroenreijn.com

follow the Hippo tr...
NoSQL Matters 2013

About Hippo

follow the Hippo trail
NoSQL Matters 2013

Visitor Analysis
OneHippo @ Goto

follow the Hippo trail
NoSQL Matters 2013

OneHippo @ Goto

follow the Hippo trail
NoSQL Matters 2013

OneHippo @ Goto

follow the Hippo trail
NoSQL Matters 2013

Journey based Targeting
follow the Hippo trail
NoSQL Matters 2013

How we analyse
visitors @ Hippo
OneHippo @ Goto

follow the Hippo trail
NoSQL Matters 2013

Registration
Visitor - entity making HTTP requests
Collector - records data about a visitor or his beh...
NoSQL Matters 2013

Matching
Characteristic - a type of fact about visitors
Example: "comes from a city", "experiences a t...
NoSQL Matters 2013

What do we store?
Request log
!

Targeting data
!

Statistics
Averages, e.g. how many visitors became ...
NoSQL Matters 2013

Real-time analysis

follow the Hippo trail
NoSQL Matters 2013

How about YOU?
• Do you analyse
your visitors?	


• Do you do it ‘realtime’?

follow the Hippo trail
NoSQL Matters 2013

Architecture
OneHippo @ Goto

follow the Hippo trail
NoSQL Matters 2013

JSON

XML (X)HTML
App server

Hippo Delivery Tier
Hippo Repository

RDBMS

follow the Hippo trail
NoSQL Matters 2013

Request
Delivery Tier
URL Matching

Fetch content

Compose output

Response

follow the Hippo trail
Request

NoSQL Matters 2013

Delivery Tier
URL Matching
Collect data
Scoring
Fetch content
Compose output

Response

follo...
NoSQL Matters 2013

Scaling
OneHippo @ Goto

follow the Hippo trail
NoSQL Matters 2013

Scaling out
App server

App server

Hippo Delivery Tier

Hippo Delivery Tier

Hippo Repository

Hippo ...
NoSQL Matters 2013

Scaling out
App server
Delivery Tier

App server
Targeting
Datastore

Repository

Delivery Tier
Reposi...
NoSQL Matters 2013

What kind of storage?
OneHippo @ Goto

follow the Hippo trail
NoSQL Matters 2013

Typical Data Access Pattern

Several reads

OneHippo @ Goto

Single write

Writer
Datastore

follow th...
NoSQL Matters 2013

Analytics Data Access Pattern
Several writes

Single read

Datastore

CMS user

Writers

follow the Hi...
NoSQL Matters 2013

Targeting Data Access Pattern
Several writes

Single read

Several reads

Datastore

CMS user

Visitor...
NoSQL Matters 2013

Distributed Cache

follow the Hippo trail
NoSQL Matters 2013

Requirements
change!
OneHippo @ Goto

follow the Hippo trail
NoSQL Matters 2013

NoSQL ?
OneHippo @ Goto

follow the Hippo trail
NoSQL Matters 2013

Suitable types
• Key-value store
• Document database
• Column oriented store

follow the Hippo trail
NoSQL Matters 2013

Assessment Criteria
Maturity

Data model

Scalability

Replication

Performance

Reliability

Caching ...
NoSQL Matters 2013

Selection Criteria
• Performance	

• Scalability	

• Schema flexibility	

• Simplicity

follow the Hipp...
NoSQL Matters 2013

Couchbase
OneHippo @ Goto

follow the Hippo trail
NoSQL Matters 2013

Why Couchbase?
• Drop-in replacement for memcached
• Read/Write-through cache
• High throughput
• Easi...
NoSQL Matters 2013

Couchbase
• Open Source
• Document-oriented
• Easy Scalable
• Consistent High Performance
• Apache lic...
NoSQL Matters 2013

Performance
•
•

Object managed cache
Write Queue to disk

follow the Hippo trail
NoSQL Matters 2013

Easy scalable
• Auto sharding
• Cross cluster replication (XDCR)
• Master - Master replication

follow...
NoSQL Matters 2013

Flexible data model
• Native JSON support
• Incremental Map Reduce
• Gives power to the developer

fol...
NoSQL Matters 2013

How we run
Couchbase @ Hippo
OneHippo @ Goto

follow the Hippo trail
NoSQL Matters 2013

Load Balancer

Hippo Delivery Tier

Database cluster

Couchbase cluster

• Request log data
• Targetin...
NoSQL Matters 2013

Analysis capabilities
• Querying via views	

• Secondary indexes via views	

• Views based on Map - Re...
NoSQL Matters 2013

Elasticsearch
• Apache Lucene
• Designed to be distributed
• Schema free
• Apache license
• RESTful AP...
NoSQL Matters 2013

Added value
• Unstructured search
• Structured search
• Faceted search
• Geo spatial search
• Combinat...
NoSQL Matters 2013

Replication

Read

ry
ue

Write

/Q

Java API

ad
Re

Hippo Delivery Tier

Couchbase Server Cluster

E...
NoSQL Matters 2013

What’s Next?
OneHippo @ Goto

follow the Hippo trail
NoSQL Matters 2013

Advanced analytics

follow the Hippo trail
NoSQL Matters 2013

{ Demo }
OneHippo @ Goto

follow the Hippo trail
NoSQL Matters 2013

!

Thanks!
OneHippo @ Goto

!

j.reijn@onehippo.com
@jreijn
www.onehippo.com

follow the Hippo trail
Upcoming SlideShare
Loading in …5
×
Upcoming SlideShare
A short introduction to Vagrant for developers
Next
Download to read offline and view in fullscreen.

3

Share

Download to read offline

Real-time visitor analysis with Couchbase and Elastichsearch

Download to read offline

These slides were from my NoSQL Matters Barcelona 2013 presentation. During this presentation I went into detail about the architecture behind our high performance real-time visitor analysis platform. The talk will also cover why we chose CouchBase for storage and how Elasticsearch can be used for advanced search and analytics. I shared how we integrated and leverage both products full-circle from within our Hippo CMS product.

Related Books

Free with a 30 day trial from Scribd

See all

Real-time visitor analysis with Couchbase and Elastichsearch

  1. 1. Real-time visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail
  2. 2. NoSQL Matters 2013 About me Jeroen Reijn Software engineer Hippo @jreijn http://blog.jeroenreijn.com follow the Hippo trail
  3. 3. NoSQL Matters 2013 About Hippo follow the Hippo trail
  4. 4. NoSQL Matters 2013 Visitor Analysis OneHippo @ Goto follow the Hippo trail
  5. 5. NoSQL Matters 2013 OneHippo @ Goto follow the Hippo trail
  6. 6. NoSQL Matters 2013 OneHippo @ Goto follow the Hippo trail
  7. 7. NoSQL Matters 2013 Journey based Targeting follow the Hippo trail
  8. 8. NoSQL Matters 2013 How we analyse visitors @ Hippo OneHippo @ Goto follow the Hippo trail
  9. 9. NoSQL Matters 2013 Registration Visitor - entity making HTTP requests Collector - records data about a visitor or his behaviour Example: location collector (GeoIPCollector) Targeting Data - all data about a specific visitor Example: IP address is located in Amsterdam follow the Hippo trail
  10. 10. NoSQL Matters 2013 Matching Characteristic - a type of fact about visitors Example: "comes from a city", "experiences a type of weather" Target Group - the specification of a Characteristic Example: "comes from a European city", "comes from Amsterdam" Persona - one or more target groups that describe a certain type of visitor Example: "Jim, the European urban consumer", "Alice, the Pet owner" follow the Hippo trail
  11. 11. NoSQL Matters 2013 What do we store? Request log ! Targeting data ! Statistics Averages, e.g. how many visitors became which persona follow the Hippo trail
  12. 12. NoSQL Matters 2013 Real-time analysis follow the Hippo trail
  13. 13. NoSQL Matters 2013 How about YOU? • Do you analyse your visitors? • Do you do it ‘realtime’? follow the Hippo trail
  14. 14. NoSQL Matters 2013 Architecture OneHippo @ Goto follow the Hippo trail
  15. 15. NoSQL Matters 2013 JSON XML (X)HTML App server Hippo Delivery Tier Hippo Repository RDBMS follow the Hippo trail
  16. 16. NoSQL Matters 2013 Request Delivery Tier URL Matching Fetch content Compose output Response follow the Hippo trail
  17. 17. Request NoSQL Matters 2013 Delivery Tier URL Matching Collect data Scoring Fetch content Compose output Response follow the Hippo trail
  18. 18. NoSQL Matters 2013 Scaling OneHippo @ Goto follow the Hippo trail
  19. 19. NoSQL Matters 2013 Scaling out App server App server Hippo Delivery Tier Hippo Delivery Tier Hippo Repository Hippo Repository RDBMS follow the Hippo trail
  20. 20. NoSQL Matters 2013 Scaling out App server Delivery Tier App server Targeting Datastore Repository Delivery Tier Repository RDBMS follow the Hippo trail
  21. 21. NoSQL Matters 2013 What kind of storage? OneHippo @ Goto follow the Hippo trail
  22. 22. NoSQL Matters 2013 Typical Data Access Pattern Several reads OneHippo @ Goto Single write Writer Datastore follow the Hippo trail
  23. 23. NoSQL Matters 2013 Analytics Data Access Pattern Several writes Single read Datastore CMS user Writers follow the Hippo trail
  24. 24. NoSQL Matters 2013 Targeting Data Access Pattern Several writes Single read Several reads Datastore CMS user Visitors follow the Hippo trail
  25. 25. NoSQL Matters 2013 Distributed Cache follow the Hippo trail
  26. 26. NoSQL Matters 2013 Requirements change! OneHippo @ Goto follow the Hippo trail
  27. 27. NoSQL Matters 2013 NoSQL ? OneHippo @ Goto follow the Hippo trail
  28. 28. NoSQL Matters 2013 Suitable types • Key-value store • Document database • Column oriented store follow the Hippo trail
  29. 29. NoSQL Matters 2013 Assessment Criteria Maturity Data model Scalability Replication Performance Reliability Caching model Query model Consistency model Support Monitoring follow the Hippo trail
  30. 30. NoSQL Matters 2013 Selection Criteria • Performance • Scalability • Schema flexibility • Simplicity follow the Hippo trail
  31. 31. NoSQL Matters 2013 Couchbase OneHippo @ Goto follow the Hippo trail
  32. 32. NoSQL Matters 2013 Why Couchbase? • Drop-in replacement for memcached • Read/Write-through cache • High throughput • Easily scalable • Schema flexibility • Low latency follow the Hippo trail
  33. 33. NoSQL Matters 2013 Couchbase • Open Source • Document-oriented • Easy Scalable • Consistent High Performance • Apache licensed follow the Hippo trail
  34. 34. NoSQL Matters 2013 Performance • • Object managed cache Write Queue to disk follow the Hippo trail
  35. 35. NoSQL Matters 2013 Easy scalable • Auto sharding • Cross cluster replication (XDCR) • Master - Master replication follow the Hippo trail
  36. 36. NoSQL Matters 2013 Flexible data model • Native JSON support • Incremental Map Reduce • Gives power to the developer follow the Hippo trail
  37. 37. NoSQL Matters 2013 How we run Couchbase @ Hippo OneHippo @ Goto follow the Hippo trail
  38. 38. NoSQL Matters 2013 Load Balancer Hippo Delivery Tier Database cluster Couchbase cluster • Request log data • Targeting data • Statistics data follow the Hippo trail
  39. 39. NoSQL Matters 2013 Analysis capabilities • Querying via views • Secondary indexes via views • Views based on Map - Reduce • Limited ad-hoc query capabilities follow the Hippo trail
  40. 40. NoSQL Matters 2013 Elasticsearch • Apache Lucene • Designed to be distributed • Schema free • Apache license • RESTful API follow the Hippo trail
  41. 41. NoSQL Matters 2013 Added value • Unstructured search • Structured search • Faceted search • Geo spatial search • Combinate all • All in (near) real-time follow the Hippo trail
  42. 42. NoSQL Matters 2013 Replication Read ry ue Write /Q Java API ad Re Hippo Delivery Tier Couchbase Server Cluster Elasticsearch Server Cluster XDCR Couchbase Transport plugin follow the Hippo trail
  43. 43. NoSQL Matters 2013 What’s Next? OneHippo @ Goto follow the Hippo trail
  44. 44. NoSQL Matters 2013 Advanced analytics follow the Hippo trail
  45. 45. NoSQL Matters 2013 { Demo } OneHippo @ Goto follow the Hippo trail
  46. 46. NoSQL Matters 2013 ! Thanks! OneHippo @ Goto ! j.reijn@onehippo.com @jreijn www.onehippo.com follow the Hippo trail
  • loretoparisi

    Jul. 16, 2015
  • heinnge

    Aug. 5, 2014
  • mathtm

    Jun. 9, 2014

These slides were from my NoSQL Matters Barcelona 2013 presentation. During this presentation I went into detail about the architecture behind our high performance real-time visitor analysis platform. The talk will also cover why we chose CouchBase for storage and how Elasticsearch can be used for advanced search and analytics. I shared how we integrated and leverage both products full-circle from within our Hippo CMS product.

Views

Total views

3,616

On Slideshare

0

From embeds

0

Number of embeds

444

Actions

Downloads

50

Shares

0

Comments

0

Likes

3

×