Igt cloud summit 13   couchbase - matt ingenthron
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share
  • 220 views

 

Statistics

Views

Total Views
220
Views on SlideShare
220
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NoDerivs LicenseCC Attribution-NoDerivs License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Usually when people talk about Big Data they talk about capturing huge amounts of data and analyzing it. This reference to Big Data is certainly a big trend.But Big Data affects operational databases in a big way as well but for a different set of reasons.There are 2 aspects of Big Data that are pushing people toward NoSQL technologies.The first is that the vast majority of the increase in data is in the form of un-structured or semi-structured data. This is data like user-generated content like consumer recommendations and machine generated data like log files and website click data. Relational databases aren’t well suited for storing this type of data while NoSQL technologies like document-oriented database are ideally suited for this.The second is that application developers are finding new types of data they want to store all the time. It might be new information they want to store in a user’s account profile, new logging information, etc. The point is that what developers want to store is changing very rapidly and the amount of data they want to store is increasing very rapidly. The result is that developers want a very flexible data model that they can evolve very quickly. Relational databases have fixed schemas that often take weeks or months to change. On the other hand, NoSQL databases are schema-less. As a result, you can far more easily add new types of data and iterate quickly on your application.
  • Big Users is an equally big trend driving developers to use NoSQL databases.Most new applications are made available over the internet so people can easily access them.This has caused the number of simultaneous users for many applications to explode.The number of people connected to the internet is more than 2B and growing rapidly.The number of hours that the average user spends on the internet is growing too further increasing the number of simultaneous users.And, with the proliferation of smart phones, people use their applications more and more frequently further increasing the number of simultaneous users.All these simultaneous users leads to a rapidly growing number of database operations and the need for a far easier way to scale your database to meet these demands.
  • Finally, the move to cloud computing and SaaS business models is also driving developers to consider NoSQL databases.15 years ago most applications were developed with a client/server architecture and a packaged software business model that supported the needs of users on a company-by-company basis.Today, applications are increasingly developed using a 3-tier internet architecture, are cloud-based, and use a Software-as-a-Service business model that needs to support the collective needs of thousandsvof customersThis approach increasingly requires a horizontally scalable architecture that easily scales with the number of users and amount of data your application has.
  • Finally, the move to cloud computing and SaaS business models is also driving developers to consider NoSQL databases.15 years ago most applications were developed with a client/server architecture and a packaged software business model that supported the needs of users on a company-by-company basis.Today, applications are increasingly developed using a 3-tier internet architecture, are cloud-based, and use a Software-as-a-Service business model that needs to support the collective needs of thousandsvof customersThis approach increasingly requires a horizontally scalable architecture that easily scales with the number of users and amount of data your application has.
  • To summarize, the technology implications of the Big Data, Big User, and Cloud Computing mega trends are causing people to seriously rethink what database they use for their applications and are increasingly coming to the conclusion that NoSQL databases are a better fit than relational databases.
  • The first companies to run into the scaling, performance, and data model flexibility problems I described were companies like Google, Amazon, Facebook, and LinkedIn.Since there were no solutions available and they have deep technical expertise, they built their own databases to solve these problems.Google published a paper on Big Table in 2006; Amazon published a paper on Dynamo in 2007; Facebook and LinkedIn likewise published papers on their technologies.This was the symbolic beginning of the NoSQL industry.Since then many other open source projects have been started that have been heavily influenced by these technologies.Shortly thereafter companies formed around the open source projects.And now we have a thriving, rapidly growing industry that is disrupting the relational database industry.
  • In this simple example we have 18 documents that have been spread across 3 commodity servers.The applications that is running on the app server uses a client library to access the database. A cluster map shows the app which server to go to in order to read or write a particular document.Since commodity servers are expected to fail from time-to-time, a replica of each document is also stored. Each server stores both active and replica copies of documents.
  • Now let’s say the number of users has increased significantly and you need to add 2 more servers to your cluster to support the increased number of database reads and writes.First, you start up two servers and load the software stack on them.Next, you simply click 1 button to join the new servers into the cluster.Active and replica documents are then automatically rebalanced across the larger cluster.The cluster map is then automatically updated to reflect when the active documents are now stored.Database reads and writes are now distributed over the larger cluster and performance is maintained.
  • Node fail is inevitable and is expected.If a node fails, the cluster will automatically detect this and fail out the node.Replica documents on other servers will then be promoted to “active” status so they can be accessed by the application on working servers.A rebalance to the smaller number of nodes will now typically occur.Key in mind that all of this is happening without anything needing to be changed in the application.With the few minutes we have left, let me give you a real life example of how NoSQL performs.
  • I’m sure the couchbase guys like the next two slidesYou can see a snap shot of our comparison in this slideBoth scale , doc store and searchable And I think the most important thing is the high throughput on the read but more important the write and updateThe cross DC replication was also an important factor for us given our architecture

Igt cloud summit 13 couchbase - matt ingenthron Presentation Transcript

  • 1. Trends in NoSQL Database Technology Matt Ingenthron Director, Developer Solutions
  • 2. Big Data High Data Variety and Velocity 2 Trillions of Gigabytes (Zettabytes) 1.8 1.6 1.4 Unstructured and Semi-Structured Data 1.2 1 Text, Log Files, Click Streams, Blogs, Tw eets, Audio, Video, etc. 0.8 0.6 0.4 Structured Data 0.2 0 2000 2006 Source: IDC 2011 Digital Universe Study (http://www.emc.com/collateral/demos/microsites/emc-digital-universe-2011/index.htm) More Flexible Data Model Required 2011
  • 3. Big Users More Users, All the Time 2 35 1 + Billion Global Online Population + Billion Hours Billion Hours Spent Online Smartphone Users http://www.go-gulf.com/blog/online-time http://business.time.com/2012/02/14/one-billion-smartphones-by-2016-here-comes-the-mobile-arms-race/ More Easily Scalable Database Required
  • 4. SaaS/Cloud Computing New Apps Use 3-Tier Internet Architecture Old Client/Server Architecture New 3-Tier Internet Architecture Client User interface Business and data processing logic INTERNET LAN Cloud Public/Private/Hybrid Cloud Web/App Server Tier Database Server Data Database Server Tier Horizontally Scalable Database Fits Cloud Computing Well
  • 5. SaaS/Cloud Computing Business Model Also Driving Move to Cloud Computing Old Client/Server Architecture Software as a Service3-Tier Internet Architecture New Client User interface Business and data processing logic Database Server INTERNET INTERNET LAN SaaS Cloud Public/Private/Hybrid Cloud Web/App Server Tier Business and data processing logic Data Database Server Tier Data
  • 6. Interactive Apps/ Cloud Computing More Users More Data + + NoSQL Scalability, Performance, Ease of Development
  • 7. Lacking Solutions, Users Forced to Invent Bigtable November 2006 Dynamo October 2007 Cassandra August 2008 Voldemort February 2009 Few Organizations Can Build & Maintain Database Technology
  • 8. How a NoSQL Database Works
  • 9. Basic Operation APP SERVER 1 APP SERVER 2 COUCHBASE Client Library COUCHBASE Client Library CLUSTER MAP CLUSTER MAP READ/WRITE/UPDATE SERVER 1 SERVER 2 SERVER 3 ACTIVE ACTIVE • Docs distributed evenly across servers • Each server stores both active and replica docs ACTIVE Doc 5 Doc Doc 4 Doc Doc 1 Doc Doc 2 Doc Doc 7 Doc Doc 2 Doc Doc 9 Doc Doc 8 Doc Doc 6 Doc REPLICA REPLICA REPLICA Only one server active at a time • Client library provides app with simple interface to database • Cluster map provides map to which server doc is on Doc 4 Doc Doc 6 Doc Doc 7 Doc Doc 1 Doc Doc 3 Doc Doc 9 Doc • App reads, writes, updates docs Doc 8 Doc Doc 2 Doc Doc 5 Doc • Multiple app servers can access same document at same time COUCHBASE SERVER CLUSTER User Configured Replica Count = 1 App never needs to know
  • 10. Add Nodes to Cluster APP SERVER 1 APP SERVER 2 COUCHBASE Client Library COUCHBASE Client Library CLUSTER MAP CLUSTER MAP READ/WRITE/UPDATE READ/WRITE/UPDATE • Two servers added One-click operation SERVER 1 SERVER 2 SERVER 3 SERVER 4 SERVER 5 ACTIVE ACTIVE ACTIVE ACTIVE ACTIVE Doc 5 Doc Doc 4 Doc Doc 1 Doc Doc 2 Doc Doc 7 Doc Doc 2 Doc Doc 9 Doc Doc 8 Doc Doc 6 Doc • REPLICA REPLICA REPLICA Doc 4 Doc Doc 6 Doc Doc 7 Doc Doc 1 Doc Doc 3 Doc Doc 9 Doc Doc 8 Doc Doc 2 Doc Doc 5 Doc COUCHBASE SERVER CLUSTER User Configured Replica Count = 1 Docs automatically rebalanced across cluster Even distribution of docs Minimum doc movement • REPLICA Cluster map updated • App database calls now distributed over larger number of servers REPLICA
  • 11. Fail Over Node APP SERVER 1 APP SERVER 2 COUCHBASE Client Library COUCHBASE Client Library CLUSTER MAP CLUSTER MAP SERVER 1 SERVER 2 SERVER 3 SERVER 4 SERVER 5 ACTIVE ACTIVE ACTIVE ACTIVE ACTIVE Doc 5 Doc Doc 4 Doc Doc 1 Doc Doc 9 Doc Doc 2 Doc Doc 7 Doc Doc 2 Doc Doc 8 Doc Doc 1 Doc 6 Doc Doc Doc 3 REPLICA REPLICA REPLICA REPLICA Doc 4 Doc Doc 6 Doc Doc 7 Doc Doc 5 Doc 1 Doc Doc 3 Doc Doc 9 Doc Doc 2 COUCHBASE SERVER CLUSTER User Configured Replica Count = 1 Doc REPLICA Doc 8 Doc Doc • App servers accessing docs • Requests to Server 3 fail • Cluster detects server failed Promotes replicas of docs to active Updates cluster map • Requests for docs now go to appropriate server • Typically rebalance would follow
  • 12. NoSQL Use Cases
  • 13. Common Use Cases Social Gaming • Couchbase stores player and game data • Example customers include: Tencent • Tapjoy, Ubisoft Mobile Apps • Couchbase stores user info and app content • Example customers include: Kobo, Playtika Ad Targeting • Couchbase stores user information for fast access • Example customers include: AOL, Mediamind, Co nvertro Session store • Couchbase Server as a keyvalue store • Example customers include: PayPal, LivePerson, Sabre User Profile Store • Couchbase Server as a key-value store • Example customers include: Concur High availability cache • Couchbase Server used as a cache tier replacement • Example customers include: Orbitz Content & Metadata Store • Couchbase document store with Elasticsearch • Example customers include: McGraw Hill 3rd party data aggregation • Couchbase stores social media and data feeds • Example customers include: PayPal, AOL
  • 14. Common Use Cases Social Gaming • Couchbase stores player and game data • Example customers include: Tencent • Tapjoy, Ubisoft Mobile Apps • Couchbase stores user info and app content • Example customers include: Kobo, Playtika Ad Targeting • Couchbase stores user information for fast access • Example customers include: AOL, Mediamind, Co nvertro Session store • Couchbase Server as a keyvalue store • Example customers include: PayPal, LivePerson, Sabre User Profile Store • Couchbase Server as a key-value store • Example customers include: Concur High availability cache • Couchbase Server used as a cache tier replacement • Example customers include: Orbitz Content & Metadata Store • Couchbase document store with Elasticsearch • Example customers include: McGraw Hill 3rd party data aggregation • Couchbase stores social media and data feeds • Example customers include: PayPal, AOL
  • 15. Use Case: Session Store Session Store Data Stored in Couchbase • Session values or Cookies (stored as key-value pairs) • Examples include: items in a shopping cart, flights selected, search results, etc. Application Requirements • Extremely fast access to session data using unique session ID • Easy scalability to handle fast growing number of users and user-generated data • Always-on functionality for global user base Why NoSQL and Couchbase • Low latency in sub-milliseconds with consistently high read / write throughput for session data via the built-in object-level cache • Linear throughput scalability to grow the database as user and data volume grow • Always-on operations even particularly high availability using Couchbase replication and failover • Intra cluster and cross cluster (XDCR) replication for globally distributed active-active platform
  • 16. LivePerson: Creating a Real Time Large Scale Analytics Platform with Couchbase
  • 17. Optimize Customer Acquisition & Reduce Bounce Rate Live engagement for Rich multimedia to lingering customer drive sales closure
  • 18. Data @ LP Data at LivePerson 13 VOLUME TB per month 20 M Engagements per month 1.8 B Visits per month
  • 19. Started with this stack MONITORING CHAT/VOICE system APACHE KAFKA Batch track Real-Time track PERPETUAL STORE COMPLEX EVENT PROCESSING STORM ANALYTICAL DB RT REPOSITORY BUSINESS INTELLIGENCE Cassandra LiveEngage DASHBOARD
  • 20. The story - once upon a time The Story...once upon a time Agents console (Java app) Web Tier Visitor’s Events Visitors
  • 21. And then the story continues The story continues... Web Agent Data center 1 Kafka & Strom (Event bus) Data center 2
  • 22. Architecture Current LivePerson Architecture REST API Application server Tomcat Couchbase Java SDK cluster cluster XDCR M/R views M/R views Storm Topology Storm Topology Couchbase Java SDK Couchbase Java SDK
  • 23. Data flow LivePerson Data Flow Agent Visitor (5) Get visitors List Every 3 sec (7) Return relevant visitors Visitor Feed API (1) Visitor browsing Visitor Monitoring Service (2) Visitor events Couchbase (4) Write event to visitor document Kafka (6) Return relevant visitors Visitor Feed - (3) Analyze Storm Topology relevant events
  • 24. Couchbase in production Couchbase in Production
  • 25. Data stack now with The new LivePerson stackCouchbase MONITORING CHAT/VOICE system APACHE KAFKA Batch track Real-Time track PERPETUAL STORE COMPLEX EVENT PROCESSING STORM ANALYTICAL DB RT REPOSITORY BUSINESS INTELLIGENCE Cassandra LiveEngage DASHBOARD
  • 26. Couchbase in LP - Strategic choice Couchbase at LivePerson: A Strategic Choice Visitor Session state Cross Session state Fast RW persistent store Generic caching layer (Memcached style)
  • 27. Thanks! Matt Ingenthron @ingenthr matt@couchbase.com Israel Partner Zaponet Eli Alafi eli.alafi@zaponet.com