Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PayPal: Creating a Central Data Backbone: Couchbase to Couchbase to Kafka to Hadoop and Back: Couchbase Connect 2015

3,795 views

Published on

Running a site like PayPal requires both huge scale and a lot of processing to support a complex growing business. Over time, the PayPal business faces a challenge in managing the user information required to run PayPal services. By leveraging Couchbase, the PayPal Data Service team is able to accommodate fast access to user information at scale while streaming data into Hadoop. The solution is able to process millions of updates a minute while leveraging Kafka's high throughput capabilities to get this user data into the Hadoop cluster.

Published in: Technology
  • Be the first to comment

PayPal: Creating a Central Data Backbone: Couchbase to Couchbase to Kafka to Hadoop and Back: Couchbase Connect 2015

  1. 1. CREATING A CENTRAL DATA BACKBONE AT PAYPAL: COUCHBASE TO KAFKA TO HADOOP AND BACK Shibi Sudhakaran| PayPal Justin Michaels l Couchbase
  2. 2. ©2015 Couchbase Inc. 3 Agenda • Define Problem Domain Justin Michaels | SolutionArchitect, Couchbase • Use case at Paypal & Demo Shibi Sudhakaran| Engineer, Paypal • Q&A
  3. 3. ©2015 Couchbase Inc. 4 Couchbase at PayPal 4 Footprint Overview  Seven use cases (more going live at later date)  Each cluster is 10 to 20 nodes per cluster  Three data center locations per use case Global Cookie Service  Three clusters (two handle traffic, one for DR)  Bi-Directional Replication  Billions of Documents  TB of Data (Maximum of 10 over time) Challenge  Data Analytics
  4. 4. ©2015 Couchbase Inc. 5 How do you analyze Couchbase data 5 CouchbaseViews Sqoop ElasticSearch Stream Data
  5. 5. ©2015 Couchbase Inc. 6 Couchbase at PayPal 6 Couchbase Solution  Couchbase Server deployed to capture and serve global cookies  Integrates with Hadoop to pass data for additional offline analytics via Kafka Results  Consistent low latency  SLA 10ms application  SLA 1ms Couchbase  High availability enabled by distributed cache and data center replication  Kafka integration for analytics within Hadoop cluster
  6. 6. ©2015 Couchbase Inc. 7 Couchbase < 3.0.3 Query Service Couchbase Cluster View (Incremental Map Reduce) Data Service node1 node8 Homogenous Scaling – Each node get a slice of the workload – Simple to do… But... – Workloads compete and interfere with each other – Cant fine tune each workload - Core Data operation are partition-able so great with wider fan-out - Indexing and Query not always partition-able so worse with wider fan-out
  7. 7. ©2015 Couchbase Inc. 8 Couchbase 4.0 Index Service Global Secondary Indexes Couchbase Cluster Query Service Data Service Views and GeoViews node1 node8 Multi-Dimensional Scalability • Independent services for Query, Index and Data • Independent scalability for capacity per Service • Data access provided by distributed cache
  8. 8. ©2015 Couchbase Inc. 9 Couchbase 4.0 Couchbase Cluster node1 node8 node9 Data Service Index ServiceQuery Service Heavier indexing (index more fields) : add compute to index service nodes Increased query load : linearly scale query service More data : linearly scale data service
  9. 9. Innovative leader in Payment 165 Million > 100 Active Customers payment currencies 203 57 Available markets countries
  10. 10. 2014 was a year of significant growth. $235 Billion $8 Billion Net Total Payment Volume ^26% YoY Revenue ^19% YoY 19 Million $168 Billion New Active Digital Wallets Merchant Services Payment Volume ^34% YoY
  11. 11. Size limitations Cookie Consumers & Merchants Overuse Plain/encrypted/ session/persiste nt
  12. 12. © 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 13 Cluster aware Cookie The Fix
  13. 13. © 2015 PayPal Inc. All rights reserved. Confidential and proprietary. Aug/Sep Oct Nov Dec Month Month MonthMonth 14 Data volume/ Scalability • Online system ; >1B documents • 4-10k size ; 5-10TB total storage • Linearly Scalable Availability • Multi data center – DR • Availability requirement of 99.99% RequirementsforDatabase Data Structure • Flexible & Schema less; document based Performance • 50% read/50% write; • Low latency < 5-10 msec
  14. 14. © 2015 PayPal Inc. All rights reserved. Confidential and proprietary. Aug/Sep Oct Nov Dec Month Month Month Month 15 CouchbaseCorePrinciples
  15. 15. © 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 16 Cookie Application Front Tier Customers Applications (C++,Node, Java) Cookie Libraries Mid Tier Data Tier Couchbase Client Functional View CB Kafka Adapter
  16. 16. © 2015 PayPal Inc. All rights reserved. Confidential and proprietary. Cookie App Cookie App Cookie App XDCR Active Write Read 17 Bi-directional Uni-directional Active Passive Deployment Model
  17. 17. © 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 18 Cluster Overview
  18. 18. Analyze Cookie data
  19. 19. © 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 20 Couchbase TAP • Snapshot Entire Database • Export Future mutations • TAP observe data changes in memcached server • Kafka - A high-throughput distributed messaging system. Couchbase Kafka Adapter Based on Couchbase Tap & Kafka Producer Kafka Producer Fast Scalable Durable Distributed https://github.com/paypal/couchbasekafka
  20. 20. © 2015 PayPal Inc. All rights reserved. Confidential and proprietary. Stream data out of database https://github.com/paypal/couchbasekafka 21 Camus , MR Jobs TAP Stream Couchbase Kafka Adapter {TAPClient + Kafka Producer} [1] [2] [3] [4][5][6] [7]
  21. 21. © 2015 PayPal Inc. All rights reserved. Confidential and proprietary. Data Partitions 22 Map Couchbase Partitions to Kafka Partitions forTotal Ordering
  22. 22. © 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 23 Demo … We will supply a link seperately
  23. 23. © 2015 PayPal Inc. All rights reserved. Confidential and proprietary. Monitoring 24
  24. 24. © 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 25 THANK YOU Twitter: @s007 https://linkedin.com/in/shibisudhakaran Shibi Sudhakaran Twitter: @justindmichaels https://linkedin.com/in/justindmichaels Justin Michaels

×