Rolling Out Apache HBase for Mobile Offerings at Visa

345 views

Published on

Partha Saha and CW Chung (Visa)

Visa has embarked on an ambitious multi-year redesign of its entire data platform that powers its business. As part of this plan, the Apache Hadoop ecosystem, including HBase, will now become a staple in many of its solutions. Here, we will describe our journey in rolling out a high-availability NoSQL solution based on HBase behind some of our prominent mobile offerings.

Published in: Software
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
345
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Rolling Out Apache HBase for Mobile Offerings at Visa

  1. 1. | HBaseCon 2016 | May 24, 20161 Rolling Out Apache HBase for Mobile Offerings at Visa Partha Saha pasaha@visa.com CW Chung cchung@visa.com
  2. 2. | HBaseCon 2016 | May 24, 20162 Data loaded in real-time Over 100 Billion rows as history from most recent Milli-second response times for write/read What this talk is about – A choice of NoSQL at Visa Scale Speed Real-time
  3. 3. | HBaseCon 2016 | May 24, 20163 An example of a mobile offering Add card to wallet Pay For Purchase See your transaction Right away along with recent history Need NoSQL Here
  4. 4. | HBaseCon 2016 | May 24, 20164 We chose HBase as a NoSQL solution. We built a scalable and real-time Transaction History Service. We migrated prominent Mobile wallet offerings to the Service. This talk is about our learnings over the last year.
  5. 5. | HBaseCon 2016 | May 24, 20165 This talk … 1. We assume some knowledge and familiarity of HBase. 2. We used HBase 1.0.0 with Cloudera Distribution CDH 5.4.3, so our observations are based on that version of HBase. 3. We cover the important learning events along the way of adoption of HBase in Visa 1. These can help new teams adopting HBase so that they avoid the same pitfalls. 2. Our learning continues as we take on more interesting and challenging opportunities.
  6. 6. | HBaseCon 2016 | May 24, 20166 Is YCSB a good way to compare NoSQL options?
  7. 7. | HBaseCon 2016 | May 24, 20167 It is actually not… • Unless you know how to configure your NoSQL options for optimal performance… • You may be driven to another solution, because its performance seems “smoother” and easier to explain by rudimentary knowledge. 0 20000 40000 60000 1 12 23 34 45 56 67 78 89 100 111 122 133 144 155 166 177 188 199 210 221 232 243 254 265 276 287 298 309 Series2 0 20000 40000 1 12 23 34 45 56 67 78 89 100 111 122 133 144 Series2 • It is a great tool however to observe how system configuration changes performance, and explore the configuration space for various workloads.
  8. 8. | HBaseCon 2016 | May 24, 20168 Our YCSB experience… • Very easy to set up! • Got a baseline of HBase performance of the cluster. Rerun after significant configuration & application code changes. • Key parameters used: – # of client threads – # of operations – # records in Data Set – Workload mix of read/update/insert. (We added 100% insert/update workload). – Use a bash driver script to test various combinations of parameters. • Latency measurement type can be in histogram or timeseries. Both were useful.
  9. 9. | HBaseCon 2016 | May 24, 20169 Should you design yourself out of major compactions?
  10. 10. | HBaseCon 2016 | May 24, 201610 Not worth the trouble when you are starting… • An argument may be made that if we need an “N” day rolling look back, we can have daily tables that we create before and delete past the look back window. We can then reason about how to compact each daily file. Will that make the system operate better? • Write amplification is a well known problem and gets a lot of attention, but however, worrying about the problem during early design stages seemed like premature optimization. • We thought that we could always optimize later through rolling compactions and diurnal patterns of traffic later once patterns of reads and writes were fully understood.
  11. 11. | HBaseCon 2016 | May 24, 201611 Does your design need transactional support?
  12. 12. | HBaseCon 2016 | May 24, 201612 We analyzed our secondary and primary key read/writes. Primary key Fact pk1 pk2 Seconda ry key Associations sk1 {pk1} sk2 {pk1, pk2} Query keys for facts Register associations • We concluded, by tracing reads and failures through updates that inconsistencies were short lived. • We would have used a transaction support library otherwise.
  13. 13. | HBaseCon 2016 | May 24, 201613 How do you hands-on learn about HBase without going into Production?
  14. 14. | HBaseCon 2016 | May 24, 201614 We built a Continuous Integration and Learning Environment Build Server git/ Stash Bamboo Artifactory Client Bamboo plan Chef Client - Checkout - Build - Upload - Deploy - Run test Test Server
  15. 15. | HBaseCon 2016 | May 24, 201615 How do get Operations ready for HBase in Production?
  16. 16. | HBaseCon 2016 | May 24, 201616 We allocated one developer for 1 day/week to monitor production problems … Bangalore India Foster City CA, USA 1. We shadowed the real production 2. Any production problem was given priority by the whole team 3. We used 2 sites for 24x7 eyes 4. Added Alert and Monitoring dashboards 5. We launched only when when we met certain metrics
  17. 17. | HBaseCon 2016 | May 24, 201617 Loading data in real-time as it is read
  18. 18. | HBaseCon 2016 | May 24, 201618 We used a micro-batch approach Pre- Processor Listing & Sender Tracker Loader Master Receiver Loader Worker Batch Processor LLF Reader HBase Load Batch Processor LLF Reader HBase Load Batch Processor Stream Reader HBase Load Listing & Sender Tracker Notification Master Receiver Notification Worker Batch Processor LLF Reader HBase Registration Query Send Notification Batch Processor LLF Reader HBase Registration Query Send Notification Batch Processor Stream Reader HBase Query Send Notification IPC IPC Micro-Batch (250 ms) Control and State Files readswrites 1 per Stream 1..N per Master 1 per Stream1..N per Master Stream N Stream 2 stream1 …..... tail We had to build an approach to remember and retry from any point in each stream
  19. 19. | HBaseCon 2016 | May 24, 201619 Reading via Web Servers
  20. 20. | HBaseCon 2016 | May 24, 201620 The web-services Front End Audit DB MQ Config Service Access Authorization Encryption UtilityAudit Load Distribution Plugin Cache Subscription Service Failover Service BusinessComponent DataService Web Service Wrapper Rest Controller API Request Transform Response Transform Domain Objects Audit Listener HBaseAPI HBase Plugin HBase Cluster Gateway and Load Balancer
  21. 21. | HBaseCon 2016 | May 24, 201621 Availability
  22. 22. | HBaseCon 2016 | May 24, 201622 We used 2 data centers to get availability Data Center 1 Streams Data Center 2 Streams Replication of non-native streams We use shadow tables to write for the other when the other is down, and drain the shadow tables for the other to catch-up
  23. 23. | HBaseCon 2016 | May 24, 201623 Learning your Data Center clock
  24. 24. | HBaseCon 2016 | May 24, 201624 HBase is sensitive to clock skew… • Kerberos services do not tolerate more than a few minutes of clock skew. • Warnings are generated for a small skews, large skews kill region-servers.
  25. 25. | HBaseCon 2016 | May 24, 201625 Client retries
  26. 26. | HBaseCon 2016 | May 24, 201626 Client retries & IOExceptions • Default HBase timeout/retries settings can take tens of minutes to timeout: – hbase.rpc.timeout: 60 sec – hbase.client.retries.number: 35 – hbase.client.pause: 100 msec (grows to 10 sec quickly after back-off) – Longer when factor in potential retries by zookeeper! – Blogs by Lars Hofhansl: “HBase Client timeouts”, “HBase client response times” • We choose Fail Fast strategy, as end user device will do end-to-end retry. • Timeout/retries settings: 1 sec timeout, 3 total tries. – Works well for the same data center, as well as across data centers • However, once a while, clients see IOExceptions! – Caused by Region Server (busy in GC, major/minor compaction, … ?) – Or the Network? – Or the Client itself?
  27. 27. | HBaseCon 2016 | May 24, 201627 Correlating client exceptions
  28. 28. | HBaseCon 2016 | May 24, 201628 Correlating client exceptions • Client side: – Turn on hbase client debugging: • log4j.logger.org.apache.hbase.client=DEBUG • log4j.logger.org.apache.hbase.ipc=DEBUG – Catch the exceptions to print out specific Region Server name: • IOException, RetriesExhaustedWithDetailsException • Server side: – Then look into the specific Region Server log of that server. • Works well when you know the specific server causing the IOExceptions. – What if not?
  29. 29. | HBaseCon 2016 | May 24, 201629 Correlating client exceptions • Build Root Cause Analysis software to: – Collect the relevant logs from the sources: • Client: application logs, hbase client logs, GC logs • Hadoop server: HBase, HDFS, Zookeeper server and GC logs • Cluster events: Cloudera Manage API • Other logs: KDC logs, Kerberos canary, network latency monitoring – Parse the logs (single line, multi-line text, json, xml) into csv files. – Normalize data and time format, apply date and time range filtering. – Apply text filtering and text reduction on verbose lines. – Output: events csv, sorted by time and server, suitable for grep/awk/sort, hive/sql. • Quickly get an total view of the sequence of events of various services. • Sometime can identify the smoking gun (e.g. exception caused by GC ). • Still useful in the few cases when no smoking gun can be found! – Trouble-shooting is also a process of elimination.
  30. 30. | HBaseCon 2016 | May 24, 201630 Kerberos Gotchas
  31. 31. | HBaseCon 2016 | May 24, 201631 Kerberos Gotchas – what we have learned • Hostname uses FQDN (Fully Qualified Domain Name, like server123.abc.com) • Use TCP rather than UDP (set udp_preference_limit = 1 in krb5.conf) • KDC (MIT Kerberos) server: – Configure to start up several kdc processes to handle bursty traffic (use –w option). – Set up a backup kdc for higher availability. • Debugging tips: – $ export KRB5_TRACE=/dev/stderr (or to a file) – $ log4j: -Dsun.security.krb5.debug=true • Kerberos support is built into the Java JRE, using internal classes: – Oracle JDK: com.sun classes; on IBM AIX: com.ibm – Hadoop is built and tested against Oracle JDK ( mileage on AIX JDK varies). • Good references (besides the usual documents on Kerberos, and HBase User mailing list): – Steve Loughran: Hadoop and Kerberos: The Madness beyond the Gate. – HBase and Hadoop common source code: UserGroupInformation.java.
  32. 32. | HBaseCon 2016 | May 24, 201632 Kerberos Gotchas – what we learned – Renewing a TGT Ticket (Ticket Granting Ticket) • After kinit successfully, application principal gets a Kerberos TGT ticket. • By default, the TGT ticket is good for 10 hours. • For long-running applications, 10 hours obviously is not enough: need to renew TGT. • Initially uses a process/thread to do a kinit once every few hours. – Still ran into some IOExceptions at the time of TGT of renewal. – Not the recommended way for long-running applications. • Now uses UGI API (UserGroupInformation): loginUserFromKeytab( ). – Does not require a separate process/thread to do TGT renewal. – Hadoop/HBase client class library will catch the exception due to TGT expiration, and will do a reloginFromKeytab( ) to renew TGT automatically. – Also considering spawn a thread and proactively invoke CheckTGTAndRelogin( ). – Ongoing investigation: client occasionally still experiencing momentary IOException around the time ticket renewal. – Referral Ticket: when on realm is set up to trust another realm, be aware of the additional kdc calls resulted when the kinit principal is from the trusted realm.
  33. 33. | HBaseCon 2016 | May 24, 201633 Garbage Collection
  34. 34. | HBaseCon 2016 | May 24, 201634 Garbage Collection • Use G1 on Oracle JDK 1.8 • Basically using settings as recommended by folks from HBaseCon2015. – By Eric Kaczmarek, Yanping Wang, Liqi Yi • Set target GC pause to 100 msec; Young Gen to ~1GB. • Observation consistent with their published results: – Observed gc time in production: • 100 msec or less: 67% • 400 msec or less: 99.98% • Important to track the actual production gc time, as Production and Test cluster shows somewhat different distribution.
  35. 35. | HBaseCon 2016 | May 24, 201635 GC Duration comparison: production vs perf cluster
  36. 36. | HBaseCon 2016 | May 24, 201636 GC: How Good is MaxGCPauseMillis as a Target? MaxGCPauseMillis = 100 Production Cluster (gc in msec) Test Cluster (gc in msec) # of gc events 165192 199883 Avg / Std Dev / Max 87.1 / 64.9 / 1530 msec 81.9 / 37.2 / 1370 msec 50 percentile (median) 80 msec 90 msec 95 percentile / 99% / 99.9% / 99.99% 210 msec / 270 / 450 / 660 msec 120 msec / 140 / 510 / 780 msec Percentile of: 100 msec / 200 / 300 / 400 msec 67% / 95% / 99.4% / 99.8% 85% / 99.4% / 99.6% / 99.8%
  37. 37. | HBaseCon 2016 | May 24, 201637 In Conclusion…
  38. 38. | HBaseCon 2016 | May 24, 201638 Adopting an open source product is a journey… • Learning from previous adoption successes is crucial – if use case has not been tried/analyzed/written about before, chances are we have to pay for learning and having alternate choices is a good idea. • Making only one major technology change at a time is always a good idea. • Setting up appropriate expectations through team members and agile processes is important. • Going to production scenario early as shadow and learning through frequent releases is helpful. • We believe extra capacity for peak workloads was very helpful. • Having source code is very useful in learning and trouble-shooting.
  39. 39. | HBaseCon 2016 | May 24, 201639 It Takes a Village! Thank you! Alexandr Peyko Amit Sharma Anthony Chu Arindam Chakraborty Artem Savinov Aviral Agarwal Bala Saravanan Kannan Ben Crane Carl Duque Chetan Talanki Debasis Mullick Deepankar Palit Hong Zhu Igor Karpenko Igor Peller Igor Ulianitski Jay Gardner Jim Gordon Karthikeyan Manickavasagan Liang Gao Murali Reddy Nandakumar Jayakumar Nimish Shah Peter Meigs Pradyot Sikdar Praveen Rudraraju Rajat Raj Raj Merchia Ralph Blore Ranjan Dutta Ricardo De Ocampo Domingo Robert Walsh Sabu Peter Sam Hamilton Sandeep Reddy Satyaban Nandi Soumya Das Srijoy Aditya Srinivas Reddy Surasani Suchismita Nayak Suresh Pulikara Ujjwal Kumar Vikash Talanki Vinay Sarda Waqar Hasan Winnie Chau Xuepeng (Hans) Li Yanyan Hao Yusuf Rahaman Amandeep Khurana Jeongho Park Jugoslav Djajic Justin Hayes Michael Stack

×