ZaloPay Merchant Platform on K8S on-premise

•Download as PPT, PDF•

15 likes•2,454 views

ZaloPay Merchant Platform deployed Kubernetes on-premise for its microservices. It faced issues with node scaling breaking production, multi-interface configuration, and node joining. Two nodes failed causing downtime and pressure to shutdown. Lessons learned include practicing more, doing chaos engineering, and getting business support for new technologies. Next steps are upgrading components, improving monitoring, and applying ingress and persistence solutions.

Internet

ZaloPay Merchant Platform on K8S on-premise
Incident & Lesson Learned
ThanhCNN
Head of MEP

• About me
• Why k8s on premise?
• MEP K8S stack
• Issues
• Lesson learned
• Next step
Contents

- M.Sc Uni Duisburg, Germany
- Tech Lead:
- ZingMe
- CSMBoot, CSMPlay, CSM
- GBC
- IoT Lab
- ZaloPay Merchant Platform
(MEP)
- K8S newbie
Me

Why K8S
• Trend: micro service,
container
• Orchestrating across hosts
• Easy to manage and scale
app
• Save cost !!!
Source: https://x-team.com/blog/introduction-kubernetes-architecture

Why on premise
• Fintech -> secure data
• Save cost !!!

MEP K8S Stack
• Deploy Architect
• Load balancing
• Access internet
• Storage
• DB
• File
• CI/CD
• Log Collect
• Tracing
• Monitor & Alert

Load balancing
• Why do we need proxy ?
• Proxy mode
• User space proxy mode (from v1.0)
• IPTables proxy mode (from v1.1)
• IPVS proxy mode (from v1.2)

Load balancing
• Service type:
• ClusterIP
• NodePort
• LoadBalancer => our choice
• Using MetalLB layer 2 (ARP)

Access Internet
• Using HTTP(S) Proxy installed in HAProxy node
• In K8s

Storage
• DB
• Separate VLAN
• SQL: TiDB
• NoSQL: Redis, Cassandra
• Queueing: Kaffka, ActiveMQ
• Search: Elastic Search
• File
• Minio

Storage
https://upload.wikimedia.org/wikipedia/commons/1/1f/TiDB_Architecture.jpg

CI/CD v1
• Gitlab hook when commit with comment “BUILD <ENV>”
• Why don’t we use GitLab Webhook?
• Call Jenkins Pipeline
• Build the code
• Test the code
• Deploy to K8s
• Manual config HAProxy

Log collector
https://medium.com/@carlosedp/log-aggregation-with-elasticsearch-fluentd-and-kibana-stack-on-arm64-kubernetes-cluster-

Tracing
https://medium.com/velotio-perspectives/a-comprehensive-tutorial-to-implementing-opentracing-with-jaeger-
a01752e1a8ce

Monitor & Alert
• Monitor dashboard from K8S
• Monitor HAProxy log
• Parse ERR 5XX
• Alert by SMS to tech leader

Issues 1
• Scale node break the production farm
• Multi interface because of sticky node
• Kubespray choose default route when no IP in inventory file

Issues 2
• Cannot join node which joined before
• Kubeadm installed
• Kubelet cannot start
• How to fix ?

Issues 3
• 2 node die
• Product has problem
• Biz pressure: request to shutdown product because of tech
incidents
• Try to fix => more problem
• How to fix ?

Lesson learned
• Try to make DEV ~ PROD
• Try to understand the root causes
• Practice & Practice & Practice
• Chaos engineering is VERY IMPORTANT for production
• Need supporting from Biz Owner to apply new technology

Next steps
• Upgrade to latest version k8s, tidb
• Consolidate monitor tools
• Make Alert system smarter
• Apply Ingress controller : Nginx Ingress Controller
• Try Persistence Volume: OpenEBS
• Redis cluster solution
• Fully automation CI/CD

ZaloPay Merchant Platform on K8S on-premise

What's hot

Introduction to MicroservicesAmazon Web Services

Introduction to Event-Driven Architecture Solace

Effective API DesignBansilal Haudakari

Event Driven ArchitectureStefan Norberg

Fitbit Agile and Waterfall techniquespranaysadarangani

How to Define Platform KPIs by King Product DirectorProduct School

Toi uu hoa he thong 30 trieu nguoi dungIT Expert Club

What is an API Gateway?LunchBadger

Lyft presentationBilly Irwin

Open Loyalty - Open Source for Loyalty Programs - Product TourDivante

Bizweb Microservices ArchitectureKhôi Nguyễn Minh

Event Sourcing & CQRS, Kafka, Rabbit MQAraf Karsh Hamid

Cao Nha Dinh - Young Marketer Contest 9ssuserf08d02

Guide to an API-first StrategyKellton Tech Solutions Ltd

Young Marketers Elite 3 Graduation Presentation - Ôi, Ra Rồi!Giang Nguyễn

YOUNG MARKETERS 5+1 - FINALE - PHAN MINH CƯỜNGthienvan94

Marketing Arena _ Cánh Cú _ Round 1 _ Lâm Tiên KhảiKhải Tiên

Application Performance MonitoringOlivier Gérardin

Anatomy of a Spring Boot App with Clean Architecture - Spring I/O 2023Steve Pember

Service Mesh @Lara Camp Myanmar - 02 Sep,2023Hello Cloud

What's hot (20)

Introduction to Microservices

Introduction to Event-Driven Architecture

Effective API Design

Event Driven Architecture

Fitbit Agile and Waterfall techniques

How to Define Platform KPIs by King Product Director

Toi uu hoa he thong 30 trieu nguoi dung

What is an API Gateway?

Lyft presentation

Open Loyalty - Open Source for Loyalty Programs - Product Tour

Bizweb Microservices Architecture

Event Sourcing & CQRS, Kafka, Rabbit MQ

Cao Nha Dinh - Young Marketer Contest 9

Guide to an API-first Strategy

Young Marketers Elite 3 Graduation Presentation - Ôi, Ra Rồi!

YOUNG MARKETERS 5+1 - FINALE - PHAN MINH CƯỜNG

Marketing Arena _ Cánh Cú _ Round 1 _ Lâm Tiên Khải

Application Performance Monitoring

Anatomy of a Spring Boot App with Clean Architecture - Spring I/O 2023

Service Mesh @Lara Camp Myanmar - 02 Sep,2023

Similar to ZaloPay Merchant Platform on K8S on-premise

ZaloPay Merchant Platform on K8S on-premiseChau Thanh

Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...Grokking VN

DNUG46 - Build your own private Cloud environmentpanagenda

Build your own private Cloud environmentNico Meisenzahl

JDO 2019: What you should be aware of before setting up kubernetes on premise...PROIDEA

Realtime traffic analyserAlex Moskvin

Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)Tibo Beijen

Building a CI/CD driven infrastructure for managing kubernetes clusters on ba...TEC Campus

OSDC 2018 | Three years running containers with Kubernetes in Production by T...NETWAYS

Why real integration developers ride CamelsChristian Posta

CMake: Improving Software Quality and ProcessMarcus Hanwell

Kubernetes on the Edge / 在邊緣的K8SYi-Fu Ciou

Migrating Java EE applications to IBM Bluemix Platform-as-a-ServiceDavid Currie

Microservices on a budget meetupMatthew Reynolds

Migrating Java EE applications to IBM Bluemix platform as-a-service (CloudFou...Jack-Junjie Cai

The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...Josef Adersberger

Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...QAware GmbH

Adding Support for Networking and Web Technologies to an Embedded SystemJohn Efstathiades

Adopting OpenTelemetryVincent Behar

Kubernetes – An open platform for container orchestrationinovex GmbH

Similar to ZaloPay Merchant Platform on K8S on-premise (20)

ZaloPay Merchant Platform on K8S on-premise

Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...

DNUG46 - Build your own private Cloud environment

Build your own private Cloud environment

JDO 2019: What you should be aware of before setting up kubernetes on premise...

Realtime traffic analyser

Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)

Building a CI/CD driven infrastructure for managing kubernetes clusters on ba...

OSDC 2018 | Three years running containers with Kubernetes in Production by T...

Why real integration developers ride Camels

CMake: Improving Software Quality and Process

Kubernetes on the Edge / 在邊緣的K8S

Migrating Java EE applications to IBM Bluemix Platform-as-a-Service

Microservices on a budget meetup

Migrating Java EE applications to IBM Bluemix platform as-a-service (CloudFou...

The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...

Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...

Adding Support for Networking and Web Technologies to an Embedded System

Adopting OpenTelemetry

Kubernetes – An open platform for container orchestration

Recently uploaded

How to Use Contact Form 7 Like a Pro.pptxGal Baras

The Best AI Powered Software - Intellivid AI StudioVikram Brahma (Online-Digital-Entrepreneur)

Pvtaan Social media marketing proposal.pdfPvtaan

The Use of AI in Indonesia Election 2024: A Case StudyDamar Juniarto

The+Prospects+of+E-Commerce+in+China.pptxlaozhuseo02

History+of+E-commerce+Development+in+China-www.cfye-commerce.shoplaozhuseo02

Article writing on excessive use of internet.pptxabhinandnam9997

How Do I Begin the Linksys Velop Setup Process?Linksys Velop Login

Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesSanjeev Rampal

ER(Entity Relationship) Diagram for online shopping - TAEHimani415946

一比一原版UTS毕业证悉尼科技大学毕业证成绩单如何办理aagad

The AI Powered Organization-Intro to AI-LAN.pdfSiskaFitrianingrum

Recently uploaded (12)

How to Use Contact Form 7 Like a Pro.pptx

The Best AI Powered Software - Intellivid AI Studio

Pvtaan Social media marketing proposal.pdf

The Use of AI in Indonesia Election 2024: A Case Study

The+Prospects+of+E-Commerce+in+China.pptx

History+of+E-commerce+Development+in+China-www.cfye-commerce.shop

Article writing on excessive use of internet.pptx

How Do I Begin the Linksys Velop Setup Process?

Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines

ER(Entity Relationship) Diagram for online shopping - TAE

一比一原版UTS毕业证悉尼科技大学毕业证成绩单如何办理

The AI Powered Organization-Intro to AI-LAN.pdf

ZaloPay Merchant Platform on K8S on-premise

1. ZaloPay Merchant Platform on K8S on-premise Incident & Lesson Learned ThanhCNN Head of MEP

2. • About me • Why k8s on premise? • MEP K8S stack • Issues • Lesson learned • Next step Contents

3. - M.Sc Uni Duisburg, Germany - Tech Lead: - ZingMe - CSMBoot, CSMPlay, CSM - GBC - IoT Lab - ZaloPay Merchant Platform (MEP) - K8S newbie Me

4. Why K8S • Trend: micro service, container • Orchestrating across hosts • Easy to manage and scale app • Save cost !!! Source: https://x-team.com/blog/introduction-kubernetes-architecture

5. Why on premise • Fintech -> secure data • Save cost !!!

6. MEP K8S Stack • Deploy Architect • Load balancing • Access internet • Storage • DB • File • CI/CD • Log Collect • Tracing • Monitor & Alert

7. MEP K8S Stack

8. Load balancing • Why do we need proxy ? • Proxy mode • User space proxy mode (from v1.0) • IPTables proxy mode (from v1.1) • IPVS proxy mode (from v1.2)

9. Load balancing • Service type: • ClusterIP • NodePort • LoadBalancer => our choice • Using MetalLB layer 2 (ARP)

10. Access Internet • Using HTTP(S) Proxy installed in HAProxy node • In K8s

11. Storage • DB • Separate VLAN • SQL: TiDB • NoSQL: Redis, Cassandra • Queueing: Kaffka, ActiveMQ • Search: Elastic Search • File • Minio

12. Storage https://upload.wikimedia.org/wikipedia/commons/1/1f/TiDB_Architecture.jpg

13. CI/CD v1 • Gitlab hook when commit with comment “BUILD <ENV>” • Why don’t we use GitLab Webhook? • Call Jenkins Pipeline • Build the code • Test the code • Deploy to K8s • Manual config HAProxy

14. CI/CD v2

15. Log collector https://medium.com/@carlosedp/log-aggregation-with-elasticsearch-fluentd-and-kibana-stack-on-arm64-kubernetes-cluster-

16. Tracing https://medium.com/velotio-perspectives/a-comprehensive-tutorial-to-implementing-opentracing-with-jaeger- a01752e1a8ce

17. Tracing

18. Monitor & Alert

19. Monitor & Alert

20. Monitor & Alert

21. Monitor & Alert • Monitor dashboard from K8S • Monitor HAProxy log • Parse ERR 5XX • Alert by SMS to tech leader

22. Issues 1 • Scale node break the production farm • Multi interface because of sticky node • Kubespray choose default route when no IP in inventory file

23. Issues 2 • Cannot join node which joined before • Kubeadm installed • Kubelet cannot start • How to fix ?

24. Issues 3 • 2 node die • Product has problem • Biz pressure: request to shutdown product because of tech incidents • Try to fix => more problem • How to fix ?

25. Lesson learned • Try to make DEV ~ PROD • Try to understand the root causes • Practice & Practice & Practice • Chaos engineering is VERY IMPORTANT for production • Need supporting from Biz Owner to apply new technology

26. Next steps • Upgrade to latest version k8s, tidb • Consolidate monitor tools • Make Alert system smarter • Apply Ingress controller : Nginx Ingress Controller • Try Persistence Volume: OpenEBS • Redis cluster solution • Fully automation CI/CD

ZaloPay Merchant Platform on K8S on-premise

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to ZaloPay Merchant Platform on K8S on-premise

Similar to ZaloPay Merchant Platform on K8S on-premise (20)

More from Chau Thanh

More from Chau Thanh (8)

Recently uploaded

Recently uploaded (12)

ZaloPay Merchant Platform on K8S on-premise