SlideShare a Scribd company logo
Technical Challenging in
MMAU microservices
Ronald (hothero)
About me
● Tech Lead (SSEII) - Carousell@TW
● Senior SE - Uber@AMS
● SSE - Carousell@TW/SG
● CTO & Co-Founder @ Backer-Founder
● A girl’s father
● 在旋轉拍賣 carousell 一年看到的後端架構 -挑戰與生活 (11.5K claps)
● 2017 後端工程師面試以及準備經驗 (1.7K claps)
Snap, list, sell
List any item for sale in 30 seconds
In-app chat
Chat directly with sellers without
revealing personal information
Photo-centric
Core focus of the app on photos
Social
Share listings on social media channels
and join groups of people with similar
interests
In 8 years, we have grown to become
South East Asia’s #1 classifieds marketplace
2012 2013 2014 2015 2016 2017 2018 2019
Founded
Carousell
in SG
Raised
USD800K
Series Seed
Raised USD6M
Series A
Launched in MY,
ID, TW
Launched
web platform
Launched HK
+ PH
Raised
USD35M
Series B
Acquired
Caarly,
launched Autos
+ Coins &
Bumps
+ Property
Launched
Spotlight +
Carousell
Protection
Acquired OLX PH,
Naspers invests,
received funding
from Telenor,
merged with 701
Search entities,
Mudah, Cho Tot
and OneKyat
Phase III
Verticalization & monetization
Phase I
Foundation
Raised
USD85M
Series C
Phase II
Internationalization
2020
Carousell
announces
US$80M
investment from
Naver, Mirae
Asset-Naver Asia
Growth Fund and
NH Investment &
Securities
8
markets
250
million
user listings
$850
million USD
in valuation
750+ teammates
19 nationalities
8 offices
Carousell Group
Resources
Infrastructure in 2018
schematic diagram of architecture in 2021
gateway
listing
recom
search
third-parties
Logistics
order
offer
ads
90+
services
Payment
homescreen
Share our improvement first
● Around 1.8% error rate in one of our systems (Payment & shipping, 7+
services tied to each other) at the beginning of 2020 -> Bad UX
● `Something went wrong` to users if one of call in the chain failed.
As having more and more services, what challenging
● The SLA would be 95%, meaning 2160 minutes of downtime / month
● Overall 99% needs three-nine on each
● The reliability of every request made is really important
● Reasons of failure
○ Network issues
○ Capacity (DB/CPU/Memory/Concurrency/etc.)
○ Performance/Latency
○ Services connection sustainability
■ Infra-level (e.g. HAProxy, Envoy)
■ Application-level
○ Wrong logic/Bad code
gateway service4
service2
service1 service3
.99 .99 .99 .99 .99
*
* * *
Application-level - the request made
Hystrix
CallerContext
Load Balancer
Server
CalleeContext
Hystrix caps at 2 seconds,
but newrelic shows avg
latency can up to 5 seconds
(capped at infra level)
how Golang projects handle timeout
Common mistakes - timeout handling
Hystrix
CallerContext
Load Balancer
Server
CalleeContext
how Golang projects handle timeout
● Somehow passing
context.Background() in goroutine
jobs and blocking
● Didn’t set hystrix to cap long requests
● grpc.WithTimeout(timeout) is set for
dialing not general timeout of
requests
https://play.golang.org/p/Nrhu5nhvaa3
POC of standardizing gRPC dialing
POC of standardizing wrapper of handling timeout
Topics
● Alternatives on Retry
● Configuration Control
Outcome of the improvement
Observability + Workflow: maintain the level of
reliability
gateway service2
service1
Hyst
rix
Hyst
rix
Hyst
rix
Distributed tracing
Error reporting
Main alerting (Programmability)
Monitoring (alert channels)
Workflow - Support Rotation/Weekly
Sync
Case study about the importance of context
passing and tracing
● High error rate alert of one endpoint on gateway
● Same error spike on service 1 but not on service 4
● No proper egress metrics on service 1 and 2 (e.g.
hystrix) but seeing error rate on gRPC server
● Based on the good enough evidence, checked with the
corresponding team. Confirmed it’s the outage related.
● If we have had tracing and appropriate context passing,
we would have noticed the root cause on Jaeger very
quickly.
● Demo
gateway
service1
service2
service3
service4
service5
The type of business and product of Carousell
determines the focus
● Core and synchronous flows in Carousell
○ Homefeed
○ Search
○ Make an offer
○ Chat
○ Make an order
○ ...
● There are still a lot of configurations/components could impact overall
reliability, why mostly about wrapper and communications
○ Read volume is larger than write traffic
○ Statibility is more important than performance for now
○ Browsing is the majority of user behaviors
Introducing service-mesh with multi-tenancy
● Technical challenging as having more and more microservices and teams
○ Staging environment is very unstable
○ Productivity of all types of testing is needed to be improved.
○ The needs of built-in routing is increasing
Reference: https://eng.uber.com/multitenancy-microservice-architecture/
Metadata of gRPC in Golang
● HTTP Headers/gRPC Metadata -> incoming context -> outgoing context -> ...
https://github.com/grpc/grpc-go/blob/master/Documentation/grpc-metadata.md,
https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md
https://github.com/carousell/Orion/blob/master/interceptors/interceptors.go#L214-L229
Conclusion
● Context is the critical component in Golang projects
○ Metadata propagation
○ Distributed tracing
○ Service Mesh with tenancy
○ Timeout handling
○ …
● From Golang document
○ Incoming requests to a server should create a Context, and outgoing calls to servers should
accept a Context. The chain of function calls between them must propagate the
Context, optionally replacing it with a derived Context created using WithCancel,
WithDeadline …... Do not store Contexts inside a struct type; instead, pass a Context
explicitly to each function that needs it. The Context should be the first parameter,
typically named ctx
● Service-mesh plays a role in micro-service architecture.
● We’re hiring, join us if you’re also interested in the journey :)
目前職缺 Current Position
Transactional GC Team
Sr. Software Engineer, Backend
Buyer Experience Team
Sr. Software Engineer, Frontend
資深後端工程師 資深前端工程師

More Related Content

What's hot

gRPC & Kubernetes
gRPC & KubernetesgRPC & Kubernetes
gRPC & Kubernetes
Kausal
 
gRPC Design and Implementation
gRPC Design and ImplementationgRPC Design and Implementation
gRPC Design and Implementation
Varun Talwar
 
Kafka Retry and DLQ
Kafka Retry and DLQKafka Retry and DLQ
Kafka Retry and DLQ
George Teo
 
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, GoogleBringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Ambassador Labs
 
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable,  Robust Kafka ReplicatoruReplicator: Uber Engineering’s Scalable,  Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
Michael Hongliang Xu
 
Introduction to gRPC
Introduction to gRPCIntroduction to gRPC
Introduction to gRPC
Prakash Divy
 
Introduction to DevOps and the Practical Use Cases at Credit OK
Introduction to DevOps and the Practical Use Cases at Credit OKIntroduction to DevOps and the Practical Use Cases at Credit OK
Introduction to DevOps and the Practical Use Cases at Credit OK
Kriangkrai Chaonithi
 
Microservices summit talk 1/31
Microservices summit talk   1/31Microservices summit talk   1/31
Microservices summit talk 1/31
Varun Talwar
 
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per DayHadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Ankur Bansal
 
Introduction to Modern DevOps Technologies
Introduction to  Modern DevOps TechnologiesIntroduction to  Modern DevOps Technologies
Introduction to Modern DevOps Technologies
Kriangkrai Chaonithi
 
Search engine based on Elasticsearch
Search engine based on ElasticsearchSearch engine based on Elasticsearch
Search engine based on Elasticsearch
Radek Baczynski
 
Robert Kubis - gRPC - boilerplate to high-performance scalable APIs - code.t...
 Robert Kubis - gRPC - boilerplate to high-performance scalable APIs - code.t... Robert Kubis - gRPC - boilerplate to high-performance scalable APIs - code.t...
Robert Kubis - gRPC - boilerplate to high-performance scalable APIs - code.t...
AboutYouGmbH
 
Kafka Practices @ Uber - Seattle Apache Kafka meetup
Kafka Practices @ Uber - Seattle Apache Kafka meetupKafka Practices @ Uber - Seattle Apache Kafka meetup
Kafka Practices @ Uber - Seattle Apache Kafka meetup
Mingmin Chen
 
Types of replication, pooling and ha
Types of replication, pooling and haTypes of replication, pooling and ha
Types of replication, pooling and ha
Dimitar Ianakiev
 
gRPC - RPC rebirth?
gRPC - RPC rebirth?gRPC - RPC rebirth?
gRPC - RPC rebirth?
Luís Barbosa
 
Modern APIs with GraphQL
Modern APIs with GraphQLModern APIs with GraphQL
Modern APIs with GraphQL
Taikai
 
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
Flink Forward
 
Reading The Source Code of Presto
Reading The Source Code of PrestoReading The Source Code of Presto
Reading The Source Code of Presto
Taro L. Saito
 
Temporal Performance Modelling of Serverless Computing Platforms - WoSC6
Temporal Performance Modelling of Serverless Computing Platforms - WoSC6Temporal Performance Modelling of Serverless Computing Platforms - WoSC6
Temporal Performance Modelling of Serverless Computing Platforms - WoSC6
Nima Mahmoudi
 
Python and trending_data_ops
Python and trending_data_opsPython and trending_data_ops
Python and trending_data_ops
chase pettet
 

What's hot (20)

gRPC & Kubernetes
gRPC & KubernetesgRPC & Kubernetes
gRPC & Kubernetes
 
gRPC Design and Implementation
gRPC Design and ImplementationgRPC Design and Implementation
gRPC Design and Implementation
 
Kafka Retry and DLQ
Kafka Retry and DLQKafka Retry and DLQ
Kafka Retry and DLQ
 
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, GoogleBringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
 
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable,  Robust Kafka ReplicatoruReplicator: Uber Engineering’s Scalable,  Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
 
Introduction to gRPC
Introduction to gRPCIntroduction to gRPC
Introduction to gRPC
 
Introduction to DevOps and the Practical Use Cases at Credit OK
Introduction to DevOps and the Practical Use Cases at Credit OKIntroduction to DevOps and the Practical Use Cases at Credit OK
Introduction to DevOps and the Practical Use Cases at Credit OK
 
Microservices summit talk 1/31
Microservices summit talk   1/31Microservices summit talk   1/31
Microservices summit talk 1/31
 
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per DayHadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
 
Introduction to Modern DevOps Technologies
Introduction to  Modern DevOps TechnologiesIntroduction to  Modern DevOps Technologies
Introduction to Modern DevOps Technologies
 
Search engine based on Elasticsearch
Search engine based on ElasticsearchSearch engine based on Elasticsearch
Search engine based on Elasticsearch
 
Robert Kubis - gRPC - boilerplate to high-performance scalable APIs - code.t...
 Robert Kubis - gRPC - boilerplate to high-performance scalable APIs - code.t... Robert Kubis - gRPC - boilerplate to high-performance scalable APIs - code.t...
Robert Kubis - gRPC - boilerplate to high-performance scalable APIs - code.t...
 
Kafka Practices @ Uber - Seattle Apache Kafka meetup
Kafka Practices @ Uber - Seattle Apache Kafka meetupKafka Practices @ Uber - Seattle Apache Kafka meetup
Kafka Practices @ Uber - Seattle Apache Kafka meetup
 
Types of replication, pooling and ha
Types of replication, pooling and haTypes of replication, pooling and ha
Types of replication, pooling and ha
 
gRPC - RPC rebirth?
gRPC - RPC rebirth?gRPC - RPC rebirth?
gRPC - RPC rebirth?
 
Modern APIs with GraphQL
Modern APIs with GraphQLModern APIs with GraphQL
Modern APIs with GraphQL
 
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
 
Reading The Source Code of Presto
Reading The Source Code of PrestoReading The Source Code of Presto
Reading The Source Code of Presto
 
Temporal Performance Modelling of Serverless Computing Platforms - WoSC6
Temporal Performance Modelling of Serverless Computing Platforms - WoSC6Temporal Performance Modelling of Serverless Computing Platforms - WoSC6
Temporal Performance Modelling of Serverless Computing Platforms - WoSC6
 
Python and trending_data_ops
Python and trending_data_opsPython and trending_data_ops
Python and trending_data_ops
 

Similar to 202104 technical challenging and our solutions - golang taipei

Angular (v2 and up) - Morning to understand - Linagora
Angular (v2 and up) - Morning to understand - LinagoraAngular (v2 and up) - Morning to understand - Linagora
Angular (v2 and up) - Morning to understand - Linagora
LINAGORA
 
65% Performance Gains at Cryptocurrency Platform CoinGecko: An Argo Smart Rou...
65% Performance Gains at Cryptocurrency Platform CoinGecko: An Argo Smart Rou...65% Performance Gains at Cryptocurrency Platform CoinGecko: An Argo Smart Rou...
65% Performance Gains at Cryptocurrency Platform CoinGecko: An Argo Smart Rou...
Cloudflare
 
Life of a Request by Ana Oprea
Life of a Request by Ana OpreaLife of a Request by Ana Oprea
Life of a Request by Ana Oprea
Rails Girls MUC
 
AWS Techniques and lessons writing a minimal cost gitlab runner
AWS Techniques and lessons writing a minimal cost gitlab runnerAWS Techniques and lessons writing a minimal cost gitlab runner
AWS Techniques and lessons writing a minimal cost gitlab runner
Anthony Scata
 
Not my problem - Delegating responsibility to infrastructure
Not my problem - Delegating responsibility to infrastructureNot my problem - Delegating responsibility to infrastructure
Not my problem - Delegating responsibility to infrastructure
Yshay Yaacobi
 
Monolith vs Microservices with Golang at practice - Ivan Kutuzov
Monolith vs Microservices with Golang at practice  -  Ivan Kutuzov Monolith vs Microservices with Golang at practice  -  Ivan Kutuzov
Monolith vs Microservices with Golang at practice - Ivan Kutuzov
Kuberton
 
Microservices at Mercari
Microservices at MercariMicroservices at Mercari
Microservices at Mercari
Google Cloud Platform - Japan
 
ISC Cloud13 Sill - Crossing organizational boundaries in cloud computing
ISC Cloud13 Sill - Crossing organizational boundaries in cloud computingISC Cloud13 Sill - Crossing organizational boundaries in cloud computing
ISC Cloud13 Sill - Crossing organizational boundaries in cloud computing
Alan Sill
 
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
Phil Wilkins
 
How to Reduce Latency with Cloudflare Argo Smart Routing
How to Reduce Latency with Cloudflare Argo Smart RoutingHow to Reduce Latency with Cloudflare Argo Smart Routing
How to Reduce Latency with Cloudflare Argo Smart Routing
Cloudflare
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scale
Dmytro Semenov
 
The new (is it really ) api stack
The new (is it really ) api stackThe new (is it really ) api stack
The new (is it really ) api stack
Red Hat
 
The working architecture of NodeJS applications, Виктор Турский
The working architecture of NodeJS applications, Виктор ТурскийThe working architecture of NodeJS applications, Виктор Турский
The working architecture of NodeJS applications, Виктор Турский
Sigma Software
 
The working architecture of node js applications open tech week javascript ...
The working architecture of node js applications   open tech week javascript ...The working architecture of node js applications   open tech week javascript ...
The working architecture of node js applications open tech week javascript ...
Viktor Turskyi
 
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
InfluxData
 
"How overlay networks can make public clouds your global WAN" by Ryan Koop o...
 "How overlay networks can make public clouds your global WAN" by Ryan Koop o... "How overlay networks can make public clouds your global WAN" by Ryan Koop o...
"How overlay networks can make public clouds your global WAN" by Ryan Koop o...
Cohesive Networks
 
TrustLeap GWAN - The multicore Future requires Parallelism Programming tools
TrustLeap GWAN - The multicore Future requires Parallelism Programming toolsTrustLeap GWAN - The multicore Future requires Parallelism Programming tools
TrustLeap GWAN - The multicore Future requires Parallelism Programming tools
TWD Industries AG
 
Mass Scale Networking
Mass Scale NetworkingMass Scale Networking
Mass Scale Networking
Steve Iatrou
 
FEVR - Micro Frontend
FEVR - Micro FrontendFEVR - Micro Frontend
FEVR - Micro Frontend
Miki Lombardi
 
Develop, deploy, and operate services at reddit scale oscon 2018
Develop, deploy, and operate services at reddit scale   oscon 2018Develop, deploy, and operate services at reddit scale   oscon 2018
Develop, deploy, and operate services at reddit scale oscon 2018
Gregory Taylor
 

Similar to 202104 technical challenging and our solutions - golang taipei (20)

Angular (v2 and up) - Morning to understand - Linagora
Angular (v2 and up) - Morning to understand - LinagoraAngular (v2 and up) - Morning to understand - Linagora
Angular (v2 and up) - Morning to understand - Linagora
 
65% Performance Gains at Cryptocurrency Platform CoinGecko: An Argo Smart Rou...
65% Performance Gains at Cryptocurrency Platform CoinGecko: An Argo Smart Rou...65% Performance Gains at Cryptocurrency Platform CoinGecko: An Argo Smart Rou...
65% Performance Gains at Cryptocurrency Platform CoinGecko: An Argo Smart Rou...
 
Life of a Request by Ana Oprea
Life of a Request by Ana OpreaLife of a Request by Ana Oprea
Life of a Request by Ana Oprea
 
AWS Techniques and lessons writing a minimal cost gitlab runner
AWS Techniques and lessons writing a minimal cost gitlab runnerAWS Techniques and lessons writing a minimal cost gitlab runner
AWS Techniques and lessons writing a minimal cost gitlab runner
 
Not my problem - Delegating responsibility to infrastructure
Not my problem - Delegating responsibility to infrastructureNot my problem - Delegating responsibility to infrastructure
Not my problem - Delegating responsibility to infrastructure
 
Monolith vs Microservices with Golang at practice - Ivan Kutuzov
Monolith vs Microservices with Golang at practice  -  Ivan Kutuzov Monolith vs Microservices with Golang at practice  -  Ivan Kutuzov
Monolith vs Microservices with Golang at practice - Ivan Kutuzov
 
Microservices at Mercari
Microservices at MercariMicroservices at Mercari
Microservices at Mercari
 
ISC Cloud13 Sill - Crossing organizational boundaries in cloud computing
ISC Cloud13 Sill - Crossing organizational boundaries in cloud computingISC Cloud13 Sill - Crossing organizational boundaries in cloud computing
ISC Cloud13 Sill - Crossing organizational boundaries in cloud computing
 
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
 
How to Reduce Latency with Cloudflare Argo Smart Routing
How to Reduce Latency with Cloudflare Argo Smart RoutingHow to Reduce Latency with Cloudflare Argo Smart Routing
How to Reduce Latency with Cloudflare Argo Smart Routing
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scale
 
The new (is it really ) api stack
The new (is it really ) api stackThe new (is it really ) api stack
The new (is it really ) api stack
 
The working architecture of NodeJS applications, Виктор Турский
The working architecture of NodeJS applications, Виктор ТурскийThe working architecture of NodeJS applications, Виктор Турский
The working architecture of NodeJS applications, Виктор Турский
 
The working architecture of node js applications open tech week javascript ...
The working architecture of node js applications   open tech week javascript ...The working architecture of node js applications   open tech week javascript ...
The working architecture of node js applications open tech week javascript ...
 
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
Scaling Prometheus Metrics in Kubernetes with Telegraf | Chris Goller | Influ...
 
"How overlay networks can make public clouds your global WAN" by Ryan Koop o...
 "How overlay networks can make public clouds your global WAN" by Ryan Koop o... "How overlay networks can make public clouds your global WAN" by Ryan Koop o...
"How overlay networks can make public clouds your global WAN" by Ryan Koop o...
 
TrustLeap GWAN - The multicore Future requires Parallelism Programming tools
TrustLeap GWAN - The multicore Future requires Parallelism Programming toolsTrustLeap GWAN - The multicore Future requires Parallelism Programming tools
TrustLeap GWAN - The multicore Future requires Parallelism Programming tools
 
Mass Scale Networking
Mass Scale NetworkingMass Scale Networking
Mass Scale Networking
 
FEVR - Micro Frontend
FEVR - Micro FrontendFEVR - Micro Frontend
FEVR - Micro Frontend
 
Develop, deploy, and operate services at reddit scale oscon 2018
Develop, deploy, and operate services at reddit scale   oscon 2018Develop, deploy, and operate services at reddit scale   oscon 2018
Develop, deploy, and operate services at reddit scale oscon 2018
 

Recently uploaded

Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
aryanpankaj78
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
21UME003TUSHARDEB
 
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
nedcocy
 
FULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back EndFULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back End
PreethaV16
 
Presentation on Food Delivery Systems
Presentation on Food Delivery SystemsPresentation on Food Delivery Systems
Presentation on Food Delivery Systems
Abdullah Al Noman
 
EV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptx
EV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptxEV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptx
EV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptx
nikshimanasa
 
DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...
DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...
DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...
OKORIE1
 
SENTIMENT ANALYSIS ON PPT AND Project template_.pptx
SENTIMENT ANALYSIS ON PPT AND Project template_.pptxSENTIMENT ANALYSIS ON PPT AND Project template_.pptx
SENTIMENT ANALYSIS ON PPT AND Project template_.pptx
b0754201
 
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUESAN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
drshikhapandey2022
 
Butterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdfButterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdf
Lubi Valves
 
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls ChennaiCall Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
paraasingh12 #V08
 
Levelised Cost of Hydrogen (LCOH) Calculator Manual
Levelised Cost of Hydrogen  (LCOH) Calculator ManualLevelised Cost of Hydrogen  (LCOH) Calculator Manual
Levelised Cost of Hydrogen (LCOH) Calculator Manual
Massimo Talia
 
Impartiality as per ISO /IEC 17025:2017 Standard
Impartiality as per ISO /IEC 17025:2017 StandardImpartiality as per ISO /IEC 17025:2017 Standard
Impartiality as per ISO /IEC 17025:2017 Standard
MuhammadJazib15
 
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
VANDANAMOHANGOUDA
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
Roger Rozario
 
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICSUNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
vmspraneeth
 
Height and depth gauge linear metrology.pdf
Height and depth gauge linear metrology.pdfHeight and depth gauge linear metrology.pdf
Height and depth gauge linear metrology.pdf
q30122000
 
Beckhoff Programmable Logic Control Overview Presentation
Beckhoff Programmable Logic Control Overview PresentationBeckhoff Programmable Logic Control Overview Presentation
Beckhoff Programmable Logic Control Overview Presentation
VanTuDuong1
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Divyanshu
 
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Transcat
 

Recently uploaded (20)

Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
 
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
一比一原版(爱大毕业证书)爱荷华大学毕业证如何办理
 
FULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back EndFULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back End
 
Presentation on Food Delivery Systems
Presentation on Food Delivery SystemsPresentation on Food Delivery Systems
Presentation on Food Delivery Systems
 
EV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptx
EV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptxEV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptx
EV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptx
 
DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...
DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...
DESIGN AND MANUFACTURE OF CEILING BOARD USING SAWDUST AND WASTE CARTON MATERI...
 
SENTIMENT ANALYSIS ON PPT AND Project template_.pptx
SENTIMENT ANALYSIS ON PPT AND Project template_.pptxSENTIMENT ANALYSIS ON PPT AND Project template_.pptx
SENTIMENT ANALYSIS ON PPT AND Project template_.pptx
 
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUESAN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
 
Butterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdfButterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdf
 
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls ChennaiCall Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
 
Levelised Cost of Hydrogen (LCOH) Calculator Manual
Levelised Cost of Hydrogen  (LCOH) Calculator ManualLevelised Cost of Hydrogen  (LCOH) Calculator Manual
Levelised Cost of Hydrogen (LCOH) Calculator Manual
 
Impartiality as per ISO /IEC 17025:2017 Standard
Impartiality as per ISO /IEC 17025:2017 StandardImpartiality as per ISO /IEC 17025:2017 Standard
Impartiality as per ISO /IEC 17025:2017 Standard
 
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
 
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICSUNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
UNIT 4 LINEAR INTEGRATED CIRCUITS-DIGITAL ICS
 
Height and depth gauge linear metrology.pdf
Height and depth gauge linear metrology.pdfHeight and depth gauge linear metrology.pdf
Height and depth gauge linear metrology.pdf
 
Beckhoff Programmable Logic Control Overview Presentation
Beckhoff Programmable Logic Control Overview PresentationBeckhoff Programmable Logic Control Overview Presentation
Beckhoff Programmable Logic Control Overview Presentation
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
 
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
 

202104 technical challenging and our solutions - golang taipei

  • 1. Technical Challenging in MMAU microservices Ronald (hothero)
  • 2. About me ● Tech Lead (SSEII) - Carousell@TW ● Senior SE - Uber@AMS ● SSE - Carousell@TW/SG ● CTO & Co-Founder @ Backer-Founder ● A girl’s father ● 在旋轉拍賣 carousell 一年看到的後端架構 -挑戰與生活 (11.5K claps) ● 2017 後端工程師面試以及準備經驗 (1.7K claps)
  • 3. Snap, list, sell List any item for sale in 30 seconds In-app chat Chat directly with sellers without revealing personal information Photo-centric Core focus of the app on photos Social Share listings on social media channels and join groups of people with similar interests
  • 4. In 8 years, we have grown to become South East Asia’s #1 classifieds marketplace 2012 2013 2014 2015 2016 2017 2018 2019 Founded Carousell in SG Raised USD800K Series Seed Raised USD6M Series A Launched in MY, ID, TW Launched web platform Launched HK + PH Raised USD35M Series B Acquired Caarly, launched Autos + Coins & Bumps + Property Launched Spotlight + Carousell Protection Acquired OLX PH, Naspers invests, received funding from Telenor, merged with 701 Search entities, Mudah, Cho Tot and OneKyat Phase III Verticalization & monetization Phase I Foundation Raised USD85M Series C Phase II Internationalization 2020 Carousell announces US$80M investment from Naver, Mirae Asset-Naver Asia Growth Fund and NH Investment & Securities
  • 10. schematic diagram of architecture in 2021 gateway listing recom search third-parties Logistics order offer ads 90+ services Payment homescreen
  • 11. Share our improvement first ● Around 1.8% error rate in one of our systems (Payment & shipping, 7+ services tied to each other) at the beginning of 2020 -> Bad UX ● `Something went wrong` to users if one of call in the chain failed.
  • 12. As having more and more services, what challenging ● The SLA would be 95%, meaning 2160 minutes of downtime / month ● Overall 99% needs three-nine on each ● The reliability of every request made is really important ● Reasons of failure ○ Network issues ○ Capacity (DB/CPU/Memory/Concurrency/etc.) ○ Performance/Latency ○ Services connection sustainability ■ Infra-level (e.g. HAProxy, Envoy) ■ Application-level ○ Wrong logic/Bad code gateway service4 service2 service1 service3 .99 .99 .99 .99 .99 * * * *
  • 13. Application-level - the request made Hystrix CallerContext Load Balancer Server CalleeContext Hystrix caps at 2 seconds, but newrelic shows avg latency can up to 5 seconds (capped at infra level) how Golang projects handle timeout
  • 14. Common mistakes - timeout handling Hystrix CallerContext Load Balancer Server CalleeContext how Golang projects handle timeout ● Somehow passing context.Background() in goroutine jobs and blocking ● Didn’t set hystrix to cap long requests ● grpc.WithTimeout(timeout) is set for dialing not general timeout of requests https://play.golang.org/p/Nrhu5nhvaa3
  • 15. POC of standardizing gRPC dialing
  • 16. POC of standardizing wrapper of handling timeout Topics ● Alternatives on Retry ● Configuration Control
  • 17. Outcome of the improvement
  • 18. Observability + Workflow: maintain the level of reliability gateway service2 service1 Hyst rix Hyst rix Hyst rix Distributed tracing Error reporting Main alerting (Programmability) Monitoring (alert channels) Workflow - Support Rotation/Weekly Sync
  • 19. Case study about the importance of context passing and tracing ● High error rate alert of one endpoint on gateway ● Same error spike on service 1 but not on service 4 ● No proper egress metrics on service 1 and 2 (e.g. hystrix) but seeing error rate on gRPC server ● Based on the good enough evidence, checked with the corresponding team. Confirmed it’s the outage related. ● If we have had tracing and appropriate context passing, we would have noticed the root cause on Jaeger very quickly. ● Demo gateway service1 service2 service3 service4 service5
  • 20. The type of business and product of Carousell determines the focus ● Core and synchronous flows in Carousell ○ Homefeed ○ Search ○ Make an offer ○ Chat ○ Make an order ○ ... ● There are still a lot of configurations/components could impact overall reliability, why mostly about wrapper and communications ○ Read volume is larger than write traffic ○ Statibility is more important than performance for now ○ Browsing is the majority of user behaviors
  • 21. Introducing service-mesh with multi-tenancy ● Technical challenging as having more and more microservices and teams ○ Staging environment is very unstable ○ Productivity of all types of testing is needed to be improved. ○ The needs of built-in routing is increasing Reference: https://eng.uber.com/multitenancy-microservice-architecture/
  • 22. Metadata of gRPC in Golang ● HTTP Headers/gRPC Metadata -> incoming context -> outgoing context -> ... https://github.com/grpc/grpc-go/blob/master/Documentation/grpc-metadata.md, https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md https://github.com/carousell/Orion/blob/master/interceptors/interceptors.go#L214-L229
  • 23. Conclusion ● Context is the critical component in Golang projects ○ Metadata propagation ○ Distributed tracing ○ Service Mesh with tenancy ○ Timeout handling ○ … ● From Golang document ○ Incoming requests to a server should create a Context, and outgoing calls to servers should accept a Context. The chain of function calls between them must propagate the Context, optionally replacing it with a derived Context created using WithCancel, WithDeadline …... Do not store Contexts inside a struct type; instead, pass a Context explicitly to each function that needs it. The Context should be the first parameter, typically named ctx ● Service-mesh plays a role in micro-service architecture. ● We’re hiring, join us if you’re also interested in the journey :)
  • 24. 目前職缺 Current Position Transactional GC Team Sr. Software Engineer, Backend Buyer Experience Team Sr. Software Engineer, Frontend 資深後端工程師 資深前端工程師