SlideShare a Scribd company logo
1 of 23
Download to read offline
High Concurrency
Architecture
09-2019. TIKI.VN
Contributors
Bùi Anh Dũng
Principal Engineer
Nguyễn Hoàng Bách
Senior Principal
Engineer
Phan Công Huân
Senior Software
Engineer
Lê Minh Nghĩa
Senior Architect
Trần Nguyên Bản
Principal Engineer
Agenda
- Principles
- Pegasus - Highest throughput API
- Arcturus - High concurrency inventory API
- Conclusion
Principles
- Use local memory to handle high concurrency transaction
- Non blocking architecture
- Trade-offs consistency vs eventual consistency
- Reliable replication
- Authority: each service owns its problems.
Pegasus - Highest Throughput API at TIKI
Problems:
- Handle the most of traffic to fetch product information
- Has to handle at least 10k request/s, tp95 < 5ms
Pegasus - Architecture
Pegasus - Architecture
Practises:
- Cache product data in local memory
- Subscribe product change event to
invalid cache
- Non-Blocking http web server
- Compress to reduce payload size
Technology:
- Java
- Guava - In Memory Cache
- Gridgo - Async IO & Event driven
framework
Pegasus - Compression
Two cases:
- Get single product by id
- Get multiple product by a list id
Solutions:
- Using gzip to compress product data to store in
cache. Reduce from 200kb text to 3Kb
- Handle single product request: return
compressed gzip bytes to client directly
- Handle multiple product request: store plain
product data and merge them to build a list of
products, then use Snappy format to compress
data and return to client
- Gzip format compress better than Snappy and
is supported natively by most http client, but
Snappy compress faster and use less CPU
- Output cache can be enabled in hot-deal
situation
Pegasus - Technology
- Java
- Gridgo
- Guava
- Kafka
Pegasus - Benchmark
Benchmark use WRK tools, 4VM,
L4 LB, GCP:
- 90%<6.6ms
- 96k request/s
Production
- 200k request/minutes (all
platforms)
- <2ms
Cache hit: 85%. Increased to 95% with Consistent Hashing With Bounded Load.
Average payload: Single (3KB), Multi (60KB)
Pegasus - Benchmark/Replication
Arcturus - High concurrency Inventory API
Problems:
- Handle inventory transaction when customers place orders
- Handle high concurrency transaction for extreme hot deals with very cheap
and low quantity. Exp: Only ten 1đ iPhone XS, only in ten minutes from 9AM
to 9h10 AM.
- Guarantee eventual consistency of inventory data between many systems
Arcturus - Context Diagram
Arcturus - Architecture
Arcturus - Problems and Solutions
Problems:
- Many customers may place order at the same time, especially for hot skus.
- System has to be deterministic and can be recovered after crashing
Solutions:
- All write requests are published to Kafka with only one partition to guarantee
the ordering of message and recover if system crashes.
- Non Blocking In Memory Cache for both read and write
- All changes are flushed to database asynchronously
- Recovery based on the offset of write command
Arcturus - In Memory Cache Data Structure
Problem:
- There will be a race condition when many
customer place the same skus at the
same time
- Building an in memory cache structure
that can buffer data changes and flush to
database asynchronously
Approach:
- Each record have a key (string, bytes…)
and a value (8 bytes long number).
- A transaction is a set of record updates.
- Transactions must be ordered (e.g. key_1
must +3 before can be -1).
- It can be millions of keys and process upto
10k TPS.
- Using ring buffer data structure to
guarantee the ordering of changes
* The hardest thing is to keep
everything ordered when make it
fast enough.
Arcturus - Old solution
- Multi threading application
- DB transaction, row locking
Pros
- Strong consistency
- Simple implementation
Cons
- Slow
- Deadlock/timeout implicit risk
Arcturus - LMax Architecture
- Non Blocking architecture using single
thread business logic processor
- Use Ring Buffer data structure to
communicate between processors
- Handle million transaction per seconds
Arcturus - Batching marshaller
vBatching: use multi thread marshaller
● Pros
- Fast (400k-600k ops/s*)
- Avoid multi key locking
- Write only most updated value
● Cons
- Inconsistent transaction offset
→ use when you just want to make your code as fast
as possible.
* Macbook pro 2016 15’’ 2.7GHz Core i7, 16G ram. Without I/O, single update per
transaction
*** Test without I/O: [Single ops vbatch] Total time running for 1000000 transactions: 1085 ms; at rate: 921,658.99 ops/s
Arcturus - Batching marshaller
hBatching: use single marshaller thread
● Pros
- Ensure transaction offset
- Ordered db updating
● Cons
- A little bit slower than vBatching.
→ use when you need to re-create application state
after fail.
*** Test without I/O: [Single ops hbatch] Total time running for 1000000 transactions: 1056 ms; at rate: 946,969.70 ops/s
Arcturus - Recovery/Replay
Arcturus - Technology
- Java
- Kafka
- ZeroMQ
- Gridgo
- MySQL
- LMax/Ring Buffer
Conclusion
- Has to trade-off between consistency and eventual consistency
- In Memory is a great way to improve the performance
- Reliable replication is the key to split and scale system
- Non Blocking architecture is a great way to utilize hardware resources
efficiently.

More Related Content

What's hot

Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles Thinking
Grokking VN
 
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks DeltaEnd-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
Databricks
 
A year with event sourcing and CQRS
A year with event sourcing and CQRSA year with event sourcing and CQRS
A year with event sourcing and CQRS
Steve Pember
 

What's hot (20)

Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous Communications
 
Grokking Techtalk #37: Data intensive problem
 Grokking Techtalk #37: Data intensive problem Grokking Techtalk #37: Data intensive problem
Grokking Techtalk #37: Data intensive problem
 
Toi uu hoa he thong 30 trieu nguoi dung
Toi uu hoa he thong 30 trieu nguoi dungToi uu hoa he thong 30 trieu nguoi dung
Toi uu hoa he thong 30 trieu nguoi dung
 
Kinh nghiệm triển khai Microservices tại Sapo.vn
Kinh nghiệm triển khai Microservices tại Sapo.vnKinh nghiệm triển khai Microservices tại Sapo.vn
Kinh nghiệm triển khai Microservices tại Sapo.vn
 
Grokking Techtalk: Problem solving for sw engineers
Grokking Techtalk: Problem solving for sw engineersGrokking Techtalk: Problem solving for sw engineers
Grokking Techtalk: Problem solving for sw engineers
 
Grokking Techtalk #37: Software design and refactoring
 Grokking Techtalk #37: Software design and refactoring Grokking Techtalk #37: Software design and refactoring
Grokking Techtalk #37: Software design and refactoring
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 
Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles Thinking
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applications
 
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks DeltaEnd-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
 
A year with event sourcing and CQRS
A year with event sourcing and CQRSA year with event sourcing and CQRS
A year with event sourcing and CQRS
 
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 Grokking Techtalk #39: How to build an event driven architecture with Kafka ... Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
MongoDB World 2015 - A Technical Introduction to WiredTiger
MongoDB World 2015 - A Technical Introduction to WiredTigerMongoDB World 2015 - A Technical Introduction to WiredTiger
MongoDB World 2015 - A Technical Introduction to WiredTiger
 
LMAX Disruptor as real-life example
LMAX Disruptor as real-life exampleLMAX Disruptor as real-life example
LMAX Disruptor as real-life example
 
High Performance and Scalability Database Design
High Performance and Scalability Database DesignHigh Performance and Scalability Database Design
High Performance and Scalability Database Design
 
Software architecture for high traffic website
Software architecture for high traffic websiteSoftware architecture for high traffic website
Software architecture for high traffic website
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
 

Similar to Grokking TechTalk #33: High Concurrency Architecture at TIKI

Storage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, WhiptailStorage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, Whiptail
Internet World
 
Deploying flash storage for Ceph without compromising performance
Deploying flash storage for Ceph without compromising performance Deploying flash storage for Ceph without compromising performance
Deploying flash storage for Ceph without compromising performance
Ceph Community
 
OSS Presentation Metro Cluster by Andy Bennett & Roel De Frene
OSS Presentation Metro Cluster by Andy Bennett & Roel De FreneOSS Presentation Metro Cluster by Andy Bennett & Roel De Frene
OSS Presentation Metro Cluster by Andy Bennett & Roel De Frene
OpenStorageSummit
 

Similar to Grokking TechTalk #33: High Concurrency Architecture at TIKI (20)

Slow things down to make them go faster [FOSDEM 2022]
Slow things down to make them go faster [FOSDEM 2022]Slow things down to make them go faster [FOSDEM 2022]
Slow things down to make them go faster [FOSDEM 2022]
 
Building big data pipelines with Kafka and Kubernetes
Building big data pipelines with Kafka and KubernetesBuilding big data pipelines with Kafka and Kubernetes
Building big data pipelines with Kafka and Kubernetes
 
Mahti quick-start guide
Mahti quick-start guide Mahti quick-start guide
Mahti quick-start guide
 
DPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles ShiflettDPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles Shiflett
 
IBM Power9 Features and Specifications
IBM Power9 Features and SpecificationsIBM Power9 Features and Specifications
IBM Power9 Features and Specifications
 
Extending Piwik At R7.com
Extending Piwik At R7.comExtending Piwik At R7.com
Extending Piwik At R7.com
 
Kafka vs kinesis
Kafka vs kinesisKafka vs kinesis
Kafka vs kinesis
 
Redpanda and ClickHouse
Redpanda and ClickHouseRedpanda and ClickHouse
Redpanda and ClickHouse
 
High perf-networking
High perf-networkingHigh perf-networking
High perf-networking
 
Ceph Day Amsterdam 2015 - Deploying flash storage for Ceph without compromisi...
Ceph Day Amsterdam 2015 - Deploying flash storage for Ceph without compromisi...Ceph Day Amsterdam 2015 - Deploying flash storage for Ceph without compromisi...
Ceph Day Amsterdam 2015 - Deploying flash storage for Ceph without compromisi...
 
100 M pps on PC.
100 M pps on PC.100 M pps on PC.
100 M pps on PC.
 
Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...
Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...
Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...
 
Storage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, WhiptailStorage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, Whiptail
 
Deploying flash storage for Ceph without compromising performance
Deploying flash storage for Ceph without compromising performance Deploying flash storage for Ceph without compromising performance
Deploying flash storage for Ceph without compromising performance
 
OpenDS_Jazoon2010
OpenDS_Jazoon2010OpenDS_Jazoon2010
OpenDS_Jazoon2010
 
Using a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming AggregationsUsing a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming Aggregations
 
Boyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experienceBoyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experience
 
OSS Presentation Metro Cluster by Andy Bennett & Roel De Frene
OSS Presentation Metro Cluster by Andy Bennett & Roel De FreneOSS Presentation Metro Cluster by Andy Bennett & Roel De Frene
OSS Presentation Metro Cluster by Andy Bennett & Roel De Frene
 
Data center network reference architecture with hpe flex fabric
Data center network reference architecture with hpe flex fabricData center network reference architecture with hpe flex fabric
Data center network reference architecture with hpe flex fabric
 
Rust at Ather
Rust at AtherRust at Ather
Rust at Ather
 

More from Grokking VN

Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking VN
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystified
Grokking VN
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking VN
 

More from Grokking VN (20)

Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
 
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystified
 
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compiler
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellchecking
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
 
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search Tree
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the Magic
 
Grokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platform
 
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
 
Grokking TechTalk #24: Kafka's principles and protocols
Grokking TechTalk #24: Kafka's principles and protocolsGrokking TechTalk #24: Kafka's principles and protocols
Grokking TechTalk #24: Kafka's principles and protocols
 
Grokking TechTalk #21: Deep Learning in Computer Vision
Grokking TechTalk #21: Deep Learning in Computer VisionGrokking TechTalk #21: Deep Learning in Computer Vision
Grokking TechTalk #21: Deep Learning in Computer Vision
 
Grokking TechTalk #20: PostgreSQL Internals 101
Grokking TechTalk #20: PostgreSQL Internals 101Grokking TechTalk #20: PostgreSQL Internals 101
Grokking TechTalk #20: PostgreSQL Internals 101
 
Grokking TechTalk #19: Software Development Cycle In The International Moneta...
Grokking TechTalk #19: Software Development Cycle In The International Moneta...Grokking TechTalk #19: Software Development Cycle In The International Moneta...
Grokking TechTalk #19: Software Development Cycle In The International Moneta...
 
Grokking TechTalk #18B: Giới thiệu về Viễn thông Di động
Grokking TechTalk #18B:  Giới thiệu về Viễn thông Di độngGrokking TechTalk #18B:  Giới thiệu về Viễn thông Di động
Grokking TechTalk #18B: Giới thiệu về Viễn thông Di động
 
Grokking TechTalk #18B: VoIP Architecture For Telecommunications
Grokking TechTalk #18B: VoIP Architecture For TelecommunicationsGrokking TechTalk #18B: VoIP Architecture For Telecommunications
Grokking TechTalk #18B: VoIP Architecture For Telecommunications
 

Recently uploaded

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 

Grokking TechTalk #33: High Concurrency Architecture at TIKI

  • 2. Contributors Bùi Anh Dũng Principal Engineer Nguyễn Hoàng Bách Senior Principal Engineer Phan Công Huân Senior Software Engineer Lê Minh Nghĩa Senior Architect Trần Nguyên Bản Principal Engineer
  • 3. Agenda - Principles - Pegasus - Highest throughput API - Arcturus - High concurrency inventory API - Conclusion
  • 4. Principles - Use local memory to handle high concurrency transaction - Non blocking architecture - Trade-offs consistency vs eventual consistency - Reliable replication - Authority: each service owns its problems.
  • 5. Pegasus - Highest Throughput API at TIKI Problems: - Handle the most of traffic to fetch product information - Has to handle at least 10k request/s, tp95 < 5ms
  • 7. Pegasus - Architecture Practises: - Cache product data in local memory - Subscribe product change event to invalid cache - Non-Blocking http web server - Compress to reduce payload size Technology: - Java - Guava - In Memory Cache - Gridgo - Async IO & Event driven framework
  • 8. Pegasus - Compression Two cases: - Get single product by id - Get multiple product by a list id Solutions: - Using gzip to compress product data to store in cache. Reduce from 200kb text to 3Kb - Handle single product request: return compressed gzip bytes to client directly - Handle multiple product request: store plain product data and merge them to build a list of products, then use Snappy format to compress data and return to client - Gzip format compress better than Snappy and is supported natively by most http client, but Snappy compress faster and use less CPU - Output cache can be enabled in hot-deal situation
  • 9. Pegasus - Technology - Java - Gridgo - Guava - Kafka
  • 10. Pegasus - Benchmark Benchmark use WRK tools, 4VM, L4 LB, GCP: - 90%<6.6ms - 96k request/s Production - 200k request/minutes (all platforms) - <2ms Cache hit: 85%. Increased to 95% with Consistent Hashing With Bounded Load. Average payload: Single (3KB), Multi (60KB)
  • 12. Arcturus - High concurrency Inventory API Problems: - Handle inventory transaction when customers place orders - Handle high concurrency transaction for extreme hot deals with very cheap and low quantity. Exp: Only ten 1đ iPhone XS, only in ten minutes from 9AM to 9h10 AM. - Guarantee eventual consistency of inventory data between many systems
  • 15. Arcturus - Problems and Solutions Problems: - Many customers may place order at the same time, especially for hot skus. - System has to be deterministic and can be recovered after crashing Solutions: - All write requests are published to Kafka with only one partition to guarantee the ordering of message and recover if system crashes. - Non Blocking In Memory Cache for both read and write - All changes are flushed to database asynchronously - Recovery based on the offset of write command
  • 16. Arcturus - In Memory Cache Data Structure Problem: - There will be a race condition when many customer place the same skus at the same time - Building an in memory cache structure that can buffer data changes and flush to database asynchronously Approach: - Each record have a key (string, bytes…) and a value (8 bytes long number). - A transaction is a set of record updates. - Transactions must be ordered (e.g. key_1 must +3 before can be -1). - It can be millions of keys and process upto 10k TPS. - Using ring buffer data structure to guarantee the ordering of changes * The hardest thing is to keep everything ordered when make it fast enough.
  • 17. Arcturus - Old solution - Multi threading application - DB transaction, row locking Pros - Strong consistency - Simple implementation Cons - Slow - Deadlock/timeout implicit risk
  • 18. Arcturus - LMax Architecture - Non Blocking architecture using single thread business logic processor - Use Ring Buffer data structure to communicate between processors - Handle million transaction per seconds
  • 19. Arcturus - Batching marshaller vBatching: use multi thread marshaller ● Pros - Fast (400k-600k ops/s*) - Avoid multi key locking - Write only most updated value ● Cons - Inconsistent transaction offset → use when you just want to make your code as fast as possible. * Macbook pro 2016 15’’ 2.7GHz Core i7, 16G ram. Without I/O, single update per transaction *** Test without I/O: [Single ops vbatch] Total time running for 1000000 transactions: 1085 ms; at rate: 921,658.99 ops/s
  • 20. Arcturus - Batching marshaller hBatching: use single marshaller thread ● Pros - Ensure transaction offset - Ordered db updating ● Cons - A little bit slower than vBatching. → use when you need to re-create application state after fail. *** Test without I/O: [Single ops hbatch] Total time running for 1000000 transactions: 1056 ms; at rate: 946,969.70 ops/s
  • 22. Arcturus - Technology - Java - Kafka - ZeroMQ - Gridgo - MySQL - LMax/Ring Buffer
  • 23. Conclusion - Has to trade-off between consistency and eventual consistency - In Memory is a great way to improve the performance - Reliable replication is the key to split and scale system - Non Blocking architecture is a great way to utilize hardware resources efficiently.