Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Architecture of Falcon,
a new chat messaging
backend system
build on Scala
Yusuke Yasuda
ChatWork
2017/02/26
Architecture of Falcon, a new backend chat messaging system build on Scala
2017/02/27 © ChatWork All rights reserved. 2
Go...
Architecture of Falcon, a new backend system build on Scala
2017/02/27 © ChatWork All rights reserved. 3
Architecture Over...
Architecture of Falcon, a new backend system build on Scala
2017/02/27 © ChatWork All rights reserved. 4
Architecture Over...
Architecture of Falcon, a new backend system build on Scala
2017/02/27 © ChatWork All rights reserved. 5
CQRS: Command Que...
2017/02/27 © ChatWork All rights reserved.
Convergent Evolution of technology
“Convergent evolution is the independent evo...
2017/02/27 © ChatWork All rights reserved. 7
Inter-system Synchronization
• Falcon subsystems and PHP system are so called...
Architecture of Falcon, a new backend system build on Scala
2017/02/27 © ChatWork All rights reserved. 8
Kafka features he...
2017/02/27 © ChatWork All rights reserved. 9
• subsystem may show temporarily poor performance:
• load spikes
• Compaction...
2017/02/27 © ChatWork All rights reserved. 10
1. SQL query latency increased at Amazon
Aurora of PHP system
2. PostProcess...
2017/02/27 © ChatWork All rights reserved.
ACID semantics of Falcon
• Atomicity: No atomicity among posting message and as...
2017/02/27 © ChatWork All rights reserved.
Consistency Model
14
Choose C or A based on CAP theorem.
CA CA CA
CA
Architectu...
2017/02/27 © ChatWork All rights reserved.
Consistency Model
15
CA CA CA
CA
•Availability for user-facing subsystems, Writ...
2017/02/27 © ChatWork All rights reserved.
Recovery from human errors
• Falcon can recover from data corruption without se...
Upcoming SlideShare
Loading in …5
×

Architecture of Falcon, a new chat messaging backend system build on Scala

836 views

Published on

ChatWorkの新ScalaバックエンドFalconのアーキテクチャ

Published in: Engineering
  • Be the first to comment

Architecture of Falcon, a new chat messaging backend system build on Scala

  1. 1. Architecture of Falcon, a new chat messaging backend system build on Scala Yusuke Yasuda ChatWork 2017/02/26
  2. 2. Architecture of Falcon, a new backend chat messaging system build on Scala 2017/02/27 © ChatWork All rights reserved. 2 Goal of Architecture • Scalability: • linear increase of throughput by adding nodes • keep stable and low latency • High Performance: • achieve 100 times higher throughput than the current load without further architectural changes • Resiliency: • avoid chain reaction of failures • fast recovery from partial failure • Low cost: • keep cluster size as small as possible • resist temporal load without additional resources • high performance/resource ratio • Legacy system integration • keep consistency without transactions
  3. 3. Architecture of Falcon, a new backend system build on Scala 2017/02/27 © ChatWork All rights reserved. 3 Architecture Overview
  4. 4. Architecture of Falcon, a new backend system build on Scala 2017/02/27 © ChatWork All rights reserved. 4 Architecture Overview • “Write API” exposes asynchronous API. Persists event and immediately returns `202 Accepted`. No queries and mutations. Storage is Kafka. • “Read API” can only query read model. No mutation. Both query by key and query by key range are supported. Storage is HBase. • ReadModelUpdater is a Kafka consumer creates read model queried by Read API from events generated by Write API. • PostProcessorForwarder is a Kafka consumer notifies legacy PHP system to execute remaining transactions, e.g. push notification.
  5. 5. Architecture of Falcon, a new backend system build on Scala 2017/02/27 © ChatWork All rights reserved. 5 CQRS: Command Query Responsibility Segregation • Command and Query responsibility is segregated at system level. • Specialized responsibility make a system simple • Each system uses different models • “Write API” uses immutable events to represent history of user actions. • “Read API” uses read models optimized for query. • Dedicated storages are used for command and query system each.
  6. 6. 2017/02/27 © ChatWork All rights reserved. Convergent Evolution of technology “Convergent evolution is the independent evolution of similar features in species of different lineages. ” 6 https://en.wikipedia.org/wiki/Convergent_evolution DDD Fighting against complexity of domain model with Event Sourcing Big Data Fighting against complexity of big data with Log Processing https://www.infoq.com/news/2016/05/event-sourcing-stream-processing Two communities invented similar features independently. Falcon is influenced by knowledge of two communities. Architecture of Falcon, a new backend system build on Scala
  7. 7. 2017/02/27 © ChatWork All rights reserved. 7 Inter-system Synchronization • Falcon subsystems and PHP system are so called “microservices”. • Microservices do not share persistent storage. • Event Sourcing to synchronize systems with properties: • No events are lost (within retention period). • The order of message events are preserved within a chat room. • Events are processed in at-least-once manner. • Processing the same event twice has no effects (idempotent). Architecture of Falcon, a new backend system build on Scala
  8. 8. Architecture of Falcon, a new backend system build on Scala 2017/02/27 © ChatWork All rights reserved. 8 Kafka features helpful for Event Sourcing • auto-sharding • Events are partitioned to be processed in parallel. • strong consistency: • One partition can be processed by single consumer. • Consumer can have internal states. • Resilient: • Partition assigned to crashed consumer is rebalanced to another consumer automatically. • Easy to connect services • Forward events to next topic topic 1 topic 3 topic 2 topic 4
  9. 9. 2017/02/27 © ChatWork All rights reserved. 9 • subsystem may show temporarily poor performance: • load spikes • Compaction of HBase • Legacy PHP system failure caused by process saturation • AWS component failure •Using Kafka as command-side storage help defend subsystem: •Kafka can easily handle events produced with higher throughput as 40 times as normal load without scaling out. •Kafka consumer can consume events with stable throughput. This ensures subsystem to deal with predictable throughput. •Throttling of Kafka ensures upper limit of throughput. Architecture of Falcon, a new backend system build on Scala
  10. 10. 2017/02/27 © ChatWork All rights reserved. 10 1. SQL query latency increased at Amazon Aurora of PHP system 2. PostProcessorForwarder caused timeout to call PHP system 4. Throughput of processing events decreased. Once subsystem recovered from failure, the throughput increased to consume stacked events but never exceeded upper limit due to throttling. 3. Events stacked on queue in Kafka Architecture of Falcon, a new backend system build on Scala
  11. 11. 2017/02/27 © ChatWork All rights reserved. ACID semantics of Falcon • Atomicity: No atomicity among posting message and associated operations, e.g. unread count calculation. Intermediate state can be observed. • Consistency: Eventual consistency. Read “Consistency Model”. • Isolation: No concurrent mutation of the same record. Events are processed sequentially. No need to isolate. • Durability: Yes. No messages are lost. • Visibility: No guarantee. There is short term posted message cannot be observed. We try making the term as short a.p. 11 ACID does not provide high availability and scalability. Falcon does not have ACID properties. http://people.eecs.berkeley.edu/~brewer/cs262b/TACC.pdf Architecture of Falcon, a new backend system build on Scala
  12. 12. 2017/02/27 © ChatWork All rights reserved. Consistency Model 14 Choose C or A based on CAP theorem. CA CA CA CA Architecture of Falcon, a new backend system build on Scala
  13. 13. 2017/02/27 © ChatWork All rights reserved. Consistency Model 15 CA CA CA CA •Availability for user-facing subsystems, Write API and Read API • Ensure always writable and readable. Loosing availability means service down. •Consistency for background subsystems, ReadModelUpdater and PostProcessorForwarder. • Ensure internal state consistency. Loosing availability is not obvious for users. Architecture of Falcon, a new backend system build on Scala
  14. 14. 2017/02/27 © ChatWork All rights reserved. Recovery from human errors • Falcon can recover from data corruption without service stop. • The system might damage data was ReadModelUpdater if malfunctioning. • “Write API”, “Read API”, “PostProcessorForwarder” cannot mutate data. • Since input events are preserved in Kafka, output can be recalculated by resetting offsets of Kafka consumer. • Stopping ReadModelUpdater does not affect availability of service. 18 Architecture of Falcon, a new backend system build on Scala

×