Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Worldwide Scalable and Resilient Messaging Services by CQRS and Event Sourcing using Akka, Kafka Streams and HBase

986 views

Published on

ChatWork is one of major business communication platforms in Japan. We keep growing up for 5+ years since our service inception. Now, we hold 110k+ of customer organizations which includes large organizations like telecom companies and the service is widely used across 200+ countries and regions.

Nowadays we have faced drastic increase of message traffic. But, unfortunately, our conventional backend was based on traditional LAMP architecture. Transforming traditional backend into highly available, scalable and resilient backend was imperative.

To achieve this, we have applied “Command Query Responsibility Segregation (CQRS) and Event Sourcing” as a heart of its architecture. The simple idea of segregation brings us independent command-side and query-side system components and it can subsequently achieve highly available, scalable and resilient systems. It is desirable property for messaging services because, for example, even if command-side was down, user can keep reading messages unless query-side was down. Event Sourcing is another key technique to enable us to build optimized systems to handle heterogeneous write/read load. This means that we can choose optimized storage platform for each side. Moreover, the event data can be the rich source for real-time analysis of user’s communication behavior. We have chosen Kafka as a command-side event storage, HBase as a query-side storage, Kafka Streams as a core library to give eventual consistency between the two sides. In application layer, Akka has been chosen as a core framework. Akka can be a good choice as an abstraction layer to build highly concurrent, distributed, resilient and message-driven application effectively. Backpressure introduced by Akka Stream can be important technology to prevent from overflow of data flows in our backend, which contributes system stability very well.

In this session, we talk about how above architecture works, how we concluded above architectural decisions on many trade-offs, what was achieved by this architecture, what was the pain points (e.g. how to guarantee eventual consistency, how to migrate systems in the real project, etc.) and several TIPS we learned for realizing our highly distributed and resilient messaging systems.

ChatWork is a business communication platform for global teams. Our four main features are enterprise-grade group chat, file sharing, task management and video chat. NTT DATA is one of biggest solution provider in Japan and providing technical support about Open Source Software and distributed computing. The project has been conducted with cooperation of ChatWork and NTT DATA.

Published in: Technology
  • Controversial method reveals inner psychology of techniques you can use to get your Ex back! See it now! ♣♣♣ http://goo.gl/FXTq7P
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Worldwide Scalable and Resilient Messaging Services by CQRS and Event Sourcing using Akka, Kafka Streams and HBase

  1. 1. by CQRS and Event Sourcing using Akka, Kafka Streams and HBase Worldwide Scalable and Resilient Messaging Services Shingo Omura ChatWork Co., Ltd. Masaru Dobashi NTT DATA Corporation © ChatWork and NTT DATA Corporation. 1
  2. 2. Agenda • Introduction of Us and Our service “ChatWork” • Technical Debts Blocked Our Growth • Our Approach: CQRS + Event Sourcing with Akka, Kafka, HBase • Technical Consideration to Build the Architecture • Several Technical Tips © ChatWork and NTT DATA Corporation. 2
  3. 3. Who am I ? Shingo OmuraAbout ChatWork Co., Ltd . • Founded in 2004, in Japan • 79 employees in total • Raised $15M in funding so far • 3 Office Locations : Japan, Taipei, U.S(California) • Senior Software Engineer in ChatWork Co., Ltd . • Specialized for Distributed and Concurrent Computing © ChatWork and NTT DATA Corporation. 3
  4. 4. Who am I ? Masaru Dobashi • Senior Software Engineer and Architect of IT Platform • Specialized for distributed computing, open sources and infrastructures. About NTT DATA Corporation Common Stock • ¥142,520 million (as of March 31, 2016) Business Area • System integration • Networking system services • Other business activities related to the above © ChatWork and NTT DATA Corporation. 4
  5. 5. ChatWork and NTT DATA • ChatWork is the project owner. • In this project, NTT DATA is providing the technical support about messaging systems and data stores in this project. © ChatWork and NTT DATA Corporation. 5
  6. 6. ChatWork (http://chatwork.com) We Change World Works • ChatWork is the enterprise grade global team collaboration platform • Group Chat, File sharing, Task management, video conference all in one place • All device support (PC, Android, iOS) 6 languages support © ChatWork and NTT DATA Corporation. 6
  7. 7. Demo © ChatWork and NTT DATA Corporation. 7
  8. 8. ChatWork (http://chatwork.com) Easy for Cross-Organizational communication • Chat Room, User namespace is shared by the whole • You don’t need to sign in to multiple organizations • You can add anyone, even in other organization, to chat rooms • Stats • 60% of users use for Internal/External communication • 10% of users use only for External communication • Typical Usecases = Business collaboration with their partners • Publishers and Writers • Franchise/Branch Operations • Consulting firm and Clients (Accounting, Law, etc.) © ChatWork and NTT DATA Corporation. 8
  9. 9. ChatWork Grows Rapidly 138,000 companies in 205 countries and regions © ChatWork and NTT DATA Corporation. 9
  10. 10. ChatWork Grows Rapidly 2 Billion Messages sent globally! Number of messages sent on ChatWork has been increased along with user growth. © ChatWork and NTT DATA Corporation. 10
  11. 11. Characteristics of Our Workload • 95% of message requests are"read" • Large portion of reads are about "recent” messages • But users sometime jump to very old messages via message links • every task and file has its associated message links © ChatWork and NTT DATA Corporation. 11
  12. 12. Technical Debts That Blocked Our Growth Cannot Scale-Up Anymore • Using the biggest intance type (db.r3.8xlarge) • Should be able to Scale-Out ACID doesn’t scale • ACID is hard to tune up performance • We realized to accept weaker consistency model Monolith is hard • to deploy, to maintain, to optimize © ChatWork and NTT DATA Corporation. 12
  13. 13. What We Want To Get on New Messaging Backend Different Scalability and Resiliency Level • Stateless Servers (API Servers) • Can be fully elastic automatically with high throughput and low latency • Fault-tolerant and self-healing • Statefull Servers (Storage) • No need to be automatically elastic, just scalable when we needed • Expected to be fault-tolerant and somewhat resilient • Durable and Predictability is important Acceptable consistency level • Eventual consistent can be accepted with reasonable/tunable delay • Every member in a chatroom should see message events in the same order © ChatWork and NTT DATA Corporation. 13
  14. 14. Our Approach: CQRS(Command and Query Responsibility Segregation) Build read side and write side independently Pros: easy to optimize and be flexible • Data Structure • De-normalized data model can be used for read models • Database Middleware • Focus on either read-heavy or write-heavy • System Capacity • Can control system capacity independently Cons: • Confined Complexity in data transformation • Operation overhead © ChatWork and NTT DATA Corporation. 14
  15. 15. Our Approach: CQRS + Event Sourcing Event Source • History of every changes in application state • It is stored in the sequence Write model database can be append only • Event is fact. It is already validated and authorized • Fact won’t be updated in nature © ChatWork and NTT DATA Corporation. 15
  16. 16. Our Approach: CQRS + Event Sourcing Easy to build/rebuild read model eventually • We can mutate each event to read model iteratively • This can be seen as pre-computing query results incrementally • This process can be replayed to re-build read model when needed by some incident. © ChatWork and NTT DATA Corporation. 16
  17. 17. Overall Architecture © ChatWork and NTT DATA Corporation. 17
  18. 18. What is Akka? • Toolkit to build powerful and concurrent distributed application easily based on actor-model programming • Asynchronous and Distributed by Design: • Easy to non-blocking and message-driven processing by Akka’s actor • High Performance: • 50 million msg/sec on a single machine. • Small memory footprint; ~2.5 million actors per GB of heap • Resilient by Design: • Error Kernel Patterns and Let It Crash Pattern with Actor hierarchy © ChatWork and NTT DATA Corporation. 18 ref: http://akka.io/
  19. 19. Akka’s Wonderful Features for Us Resilient by design • Customizable and Flexible resiliency with Kafka Consumer/Stream • With supervisor, we can restart KStream safely and cleanly without stopping JVM process and implement flexible and graceful restart policies. © ChatWork and NTT DATA Corporation. 19
  20. 20. Akka’s Wonderful Features for Us Non-Blocking • Large Blocking Iteration can be converted to Akka-Stream(ex. HBase’s scan operation) • We implemented HBaseScanStage which transform scanner(iterator) to the stream emitting scanned HBase rows asynchronously • This can achieve higher throughput and use threads fairly among multiple scan requests • Attach Isolated thread pool for blocking-call • We attach isolated thread pool for scan stage for avoiding akka starving threads Source.fromGraph(new HBaseScanStage(connection, “message”, scan)) .withAttributes( ActorAttributes.dispatcher("hbase-blocking-dispatcher") ) © ChatWork and NTT DATA Corporation. 20
  21. 21. What is Kafka? • Apache Kafka is a messaging system and used to construct the pipeline of data processing • In our use case, Kafka is used as a central log system for Event Sourcing. Kafka stores events generated by the write-api servers. • Idea similar to our platform • https://www.confluent.io/blog/event-sourcing-cqrs-stream-processing- apache-kafka-whats-connection/ © ChatWork and NTT DATA Corporation. 21
  22. 22. Kafka‘s Wonderful Features for Us • Kafka’s pub/sub model provides us the flexibility of developing services. We can add and improve functions step by step. • Kafka has both of the scalability and the reasonable guaranty of the message order which fits our service design. © ChatWork and NTT DATA Corporation. 22
  23. 23. What is HBase? • Apache HBase is a database for massive read/write operations and effectively leverages Hadoop HDFS. • In our use case, HBase is used to store data of the read model, a master data of the communication service. © ChatWork and NTT DATA Corporation. 23
  24. 24. HBase‘s Wonderful Features for Us • HBase is developed based on the stable architecture leveraging Hadoop HDFS (We are used to Hadoop) • HBase processes our read/write request workload effectively. • The write requests of this service are random access. Fortunately, HBase converts them to sequential access before writing data to disks. • The read requests of this service tends to be heavy on the recent data. Fortunately, HBase's read cache mechanism can handle such request efficiently. © ChatWork and NTT DATA Corporation. 24
  25. 25. What is Kafka Streams? • Kafka Streams is a part of Apache Kafka, which provides us the stream processing in the simple way. • Wonderful features for us • In our architecture, Kafka is a hub of the data pipeline, so that Kafka Streams is already included in the environment. • Even though Kafka Streams was a young component, the basic design seemed to be simple and reasonable for us. We first tried it for stateless application. © ChatWork and NTT DATA Corporation. 25
  26. 26. Summary of Actual Performance • Write API • Throughput(in stress test): 40x of current peak with only 2 write-api pods (4 core&5G mem/pod on m4.2xlarge instance) • Latency(in production): 200ms à 80ms • produce time to Kafka Brokers = 20ms (in production) © ChatWork and NTT DATA Corporation. 26
  27. 27. Summary of Actual Performance • Read API • Throughput(in stress test): current peak with 4 read-api pods (4 core&5G mem/pod on m4.2xlarge instance) • Latency(in production): 70ms à 70ms • HBase’s block cache hit rate = 99%!!! (in production) © ChatWork and NTT DATA Corporation. 27
  28. 28. Summary of Actual Performance • Read Model Updater • Time lag until read model being updated: 80ms (in production) • Resilient enough • Akka supervisor safely restarts kafka streams without stopping pods • Kafka consumer group itself is also resilient enough • partition reassignment happens automatically even when some of consumer pods are down (e.g rolling update) and can keep processing event mutation © ChatWork and NTT DATA Corporation. 28
  29. 29. Technical Consideration Topics • Guaranteeing the Order of Events in Kafka • Integrating Message Events to Other Mircoservices • Architecture Design To Realize Reasonable Fault Tolerancy and Durability • Kafka as Cushioning Layer • Heterogeneous design of data store • Error handling in each layer © ChatWork and NTT DATA Corporation. 29
  30. 30. Guaranteeing the Order of Events in Kafka • Partition is the unit of guaranteeing event ordering in Kafka • We use “chatroom id” for partition key to enclose events in chatroom to specific parition © ChatWork and NTT DATA Corporation. 30
  31. 31. Guaranteeing the Order of Events in Kafka • you should care • keep partitioner simple • Partitioner’s computational cost directly affects to producer throughput • default is recommended • changing key→partition is dangerous • If you use default practitioner, the number of partitions must not be changed. • This produces operational difficulties… • we operate 1000 pertitions for message event topic to get high concurrency in read model updater • Kafka doesn’t support automated partition rebalancing…. • We have to edit huge json object to move partition to new brokers... © ChatWork and NTT DATA Corporation. 31
  32. 32. Integrating Message Events to Other Microservices • Kafka is very useful for integrating to other services • We currently one event forwarder which integrates to multiple existential services • We are now adding event forwarder for outgoing webhook service • Important: integrated service should be “idempotent” • Event forwarder guarantee only “at-least-once” delivery • Integrated service might receive the same event multiple times © ChatWork and NTT DATA Corporation. 32
  33. 33. Reasonable Fault Tolerance and Durability • Important: Define your own level of fault tolerance and durability • Unnecessarily high-level fault tolerance and durability tends to reveal terrible complexity of the internal architecture. © ChatWork and NTT DATA Corporation. 33
  34. 34. Key Demands and Constraints for Us Durability against faults of each node Because the distributed system operates large number of nodes, the actual probability to find errors on some nodes may be not so small. Best efforts of durability against faults of the whole of a data center We aimed to provide the readability of the historical data even in the case of falling down of a data center as much as possible. Scalability for EACH layer The heterogeneous workload of write/read forces us the individual scale out plan. © ChatWork and NTT DATA Corporation. 34
  35. 35. Architecture design to realize the reasonable fault tolerancy and durability Heterogeneous Design of Data Store • The write-side • The read-side Data type Only recent data Requirements other than FT and durability Small footprint and efficiency Data type Long term master data Requirements other than FT and durability Stability and predictability © ChatWork and NTT DATA Corporation. 35
  36. 36. Architecture design to realize the reasonable fault tolerance and durability Cushioning Layer • In our design, Kafka has a role of “cushioning layer” as well as the hub of the pipeline. • We can reprocess old messages both automatically and manually when we find errors. This is achieved by storing several generations of offsets in the output data store and controlling offsets in the applications. © ChatWork and NTT DATA Corporation. 36
  37. 37. Architecture design to realize the reasonable fault tolerancy and durability Error Handling in Each Layer (1/2) • In reality, it is difficult to perfectly handle errors in a certain layer. • For example, Kafka Streams didn’t provide the fine-grained error handling at the time that we started this project. Some of errors during processing records may cause application failures. © ChatWork and NTT DATA Corporation. 37
  38. 38. Architecture design to realize the reasonable fault tolerancy and durability Error Handling in Each Layer (2/2) • Fortunately, since Kafka Streams can be run as a single application, you can wrap it by Akka Supervisor. This enables us to handle errors simply using UncaughtExceptionHandler. © ChatWork and NTT DATA Corporation. 38 public void setUncaughtExceptionHandler(final Thread.UncaughtExceptionHandler eh) streams.setUncaughtExceptionHandler(new UncaughtExceptionHandler { override def uncaughtException(t: Thread, e: Throwable): Unit = { self ! UncaughtExceptionInStream(e) } }) E.x. Send messages to itself to trigger the back off function of the supervisor.
  39. 39. Tips © ChatWork and NTT DATA Corporation. 39
  40. 40. Manual Offset Management of Kafka Streams • Question • How to handle both of writing the result and updating the offset information in one sequence(or transaction)? • Answer • Manage "offset information" in the output data stores • Way • In case of Kafka Streams, we’ve implemented our own consumer which writes the result and updates the offset information to the output data store. © ChatWork and NTT DATA Corporation. 40
  41. 41. KafkaClientSupplier • You can use KafkaClientSupplier implementation to provide a custom consumer & provider to KafkaStreams instance. • However, Kafka Streams is not basically designed for such use cases, so that it may be painful for you. You also need to be careful for frequent updating of the offset information. © ChatWork and NTT DATA Corporation. 41 public KafkaStreams(final TopologyBuilder builder, final StreamsConfig config, final KafkaClientSupplier clientSupplier) { public interface KafkaClientSupplier { Producer<byte[], byte[]> getProducer(final Map<String, Object> config); Consumer<byte[], byte[]> getConsumer(final Map<String, Object> config); Consumer<byte[], byte[]> getRestoreConsumer(final Map<String, Object> config); }
  42. 42. Parallelism and Ordering • As we told you in "Restriction about ordering of messages" page, the guarantee of the order of events in each chat room is important for us. • The configuration of parallelism of each component is important to realize both of the high throughput and the guarantee of ordering. • For example, Kafka Producer can reproduce messages due to some errors but this may cause the reordering when you send data in parallel . To prevent it, you can set max.in.flight.requests.per.connection to 1. © ChatWork and NTT DATA Corporation. 42
  43. 43. max.in.flight.requests.per.connection • This parameter configures the maximum parallelism of sending requests, etc. © ChatWork and NTT DATA Corporation. 43 Conditions: queue == null || queue.isEmpty() || (queue.peekFirst().send.completed() && queue.size() < this.maxInFlightRequestsPerConnection);
  44. 44. Summary • Why and How we build our messaging backend with CQRS + Event Sourcing by Akka, Kafka, HBase • How Akka, Kafka, HBase fits with the architecture and our usecase • Technical consideration topics to build the architecture • Several Technical Tips © ChatWork and NTT DATA Corporation. 44
  45. 45. Thank you! Any Questions? © ChatWork and NTT DATA Corporation. 45

×