Successfully reported this slideshow.

Digital integration hub: Why, what and how?

1

Share

1 of 25
1 of 25

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Related Audiobooks

Free with a 14 day trial from Scribd

See all

Digital integration hub: Why, what and how?

  1. 1. KAFKA Meetup - December 2021 Andrea Gioia CTO at Quantyca Co-Founder at Blindata Digital Integration Hub: why, what and how?
  2. 2. Legacy systems Some truths to face Legacy system are growing in size and number. They are here to stay! If your architecture does not manage legacy systems, legacy systems will menage sooner or later your architecture.
  3. 3. Who am I? Not an easy question to answer but keeping it simple... Andrea Gioia andrea.gioia@quantyca.it Quantyca is a privately owned technological consulting firm specialized in data and metadata management based in Italy quantyca.it Blindata is a SAAS platform that leverages Data Governance and Compliance to empower your Data Management projects. blindata.io CTO CO-FOUNDER
  4. 4. What is legacy modernization Digital transformation continuously push toward the development of new ● touchpoints in a omnichannel logic (System of engagement) ● analytical and AI based services (System of insight) These new applications are usually integrated with back-office legacy systems with a point-to-point logic. This way of integrating the new with the legacy does not scale up in the long term. Because the legacy cannot be simply thrown away a better integration architecture is needed in order to modernize them in place. ...and why it matters System of Engagement System of Insight System of Records Legacy Systems Application Layer Integration Layer Point to point “Spaghetti” integration
  5. 5. Legacy modernization TIME-TO-MARKET AND BUSINESS AGILITY IMPROVEMENT: Go beyond the limits imposed by legacy systems to improve business agility Key business drivers COSTS AND RISKS REDUCTION: Rationalize integrations to reduce development and maintenance costs and to avoid uncontrolled access to data RESILIENCE AND PERFORMANCE IMPROVEMENT: Ensure the uptime of legacy systems even in the face of significant increases in the workloads
  6. 6. Integration architecture #1 All new functionalities are implemented directly by extending the legacy system or by buying complementary products offered by the same vendor of the legacy system. Integration layer if present is limited to an API Gateway to decouple legacy backend from frontend applications Legacy systems take it all System of Engagement Frontend System of Insight Frontend System of Records Legacy Systems Application Layer Integration Layer API Gateway SoE & SoI Backend SoE & SoI Backend SoE & SoI Backend SoE & SoI Backend SoE & SoI Backend TIME-TO-MARKET AND BUSINESS AGILITY IMPROVEMENT COSTS AND RISKS REDUCTION RESILIENCE AND PERFORMANCE IMPROVEMENT
  7. 7. Integration architecture #2 Integration rationalization through composite services System of engagement System Of Insight System of Records Legacy Systems Application Layer Integration Platform API Gateway Request Based Integration Layer Application Services Process Services Sourcing Services Composite Services Integrations are rationalized through different layers of reusable and composable services. Sourcing services wrap legacy systems, process service orchestrate business process and application services provide a backend for frontend applications TIME-TO-MARKET AND BUSINESS AGILITY IMPROVEMENT COSTS AND RISKS REDUCTION RESILIENCE AND PERFORMANCE IMPROVEMENT
  8. 8. Integration architecture #2 Integration rationalization through data virtualization System of engagement System Of Insight System of Records Legacy Systems Application Layer Integration Platform API Gateway Request Based Integration Layer Application Layer Business Layer Physical Layer Virtual DWH TIME-TO-MARKET AND BUSINESS AGILITY IMPROVEMENT COSTS AND RISKS REDUCTION RESILIENCE AND PERFORMANCE IMPROVEMENT Integrations are rationalized through different layers of views served by a data virtualization application. Physical layer wraps legacy systems, business layer exposes the business model and application layer provide projections designed to facilitate consumption.
  9. 9. Integration architecture #2 Integration rationalization System of engagement System Of Insight System of Records Legacy Systems Application Layer Hybrid Integration Platform API Gateway Request Based Integration Layer Virtual DWH Composite Services TIME-TO-MARKET AND BUSINESS AGILITY IMPROVEMENT COSTS AND RISKS REDUCTION RESILIENCE AND PERFORMANCE IMPROVEMENT Composite services and data virtualization can be used in the same architecture. The former is preferred to back system of engagement the latter to back system of insight. Both solutions simplify integrations but don’t reduce the workload on the backend systems
  10. 10. Integration architecture #3 Data offloading System of engagement System Of Insight System of Records Legacy Systems Application Layer Hybrid Integration Platform API Gateway Event-Based Integration Layer High-Performance Data Store Microservices Metadata Management TIME-TO-MARKET AND BUSINESS AGILITY IMPROVEMENT COSTS AND RISKS REDUCTION RESILIENCE AND PERFORMANCE IMPROVEMENT Data offloaded from legacy systems are aggregated into low-latency, high performance datastore accessible via APIs, events or batch. The data store synchronizes with the beck ends via event-driven integration patterns.
  11. 11. Digital Integration Hub Key building blocks Event store High performance data store Connectors Legacy Systems Applications Services Where the data is stored Keeps the legacy systems and the high performance data store in sync offloading all modifications to relevant data in real time Transform technical events coming from connectors to domain and business events that can be consumed downstream by high performance data store or other consumers (event driven integration) Stores domain specific data exposing a single consolidated view of entities ~ Supports fast ingestion to reduce eventual consistency window ~ Can support analytical queries Connect to high performance data store for read queries Execute write on the legacy systems by means of command events pushed on the event store (command query responsibility segregation) Where the data is used
  12. 12. Connectors Data acquisition patterns Trigger (Push Mode) Good for neo-legacies. Problematic for old-school legacies. Change Data Capture (Backend Interception) The best option but the CDC connectors can be quite expansive. Active Pooling (Pop Mode) Difficult to find a trade off that satisfies the load constraints of the legacy and real time needs of applications. Interesting source connectors for legacy modernization available for Kafka are: ○ JDBC Connectors: for active pooling ○ Debezium Connector: for CDC from MySql, Postgres, … ○ Salesforce Connectors: for CDC from salesforce ○ Oracle Connector: for CDC from oracle ○ Partner Connectors: for CDC from other legacies like SAP and Mainframe (ex. Qlik Replicate Connector) Decorating Collaborator (Frontend Interception) Largely cited in letterature. Good in theory problematic in reality.
  13. 13. Event Store Event driven integration Legacy System Streaming Platform Technical Events (Speed & Fidelity) Domain Events (Trusted Views) Business Events (Ease of consumption) High Performance Data Store
  14. 14. Event Store Offloading patterns Legacy System Streaming Platform Technical Events (Speed & Fidelity) Domain Events (Trusted Views) One table per topic Changes to each table are mapped to distinct topics, one topic per table. Stream joins are used to create domain events from technical events spread across different topics Preserving transactional coherence within aggregates can be complex when the aggregate is spread among multiple tables updated by long running transactions
  15. 15. Event Store Offloading patterns Legacy System Streaming Platform Technical Events (Speed & Fidelity) Domain Events (Trusted Views) One aggregate per topic All changes to tables that are part of the same aggregate are mapped to the same topic. The identifier of the aggregate is used to partition the topic. It’s easier to create domain events from technical events preserving transactional coherence even with complex aggregates or unpredictable transactional pattern.
  16. 16. Event Store Offloading patterns Legacy System Streaming Platform Technical Events (Speed & Fidelity) Domain Events (Trusted Views) Transactional outbox pattern The legacy system is modified in order to inserts messages/events into an outbox table as part of the local transaction. The modification can be performed at code or database level (es. triggers or materialized views). The connector that offload data to the streaming platform is triggered by the outbox table. OUTBOX Table COMMIT TRX INSERT UPDATE DELETE INSERT
  17. 17. Event Store Offloading patterns Legacy System Streaming Platform Technical Events (Speed & Fidelity) Domain Events (Trusted Views) Triggered publisher All changes to tables that are part of the same aggregate are mapped to the same topic as technical event that can contain only the aggregate id and transaction id as payload. For every transaction id a stream processor query the legacy database extracting the modified aggregate, filtering by id, and publishing it as payload of a new domain event To reduce the workload on legacy the stream processor can query a read replica Transactional coherence within the aggregate is guaranteed by the upstream database
  18. 18. High-performance data store Some options with pros and cons KSQL DB Document DB In Memory DB HATP DB PROS + Does not requery external components + Low Latency + Can handle very high throughput + Moving from event to state is simple and requires a small integration effort + Stored data can be consumed directly also by stream processors CONS - Not SQL compliant - Serving to external consumers have some limitations that must be managed directly by the consumers - It’s not a good fit for complex analytical workloads - TCO maybe not optimal for huge data volumes
  19. 19. High-performance data store Some options with pros and cons KSQL DB Document DB In Memory DB HATP DB PROS + Does not require format transformation during the whole flow from streaming platform to services + Largely used by service developers, probably already present in the architecture + Good fit to expose single read view of domain entities consolidated from different sources + Quite easy to handle schema changes CONS - Not SQL compliant - Not a good fit for complex analytical workloads - Not a good fit to expose business entity whose access pattern from service is not predictable - Can have some performance issues at very high throughput
  20. 20. High-performance data store Some options with pros and cons KSQL DB Document DB In Memory DB HATP DB PROS + SQL compliant (some of them, not all) + Can handle very high throughput + Can handle complex analytical queries + Good fit to expose read view of domain events and business events as well + TCO can be optimize selecting the right strategy of distribution of stored data between RAM and disk CONS - Require format transformation from document to relational and then back to document when moving data from streaming platform first and to service then - Changes in schema performed upstream must be actively managed
  21. 21. High-performance data store Some options with pros and cons KSQL DB Document DB In Memory DB HATP DB PROS + Can handle very high throughput CONS - Not SQL compliant (in most of the cases, not all) - Not a good fit for complex analytical workloads - Can require format transformation when data is read from streaming platform first and then again when data is consumed by services - TCO maybe not optimal for huge data volumes
  22. 22. Closing the loop with CQRS From services back to legacy systems Legacy System Streaming Platform Technical Events (Speed & Fidelity) Domain Events (Trusted Views) High Performance Data Store Business Events (Ease of consumption) Commands Micro/Mini Services READ WRITE
  23. 23. The legacy modernization journey Offloading, Isolation and Refactoring Legacy System Digital Integration Hub Applications 1 Legacy Offloading Legacy System Digital Integration Hub Applications Anti Corruption Layer Bubble Context 2 Legacy Isolation Digital Integration Hub Applications Anti Corruption Layer Bubble Context 3 Legacy Refactoring
  24. 24. Takeaways Digital integration hub can be seen as a way of decoupled systems using data as anti corruption layer. Data offloaded into the integration platform become a first-class citizen of the new data centric architecture. Benefits ○ Responsive user experience ○ Offload legacy systems from expansive workloads generated by front-end services ○ Support legacy refactoring ○ Align services to business domain ○ Enable real time analytics ○ Foster a data centric approach to integration Challenges ○ Adapting the conceptual architecture to your specific context ○ Assembling different technology components, possibly from different vendors ○ Operating a complex distributed and loosely coupled architecture ○ Supporting bidirectional synchronization ○ Designing the domain data models for the business entities ○ Developing services that can tolerate eventual consistency ○ Managing organizational politics related to data ownership
  25. 25. Questions? Feel free to ask andrea.gioia@quantyca.it

×