MODERNISING THE
DIGITAL INTEGRATION HUB
Dan Toomey
@daniel2me
Who am I?
• Senior Integration Specialist, Deloitte
• Microsoft Azure MVP
• MCSE, MCT, MCPD, MCTS BizTalk & Azure
• Pluralsight Author
• www.mindovermessaging.com
• @daniel2me
AGENDA
• APIs, Data, & Integration
• Data Platform Concepts
• What is a Digital Integration Hub?
• Building a Digital Integration Hub
with Microsoft Cloud
• Case Studies
5
ACKNOWLEDGEMENTS
Alok Mishra
Principal
Cloud & Engineering
Gavin Carmont
Senior Manager
AI & Data
Michelle Brkic
Senior Consultant
AI & Data
Deloitte Australia - Engineering, AI & Data
6
APIs
MAKING THE DIGITAL WORLD GO AROUND
7
8
"APIs MAKE DIGITAL SOCIETY
AND DIGITAL BUSINESS
WORK; THEY ARE THE BASIS
OF EVERY DIGITAL STRATEGY."
"The 10 Things CIOs Need to Know About APIs and the API Economy" –
Gartner (G00318859, Jan 2017)
9
"APIs MAKE DIGITAL SOCIETY
AND DIGITAL BUSINESS WORK;
THEY ARE THE BASIS OF EVERY
DIGITAL AND AI STRATEGY."
"The 10 Things CIOs Need to Know About APIs and the API Economy" –
Gartner (G00318859, Jan 2017)
10
APIs
Simplify
Integration
Enable
Scalability
Facilitate
Development
Enhance
Security
Increase
Flexibility
Promote
Innovation
BENEFITS OF APIS
• Better integration
• More scalable solutions
• Streamlined development
• Controlled access to data & services
• Support for diverse platforms &
devices
• Create new applications or services
DATA
THE HEART OF THE MATTER
11
"DATA IS THE LIFEBLOOD OF YOUR BUSINESS"
12
"Data is the Lifeblood of Your Business – That's Why You Need a Data Strategy at its Heart" – The Drum (Feb 2020)
”
“
13
”
“
By 2021, enterprises using a cohesive
strategy incorporating data hubs, lakes
and warehouses will support 30% more
use cases than competitors.
"Data Hubs, Data Lakes & Data Warehouses: How They Are Different and Why They Are Better Together" – Gartner (G00465401, Feb 2020)
QUESTION: Did this turn out to be true?
14
15
16
Data Lake
Data
Warehouse
Data Hub
17
DATA HUB
Key capabilities:
• Data management
• Data ingestion
• Data authorization
Data
Warehouse
Data
Lake
19
• Well-defined, structured data
• Consistent semantics
• Fixed processing strategy
(e.g. SQL)
• ETL
• Business Intelligence (BI)
• High performance
requirements
• Concurrent access
• Raw, unrefined data
• Highly variable semantics
• Diverse sources
• ELT
• Data preparation
• Exploratory analysis
• Data science activities
(AI / ML training)
Characteristics:
Use Cases:
Data
Warehouse
Data
Lake
20
WHICH TO CHOOSE?
Data Hub
Data
Warehouse
Data
Lake
21
USE THEM ALL!!
Data Hub
Data
Warehouse
Data
Lake
DATA LAKEHOUSE
• Hybrid architecture
• Combines benefits of Data Lakes and Data
Warehouses
• Scalability
• Schemas (read & write)
• Fast query performance
• Balanced storage complexity & cost
• Examples:
• Databricks
• Snowflake
• Apachi Hudi
22
https://solutionsreview.com/data-management/gartner-da-summit-2023-the-gartner-view-of-the-data-lake-la
kehouse/
23
DATA LAKEHOUSE
Image © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered
trademark of Gartner, Inc. and its affiliates
24
MEDALLION LAKE ARCHITECTURE
https://learn.microsoft.com/en-us/azure/databricks/lakehouse/medallion
25
https://learn.microsoft.com/en-us/azure/databricks/lakehouse/medallion
INTEGRATION
THE PLUMBING YOU CAN'T LIVE WITHOUT
26
27
28
29
INTEGRATION HUB
Key capabilities:
• Real-time command and queries
• Bulk data publish and subscribe
• Events publish/subscribe for
internal and external systems
• API management
• Message transformation
• Orchestration
”
“
30
”
“
Integration is fundamental to executing a data hub
strategy. Data and analytics leaders must determine the
role of both data integration and application integration
technology for connecting endpoint systems to their data
hubs.
"Infuse your data hub strategy with Data & Application Integration" – Gartner (G00343327, Dec 2017)
”
“
31
”
“
Integration is fundamental to executing a data hub
strategy. Data and analytics leaders must determine the
role of both data integration and application integration
technology for connecting endpoint systems to their data
hubs.
"Infuse your data hub strategy with Data & Application Integration" – Gartner (G00343327, Dec 2017)
We are seeing Integration teams work closer with Data
teams to enable them to accelerate the value delivered
through their data platform such as a Data Hub through
really good integration platform services built on iPaaS.
"Reviving the Data Hub Strategy: Why Integration matters in Data Management" – Alok Mishra, Deloitte, Jan 2023
32
BRINGING IT ALL TOGETHER:
Data
Integration
APIs
Integration provides the link, but…
• APIs are still coupled to the data
sources (even if indirectly)
• Availability of backend systems
is a dependency
• Complexity & cost of APIs
designed to aggregate data
sources
33
34
BRINGING IT ALL TOGETHER:
Data
Integration
APIs
BRINGING IT ALL TOGETHER:
35
Data
Integration
APIs
DIGITAL INTEGRATION HUB
"An advanced application
architecture that
aggregates multiple back-
end system of record data
sources into a low-latency
and scale-out, high-
performance data store."
"Turbocharge Your API Platform with a Digital Integration Hub – Gartner (G00360082, Jul 2018)
Digital
Integration
Hub
36
• Consolidates and aggregates data from
multiple sources
• Transforms the data into non-proprietary
semantics, using entity views
• Allows for the access and manipulation of data
without impacting the core business systems
• Provides advanced search capabilities
• Analytical systems & AI
• Potential components:
• Data warehouse
• Data lake
• Data Lakehouse
• Master Data Management (MDM)
The heart of the DIH
HIGH-PERFORMANCE DATA STORE
37
• Captures data at the source (preferably as it
happens)
• Accommodates multiple integration patterns
& styles
• Event brokering / messaging
• Extract Transform & Load (ETL)
• Change Data Capture (CDC)
• Integration Patterns (ESB, iPaaS)
• Stream processing (Spark, Flint)
Gathering your data
EVENT-BASED INTEGRATION LAYER
38
• Microservices access the data at the
entity level
• Can be hierarchical and inter-connected
• Generally do not access the Systems of
Record directly
• API Gateway provides management and
access layer:
• Discovery
• Security
• Abstraction
• Monetization
• Analytics
Surfacing/exposing your data
FRONT-END API SERVICES
39
Data
Integration
Hub
Digital
Integration
Hub
Digital Integration Hub is a specific type of a Data Integration Hub
Data
Integration
Hub
Digital
Integration
Hub
Focuses on real-time
data access &
application integration
Supports transactional
& operational systems
Primarily real-time
processing with low
latency
Covers broad data
management &
integration capabilities
Supports wide range of
purposes, including
analytics, BI, data
warehousing
Real-time and batch
processing
BENEFITS
• Responsive user experience
• "Defending" systems of record
• 24/7 support
• Decoupling the front-end layer from the
system of record applications
• Supporting legacy systems replacement
• Normalising the APIs for a certain application
domain
• Providing real-time business insight
CHALLENGES
Complexity of rolling out a
high-performance data
management technology
(e.g. NoSQL DBMS or in-
memory data grid)
Designing a canonical data
model for the DIH business
entities that supports
multiple channels
Supporting bidirectional,
event-driven
synchronization between
the high-performance data
store and system of record
applications
Implementing appropriate
metadata management to
support discovery and
introspection of data entities
and relationships across
multiple data sources
Designing, building, and
managing the complex
distributed architecture of a
Digital Integration Hub
43
RECOMMENDATIONS
Determine if your organisation needs a DIH
and if it has the skills to support it
Know your data and consumer requirements
Understand the integration patterns that will
be required
Identify appropriate metadata and ensure you
employ a suitable catalog and search service
Consider using a technology partner for
implementation
44
Complexity of rolling out a high-
performance data management
technology (e.g. NoSQL DBMS or
in-memory data grid)
Designing a canonical data model
for the DIH business entities that
supports multiple channels
Supporting bidirectional, event-
driven synchronization between
the high-performance data store
and system of record applications
Implementing appropriate metadata
management to support discovery and
introspection of data entities and
relationships across multiple data sources
Designing, building, and
managing the complex distributed
architecture of a Digital
Integration Hub
BUILDING A
DIGITAL INTEGRATION
HUB WITH
MICROSOFT CLOUD
45
Analytics
46
API Gateway
Microservices
High-Performance Data Store
Integration Layer
Systems of Record / Applications & Data
Event Hubs
Logic Apps
Function Apps Container Apps AKS Service Fabric
Web Apps
API Management
ML Synapse
Analytics
Databricks
Data
Explorer
AI Search Purview
PostgreSQL
Cosmos
DB
SQL DB
Data Lake
Data WH
Data Factory
Function Apps
Logic Apps
Stream Analytics
Event Grid Service Bus
47
AZURE DIH ACCELERATOR
• Template on GitHub
• Jump starts by setting up the engineering system for your application
• Allows you to concentrate on the business logic
• Consists of:
• A pre-configured development environment
• An application
• Build and Deployment automation
https://github.com/Azure-Samples/digital-integration-hub
48
AZURE DIH ACCELERATOR
https://learn.microsoft.com/en-us/fabric/fundamentals/microsoft-fabric-overview 49
MICROSOFT FABRIC – ALL IN ONE DATA PLATFORM
SaaS offering includes:
• Single unified platform for the whole organisation
• End to end integrated analytics
• Unified data lakehouse storage preserving data in its
original location
• Consistent, user-friendly experiences
• AI-enhanced stack to accelerate the data journey
• Centralized administration and governance
MICROSOFT FABRIC – ALL IN ONE DATA PLATFORM
Component Description
Data Factory Ingest, prepare, and transform data from a rich set of data sources. Use Power Query, and more than 200 native connectors
to connect to data sources on-premises and in the cloud
Data Engineering Spark platform to create, manage, and optimize infrastructures for collecting, storing, processing, and analyzing vast data
volumes. Integration with Data Factory allows scheduling and orchestrating notebooks and Spark jobs.
Data Science Build, deploy, and operationalize machine learning models from Fabric. Integrates with Azure Machine Learning to provide
built-in experiment tracking and model registry. Data scientists can enrich organizational data with predictions and business
analysts can integrate those predictions into their BI reports, allowing a shift from descriptive to predictive insights.
Data Warehousing Provides industry leading SQL performance and scale. Separates compute from storage, enabling independent scaling of
both components. Natively stores data in the open Delta Lake format.
Real-Time
Intelligence
End-to-end solution for event-driven scenarios, streaming data, and data logs. Enables extraction of insights, visualization,
and action on data in motion by handling data ingestion, transformation, storage, modeling, analytics, visualization,
tracking, AI, and real-time actions. Real-Time hub provides a wide variety of no-code connectors, converging into a catalog
of organizational data that is protected, governed, and integrated across Fabric.
Power BI Easily connect to data sources for visualisation and discovery.
Databases Using the mirroring capability, bring data from various systems together into OneLake. Continuously replicate existing data
estate directly into Fabric's OneLake, including data from Azure SQL Database, Azure Cosmos DB, Azure Databricks,
Snowflake, and Fabric SQL database.
OneLake The datalake that is the foundation for all Fabric workloads. OneLake is the single store for all organizational data,
preventing data silos by offering one unified storage system that makes data discovery, sharing, and consistent policy
enforcement easy.
https://learn.microsoft.com/en-us/fabric/fundamentals/microsoft-fabric-overview
https://learn.microsoft.com/en-us/fabric/fundamentals/microsoft-fabric-overview 51
ONELAKE INGESTS DATA FROM MULTIPLE SOURCES
• Every developer and business unit can have their
own workspace
• Can ingest data into lakehouses and start
processing, analysing and collaborating on the
data
• Shortcuts feature allows import from various
sources (including other clouds) without moving
the data
52
https://learn.microsoft.com/en-us/azure/architecture/industries/automotive/automotive-telemetry-analytics 53
https://learn.microsoft.com/en-us/azure/architecture/industries/automotive/automotive-telemetry-analytics 54
Analytics
55
API Gateway
Microservices
High-Performance Data Store
Integration Layer
Systems of Record / Applications & Data
Event Hubs
Logic Apps
Function Apps Container Apps AKS Service Fabric
Web Apps
API Management
ML Synapse
Analytics
Databricks
Data
Explorer
AI Search Purview
PostgreSQL
Cosmos
DB
SQL DB
Data Lake
Data WH
Data Factory
Function Apps
Logic Apps
Stream Analytics
Event Grid Service Bus
Integration Layer
Analytics
56
API Gateway
Microservices
High-Performance Data Store
Systems of Record / Applications & Data
Logic Apps
Function Apps Container Apps AKS Service Fabric
Web Apps
API Management
Microsoft Fabric
Data Science Power BI Real-Time
Intelligence
OneLake Data Engineering Data Warehouse
Data Factory
Event Hubs Function Apps
Logic Apps
Stream Analytics
Event Grid Service Bus
Databases
57
PaaS vs SaaS
Source: https://learn.microsoft.com/en-us/fabric/real-time-intelligence/real-time-intelligence-compare
REAL WORLD
IMPLEMENTATIONS
HOW DELOITTE HAS LEVERAGED DIGITAL
INTEGRATION HUBS WITH OUR CLIENTS
58
GOALS:
• Establish an integration platform
• Integrate core clinical systems
using HL7 while enabling
integration of modern digital
consumers over FHIR APIs
• Enable the interoperability of
internal and external client
systems hosted in a fully
controlled on-premises
infrastructure or in a secure
public cloud
Pathology Services Provider
CASE STUDY #1
59
Pathology Services Provider
CASE STUDY #1- LOGICAL ARCHITECTURE
Confluence
for
knowledge
mgmt.
Jira
for
task
mgmt.
and
tracking
Bamboo
CI/CD
Azure
AD
and
Symantec
VIP
for
Auth.
Patients
Results
FHIR APIs
Integration Services
Orders
Orders
Results
Shipping
Manifest
App
Patients
IoT hubs
Patient
App
3rd
Party
Portal
Nat'l Infra
Specimen
Catalogue
Practitioners
Notifications
Practitioners
Notifications
Misc
Apps
Billing
Mgmt App
Patients
Billing
Engine
Encounters
Charge
Items
Notifications
Encounters
Charge
Items
LIS
LIS
PAS
Encounters
Orders
Results
Billing
Records
Patients
Encounters
Results
Results
Encounters
Patients
Charge
Items
LIS
LIS
LIS
LIS
LIS
Billing
System
EPR
BENEFITS REALISED:
• New integration platform
connected HL7 core systems to
modern FHIR consumers
• Accelerated delivery of new
initiatives, e.g. COVID SMS
notification system in 2 weeks
• Established a library of API-led
microservices to build new
integrations and applications
Pathology Services Supplier
CASE STUDY #1
62
GOALS:
• Establish a digital core to enable:
• seamless interoperability of key
systems
• integration of datasets
• modern analytic capabilities
• Modernize technology foundations,
processes, governance, and data & AI
strategy
Major Energy Supplier
CASE STUDY #2
63
SOLUTION:
• Implemented a Digital Services Platform (DSP)
using Microsoft Azure iPaaS services and
Databricks, including:
• Automated Data Ingestion and Advanced
Data Management
• Analytics workbench
• Data Governance using Unity Catalog
Major Energy Supplier
CASE STUDY #2
64
Major Energy Supplier
CASE 2 – LOGICAL ARCHITECTURE
65
Major Energy Supplier
CASE STUDY #3 – LOGICAL ARCHITECTURE
66
Major Energy Supplier
CASE STUDY #2 – LOGICAL ARCHITECTURE
67
SUMMARY
• A Digital Integration Hub enhances performance
from APIs while protecting your backend
systems
• It enables increased scalability, greater flexibility,
and better insights from your organizational
data, especially with a Data Lakehouse
• Microsoft presents multiple options for building
out your high-performance data store, including
both PaaS and SaaS offerings
70
REFERENCES
"Turbocharge Your API Platform with a Digital Integration Hub – Gartner (G00360082, Jul 2018)
"The 10 Things CIOs Need to Know About APIs and the API Economy" – Gartner (G00318859, Jan 2017)
"Data is the Lifeblood of Your Business – That's Why You Need a Data Strategy at its Heart" – The Drum (Feb 2020)
"Data Hubs, Data Lakes & Data Warehouses: How They Are Different and Why They Are Better Together" –
Gartner (G00465401, Feb 2020)
“Data Lake, Lakehouse, Warehouse: How to Choose?” – Gartner (March 2025)
"Five Ways to Make the Most of Your Data" – Forbes (Jul 2021)
"How to Justify Strategic Investments in Integration Technology" – Gartner (G00385596, May 2019 )
"Infuse your data hub strategy with Data & Application Integration" – Gartner (G00343327, Dec 2017)
https://www.integrationdownunder.com/
Bill Chesnut Wagner Silveira Mick Badran Martin Abbott Dan Toomey
STICKERS!!
LET’S CONNECT!
dtoomey@deloitte.com.au
@daniel2me
linkedin.com/in/danieltoomey
mindovermessaging.com

Modernising the Digital Integration Hub

  • 1.
    MODERNISING THE DIGITAL INTEGRATIONHUB Dan Toomey @daniel2me
  • 2.
    Who am I? •Senior Integration Specialist, Deloitte • Microsoft Azure MVP • MCSE, MCT, MCPD, MCTS BizTalk & Azure • Pluralsight Author • www.mindovermessaging.com • @daniel2me
  • 4.
    AGENDA • APIs, Data,& Integration • Data Platform Concepts • What is a Digital Integration Hub? • Building a Digital Integration Hub with Microsoft Cloud • Case Studies
  • 5.
  • 6.
    ACKNOWLEDGEMENTS Alok Mishra Principal Cloud &Engineering Gavin Carmont Senior Manager AI & Data Michelle Brkic Senior Consultant AI & Data Deloitte Australia - Engineering, AI & Data 6
  • 7.
    APIs MAKING THE DIGITALWORLD GO AROUND 7
  • 8.
    8 "APIs MAKE DIGITALSOCIETY AND DIGITAL BUSINESS WORK; THEY ARE THE BASIS OF EVERY DIGITAL STRATEGY." "The 10 Things CIOs Need to Know About APIs and the API Economy" – Gartner (G00318859, Jan 2017)
  • 9.
    9 "APIs MAKE DIGITALSOCIETY AND DIGITAL BUSINESS WORK; THEY ARE THE BASIS OF EVERY DIGITAL AND AI STRATEGY." "The 10 Things CIOs Need to Know About APIs and the API Economy" – Gartner (G00318859, Jan 2017)
  • 10.
    10 APIs Simplify Integration Enable Scalability Facilitate Development Enhance Security Increase Flexibility Promote Innovation BENEFITS OF APIS •Better integration • More scalable solutions • Streamlined development • Controlled access to data & services • Support for diverse platforms & devices • Create new applications or services
  • 11.
    DATA THE HEART OFTHE MATTER 11
  • 12.
    "DATA IS THELIFEBLOOD OF YOUR BUSINESS" 12 "Data is the Lifeblood of Your Business – That's Why You Need a Data Strategy at its Heart" – The Drum (Feb 2020)
  • 13.
    ” “ 13 ” “ By 2021, enterprisesusing a cohesive strategy incorporating data hubs, lakes and warehouses will support 30% more use cases than competitors. "Data Hubs, Data Lakes & Data Warehouses: How They Are Different and Why They Are Better Together" – Gartner (G00465401, Feb 2020) QUESTION: Did this turn out to be true?
  • 14.
  • 15.
  • 16.
  • 17.
    17 DATA HUB Key capabilities: •Data management • Data ingestion • Data authorization
  • 18.
  • 19.
    19 • Well-defined, structureddata • Consistent semantics • Fixed processing strategy (e.g. SQL) • ETL • Business Intelligence (BI) • High performance requirements • Concurrent access • Raw, unrefined data • Highly variable semantics • Diverse sources • ELT • Data preparation • Exploratory analysis • Data science activities (AI / ML training) Characteristics: Use Cases: Data Warehouse Data Lake
  • 20.
    20 WHICH TO CHOOSE? DataHub Data Warehouse Data Lake
  • 21.
    21 USE THEM ALL!! DataHub Data Warehouse Data Lake
  • 22.
    DATA LAKEHOUSE • Hybridarchitecture • Combines benefits of Data Lakes and Data Warehouses • Scalability • Schemas (read & write) • Fast query performance • Balanced storage complexity & cost • Examples: • Databricks • Snowflake • Apachi Hudi 22
  • 23.
    https://solutionsreview.com/data-management/gartner-da-summit-2023-the-gartner-view-of-the-data-lake-la kehouse/ 23 DATA LAKEHOUSE Image ©2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates
  • 24.
  • 25.
  • 26.
    INTEGRATION THE PLUMBING YOUCAN'T LIVE WITHOUT 26
  • 27.
  • 28.
  • 29.
    29 INTEGRATION HUB Key capabilities: •Real-time command and queries • Bulk data publish and subscribe • Events publish/subscribe for internal and external systems • API management • Message transformation • Orchestration
  • 30.
    ” “ 30 ” “ Integration is fundamentalto executing a data hub strategy. Data and analytics leaders must determine the role of both data integration and application integration technology for connecting endpoint systems to their data hubs. "Infuse your data hub strategy with Data & Application Integration" – Gartner (G00343327, Dec 2017)
  • 31.
    ” “ 31 ” “ Integration is fundamentalto executing a data hub strategy. Data and analytics leaders must determine the role of both data integration and application integration technology for connecting endpoint systems to their data hubs. "Infuse your data hub strategy with Data & Application Integration" – Gartner (G00343327, Dec 2017) We are seeing Integration teams work closer with Data teams to enable them to accelerate the value delivered through their data platform such as a Data Hub through really good integration platform services built on iPaaS. "Reviving the Data Hub Strategy: Why Integration matters in Data Management" – Alok Mishra, Deloitte, Jan 2023
  • 32.
    32 BRINGING IT ALLTOGETHER: Data Integration APIs Integration provides the link, but… • APIs are still coupled to the data sources (even if indirectly) • Availability of backend systems is a dependency • Complexity & cost of APIs designed to aggregate data sources
  • 33.
  • 34.
    34 BRINGING IT ALLTOGETHER: Data Integration APIs
  • 35.
    BRINGING IT ALLTOGETHER: 35 Data Integration APIs DIGITAL INTEGRATION HUB "An advanced application architecture that aggregates multiple back- end system of record data sources into a low-latency and scale-out, high- performance data store." "Turbocharge Your API Platform with a Digital Integration Hub – Gartner (G00360082, Jul 2018) Digital Integration Hub
  • 36.
  • 37.
    • Consolidates andaggregates data from multiple sources • Transforms the data into non-proprietary semantics, using entity views • Allows for the access and manipulation of data without impacting the core business systems • Provides advanced search capabilities • Analytical systems & AI • Potential components: • Data warehouse • Data lake • Data Lakehouse • Master Data Management (MDM) The heart of the DIH HIGH-PERFORMANCE DATA STORE 37
  • 38.
    • Captures dataat the source (preferably as it happens) • Accommodates multiple integration patterns & styles • Event brokering / messaging • Extract Transform & Load (ETL) • Change Data Capture (CDC) • Integration Patterns (ESB, iPaaS) • Stream processing (Spark, Flint) Gathering your data EVENT-BASED INTEGRATION LAYER 38
  • 39.
    • Microservices accessthe data at the entity level • Can be hierarchical and inter-connected • Generally do not access the Systems of Record directly • API Gateway provides management and access layer: • Discovery • Security • Abstraction • Monetization • Analytics Surfacing/exposing your data FRONT-END API SERVICES 39
  • 40.
  • 41.
    Digital Integration Hubis a specific type of a Data Integration Hub Data Integration Hub Digital Integration Hub Focuses on real-time data access & application integration Supports transactional & operational systems Primarily real-time processing with low latency Covers broad data management & integration capabilities Supports wide range of purposes, including analytics, BI, data warehousing Real-time and batch processing
  • 42.
    BENEFITS • Responsive userexperience • "Defending" systems of record • 24/7 support • Decoupling the front-end layer from the system of record applications • Supporting legacy systems replacement • Normalising the APIs for a certain application domain • Providing real-time business insight
  • 43.
    CHALLENGES Complexity of rollingout a high-performance data management technology (e.g. NoSQL DBMS or in- memory data grid) Designing a canonical data model for the DIH business entities that supports multiple channels Supporting bidirectional, event-driven synchronization between the high-performance data store and system of record applications Implementing appropriate metadata management to support discovery and introspection of data entities and relationships across multiple data sources Designing, building, and managing the complex distributed architecture of a Digital Integration Hub 43
  • 44.
    RECOMMENDATIONS Determine if yourorganisation needs a DIH and if it has the skills to support it Know your data and consumer requirements Understand the integration patterns that will be required Identify appropriate metadata and ensure you employ a suitable catalog and search service Consider using a technology partner for implementation 44 Complexity of rolling out a high- performance data management technology (e.g. NoSQL DBMS or in-memory data grid) Designing a canonical data model for the DIH business entities that supports multiple channels Supporting bidirectional, event- driven synchronization between the high-performance data store and system of record applications Implementing appropriate metadata management to support discovery and introspection of data entities and relationships across multiple data sources Designing, building, and managing the complex distributed architecture of a Digital Integration Hub
  • 45.
    BUILDING A DIGITAL INTEGRATION HUBWITH MICROSOFT CLOUD 45
  • 46.
    Analytics 46 API Gateway Microservices High-Performance DataStore Integration Layer Systems of Record / Applications & Data Event Hubs Logic Apps Function Apps Container Apps AKS Service Fabric Web Apps API Management ML Synapse Analytics Databricks Data Explorer AI Search Purview PostgreSQL Cosmos DB SQL DB Data Lake Data WH Data Factory Function Apps Logic Apps Stream Analytics Event Grid Service Bus
  • 47.
    47 AZURE DIH ACCELERATOR •Template on GitHub • Jump starts by setting up the engineering system for your application • Allows you to concentrate on the business logic • Consists of: • A pre-configured development environment • An application • Build and Deployment automation https://github.com/Azure-Samples/digital-integration-hub
  • 48.
  • 49.
    https://learn.microsoft.com/en-us/fabric/fundamentals/microsoft-fabric-overview 49 MICROSOFT FABRIC– ALL IN ONE DATA PLATFORM SaaS offering includes: • Single unified platform for the whole organisation • End to end integrated analytics • Unified data lakehouse storage preserving data in its original location • Consistent, user-friendly experiences • AI-enhanced stack to accelerate the data journey • Centralized administration and governance
  • 50.
    MICROSOFT FABRIC –ALL IN ONE DATA PLATFORM Component Description Data Factory Ingest, prepare, and transform data from a rich set of data sources. Use Power Query, and more than 200 native connectors to connect to data sources on-premises and in the cloud Data Engineering Spark platform to create, manage, and optimize infrastructures for collecting, storing, processing, and analyzing vast data volumes. Integration with Data Factory allows scheduling and orchestrating notebooks and Spark jobs. Data Science Build, deploy, and operationalize machine learning models from Fabric. Integrates with Azure Machine Learning to provide built-in experiment tracking and model registry. Data scientists can enrich organizational data with predictions and business analysts can integrate those predictions into their BI reports, allowing a shift from descriptive to predictive insights. Data Warehousing Provides industry leading SQL performance and scale. Separates compute from storage, enabling independent scaling of both components. Natively stores data in the open Delta Lake format. Real-Time Intelligence End-to-end solution for event-driven scenarios, streaming data, and data logs. Enables extraction of insights, visualization, and action on data in motion by handling data ingestion, transformation, storage, modeling, analytics, visualization, tracking, AI, and real-time actions. Real-Time hub provides a wide variety of no-code connectors, converging into a catalog of organizational data that is protected, governed, and integrated across Fabric. Power BI Easily connect to data sources for visualisation and discovery. Databases Using the mirroring capability, bring data from various systems together into OneLake. Continuously replicate existing data estate directly into Fabric's OneLake, including data from Azure SQL Database, Azure Cosmos DB, Azure Databricks, Snowflake, and Fabric SQL database. OneLake The datalake that is the foundation for all Fabric workloads. OneLake is the single store for all organizational data, preventing data silos by offering one unified storage system that makes data discovery, sharing, and consistent policy enforcement easy. https://learn.microsoft.com/en-us/fabric/fundamentals/microsoft-fabric-overview
  • 51.
    https://learn.microsoft.com/en-us/fabric/fundamentals/microsoft-fabric-overview 51 ONELAKE INGESTSDATA FROM MULTIPLE SOURCES • Every developer and business unit can have their own workspace • Can ingest data into lakehouses and start processing, analysing and collaborating on the data • Shortcuts feature allows import from various sources (including other clouds) without moving the data
  • 52.
  • 53.
  • 54.
  • 55.
    Analytics 55 API Gateway Microservices High-Performance DataStore Integration Layer Systems of Record / Applications & Data Event Hubs Logic Apps Function Apps Container Apps AKS Service Fabric Web Apps API Management ML Synapse Analytics Databricks Data Explorer AI Search Purview PostgreSQL Cosmos DB SQL DB Data Lake Data WH Data Factory Function Apps Logic Apps Stream Analytics Event Grid Service Bus
  • 56.
    Integration Layer Analytics 56 API Gateway Microservices High-PerformanceData Store Systems of Record / Applications & Data Logic Apps Function Apps Container Apps AKS Service Fabric Web Apps API Management Microsoft Fabric Data Science Power BI Real-Time Intelligence OneLake Data Engineering Data Warehouse Data Factory Event Hubs Function Apps Logic Apps Stream Analytics Event Grid Service Bus Databases
  • 57.
    57 PaaS vs SaaS Source:https://learn.microsoft.com/en-us/fabric/real-time-intelligence/real-time-intelligence-compare
  • 58.
    REAL WORLD IMPLEMENTATIONS HOW DELOITTEHAS LEVERAGED DIGITAL INTEGRATION HUBS WITH OUR CLIENTS 58
  • 59.
    GOALS: • Establish anintegration platform • Integrate core clinical systems using HL7 while enabling integration of modern digital consumers over FHIR APIs • Enable the interoperability of internal and external client systems hosted in a fully controlled on-premises infrastructure or in a secure public cloud Pathology Services Provider CASE STUDY #1 59
  • 61.
    Pathology Services Provider CASESTUDY #1- LOGICAL ARCHITECTURE Confluence for knowledge mgmt. Jira for task mgmt. and tracking Bamboo CI/CD Azure AD and Symantec VIP for Auth. Patients Results FHIR APIs Integration Services Orders Orders Results Shipping Manifest App Patients IoT hubs Patient App 3rd Party Portal Nat'l Infra Specimen Catalogue Practitioners Notifications Practitioners Notifications Misc Apps Billing Mgmt App Patients Billing Engine Encounters Charge Items Notifications Encounters Charge Items LIS LIS PAS Encounters Orders Results Billing Records Patients Encounters Results Results Encounters Patients Charge Items LIS LIS LIS LIS LIS Billing System EPR
  • 62.
    BENEFITS REALISED: • Newintegration platform connected HL7 core systems to modern FHIR consumers • Accelerated delivery of new initiatives, e.g. COVID SMS notification system in 2 weeks • Established a library of API-led microservices to build new integrations and applications Pathology Services Supplier CASE STUDY #1 62
  • 63.
    GOALS: • Establish adigital core to enable: • seamless interoperability of key systems • integration of datasets • modern analytic capabilities • Modernize technology foundations, processes, governance, and data & AI strategy Major Energy Supplier CASE STUDY #2 63
  • 64.
    SOLUTION: • Implemented aDigital Services Platform (DSP) using Microsoft Azure iPaaS services and Databricks, including: • Automated Data Ingestion and Advanced Data Management • Analytics workbench • Data Governance using Unity Catalog Major Energy Supplier CASE STUDY #2 64
  • 65.
    Major Energy Supplier CASE2 – LOGICAL ARCHITECTURE 65
  • 66.
    Major Energy Supplier CASESTUDY #3 – LOGICAL ARCHITECTURE 66
  • 67.
    Major Energy Supplier CASESTUDY #2 – LOGICAL ARCHITECTURE 67
  • 68.
    SUMMARY • A DigitalIntegration Hub enhances performance from APIs while protecting your backend systems • It enables increased scalability, greater flexibility, and better insights from your organizational data, especially with a Data Lakehouse • Microsoft presents multiple options for building out your high-performance data store, including both PaaS and SaaS offerings
  • 69.
    70 REFERENCES "Turbocharge Your APIPlatform with a Digital Integration Hub – Gartner (G00360082, Jul 2018) "The 10 Things CIOs Need to Know About APIs and the API Economy" – Gartner (G00318859, Jan 2017) "Data is the Lifeblood of Your Business – That's Why You Need a Data Strategy at its Heart" – The Drum (Feb 2020) "Data Hubs, Data Lakes & Data Warehouses: How They Are Different and Why They Are Better Together" – Gartner (G00465401, Feb 2020) “Data Lake, Lakehouse, Warehouse: How to Choose?” – Gartner (March 2025) "Five Ways to Make the Most of Your Data" – Forbes (Jul 2021) "How to Justify Strategic Investments in Integration Technology" – Gartner (G00385596, May 2019 ) "Infuse your data hub strategy with Data & Application Integration" – Gartner (G00343327, Dec 2017)
  • 70.
    https://www.integrationdownunder.com/ Bill Chesnut WagnerSilveira Mick Badran Martin Abbott Dan Toomey STICKERS!!
  • 71.

Editor's Notes

  • #7 APIs make the digital world go round
  • #8 APIs are can solve many problems: Enable systems & applications to interact Disparate data sources & access protocols Disparate consumer channels But… there be challenges: Security Performance (consumer & system impact) Avoiding numerous point-to-point solutions
  • #9 AIPs are essential for AI to work
  • #10 Interoperability: APIs facilitate seamless communication between disparate systems, allowing them to share data and functionality regardless of underlying technologies, which enhances system compatibility. Scalability: By enabling modular system design, APIs allow developers to build scalable applications where different components can be updated or expanded independently without affecting the entire system. Efficiency: APIs streamline development by allowing developers to leverage existing functionalities rather than building from scratch, saving time and resources. This reuse of code leads to faster deployment and innovation. Security: APIs provide controlled access to data and services through authentication and authorization mechanisms, ensuring secure interactions between systems and protecting sensitive information. Flexibility: They support diverse platforms and devices, enabling the development of cross-platform applications. This flexibility is crucial for reaching a broader audience and adapting to new technological trends. Innovation: APIs empower developers to create new applications or services by integrating with third-party solutions, fostering innovation and opening up new business opportunities through enhanced capabilities.
  • #11 APIs are only as helpful as the data that they can access
  • #12 APIs only as good as the data they can access Large organisations rely on multiple systems of record Information needs to be shared on multiple channels Other internal systems Partner systems Client systems Strategy is important
  • #13 What does Gartner & Forrester have to say about data?
  • #16 Not mentioned: Data mesh
  • #17 It is a place where data is ingested, curated, managed and served in contrast to the traditional paradigm of data from the source systems enables the seamless flow and governance of data Connects producers and consumers NOT SO MUCH ABOUT STORING DATA LONG TERM, AS FACILITATING ITS FLOW Centralised Data Access: A data hub provides a unified platform for accessing diverse data sources, ensuring consistent and streamlined access to information across the organisation. Data Integration: It facilitates seamless integration of structured and unstructured data from different systems, allowing for comprehensive data analysis and decision-making. Data Governance: With robust governance features, a data hub ensures data quality, consistency, and compliance with regulatory requirements, providing a trusted data environment. Scalability: A data hub can handle large volumes of data, supporting organisational growth and the increasing demand for data-driven insights without performance degradation. Real-Time Data Processing: It supports real-time data ingestion and processing, enabling timely analysis and response to changing business conditions. Interoperability: By offering interoperability with various data formats and systems, a data hub enhances collaboration and data sharing across departments and external partners. Security and Privacy: It includes advanced security measures to protect sensitive data, ensuring privacy and safeguarding against breaches.
  • #18 Data warehouses and data lakes are structures supporting analytic workloads. They are repositories Data hubs are different — their main focus is enabling data sharing and governance.
  • #19 Data warehouses and data lakes are structures supporting analytic workloads. They are repositories Data hubs are different — their main focus is enabling data sharing and governance.
  • #21 Combined benefits of: Operational Data Store Broad analytical capabilities with AI/ML
  • #22 Best for enterprises needing both AI/ML capabilities and structured analytics Databricks, Snowflake, and Apache Hudi
  • #24 Ingest raw data in the bronze layer Minimal data cleanup and validation (variants, string, binary) Validate and deduplicate data in the silver layer Build from the bronze (or silver) tables Enforce data quality Start modelling data Power analytics with the Gold Layer Create aggregated data tailored for analytics and reporting. Align with business logic and requirements. Optimise for performance in queries and dashboards. Bronze layer (ops.bronze): Ingests raw data from cloud storage, Kafka, and Salesforce. No data cleanup or validation is performed here. Silver layer (ops.silver): Data cleanup and validation are performed in this layer. Data about customers and transactions is cleaned by dropping nulls and quarantining invalid records. These datasets are joined into a new dataset called customer_transactions. Data scientists can use this dataset for predictive analytics. Similarly, accounts and opportunity datasets from Salesforce are joined to create account_opportunities, which is enhanced with account information. The leads_raw data is cleaned in a dataset called leads_cleaned. Gold layer (ops.gold): This layer is designed for business users. It contains fewer datasets than silver and gold. customer_spending: Average and total spend for each customer. account_performance: Daily performance for each account. sales_pipeline_summary: Information about the end-to-end sales pipeline. business_summary: Highly aggregated information for the executive staff.
  • #25 Ingest raw data in the bronze layer Minimal data cleanup and validation (variants, string, binary) Validate and deduplicate data in the silver layer Build from the bronze (or silver) tables Enforce data quality Start modelling data Power analytics with the Gold Layer Create aggregated data tailored for analytics and reporting. Align with business logic and requirements. Optimise for performance in queries and dashboards. Bronze layer (ops.bronze): Ingests raw data from cloud storage, Kafka, and Salesforce. No data cleanup or validation is performed here. Silver layer (ops.silver): Data cleanup and validation are performed in this layer. Data about customers and transactions is cleaned by dropping nulls and quarantining invalid records. These datasets are joined into a new dataset called customer_transactions. Data scientists can use this dataset for predictive analytics. Similarly, accounts and opportunity datasets from Salesforce are joined to create account_opportunities, which is enhanced with account information. The leads_raw data is cleaned in a dataset called leads_cleaned. Gold layer (ops.gold): This layer is designed for business users. It contains fewer datasets than silver and gold. customer_spending: Average and total spend for each customer. account_performance: Daily performance for each account. sales_pipeline_summary: Information about the end-to-end sales pipeline. business_summary: Highly aggregated information for the executive staff.
  • #27 A Data Hub can be part of an integration hub
  • #28 Integration Hub has broader responsibilities
  • #33 High cost: huge number of API calls potentially imply a massive, often low-value workload which ultimately hits the system of record applications and the integration layer. ■ Complexity: The integration macroservices could be extremely complex to develop, deploy, run and manage, especially if the back-end data is highly fragmented across multiple applications and data sources. Availability: Dependency on backend systems being available Tight coupling: Changes in the back-end systems imply re-engineering of the integration macroservices or their complete redesign, for example, if a back-end system is replaced with a different solution.
  • #35 Explain how a DIH can act like a “transmission” to decouple the gears.
  • #36 "By storing an aggregated replica of the system of record data needed by the channel applications, the DIH protects the former from excessive workloads while optimizing the data access latency and responsiveness for the latter.” ODS – Operational Data Store
  • #37 Metadata management – enables metadata-driven development & introspection, e.g to support domain-driven design Discover, capture, & synchronise metadata models Analytics – technically not part of the DIH, but often a driver for its implementation
  • #38 ETL – preferred for initial loads or periodic refreshes
  • #39 Some exceptions to direct access: CQRS (Command Query Responsibility Segregation) pattern Updates to SoR Would typically use data virtualization, federated views
  • #42 Responsive user experience: while enabling users access to a consolidated, yet real-time, view of data scattered across multiple system of record back-ends "Defending" systems of record: from the potentially excessive, often low value, workloads generated by the channels. 24/7 support: Always-on access to the front-end API services, even in situations where the back-end systems must be put offline for maintenance, upgrade or other reasons. ■ Decoupling the front-end layer from the system or record applications: Supporting legacy system of record applications modernization: Normalizing the APIs for a certain application domain: So that with a single set of APIs the channel applications can access data held in multiple systems of record Providing real-time business insight: Gathering data on user behavior or offering them additional services (including search, predictive analytics
  • #44 DRIVER: low-latency, real-time, consolidated and accurate access to data is critical for business success.
  • #46 Polyglot data stores Azure PaaS solution
  • #47 This repository contains a template of an application built for the Azure Application Platform. This template is built to make it easy to dive straight into implementing business logic without having to spend time on setting up an engineering system for your application. The templates give you a starting point, while providing the option to change and extend any of the pre-configured components to suit your needs.
  • #48 The front end processing layer are implemented by Azure API Management for providing the discoverability and gateways for all your APIs. The microservices implementation leverage serverless functions which provide a simple CRUD based API. The data layer is implemented using PostgresDB For Analytics and getting a single pane of glass view over your disparate data sources this is where you could optionally add on Azure Synapse analytics to extend this solution. This is not included in the sample. The Integration layer in this case uses Azure Event Grid for the event driven programming model to ensure the system responds to all API events in real time and Logic Apps to react upon these events and process the incoming data. The implementation of the logic app is left blank, leaving you to leverage the 450+ connector ecosystem to implement the syncing functionality between the data layer and your backend systems of record you choose to integration the Digital Integration Hub with. and Monitoring provided by Application Insights and Log Analytics features of Azure Monitor. The application is comprised of an Azure Function API implemented written in Typescript, a PostgreSQL 'items' database and an Integration layer implemented by Logic Apps and Event Grid. It uses an Object-Relational Mapper (Sequelize) and implements a single object, items, to get you started.
  • #49 Fabric unifies data movement, data processing, ingestion, transformation, real-time event routing, and report building. It supports these capabilities with integrated services like Data Engineering, Data Factory, Data Science, Real-Time Intelligence, Data Warehouse, and Databases.
  • #50 Fabric unifies data movement, data processing, ingestion, transformation, real-time event routing, and report building. It supports these capabilities with integrated services like Data Engineering, Data Factory, Data Science, Real-Time Intelligence, Data Warehouse, and Databases. Pricing is by Capacity Units and storage
  • #53 Eventhouse:  handle real-time data streams efficiently, which lets organizations ingest, process, and analyze data in near real-time. These aspects make eventhouses useful for scenarios where timely insights are crucial.  The guidance in this article is for telemetry scenarios and batch test drive data ingestion scenarios. This architecture focuses on the data platform that processes diagnostic data and the connectors for data visualization and data reporting. The data capture device is connected to the vehicle networks and collects high-resolution vehicle signal data and video. (1a) The device publishes real-time telemetry messages or (1b) requests the upload of recorded data files to the Azure Event Grid MQTT broker functionality by using an MQTT client. This functionality uses a Claim-Check pattern. (2a) Event Grid routes live vehicle signal data to an Azure Functions app. This app decodes the vehicle signals to the JavaScript Object Notation (JSON) format and posts them to an eventstream. (2b) Event Grid coordinates the file upload from the device client to the lakehouse. A completed file upload triggers a pipeline that decodes the data and writes the decoded file to OneLine in a format that's suitable for ingestion, such as parquet or CSV. (3a) The eventstream routes the decoded JSON vehicle signals for ingestion in the Eventhouse. (3b) A data pipeline triggers the ingestion of decoded files from the lakehouse. The Eventhouse uses update policies to enrich the data and to expand the JSON data into a suitable row format, for example location data might be clustered to align with geospatial analytics. Every time a new row is ingested, the real-time analytics engine invokes an associated Update() function. Data engineers and data scientists use Kusto Query Language (KQL) to build analytics use cases. Users store frequently used cases as shareable user-defined functions. The engineers use built-in KQL functions such as aggregation, time-series analysis, geospatial clustering, windowing, and machine learning plugins with Copilot support. R&D engineers and data scientists use notebooks to analyze data and build test and validation use cases. R&D engineers use KQL query sets and Copilot for Real-Time Intelligence to perform interactive data analysis. Data engineers and data scientists use notebooks to store and share their analysis processes. With notebooks, engineers can use Azure Spark to run analytics and use Git to manage the notebook code. Users can take advantage of Copilot for Data Science and Data Engineering to support their workflow with contextual code suggestions. R&D engineers and data scientists can use Power BI with dynamic queries or real-time analytics dashboards to create visualizations to share with business users. These visualizations invoke user-defined functions for ease of maintenance. Engineers can also connect more tools to Microsoft Fabric. For instance, they can connect Azure Managed Grafana to the Eventhouse or create a web application that queries the Eventhouse directly. Data engineers and R&D engineers use Data Activator to create reflex items to monitor conditions and trigger actions, such as triggering Power Automate flows for business integration. For example, Data Activator can notify a Teams channel if the health of a device degrades. The data collector configuration enables engineers to change the data collection policies of the data capture device. Azure API Management abstracts and secures the partner configuration API and provides observability.
  • #54 Eventhouse:  handle real-time data streams efficiently, which lets organizations ingest, process, and analyze data in near real-time. These aspects make eventhouses useful for scenarios where timely insights are crucial.  The guidance in this article is for telemetry scenarios and batch test drive data ingestion scenarios. This architecture focuses on the data platform that processes diagnostic data and the connectors for data visualization and data reporting. The data capture device is connected to the vehicle networks and collects high-resolution vehicle signal data and video. (1a) The device publishes real-time telemetry messages or (1b) requests the upload of recorded data files to the Azure Event Grid MQTT broker functionality by using an MQTT client. This functionality uses a Claim-Check pattern. (2a) Event Grid routes live vehicle signal data to an Azure Functions app. This app decodes the vehicle signals to the JavaScript Object Notation (JSON) format and posts them to an eventstream. (2b) Event Grid coordinates the file upload from the device client to the lakehouse. A completed file upload triggers a pipeline that decodes the data and writes the decoded file to OneLine in a format that's suitable for ingestion, such as parquet or CSV. (3a) The eventstream routes the decoded JSON vehicle signals for ingestion in the Eventhouse. (3b) A data pipeline triggers the ingestion of decoded files from the lakehouse. The Eventhouse uses update policies to enrich the data and to expand the JSON data into a suitable row format, for example location data might be clustered to align with geospatial analytics. Every time a new row is ingested, the real-time analytics engine invokes an associated Update() function. Data engineers and data scientists use Kusto Query Language (KQL) to build analytics use cases. Users store frequently used cases as shareable user-defined functions. The engineers use built-in KQL functions such as aggregation, time-series analysis, geospatial clustering, windowing, and machine learning plugins with Copilot support. R&D engineers and data scientists use notebooks to analyze data and build test and validation use cases. R&D engineers use KQL query sets and Copilot for Real-Time Intelligence to perform interactive data analysis. Data engineers and data scientists use notebooks to store and share their analysis processes. With notebooks, engineers can use Azure Spark to run analytics and use Git to manage the notebook code. Users can take advantage of Copilot for Data Science and Data Engineering to support their workflow with contextual code suggestions. R&D engineers and data scientists can use Power BI with dynamic queries or real-time analytics dashboards to create visualizations to share with business users. These visualizations invoke user-defined functions for ease of maintenance. Engineers can also connect more tools to Microsoft Fabric. For instance, they can connect Azure Managed Grafana to the Eventhouse or create a web application that queries the Eventhouse directly. Data engineers and R&D engineers use Data Activator to create reflex items to monitor conditions and trigger actions, such as triggering Power Automate flows for business integration. For example, Data Activator can notify a Teams channel if the health of a device degrades. The data collector configuration enables engineers to change the data collection policies of the data capture device. Azure API Management abstracts and secures the partner configuration API and provides observability.
  • #55 Polyglot data stores Azure PaaS solution
  • #56 Polyglot data stores Azure PaaS solution
  • #61 Heavy lifting was done by integration services – e.g. patient matching algorithm, ‘ready to bill’ rules implementation. Patient Administration System Laboratory Information System Electronic Patient Record Technologies used: Mulesoft Azure Service Bus Azure Cosmos DB Azure API Gateway AKS