The document discusses various solutions and blueprints for integrating data between an Oracle relational database management system (RDBMS) and the Apache Kafka streaming platform. It begins with an introduction to microservices architecture and the need for integrating traditional and modern applications. It then outlines five blueprints for moving data from an Oracle RDBMS to Kafka, including polling database tables/views, change data capture, polling APIs, producing directly to Kafka, and using queues. Finally, it briefly discusses blueprints for moving data from Kafka to an Oracle RDBMS.
3. BASEL | BERN | BRUGG | BUKAREST | DÜSSELDORF | FRANKFURT A.M. | FREIBURG I.BR. | GENF
HAMBURG | KOPENHAGEN | LAUSANNE | MANNHEIM | MÜNCHEN | STUTTGART | WIEN | ZÜRICH
Guido
Working at Trivadis for more than 22 years
Consultant, Trainer, Platform Architect for Java,
Oracle, SOA and Big Data / Fast Data
Oracle Groundbreaker Ambassador & Oracle ACE
Director
@gschmutz guidoschmutz.wordpress.com
175th
edition
5. Microservices / Modern Applications
• Highly decoupled
• Independently deployable
• Bounded Context/Aggregate (DDD)
• Responsible for their data
• Favour asynchronous, event-driven
interaction over synchronous
• Smart Endpoints and Dump Pipes
• Use Anti-Corruption Layer (ACL) if no
fit! M3M2
ACL
Event
Hub
M1
6. Microservices / Modern Applications
Integrate with Traditional System
M3M2
ACL
Event
Hub
M1
ACL
• Highly decoupled
• Independently deployable
• Bounded Context/Aggregate (DDD)
• Responsible for their data
• Favour asynchronous, event-driven
interaction over synchronous
• Smart Endpoints and Dump Pipes
• Use Anti-Corruption Layer (ACL) if no
fit!
Traditional
App
7. Apache Kafka – A Streaming Platform
Kafka Cluster
Consumer 1 Consume 2r
Broker 1 Broker 2 Broker 3
Zookeeper
Ensemble
ZK 1 ZK 2ZK 3
Schema
Registry
Service 1
Management
Control Center
Kafka Manager
KAdmin
Producer 1 Producer 2
kafkacat
Data Retention:
• Never
• Time (TTL) or Size-based
• Log-Compacted based
Producer3Producer3
ConsumerConsumer 3
8. Use Case
Customer Microservice
{ }
Customer API CustomerCustomer Logic
Order Processing System
{ }
Order API OrderOrder Logic
REST
REST
Event Hub
Customer
Mat View
Order
Customer
(compacted)
Notification Microservice
Notification Logic
“Modern Apps”Traditional Apps (Legacy)
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
9. MessageMessageMessageMessage
MessageMessage
Properties - Message
Message Message
A1 A2 A3
Message
B1 B2 B3 B4
A1 A2 A3 B
B1 B2 B3 B4
Table A
A1
A2
A3
Table B
B1
B2
B3
B4
FlatDB Model Aggregate
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
10. Properties - Latency
Traditional System Event
Hub
Data
Flow
RDBMS
latency
latency
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
11. Properties – Anti-Corruption Layer (ACL)
Traditional System Event
Hub
Data
Flow
RDBMS
Traditional System Event
Hub
Data
Flow
RDBMS
ACL
ACL
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
Examples:
• View Layer
• Storage Procedure
• JSON Support in DB
• …
Examples:
• StreamSets
• Kafka Connect
• Kafka Streams / KSQL
• …
Database Dataflow
12. Properties – License
• Cost Free
• either part of Oracle RDMBS license
• or part of Kafka license (if Confluent Enterprise is used)
• or a free software (mostly open source) component
• Commercial
• an additional component involving license costs
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
14. Blueprints Oracle RDBMS => Apache Kafka (DB-K)
Customer Microservice
{ }
Customer API CustomerCustomer Logic
Order Processing System
{ }
Order API OrderOrder Logic
REST
REST
Event Hub
Customer
Mat View
Order
(compacted)
Customer
(compacted)
Notification Microservice
Notification Logic
Schema
Registry
DB-K_1: Polling of RDBMS table/view
DB-K_2: Change Data Capture (CDC) on RDBMS
DB-K_3: Polling of RDBMS API
DB-K_4: Produce to Event Hub from RDBMS
DB-K_5: RDBMS Queue with bridge to Event Hub
15. DB-K_1: Polling of RDBMS table/view
Event
Hub
Stream Data
Integration
API
Applications / Data Sources
Data FlowRDBMS
Application
Logic
Stream Data
Integration & Analytics
Stream
Analytics
Data Flow
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
16. Event
Hub
Stream Data
Integration
API
Applications / Data Sources
Data FlowRDBMS
Application
Logic
Stream Data
Integration & Analytics
Stream
Analytics
Data Flow
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
DB-K_1: Polling of RDBMS table/view
Kafka Connect with JDBC Source Connector
JDBC Connector part of Confluent Open Source Platform
17. Kafka Connect & JDBC Connector
• Many connectors available
• Single Message Transforms (SMT)
• declarative style, simple data flows
• framework is part of Apache Kafka
https://www.confluent.io/hub
19. DB-K_2: Change Data Capture (CDC) on RDBMS
Stream Data
Integration & Analytics
Stream
Analytics
Event
Hub
Stream Data
Integration
API
Data Flow
Application / Data Sources
Data Flow
Application
Logic
RDBMS
Redo Log
REST to
Event Hub
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
20. Stream Data
Integration & Analytics
Stream
Analytics
Event
Hub
Stream Data
Integration
API
Data Flow
Application / Data Sources
Data Flow
Application
Logic
RDBMS
Redo Log
REST to
Event Hub
Rest Proxy
DB-K_2: Change Data Capture (CDC) on RDBMS
Using Oracle GoldenGate
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
Alternatives:
StreamSets Data Collector
Attunity
Debezium
…
21. DB-K_3: Polling of RDBMS API
Event
Hub
Stream Data
Integration
API
Applications / Data Sources
Data Flow
RDBMS
Application
Logic
API
Stream Data
Integration & Analytics
Stream
Analytics
Data Flow
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
22. Event
Hub
Stream Data
Integration
API
Applications / Data Sources
Data Flow
RDBMS
Application
Logic
API
Stream Data
Integration & Analytics
Stream
Analytics
Data Flow
DB-K_3: Polling of RDBMS API
StreamSets invokes Oracle Rest Data Service
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense Oracle Rest Data Service is part of Oracle RDBMS, StreamSets is Open Source
23. Oracle REST Data Services (ORDS)
• makes it easy to develop modern REST interfaces for relational data in the Oracle
Database and the Oracle Database 18c JSON Document Store
• ORDS maps HTTP(S) verbs (GET, POST, PUT, DELETE, etc.) to database transactions and
returns any results formatted using JSON
• Java middle tier application on WebLogic, Tomcat, Docker, Standalone (for
development)
https://www.oracle.com/database/technologies/appdev/rest.html
25. DB-K_3 – Setup ORDS (II)
ORDS.DEFINE_HANDLER(
p_module_name => 'order_processing',
p_pattern => 'changes/:offset',
p_method => 'GET',
p_source_type => 'resource/lob',
p_items_per_page => 25,
p_source => q'[
'SELECT 'application/json', json_object('orderId' VALUE po.id,
'orderDate' VALUE po.order_date,
'orderMode' VALUE po.order_mode,
'customer' VALUE
json_object('firstName' VALUE cu.first_name,
'lastName' VALUE cu.last_name
'emailAddress' VALUE cu.email),
'lineItems' VALUE (SELECT json_arrayagg(
json_object('ItemNumber' VALUE li.id,
'Product' VALUE
json_object('id' VALUE li.product_id,
'name' VALUE li.product_name,
'unitPrice' VALUE li.unit_price),
'quantity' VALUE li.quantity))
FROM order_item_t li WHERE po.id = li.order_id),
'offset' VALUE TO_CHAR(po.modified_at, 'YYYYMMDDHH24MISS'))
FROM order_t po LEFT JOIN customer_t cu ON (po.customer_id = cu.id)
WHERE po.modified_at > TO_DATE(:offset, 'YYYYMMDDHH24MISS')]'
26. StreamSets Data Collector
• GUI-based, drag-and
drop Data Flow Pipelines
• Both stream and batch
processing
• custom sources, sinks,
processors
• Monitoring and Error
Detection
https://streamsets.com/products/sdc
27. DB-K_4: Produce to Event Hub from RDBMS
Event
Hub
Stream Data
Integration
API
Applications / Data Sources
RDBMS
Application
Logic
API
Stream Data
Integration & Analytics
Stream
Analytics
Data Flow
REST to
Event Hub
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
28. Event
Hub
Stream Data
Integration
API
Applications / Data Sources
RDBMS
Application
Logic
API
Stream Data
Integration & Analytics
Stream
Analytics
Data Flow
REST to
Event Hub
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
DB-K_4: Produce to Event Hub from RDBMS
Native Kafka Producer using Java in DB
Does not feel right! Java in DB
as well as “dual write” problem
29. Event
Hub
Stream Data
Integration
API
Applications / Data Sources
RDBMS
Application
Logic
API
Stream Data
Integration & Analytics
Stream
Analytics
Data Flow
REST to
Event Hub
Rest Proxy
?
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
DB-K_4: Produce to Event Hub from RDBMS
Invoke REST Proxy from PL/SQL
Invoking a REST Service from DB not well-
supported and ”dual write” problem
Confluent REST Proxy is part of Confluent Open Source Platform
30. Event
Hub
Stream Data
Integration
API
Applications / Data Sources
RDBMS
Application
Logic
API
Stream Data
Integration & Analytics
Stream
Analytics
Data Flow
REST to
Event Hub
Oracle Big Data SQL
Coming soon …
DB-K_4: Produce to Event Hub from RDBMS
Oracle Big Data SQL integrates with Kafka
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
31. DB-K_5: RDBMS Queue with bridge to Event Hub
Stream Data
Integration & Analytics
Stream
Analytics
Event
Hub
Stream Data
Integration
API
Data Flow
Application / Data Sources
Data Flow
Application
Logic
RDBMS
Queue
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
32. DB-K_5: RDBMS Queue with bridge to Event Hub
Oracle Advanced Queuing & Kafka Connect JMS
Stream Data
Integration & Analytics
Stream
Analytics
Event
Hub
Stream Data
Integration
API
Data Flow
Application / Data Sources
Data Flow
Application
Logic
RDBMS
QueueAQ
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense Oracle Advanced Queuing is part of Oracle RDBMS
35. DB-K_5: RDBMS Queue with bridge to Event Hub
Oracle AQ with Kafka API & MirrorMaker
Stream Data
Integration & Analytics
Stream
Analytics
Event
Hub
Stream Data
Integration
API
Data Flow
Application / Data Sources
Data Flow
Application
Logic
RDBMS
Queue
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
AQ (Kafka API)
Oracle is working on a Kafka API
for Advanced Queuing
37. Blueprints Apache Kafka => Oracle RDBMS (K-DB)
Customer Microservice
{ }
Customer API CustomerCustomer Logic
Order Processing System
{ }
Order API OrderOrder Logic
REST
REST
Event Hub
Customer
Mat View
Order
(compacted)
Customer
(compacted)
Notification Microservice
Notification Logic
Schema
Registry
K-DB_1: Write to RDBMS table/view
K-DB_2: Write over RDBMS API
K-DB_3: Consume from Event Hub
K-DB_4: Event Hub with bridge to RDBMS Queue
38. K-DB_1: Write to RDBMS table/view
Event
Hub
Stream Data
Integration
API
Applications / Data Sources
Data FlowRDBMS
Application
Logic
Stream Data
Integration & Analytics
Stream
Analytics
Data Flow
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
39. Event
Hub
Stream Data
Integration
API
Applications / Data Sources
Data FlowRDBMS
Application
Logic
Stream Data
Integration & Analytics
Stream
Analytics
Data Flow
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
K-DB_1: Write to RDBMS table/view
Kafka Connect and JDBC Sink Connector
40. K-DB_2: Write over RDBMS API
Event
Hub
Stream Data
Integration
API
Applications / Data Sources
Data Flow
RDBMS
Application
Logic
API
Stream Data
Integration & Analytics
Stream
Analytics
Data Flow
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
41. Event
Hub
Stream Data
Integration
API
Applications / Data Sources
Data Flow
RDBMS
Application
Logic
API
Stream Data
Integration & Analytics
Stream
Analytics
Data Flow
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
K-DB_2: Write over RDBMS API
Kafka Connect invokes Oracle Rest Data Service
Oracle Rest Data Service is part of Oracle RDBMS, REST Connector is Open Source
43. K-DB_3: Consume from Event Hub
Event
Hub
Stream Data
Integration
API
Applications / Data Sources
RDBMS
Application
Logic
API
Stream Data
Integration & Analytics
Stream
Analytics
Data Flow
REST to
Event Hub
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
44. Event
Hub
Stream Data
Integration
API
Applications / Data Sources
RDBMS
Application
Logic
API
Stream Data
Integration & Analytics
Stream
Analytics
Data Flow
REST to
Event Hub
Oracle Big Data SQL
Coming soon…
K-DB_3: Consume from Event Hub
Oracle Big Data SQL exposes topic as table
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
45. K-DB_4: Event Hub with bridge to RDBMS queue
Stream Data
Integration & Analytics
Stream
Analytics
Event
Hub
Stream Data
Integration
API
Data Flow
Application / Data Sources
Data Flow
Application
Logic
RDBMS
Queue
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
46. Stream Data
Integration & Analytics
Stream
Analytics
Event
Hub
Stream Data
Integration
API
Data Flow
Application / Data Sources
Data Flow
Application
Logic
RDBMS
QueueAQ
K-DB_4: Event Hub with bridge to RDBMS queue
Oracle Advanced Queuing & Kafka Connect JMS
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense Oracle Advanced Queuing is part of Oracle RDBMS
47. Stream Data
Integration & Analytics
Stream
Analytics
Event
Hub
Stream Data
Integration
API
Data Flow
Application / Data Sources
Data Flow
Application
Logic
RDBMS
QueueAQ (Kafka API)
Oracle is working on a Kafka API
for Advanced Queuing
K-DB_4: Event Hub with bridge to RDBMS queue
Oracle AQ with Kafka API & MirrorMaker
Flat Aggregate
Low Latency High Latency
DB Dataflow
Message
Latency
ACL
Cost Free CommercialLicense
49. Summary
Customer Microservice
{ }
Customer API CustomerCustomer Logic
Order Processing System
{ }
Order API OrderOrder Logic
REST
REST
Event Hub
Customer
Mat View
Order
(compacted)
Customer
(compacted)
Notification Microservice
Notification Logic
Schema
Registry
K-DB_1: Write to RDBMS table/view
K-DB_2: Write over RDBMS API
K-DB_3: Consume from Event Hub
K-DB_4: Event Hub with bridge to RDBMS Queue
https://github.com/gschmutz/various-demos/tree/master/bidirectional-integration-oracle-kafka
DB-K_1: Polling of RDBMS table/view
DB-K_2: Change Data Capture (CDC) on RDBMS
DB-K_3: Polling of RDBMS API
DB-K_4: Produce to Event Hub from RDBMS
DB-K_5: RDBMS Queue with bridge to Event Hub
50. Bulk Source
Ref Architecture
Data Platform
Service
Event
Stream
Bulk
Data
Flow
Event Source
Location
DB
Extract
File
Weather
DB
IoT
Data
Mobile
Apps
Social
File Import / SQL Import
Consumer
BI Apps
Data Science
Workbench
Enterprise
App
Enterprise Data
Warehouse
SQL / Search
SQL
“Native” Raw
RDBMS
“SQL” / Search
Service
Event
Hub
Hadoop ClusterdHadoop ClusterBig Data Platform
SQL
Export
Storage
Storage
Raw
Refined/
UsageOpt
Microservice Cluster
Stream Processing Cluster
Stream
Processor
Model /
State
Edge Node
Rules
Event Hub
Storage
Governance
Data Catalog
Rules
Engine
Parallel
Processing
Query
Engine
Microservice Data
{ }
API
Event
Stream
Event Stream
Modern Data Platform
Event Stream
51. Bulk Source
Ref Architecture
Data Platform
Service
Event
Stream
Bulk
Data
Flow
Event Source
Location
DB
Extract
File
Weather
DB
IoT
Data
Mobile
Apps
Social
File Import / SQL Import
Consumer
BI Apps
Data Science
Workbench
Enterprise
App
Enterprise Data
Warehouse
SQL / Search
SQL
“Native” Raw
RDBMS
“SQL” / Search
Service
sEvent
Hub
Hadoop ClusterdHadoop ClusterBig Data Platform
SQL
Export
Storage
Storage
Raw
Refined/
UsageOpt
Microservice Cluster
Stream Processing Cluster
Stream
Processor
Model /
State
Edge Node
Rules
Event Hub
Storage
Governance
Data Catalog
Rules
Engine
Parallel
Processing
Query
Engine
Microservice Data
{ }
API
Event
Stream
Event Stream
Modern Data Platform
Event Stream