Traditional systems were designed in an era that predates large-scale distributed systems. These systems often lack the ability to scale to meet the needs of the modern data-driven organisation. Adding to this is the accumulation of technologies and the explosion of data which can result in complex point-to-point integrations where data becomes siloed or separated across the enterprise.
The demand for fast results and decision making, have generated the need for real-time event streaming and processing of data adoption in financial institutions to be on the competitive edge. Apache Kafka and the Confluent Platform are designed to solve the problems associated with traditional systems and provide a modern, distributed architecture and Real-time Data streaming capability. In addition these technologies open up a range of use cases for Financial Services organisations, many of which will be explored in this talk. .
2. Every Industry is Moving from Batch/Manual to
Software-Defined
Auto / Transport
Software-using Software-defined
Spreadsheet-driven driver schedule Real-time ETA
Banking Nightly credit-card fraud checks Real-time credit card fraud prevention
Retail Batch inventory updates Real-time inventory management
Healthcare Batch claims processing Real-time claims processing
Oil and Gas Batch analytics Real-time analytics
Manufacturing Scheduled equipment maintenance Automated, predictive maintenance
Defense Reactive cyber-security forensics Automated SIEM and Anomaly Detection
U.S. Defense
Agencies
2
3. Gartner
Becoming Software-Defined is
a Competitive Requirement
“By 2020 eventsourced, real-time situational
awareness will be a required characteristic for
80% of digital business solutions. And 80% of
new business ecosystems will require support
for event processing.”
4. Data Platform Requirements for Becoming
Software-Defined
Software
-using
Software
-defined
1 3 42
Scalable for
Transactional Data
Transient Raw dataBuilt for
Historical Data
Built for Real-
Time Events
Scalable for
ALL data
Persistent +
Durable
Enriched
data
● State vs. Change
● Historical analysis vs.
real-time operations
● Non-transactional
data is 10x
transactional data
● IoT, Logs, Security
events...
● Mission critical
apps require zero
data loss
● Mission critical
systems require
replay
● Stream Processing
(SQL on RT events)
● Context &
situational
awareness (ex. ETA)
4
5. Cyber Security
High volume log ingest & anomaly
detection
Wealth Management
and Capital Markets
Trade Data Capture, Next Gen Pricing
Applications, Clearing & Settlement, OATS,
Next Gen Advisor workstations
Retail and Corp Banking
Fraud Detection, Credit analytics, Next
Generation Payment Hubs, Open Banking
Market & Credit Risk
Consolidates data across dozens of
disparate risk systems
IT Modernization
Mainframe off-load, “Bridge to the Cloud” ,
microservices
Customer Experience
Tailored offerings and alerts, omni
channel, increase digital engagement
within
Retail & Corp Banking, Wealth
Management
Kafka across Financial Services
6. Some Highlights
Payments:
• Fraud
• Payment Processing
• Settlements
Markets and Trade Data:
• Market Data
• Trade Data Distribution
Regulatory
• MIFID, OATS, CAT etc
• Risk, Liquidity Management
Technology Modernisation
• Data Warehouse Modernisation
• Netezza, Teradata etc to Cloud Data Services
• Messaging
• MQ and TIBCO offload
• Mainframe
• Data offload
6
7. How does event streaming solve data
challenges?
Aggregate events from
everywhere
Store events in persistent,
highly available, and
scalable platform
Transform, filter, and
enrich data in flight with
stream processing
Integrate and share events
with your apps, on prem
and in the cloud
Mainframe Micro-
service 1
Micro-
service 2
KStreams /
KSQL
Database Cloud
Connector
Mainframe Database Cloud
Connector
Micro-
service 1
Micro-
service 2
All in real-time
7
8. Messaging
Modernization
Operational Data
Modernization
(Legacy / Mainframe)
Microservices
Architecture
● Single streaming platform to support multiple data sources and application
deployments with high availability, security and fault tolerant (zero message loss)
● Segregate data between internal and external systems using kafka clusters (internal
and DMZ) and replicating for external systems and customer consumption
● Replacing Tibco and other legacy messaging systems for cost and scale
● Unlock data from legacy databases and Mainframe systems
● Real time transaction data on mainframes made available to external systems (other
business systems or customer apps) with CDC, ETL and stream processing
● Deploy microservices to reduce load on mainframe data and reduce CAPEX/OPEX
● Staying compliant with regulatory and business logic while sharing data
● Sharing with external 3rd party systems to enable customer 360 and open banking
● Access legacy systems and data for newer applications and customer services
● Replatform to modern cloud native architectures and enhance flexibility and SLAs
● Modernize application development and deployment with DevOps
● Completely decoupled microservices deliver distributed and highly scalable solution
Financial Services - Common Pipeline Use Cases
9. ● Moving from batch to real-time data integrations and data filtering, enrichment, transformation
and normalization
● Present data to external systems and processes in the form required for further processing
● Deploy microservices to process event data real-time
● Drive fraud and anomaly detection real-time with interactive processing
● Eliminate complex ETL processes and decrease cost of legacy ETL tools
Streaming ETL
Data Warehouse
Modernization
Real-Time Analytics
● Streaming data across into a centralized data lake for advanced analytics and reporting
● Modernize data warehouses by moving data into public cloud
● Integrate with cloud services around data analytics, visualization and AI
● Clickstream analytics - Website / customer experiences
● Real time recommendations with stream processing
● Real-time visualization, analytics and predictions with stream processing as well as integration
to external systems
Financial Services - Common Pipeline Use Cases
10. SIEM Optimization
● Avoid Vendor Lock in: Splunk, Arcsight or other SIEM solutions provide value but expensive
● Open Architecture: cost effective , right data to right downstream systems at the right time.
● Filter / format event streams and, reduce message sizes sent to SIEM
● Shorten SIEM retention window and save costs.
● Leverage Big Data platform for batch analytics activities and as Data Lake while leveraging
SIEM for real time alerts, dashboards, visualizations etc.
Financial Services - Common Pipeline Use Cases
● Distributed local and global event streaming across data centers (and clouds)
● Disaster recovery
● Migrating on premise data to cloud
● Integrating on premise legacy systems with cloud based apps and services
● Security: encryption and tokenization of data moving across multi sites
● Infrastructure and network monitoring
Hybrid Cloud
12. ● Capital One “Second Look”
● Customers do not check
statements regularly
● Duplicate charges, high tips,
increased recurring charges
go unnoticed
● The right level of signal vs
noise for the consumer
● Preventing $150 of fraud on
average a year/customer
Use Cases:
-Customer 360
-Customer Notifications & Alerts
-Fraud Detection
13. ● Growth in multi channel
fraud
● Attacks based on multiple
LOBs
● LOBs very siloed - separate
fraud systems
● Get a business wide view,
and bring much more data
into the fraud management
process
● Use Machine Learning
against large historical data
sets
Use Cases:
-Fraud Detection, with machine
learning
Major Global Bank
and Payments Processor
payments
credit card
transactions
debit card
transactions
credit card
applications
mortgage
applications
realtime fraud
scoring
ML model
enhancement
fraud case
management
Realtime Fraud Scoring
back testing
on 1 year of trx
14. TOP 5 US GLOBAL BANK
● Identify fraud in near
real-time using geo-location
information coming from
their customers mobile
devices
● Real-time credit
authorizations and new
credit approvals
Use Cases:
-Fraud Detection
-Customer 360
Challenges
● Complex data pipeline - data movement is a collection of point to point
integrations
● Data is often delayed (e.g. 1 day old) and with limited thruput
● Data analysts are operating on stale data leading to poor outcomes.
● Lag time between Credit Bureau updates and bank systems for credit
authorization
Solution
● Built an event cloud to handle real time fraud and credit authorizations
● Identify fraud in near real-time using geo-location information coming from
customers mobile devices
● The fraud system will use that data to either approve or decline the
transaction
● Handle 600+ transactions per second
15. ● Stream and batch pipelines
delivered off of Kafka
message systems, and are
decoupled from the main
transactional system.
● Cloud Dataflow pipeline
provides lower latency
compared to old systems
● New Kafka based solution is
flexible to handle
continuously evolving fraud
threats
Use Cases:
-Fraud Detection
Challenges
Outgrowing dedicated instances running with MySQL, which began to
impact ability to do fraud detection for which query response time is
critical
Solution
New stream analytics pipeline for fraud detection that would give answers
to queries in near-real time without affecting main transactional system
https://cloud.google.com/blog/big-data/2017/08/how-wepay-uses-stream-analytics-for-real-time-fraud-detection-using-gcp-and-apache-kafka
17. Schema
Registry
Data Transformations /
Microservices
Streaming apps
(KStreams & KSQL)
Why Confluent?
The bank’s legacy queue based messaging could not handle the volume of event data being created through the banks transaction systems,
mobile, web, ATM, and call center channels
Event Streaming Platform (based on Kafka) includes messaging, storage, and processing in a single platform that can run globally at scale
...Future sources (e.g. Mainframe)
Fraud,
Marketing,
Fintech
Customer
Support, etc
Data Lake & Customer Graph
Retail Bank - Next Gen Customer 360
18. Ingest & User
Interfaces
Schema
Registry
Data transformations
microservices &
Streaming apps
(KStreams & KSQL)
Key Features
1. Wide range of ingest options with ability enforce schema and data quality
2. Change Data Capture via Oracle Golden Gate or Confluent Connect framework
3. All ingest events are captured and persisted in order (Replay, Audit, Compliance)
4. Optional real-time filtering, enrichment, transformation on ingest or via streaming apps
Data Analytics
API & Connectors
Workflow & Chat
Change Data Capture
Destinations
NoSQL
Oracle
Private Bank’s Enterprise Streaming Platform
20. Third largest corporation in the
Nordic region and one of the top
10 financial services companies in
Europe based on market
capitalisation, presence in 20
countries, including our four
Nordic home markets – Denmark,
Finland, Norway and Sweden,,
offering services across retail,
investment, trading and other
financial services
Use Cases
-Regulatory Compliance
-Messaging Modernization
Challenges
● MiFID II compliance requirements
● Breaking down silos in the bank to get an aggregated view of all
equity trades
● Legacy systems support limited throughput and ETL
Solution
● Centralized Confluent deployment to free data stuck in legacy systems
● JMS client offload data from legacy systems
● Dropped analytics turnaround time from weeks to instant reporting,
● Provided analysts access to trade data in real-time
● Reduced overall platform costs by 73%
Mainframe
APACHE KAFKA
Confluent
JMS
21. TOP 5 US GLOBAL BANK
Top 5 Global Bank delivering a
wide range of financial services
including retail, investment,
trading, market data, Forex,
Mortgages etc
Use Cases
-Regulatory Compliance
-Messaging Modernization
-Data Warehouse Modernization
Challenges
● Mechanism to migrate OATS reporting off of legacy (Netezza) systems
● Handle very high throughput from trading systems the data into HDFS to
leverage new more scalable approaches at OATS reporting.
● OATS reporting is eventually getting deprecated for CAT reporting which will
require more intraday and windowed reporting to occur,
● Provide reliable delivery of market and trade data to current and future
destinations for regulatory reporting (e.g. OATS) and to eventually get ready
for CAT
Solution
● Realtime ingest of trade data into enterprise analytics platform
● Integration with Spark Streaming
● Integrate destinations such as Elastic, HDFS, JDBC using Confluent
Connect
● Stream processing using KStreams and KSQL for filtering, aggregation,
windowing and Schema Registry for governance
● Stream over 100K messages per second into a scalable platform that
could effectively leverage fault tolerant connectors
22. Deduplicate message with Kstreams
Solace Environment
Zookeeper x3
HDFS
Solace Queue (N)
Confluent Environment
Connectors (x2)
JMS
Kafka Broker x3
(note - in test
environment this will
increase to 10)
Kafka HDFS Sink
Connector
Big Data Layer
HBASE
Elastic
Spark
OATS Reporting Layer
Qlik HTML Frontend tool
Reporting done every 5-15 minutes,
OATS generated end of Day
500 M msg/day to start, eventually will
scale to 1 Bn msg/day
Kafka Streams (x2)
Tibco Environment
Tibco EMS
Solace Topics (100s)
● Consume data from Solace and TIBCO EMS (legacy FIX messages)
● deliver high volume equities trade data to HDFS and Hbase as well as other downstream sources such as Elastic and RDBMS.
● Kafka to store all the trade data, in the correct order and make sure it is replicated across the cluster to ensure no data is lost.
● Confluent HDFS Connector with its batching features to avoid sending small files into HDFS, Connector also integrates with
the HIVE metastore..
● Use Spark to create the OATS reports
Top 5 US Global Bank: Example Architecture (OATS reporting)
24. Market Data in Cloud
Market Data
Analytics
Analytics
Layer
Apps
Data Access
Visualization
tools
Integration &
Transformation
Data Serialization,
Transformation
Major Exchange
Market Data-in-the Cloud
Bridge to
Cloud
Secure,
Filter, Enrich
Confluent Cloud
24
Market Data in the Cloud is a project to bring market data
feeds into public clouds in support of the following objectives:
1. Provide a go-to-market strategy to provide customers
Exchange Data Feeds in AWS, GCP, and Azure
2. Move to modern architectures that will allow the
business to scale and grow to hundreds of customers
and quickly add new feeds
3. Create differentiating analytics directly on the
exchange data feeds, which will provide another source
of insight
4. Secure, scalable, reliable bridge to propagate data as a
stream of events from the exchange data centers to all
major public clouds.
25. What role does
Confluent play in
the Market Data as
a service project?
Data Pipeline | Stream
Processing
Provide a real-time market data
service in the customer's
preferred cloud provider
Integration
Enable customers to
consume Market Data feeds
across public cloud domains
90+ connectors for future
integrations to cloud natives
services (object stores,
analytic warehouses, etc)
Scale up and down
Confluent allows the exchange to
easily add and remove data feeds or
react to changes in data volumes
without the need for extensive
planning, hardware provisioning or
complicated procurement efforts
after Phase 1
Efficient, on-demand, and elastic
infrastructure reduces costs
New Analytics
Provide new analytics to the
exchange’s customers to
compete with data
aggregators (e.g. Bloomberg,
Reuters)
Easily join, filter, enrich,
window and transform data
using the native stream
processing engine to
produce analytics 25
26. First pan-European exchange,
Euronext operates regulated
securities and derivatives markets
in Amsterdam, Brussels, Lisbon
and Paris, as well as a regulated
securities market in Ireland and
the UK.
Use Cases
-Market Data & Trade Platform
-Messaging Modernization
Challenges
● Develop a new trading platform for markets across multiple countries in EU
● Support high-volume, high-speed trading, provides clients with access to
real-time data.
● Mission-critical platform trading platform will support the market
capitalization of six European countries
Solution
● Developed a new event-driven trading platform, Optiq®, with tenfold increase
in capacity
● Average performance latency of 15ms for order round trip as well as for
market data
● Billions of messages per day with millisecond latencies
● Reliable 24/5 operations with dedicated enterprise support
● Development applications that use the Kafka Streams API library to perform
data enrichment in real time
“We have been very satisfied with Confluent Platform as the backbone of our persistence
engine. The platform has been super reliable. We have stringent requirements for real-time
performance and reliability, and we have confirmed - from proof-of-concept to deployment
of a cutting-edge production trading platform – that we made the right decision”
- Alain Courbebaisse, Chief Information Officer, Euronext
28. Modernize
your apps
Make your applications
more valuable with
real time insights
enabled by next-gen
architecture
DATA INTEGRATION
Database changes
Log
events
IoT
events
Web events
Connected car
Fraud detection
Customer 360
Personalized
promotions
Apps driven by
real-time data
Quality
assurance
SIEM/SOC
Inventory
management
Proactive patient
care
Sentiment
analysis
Capital
management
Modernize
your apps
Amazon
Kinesis
Amazon
S3
29. TOP 5 US GLOBAL BANK
Company goal of reducing multi
billion dollars in opex by 2020
Moving to modern analytics
taking advantage of native cloud
services
Better understand and serve
customers through insights
gained by combining data assets
across lines of business
Secure, scalable, reliable bridge to
propagate data in real-time from
data centers to the public cloud
(GCP)
Top 5 Global Bank wants to bring data assets into the public cloud
(GCP) to drive advanced analytics and cloud services
Cloud
Services
Analytics
layer
Apps
Data Access
Visualization
tools
Integration &
Transformation
Data Validation,
Transformation
Hybrid Cloud with Analytics
Bridge to
Cloud
Secure, Catalog,
Filter, Enrich
Hybrid Cloud
30. Mainframe Pain Points:
High Cost of Operations:
- Mainframe backing user interactions
- Cost incurred for every user click
- All transactions written to mainframe
Delayed Reporting
- Read transactions from Mainframe
- Read descriptions from other systems
- Reconcile overnight to serve user queries
Confluent Solution:
Mainframe Offload
CQRS Pattern
- Writes to mainframe thru Kafka
- Reads from Kafka
- Majority of load (i.e. cost) moved off
Mainframe
Real Time reporting
- Read transactions from Kafka topic
- Read descriptions from another Kafka topic
- Simple microservices that reconciles
transactions and descriptions to be served
from search
IT Modernization - Mainframe Modernization (Offloading)
31. Modernizing how you access data and build apps
Microservice
Mainframe
Connectors:
CDC
MQ
REST Proxy
Trigger
Event Acknowledge
Microservice
Microservice
Easily unlock
data stored on
legacy systems
Unleash developer velocity
by moving to a completely
decoupled microservices
architecture
Enable seamless hybrid
and multi cloud data
access and app
development
31
33. UK retail finance organisation
who are a major provider of
mortgages, and other
financial services
Use Cases:
-Mainframe Offloading
-Legacy to modern
application Integrations
-Data analytics and reporting
Challenges
● Competition from digital first banks driving disruption
and modernization
● Digital disruption efforts including open banking,
regulatory requirements and expose data through APIs
● High and unpredictable data volumes, 24x7 SLA and
availability requirements,
● Protect core Systems of Record (SORs) from the external
loads / applications.
● Drive Cloud adoption and IT modernization strategy
Solution
● Developed an event based real-time data platform on Confluent
called “Speed Layer”
● Speed Layer - preferred source of data for high-volume read-only
data requests and event sourcing.
● Delivered secure, near real time customer, account and
transaction information from back end systems to front end
systems with speed and resilience.
● Microservices architectures to onboard new use cases quickly
and easily
● Maintain service availability despite unprecedented demand,
agility and autonomy in digital development teams
Mainframe Offloading - Major UK FS Org and Mortgage Provider
36. Top 5 Asian commercial
banking group i in terms of
assets and deposits,
corporate and consumer
products across retail
banking, commercial lending,
mortgage etc
Use Cases:
-Mainframe Offloading
-Legacy to modern
application Integrations
-Data analytics and reporting
Challenges
● Share real time transaction data with external systems
including customer facing systems, trigger downstream
events and activities based on data coming in and help
● drive visualization and real-time reporting to promptly
respond to customer and business needs.
● Meet specific performance criteria for the time to
replicate the transactions from the backend systems and
make data available to external systems.
Solution
● Customer backend systems runs on mainframes (Z series) with
DB2 backend hosting the transaction data.
● The transaction data is replicated within a small time window to
an event streaming platform
● Event streaming transformed data sent to various consumer
applications.
● Specific subset of data exposed to specific external analytics and
customer facing applications
● Bank vision: Expand visibility and integration of real-time
transaction data across the various banking applications.
Mainframe Offloading - TOP 5 ASIAN BANK
37. Mainframe offloading:
● Replicated transaction data from DB2 and
Oracle DB backend to Confluent Platform.
● IBM Infosphere used as the CDC layer
● Stream processing to transform data
● Validate transaction replication time,
resilience test with reporting, administration,
monitoring and security considerations.
37
Data Analytics: Visualization and reporting:
● Streaming data from Confluent provided
specific topics that were subscribed by
MongoDB (using Sink connector)
● Data was transformed to the specific schema
requirements within MongoDB
● MongoDB charts were used to visualize the
data to provide real time reporting and
visualization
Solution Architecture Details
40. Common Patterns and Themes
Regardless of whether it’s Retail Banking, Global Markets, Commercial Banking….
There’s common demands and architectural patterns emerging across the board
● Be more agile, and respond to changing customer and regulatory requirements
● Differentiate through technology
● Reduce reliance on, and where possible, decommission legacy
● Build new applications that can be deployed on Cloud
● Respond in real time, not batch
Many of these are driving the adoption of Microservices and Event Streaming