Evolving Data Governance for the
Real-time Streaming and AI Era
Andrew Foo
Customer Solutions @ Confluent
Would you blindly cross the street with
traffic information that is 5 minutes old?
Generative AI is a revolutionary tool… and it’s
only getting better.
/imagine prompt:Street style photo of a woman shot on Kodak
July 2022 July 2023
Source: https://twitter.com/nickfloats/status/1676279157620199424?s=46&t=plcKoQYXnokFvxs3ieVg3Q
Recency, quality,
trustworthiness and
instant applicability
of data is as
important as the
models themselves.
Source: https://au.pcmag.com/ai/103906/air-canada-must-honor-a-fake-refund-policy-created-by-its-chatbot-court-says
Source: https://au.pcmag.com/ai/103906/air-canada-must-honor-a-fake-refund-policy-created-by-its-chatbot-court-says
Without context,
trustworthiness
or real-time data
applicability,
LLMs can’t drive
meaningful value
What is the status of
my flight to New York?
It is currently delayed by 2 hours and
expected to depart at 5 pm GMT.
Is there another flight available
to the same city that will depart
and arrive sooner? What are the
seating options and cost?
The next available flight to New York
with United departs later but will
arrive faster than your current flight.
The only available seats in this flight
are first class window seats and
costs $1,500.
Can your GenAI
assistant remember
data from an earlier
conversation?
What is the source of
this information? Is this
trustworthy? Is it fresh
and accurate?
How do you securely augment
customer data with real-time
data and process them on the fly
to provide meaningful insights?
“Our latest research estimates
that generative AI could add
the equivalent of $2.6 trillion
to $4.4 trillion annually across
the 63 use cases we analyzed.”
Source: Economic Potential of Generative AI, McKinsey
What we’ll
talk about
● The data architecture challenge
● Unifying the operational and analytical worlds
● Connecting governed data streams to power AI
● Benefits of a modern data streaming platform
Traditional enterprise data architecture
is a GenAI innovation bottleneck
Historic Public Data
Generative
AI Model
Intelligent
Business-Specific
Co-Pilot
User Interaction
??
Enterprise data architecture
In-context learning &
prompt-time assembly
9
10
11
DATA MESS = DEVELOPER PAIN
DATA MESS DATA PRODUCTS
12
Point-to-Point
Data Extracted by Consumer
Multi-Subscriber
Producer Presented
So, what’s stopping us?
13
ANALYTICAL ESTATE
OPERATIONAL ESTATE
14
ANALYTICAL ESTATE
OPERATIONAL ESTATE
15
DATA PRODUCTS
What if we could unite them?
16
17
Serve the needs of
applications to transact with
customers in real-time
Support after-the-fact business
analysis and reporting for various
stakeholders
OPERATIONAL ESTATE ANALYTICAL ESTATE
18
OPERATIONAL ESTATE
19
Kafka is the Open Standard for the Operational Estate
OPERATIONAL ESTATE
20
ANALYTICAL ESTATE
21
S3 / GCS / ABS
ANALYTICAL ESTATE
22
Iceberg is the Open Standard for the Analytical Estate
ANALYTICAL ESTATE
23
ANALYTICAL ESTATE
OPERATIONAL ESTATE
STREAM
Analytical Product
24
Universal Data Product
25
Universal Data Product
Powered by a Streaming Platform
26
Universal Data Product
Kafka Topic + Schema + Owner
27
DEVELOPERS
SECURITY & COMPLIANCE
28
29
Shift Your Governance Thinking to the Left
30
The Cleanest Data is Always Bottled at the Source
31
POINT-IN-TIME LINEAGE
LINEAGE SEARCH
Stream Lineage
TECHNICAL METADATA
BUSINESS METADATA
Stream Catalog
`
TECHNICAL METADATA
BUSINESS METADATA
Stream Quality
32
Online Purchase
In Store Purchase
Customer Detail
Purchases
Click Stream
Customer 360 Analytical Reports
Gen AI
Online Apps
From Vicious Cycle to Virtuous Circle
Challenge: Judo Bank needed to replace a series of
point-to-point integrations and core lending platform with a
new, unified system, and re-architect the foundational IT
infrastructure to drive event-first thinking and event-driven
principals.
Solution: Judo banke leverages Confluent Cloud for an easy,
agile, holistic management of a suite of services and creation of
a new CRM system and loan originator capabilities.
Results:
● Consistent,unified system
● Better data availability and time to market
● Bettersupportformicroservices
“Confluent is a strategic platform for us. With every
project we look at, we now think about how we use
Confluent to move things around and join things
together” — Niko Bielovich, General Manager, Services
Management
Industry: Banking | Geo: APAC | Product: Confluent Cloud
Click here to learn more.
“Everything we do is in real time because batch processing is an old
way of thinking. The longer your data waits, the less value it has. So,
as data comes through, you need to be able to act on it, or enrich it
quickly. Confluent enables this for us.”
— Rajay Rai, Chief Information Officer at Trust Bank
Challenge: Building a secure, digital-only bank to power unique, secure,
and real-time experiences for customers.
Solution: Trust Bank leverages Confluent’s data streaming platform for its
event-driven architecture, enabling different teams to produce, share, and
consume self-service data products in the form of real-time streams, drive
innovation, improve agility, reduce the total cost of ownership, and ensure
the appropriate quality controls and security policies are applied across
the organization.
Results:
● A scalable and resilient platform to power new, real-time experiences
for banking customers
● Built-in governance to meet regulations, gain customer trust, and
break down data silos so teams have self-service access to find, browse,
create, share, and reuse data, wherever and whenever it’s needed
● Lower total cost of ownership (TCO)
● Unscheduled downtime for critical systems does not exceed four hours
within any 12-month period
“Everything we do is in real time because batch processing is
an old way of thinking. The longer your data waits, the less
value it has. So, as data comes through, you need to be able to
act on it, or enrich it quickly. Confluent enables this for us.” —
Rajay Rai, Chief Information Officer
Industry: Financial Services | Geo: APAC | Product: Confluent Cloud
Click here to learn more.
CONNECT
PROCESS
GOVERN
SHARE
Custom Apps &
Microservices
Data Systems
STREAM
AI/ML Modeling
Inventory Payments
Personalization
Fraud Supply Chain
Recommendations
From Data Mess To Data Products
To Instant Value
Everywhere
36
DATA MESS = DEVELOPER PAIN
DATA PRODUCTS = DEVELOPER GAIN

Evolving Data Governance for the Real-time Streaming and AI Era

  • 1.
    Evolving Data Governancefor the Real-time Streaming and AI Era Andrew Foo Customer Solutions @ Confluent
  • 2.
    Would you blindlycross the street with traffic information that is 5 minutes old?
  • 3.
    Generative AI isa revolutionary tool… and it’s only getting better. /imagine prompt:Street style photo of a woman shot on Kodak July 2022 July 2023 Source: https://twitter.com/nickfloats/status/1676279157620199424?s=46&t=plcKoQYXnokFvxs3ieVg3Q
  • 4.
    Recency, quality, trustworthiness and instantapplicability of data is as important as the models themselves. Source: https://au.pcmag.com/ai/103906/air-canada-must-honor-a-fake-refund-policy-created-by-its-chatbot-court-says Source: https://au.pcmag.com/ai/103906/air-canada-must-honor-a-fake-refund-policy-created-by-its-chatbot-court-says
  • 5.
    Without context, trustworthiness or real-timedata applicability, LLMs can’t drive meaningful value What is the status of my flight to New York? It is currently delayed by 2 hours and expected to depart at 5 pm GMT. Is there another flight available to the same city that will depart and arrive sooner? What are the seating options and cost? The next available flight to New York with United departs later but will arrive faster than your current flight. The only available seats in this flight are first class window seats and costs $1,500. Can your GenAI assistant remember data from an earlier conversation? What is the source of this information? Is this trustworthy? Is it fresh and accurate? How do you securely augment customer data with real-time data and process them on the fly to provide meaningful insights?
  • 6.
    “Our latest researchestimates that generative AI could add the equivalent of $2.6 trillion to $4.4 trillion annually across the 63 use cases we analyzed.” Source: Economic Potential of Generative AI, McKinsey
  • 7.
    What we’ll talk about ●The data architecture challenge ● Unifying the operational and analytical worlds ● Connecting governed data streams to power AI ● Benefits of a modern data streaming platform
  • 8.
    Traditional enterprise dataarchitecture is a GenAI innovation bottleneck Historic Public Data Generative AI Model Intelligent Business-Specific Co-Pilot User Interaction ?? Enterprise data architecture In-context learning & prompt-time assembly
  • 9.
  • 10.
  • 11.
    11 DATA MESS =DEVELOPER PAIN
  • 12.
    DATA MESS DATAPRODUCTS 12 Point-to-Point Data Extracted by Consumer Multi-Subscriber Producer Presented
  • 13.
  • 14.
  • 15.
  • 16.
    What if wecould unite them? 16
  • 17.
    17 Serve the needsof applications to transact with customers in real-time Support after-the-fact business analysis and reporting for various stakeholders OPERATIONAL ESTATE ANALYTICAL ESTATE
  • 18.
  • 19.
    19 Kafka is theOpen Standard for the Operational Estate OPERATIONAL ESTATE
  • 20.
  • 21.
    21 S3 / GCS/ ABS ANALYTICAL ESTATE
  • 22.
    22 Iceberg is theOpen Standard for the Analytical Estate ANALYTICAL ESTATE
  • 23.
  • 24.
  • 25.
    25 Universal Data Product Poweredby a Streaming Platform
  • 26.
    26 Universal Data Product KafkaTopic + Schema + Owner
  • 27.
  • 28.
  • 29.
    29 Shift Your GovernanceThinking to the Left
  • 30.
    30 The Cleanest Datais Always Bottled at the Source
  • 31.
    31 POINT-IN-TIME LINEAGE LINEAGE SEARCH StreamLineage TECHNICAL METADATA BUSINESS METADATA Stream Catalog ` TECHNICAL METADATA BUSINESS METADATA Stream Quality
  • 32.
    32 Online Purchase In StorePurchase Customer Detail Purchases Click Stream Customer 360 Analytical Reports Gen AI Online Apps From Vicious Cycle to Virtuous Circle
  • 33.
    Challenge: Judo Bankneeded to replace a series of point-to-point integrations and core lending platform with a new, unified system, and re-architect the foundational IT infrastructure to drive event-first thinking and event-driven principals. Solution: Judo banke leverages Confluent Cloud for an easy, agile, holistic management of a suite of services and creation of a new CRM system and loan originator capabilities. Results: ● Consistent,unified system ● Better data availability and time to market ● Bettersupportformicroservices “Confluent is a strategic platform for us. With every project we look at, we now think about how we use Confluent to move things around and join things together” — Niko Bielovich, General Manager, Services Management Industry: Banking | Geo: APAC | Product: Confluent Cloud Click here to learn more.
  • 34.
    “Everything we dois in real time because batch processing is an old way of thinking. The longer your data waits, the less value it has. So, as data comes through, you need to be able to act on it, or enrich it quickly. Confluent enables this for us.” — Rajay Rai, Chief Information Officer at Trust Bank Challenge: Building a secure, digital-only bank to power unique, secure, and real-time experiences for customers. Solution: Trust Bank leverages Confluent’s data streaming platform for its event-driven architecture, enabling different teams to produce, share, and consume self-service data products in the form of real-time streams, drive innovation, improve agility, reduce the total cost of ownership, and ensure the appropriate quality controls and security policies are applied across the organization. Results: ● A scalable and resilient platform to power new, real-time experiences for banking customers ● Built-in governance to meet regulations, gain customer trust, and break down data silos so teams have self-service access to find, browse, create, share, and reuse data, wherever and whenever it’s needed ● Lower total cost of ownership (TCO) ● Unscheduled downtime for critical systems does not exceed four hours within any 12-month period “Everything we do is in real time because batch processing is an old way of thinking. The longer your data waits, the less value it has. So, as data comes through, you need to be able to act on it, or enrich it quickly. Confluent enables this for us.” — Rajay Rai, Chief Information Officer Industry: Financial Services | Geo: APAC | Product: Confluent Cloud Click here to learn more.
  • 35.
    CONNECT PROCESS GOVERN SHARE Custom Apps & Microservices DataSystems STREAM AI/ML Modeling Inventory Payments Personalization Fraud Supply Chain Recommendations From Data Mess To Data Products To Instant Value Everywhere
  • 36.
    36 DATA MESS =DEVELOPER PAIN DATA PRODUCTS = DEVELOPER GAIN