Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with Amazon Bedrock, Rockset and Confluent Cloud

@yourtwitterhandle | developer.confluent.io
What are the best practices to debug client applications
(producers/consumers in general but also Kafka Streams
applications)?
Starting soon…
STARTING SOOOOON..
Starting sooooon ..
Starting soon…
Starting soon…

What are the best practices to debug client applications
(producers/consumers in general but also Kafka Streams
applications)?
Starting soon…
STARTING SOOOOON..
Starting sooooon ..
Starting soon…

Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Real-time AI
Model
Config
Params,
Features
Vector Store
Object store
AI-powered
Apps
Telemetry
MLOps Pipelines
Training
Data
Output

Goal
Partners Tech Talks are webinars where subject matter experts from a Partner talk about a
specific use case or project. The goal of Tech Talks is to provide best practices and
applications insights, along with inspiration, and help you stay up to date about innovations
in confluent ecosystem.

I will add a slide about the next talk

Starting soon…
STARTING SOOOOON..
Starting sooooon ..

The Rise of Data Streaming for GenAI
Kai Waehner
Field CTO
kai.waehner@confluent.io
linkedin.com/in/kaiwaehner
@KaiWaehner
confluent.io
kai-waehner.de

The Rise of Data Streaming
Real-time Data beats Slow Data.
Logistics
Real-time sensor
diagnostics
Delivery planning
ETA updates
Payment
Fraud detection
Risk systems
Mobile applications /
customer experience
Retail
Real-time inventory
Real-time POS
reporting
Personalization
Sales
Real-time
recommendations
Personalized
coupon feed
Pay by walking out

Data Streaming to Unlock the Value of Data

Universal Data Products
Write Your Data as a Stream or Table, Read It Anywhere

Challenge: Build a conversational chatbot service that
incorporates complex technologies such as fulfillment,
natural-language understanding, and real-time analytics
Solution: Use Confluent to build a fast, super-scalable
event-driven architecture that could handle immense traffic
spikes and also provide other guarantees around delivery
semantics
Results:
● Near-zero downtime even during huge traffic spikes
● Rapid acceleration of new-skill onboarding
● Doubling of NPS rating
Virtual Agent Platform:
(Marc Silbey, VP of Product at Expedia)

Data Products Versioned in git
Schema
Registry
Conﬂuent Cloud
Consumer
Group
LLM API Gateway
LLM Instances
LLM Service
Schema Specs
Terraform
Web Chat Agent
MongoDB
Vector Search
Reasoning Agent
How Conﬂuent Works with Gen AI - Big Picture
Enforce Business Logic and Compliance Requirements with LLM Outputs
Post- Processing
Consumer Group
RAG
Airport - Physical Location
Airline Website
Bag Tracking
Cancelation/Delays
Crew Scheduling
Online Bookings
Special Offers
Loyalty Rewards
Data Governance
Field Level Tags Data Rules
Real-Time Airport, Flight, and
Price Validation
Flink SQL
Valid Answer
User Question
Kafka Topics
Producer
Connector
Connector
Connector
Producer
Connector
Crew
Customer
Flight
Gate

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved
Amazon Bedrock
The easiest way to build and scale generative AI
applications with foundation models
Solutions Architect
EMEA Data & AI ISV Champion & Worldwide Ambassador
Amazon Web Services (AWS)
Steffen Schneider
st-sch

16
What generative AI customers are asking for
Which model
should I use?
How can I
move quickly?
How can I keep
my data secure
and private?

17
Amazon Bedrock
The easiest way to build and scale
generative AI applications with
foundation models (FMs)
Choice of leading FMs through a single API
Model customization
Retrieval Augmented Generation (RAG)
Agents that execute multistep tasks
Security, privacy, and safety

18
Integration
Choice Customization Security and
governance
Amazon Bedrock
simpliﬁes

19
Summarization,
complex reasoning,
writing, coding
Contextual answers,
summarization,
paraphrasing
High-quality images
and art
Text generation,
search, classiﬁcation
Q&A and reading
comprehension
Text summarization,
generation,
Q&A, search,
image generation
Amazon Titan
Text Premier
Amazon Titan
Text Lite
Amazon Titan
Text Express
Amazon Titan Text
Embeddings
Amazon Titan Text
Embeddings V2
Amazon Titan
Multimodal
Embeddings
Amazon Titan
Image Generator
Claude 3.5 Sonnet
Claude 3 Opus
Claude 3 Sonnet
Claude 3 Haiku
Claude 2.1
Claude 2
Claude Instant
Llama 3 8B
Llama 3 70B
Llama 2 13B
Llama 2 70B
Command
Command Light
Embed English
Embed Multilingual
Command R+
Command R
Stable Diffusion XL1.0
Stable Diffusion
XL 0.8
Jamba-Instruct
Jurassic-2 Ultra
Jurassic-2 Mid
Mistral Small
Mistral Large
Mistral 7B
Mixtral 8x7B
Text summarization,
text classiﬁcation,
text completion,
code generation, Q&A
BROAD CHOICE OF MODELS
Amazon Bedrock

20
Enabling semantic
(vector) search
across our services
Amazon
DocumentDB
Amazon Neptune
Amazon DynamoDB
via zero-ETL
Amazon MemoryDB
for Redis
Amazon
OpenSearch Service
Amazon RDS for PostgreSQL
Amazon
OpenSearch Serverless
Amazon Aurora
PostgreSQL

21
Storing vectors and data together
Avoid additional
licensing and
management
Provide a faster
experience to
end users
Reduce the need
for data sync
and movement
Use familiar tools
that meet your
requirements

22
ENABLE GENERATIVE AI APPLICATIONS TO EXECUTE MULTISTEP TASKS USING COMPANY SYSTEMS AND DATA
SOURCES
Agents for Amazon Bedrock
Breaks down and orchestrates tasks
Securely accesses and retrieves company data for RAG
Takes action by invoking API calls on your behalf
Chain-of-thought trace and ability to modify agent prompts
SELECT YOUR
FOUNDATION MODEL
PROVIDE BASIC
INSTRUCTIONS
SELECT RELEVANT
DATA SOURCES
SPECIFY AVAILABLE
ACTIONS
1 2 3 4

23
Guardrails for
Amazon Bedrock
Configure harmful content filtering
based on your responsible AI policies
Define and disallow denied topics with
short natural language descriptions
Redact or block sensitive information
such as PIIs, and custom Regex
IMPLEMENT SAFEGUARDS CUSTOMIZED TO
YOUR APPLICATION REQUIREMENTS
AND RESPONSIBLE AI POLICIES
Apply guardrails to multiple foundation
models and Agents for Amazon Bedrock

24
None of the customer’s data is used
to train the underlying models
All data is encrypted in transit and at rest;
data used for customization is securely
transferred through customer’s VPC
Support for GDPR, SOC, ISO, CSA
compliance, and HIPAA eligibility
Data remains in the Region where the
API is processed
Amazon Bedrock
HELPS KEEP YOUR DATA
SECURE AND PRIVATE

Partner Integration Designer
PID is an internal tool that helps build customer and partner facing demos faster.
PID has the following features and capabilities:
● Drag and drop UI for designing demos
● Includes many building blocks from the Kafka and the wider ecosystem including but not limited to:
○ Relational/NoSql databases
○ Source and Target Connectors
○ KSQL
○ Producers/consumers
● Allows industry speciﬁc random data to be streamed into your demo
● Allow demos to be shared with others
25

Enables real-time updates and handles
high-dimensional data effectively, while
integrating with other database features for
robust functionality.
Components
1. Documents are published using
connectors or Kafka APIs.
2. Each document is split into chunks for
better granularity and to enable parallel
processing.
3. Embeddings are created for each chunk
using the Bedrock embeddings service.
4. The embeddings/chunks are indexed in a
vector database using sink connectors.
Key Points
● Achieve real-time relevance with the vector
database.
● Utilize a microservice architecture for
scalability.
● Enable each microservice to independently
scale for enhanced performance,
responsiveness, and stability.
● Leverage Bedrock embeddings to enhance
vector quality.
Unstructured Document Indexing

Chatbot use case
Utilizes artiﬁcial intelligence tailored for
genomic data, assisting users in tasks like
data analysis, interpretation, and
exploration within the realm of genomics.
Components
1. Chat interaction serves as the human
interface to the system.
2. An embedding is generated for each
human interaction.
3. Embeddings are utilized to discover similar
documents, aiding the prompt engineering
service.
4. Prompts are crafted from various data
sources (Vector search results, Conversation
history/summary, …)
5. The prompt is forwarded to Bedrock LLM.
6. Conversations can be summarized for
utilization by the prompt engineering
service.
Key Points
● Utilize a microservice architecture for
scalability.
● Enable each microservice to independently
scale for enhanced performance,
responsiveness, and stability.
● Leverage Bedrock LLM

All together

Q&A

Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with Amazon Bedrock, Rockset and Confluent Cloud

More Related Content

Similar to Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with Amazon Bedrock, Rockset and Confluent Cloud

More from confluent

Recently uploaded

Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with Amazon Bedrock, Rockset and Confluent Cloud