Amazon Q and Bedrock,
fully managed vs custom
Alessandra Bilardi
Data & Automation Specialist @ Corley Cloud
>>AI CONF 2025
AI Conf 2025
Amazon Q and Bedrock,
fully managed vs custom
AI Conf 2025
Oltre 500 progetti su AWS
Corley Cloud è una realtà certificata
con innumerevoli riconoscimenti e
un portfolio di centinaia di progetti
AWS sviluppati in diversi ambiti:
cloud native, migrazione, machine
learning & AI, serverless, IoT,
sicurezza e cloudOps.
Advanced Partner AWS
AI Conf 2025
Data & Automation Specialist @ Corley Cloud
alessandra.bilardi@corley.it
corley.it
Alessandra Bilardi
AI Conf 2025
Alessandra Bilardi
Data & Automation Specialist @ Corley Cloud
alessandra.bilardi@corley.it
corley.it
AI Conf 2025
Alessandra Bilardi
Data & Automation Specialist @ Corley Cloud
alessandra.bilardi@corley.it
corley.it
AI Conf 2025
SUMMARY
Machine learning steps and actors
Generative AI with Amazon Q
Generative AI with Amazon Bedrock
Chat bot
AI Conf 2025
Machine learning steps and actors
AI Conf 2025
What are the steps of ML ?
➔ The data may arrive ready for learning,
but often some processing is needed
➔ Model training could be delegated to an
AI system, except for custom steps
➔ Evaluation is a prediction for which we
know the expected values, for which we
can calculate metrics
➔ The prediction works on new data
processed with point 1 with the best
model saved in point 3
Preparation
Training
& Tuning
Testing
& Evaluation
Prediction
(inference)
AI Conf 2025
ML system
AI Conf 2025
ML system
AI Conf 2025
ML system
AI Conf 2025
ML system
AI Conf 2025
ML
AI Conf 2025
ML
AI Conf 2025
Are there other steps or actors in ML ?
➔ Embeddings are objects that contain
information about text, images, videos,
audio or code
➔ The prompt is the text that contains the
behavior that the model must have, the
instructions to follow to respond to the
request posed.
➔ Augmented Generation (AG) techniques
allow us to exploit generalist models by
providing them with instructions (the
prompt), context (an extract of the
embeddings) and a request to obtain a
specific response.
Question
AG
Answer
Embeddings
& Prompt
LM
AI Conf 2025
Use case - Chat bot - preparation steps
AI Conf 2025
Use case - Chat bot - preparation steps
AI Conf 2025
Use case - Chat bot - preparation steps
AI Conf 2025
Use case - Chat bot - preparation steps
AI Conf 2025
Use case - Chat bot - preparation steps
AI Conf 2025
Use case - Chat bot - preparation steps
AI Conf 2025
Use case - Chat bot - preparation steps
AI Conf 2025
Use case - Chat bot - preparation steps
AI Conf 2025
Use case - Chat bot - preparation steps
AI Conf 2025
Use case - Chat bot - preparation steps
AI Conf 2025
Generative AI with Amazon Q
AI Conf 2025
Amazon Q
AI Conf 2025
Amazon Q
AI Conf 2025
Amazon Q
AI Conf 2025
Amazon Q
Business
AI Conf 2025
Amazon Q
Business
1. Embedding (as needed)
AI Conf 2025
Amazon Q
Business
1. Embedding (as needed)
AI Conf 2025
Amazon Q
Business
1. Embedding (as needed)
AI Conf 2025
Amazon Q
Business
1. Embedding (as needed)
2. Request (question)
3. RAG
4. Response (answer)
AI Conf 2025
Amazon Q
Business
1. Embedding (as needed)
2. Request (question)
3. RAG
4. Response (answer)
AI Conf 2025
Amazon Q
Developer
1. Request
AI Conf 2025
Amazon Q
Developer
1. Request
2. Embedding (as needed)
AI Conf 2025
Amazon Q
Developer
1. Request
2. Embedding (as needed)
3. Context
AI Conf 2025
Generative AI with Amazon Bedrock
AI Conf 2025
Amazon Bedrock
1
2 Knowledge base (embedding)
3 Agent
Models
4 Prompt
AI Conf 2025
Models
AI Conf 2025
Models
AI Conf 2025
Models
AI Conf 2025
Models
AI Conf 2025
Models
AI Conf 2025
Models
AI Conf 2025
Models
AI Conf 2025
Knowledge base
AI Conf 2025
Knowledge base
AI Conf 2025
Knowledge base
AI Conf 2025
Knowledge base
AI Conf 2025
Knowledge base
AI Conf 2025
Agent
AI Conf 2025
Agent
AI Conf 2025
Agent
AI Conf 2025
Agent
AI Conf 2025
Agent
AI Conf 2025
Agent
AI Conf 2025
Agent
AI Conf 2025
Agent
AI Conf 2025
Prompt
AI Conf 2025
Prompt
AI Conf 2025
Prompt
AI Conf 2025
Prompt
AI Conf 2025
Prompt
AI Conf 2025
Flows
AI Conf 2025
Flows
AI Conf 2025
Flows
AI Conf 2025
Amazon Bedrock
1
2 Model evaluation
3 Playground
Data automation
4 Prompt routers / caching
AI Conf 2025
Chat bot
AI Conf 2025
Use case - Chat bot - preparation steps
AI Conf 2025
Use case - Chat bot - preparation steps
AI Conf 2025
Use case - Chat bot - production version 1.0
AI Conf 2025
Use case - Chat bot - production version 1.0
AI Conf 2025
Use case - Chat bot - production version 1.0
AI Conf 2025
Use case - Chat bot - production version 1.0
AI Conf 2025
Use case - Chat bot - production version 1.0
AI Conf 2025
Use case - Chat bot - production version 1.1
AI Conf 2025
Use case - Chat bot - production version 2.0
AI Conf 2025
Use case - Chat bot - production version 2.0
AI Conf 2025
Use case - Chat bot - production version 2.0
AI Conf 2025
Use case - Chat bot - production version 2.0
AI Conf 2025
Use case - Chat bot - production version 2.0
AI Conf 2025
Which infrastructure for the
ChatBot ?
AI Conf 2025
Solutions
1. like ChatGPT, max 30s
AI Conf 2025
Solutions
1. like ChatGPT, max 30s
2. extend 30s timeout
AI Conf 2025
Solutions
1. like ChatGPT, max 30s
2. extend 30s timeout
Goals
● ↓ the response time
● ↓ the inference costs
AI Conf 2025
Inference
➔ AWS Lambda
➔ Amazon Fargate
➔ Amazon EC2
AI Conf 2025
Inference
➔ AWS Lambda
➔ Amazon Fargate
➔ Amazon EC2
➔ Amazon SageMaker
AI Conf 2025
Inference
➔ AWS Lambda
➔ Amazon Fargate
➔ Amazon EC2
➔ Amazon SageMaker
➔ Amazon Bedrock / Q
AI Conf 2025
Inference
Needs
➔ GPU
➔ AWS Lambda
➔ Amazon Fargate
➔ Amazon EC2
➔ Amazon SageMaker
➔ Amazon Bedrock / Q
AI Conf 2025
Inference
Needs
➔ GPU
➔ Model loading
✖
✖
➔ AWS Lambda
➔ Amazon Fargate
➔ Amazon EC2
➔ Amazon SageMaker
➔ Amazon Bedrock / Q
✖
AI Conf 2025
Inference
Needs
➔ GPU
➔ Model loading
✖
✖
➔ AWS Lambda
➔ Amazon Fargate
➔ Amazon EC2
➔ Amazon SageMaker
➔ Amazon Bedrock / Q
✖
AI Conf 2025
Comparison of solutions
AI Conf 2025
AWS Services Comparison for a Chatbot
Services Difficulty Embeddings Training $ Inference $
Amazon Q
Business
$0.264 / hour
/ 200MB
$20 / user /
mo
Bedrock
fine tuning $2 / 1000
queries
$0.0079 / 1k
tokens
$30 / hour
Bedrock
on demand
$0.00072 for input / 1k tokens
and for output / 1k tokens
Amazon
SageMaker
$2 / 1000
queries
$0.921 / hour $0.921 / hour
AI Conf 2025
AWS Services Comparison for a Chatbot
Services Difficulty Embeddings Training $ Inference $
Amazon Q
Business
$0.264 / hour
/ 200MB
$20 / user /
mo
Bedrock
fine tuning $2 / 1000
queries
$0.0079 / 1k
tokens
$30 / hour
Bedrock
on demand
$0.00072 for input / 1k tokens
and for output / 1k tokens
Amazon
SageMaker
$2 / 1000
queries
$0.921 / hour $0.921 / hour
AI Conf 2025
AWS Services Comparison for a Chatbot
Services Difficulty Embeddings Training $ Inference $
Amazon Q
Business
$0.264 / hour
/ 200MB
$20 / user /
mo
Bedrock
fine tuning $2 / 1000
queries
$0.0079 / 1k
tokens
$30 / hour
Bedrock
on demand
$0.00072 for input / 1k tokens
and for output / 1k tokens
Amazon
SageMaker
$2 / 1000
queries
$0.921 / hour $0.921 / hour
AI Conf 2025
AWS Services Comparison for a Chatbot
Services Difficulty Embeddings Training $ Inference $
Amazon Q
Business
$0.264 / hour
/ 200MB
$20 / user /
mo
Bedrock
fine tuning $2 / 1000
queries
$0.0079 / 1k
tokens
$30 / hour
Bedrock
on demand
$0.00072 for input / 1k tokens
and for output / 1k tokens
Amazon
SageMaker
$2 / 1000
queries
$0.921 / hour $0.921 / hour
AI Conf 2025
Services Embeddings $ Training $ Inference $
Amazon Q Business 190 20 (user / mo)
Bedrock fine tuning
2
1.5089 22320
Bedrock on demand 1.0714 (per 1k token)
Amazon SageMaker
2 0.0154
3.68 (serverless)
685.22 (provisioned)
AWS Services Comparison for a Chatbot
Excluded from costs: ML storage, data processing and provisioned concurrency (serverless only)
Example: 1 training of 191011 tokens + 1 request of 20s for every hour, every day for a month
AI Conf 2025
Services Embeddings $ Training $ Inference $
Amazon Q Business 190 20 (user / mo)
Bedrock fine tuning
2
1.5089 22320
Bedrock on demand 1.0714 (per 1k token)
Amazon SageMaker
2 0.0154
3.68 (serverless)
685.22 (provisioned)
AWS Services Comparison for a Chatbot
Excluded from costs: ML storage, data processing and provisioned concurrency (serverless only)
Example: 1 training of 191011 tokens + 1 request of 20s for every hour, every day for a month
AI Conf 2025
Services Embeddings $ Training $ Inference $
Amazon Q Business 190 20 (user / mo)
Bedrock fine tuning
2
1.5089 22320
Bedrock on demand 1.0714 (per 1k token)
Amazon SageMaker
2 0.0154
3.68 (serverless)
685.22 (provisioned)
AWS Services Comparison for a Chatbot
Excluded from costs: ML storage, data processing and provisioned concurrency (serverless only)
Example: 1 training of 191011 tokens + 1 request of 20s for every hour, every day for a month
AI Conf 2025
Services Embeddings $ Training $ Inference $
Amazon Q Business 190 20 (user / mo)
Bedrock fine tuning
2
1.5089 22320
Bedrock on demand 1.0714 (per 1k token)
Amazon SageMaker
2 0.0154
3.68 (serverless)
685.22 (provisioned)
AWS Services Comparison for a Chatbot
Excluded from costs: ML storage, data processing and provisioned concurrency (serverless only)
Example: 1 training of 191011 tokens + 1 request of 20s for every hour, every day for a month
AI Conf 2025
Services Embeddings $ Training $ Inference $
Amazon Q Business 190
Bedrock fine tuning
2
1.5089 22320
Bedrock on demand 1.0714 (per 1k token)
Amazon SageMaker
2 0.0154
3.68 (serverless)
685.22 (provisioned)
AWS Services Comparison for a Chatbot
Excluded from costs: ML storage, data processing and provisioned concurrency (serverless only)
Example: 1 training of 191011 tokens + 1 request of 20s for every hour, every day for a month
AI Conf 2025
Thanks
for listening!
Thank you!
>>AI CONF 2025
👉 slides & videos: https://www.improove.tech/videos

Amazon Q and Amazon Bedrock, fully managed vs. custom - 2025-06-25

  • 1.
    Amazon Q andBedrock, fully managed vs custom Alessandra Bilardi Data & Automation Specialist @ Corley Cloud >>AI CONF 2025
  • 2.
    AI Conf 2025 AmazonQ and Bedrock, fully managed vs custom
  • 3.
    AI Conf 2025 Oltre500 progetti su AWS Corley Cloud è una realtà certificata con innumerevoli riconoscimenti e un portfolio di centinaia di progetti AWS sviluppati in diversi ambiti: cloud native, migrazione, machine learning & AI, serverless, IoT, sicurezza e cloudOps. Advanced Partner AWS
  • 4.
    AI Conf 2025 Data& Automation Specialist @ Corley Cloud alessandra.bilardi@corley.it corley.it Alessandra Bilardi
  • 5.
    AI Conf 2025 AlessandraBilardi Data & Automation Specialist @ Corley Cloud alessandra.bilardi@corley.it corley.it
  • 6.
    AI Conf 2025 AlessandraBilardi Data & Automation Specialist @ Corley Cloud alessandra.bilardi@corley.it corley.it
  • 7.
    AI Conf 2025 SUMMARY Machinelearning steps and actors Generative AI with Amazon Q Generative AI with Amazon Bedrock Chat bot
  • 8.
    AI Conf 2025 Machinelearning steps and actors
  • 9.
    AI Conf 2025 Whatare the steps of ML ? ➔ The data may arrive ready for learning, but often some processing is needed ➔ Model training could be delegated to an AI system, except for custom steps ➔ Evaluation is a prediction for which we know the expected values, for which we can calculate metrics ➔ The prediction works on new data processed with point 1 with the best model saved in point 3 Preparation Training & Tuning Testing & Evaluation Prediction (inference)
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
    AI Conf 2025 Arethere other steps or actors in ML ? ➔ Embeddings are objects that contain information about text, images, videos, audio or code ➔ The prompt is the text that contains the behavior that the model must have, the instructions to follow to respond to the request posed. ➔ Augmented Generation (AG) techniques allow us to exploit generalist models by providing them with instructions (the prompt), context (an extract of the embeddings) and a request to obtain a specific response. Question AG Answer Embeddings & Prompt LM
  • 17.
    AI Conf 2025 Usecase - Chat bot - preparation steps
  • 18.
    AI Conf 2025 Usecase - Chat bot - preparation steps
  • 19.
    AI Conf 2025 Usecase - Chat bot - preparation steps
  • 20.
    AI Conf 2025 Usecase - Chat bot - preparation steps
  • 21.
    AI Conf 2025 Usecase - Chat bot - preparation steps
  • 22.
    AI Conf 2025 Usecase - Chat bot - preparation steps
  • 23.
    AI Conf 2025 Usecase - Chat bot - preparation steps
  • 24.
    AI Conf 2025 Usecase - Chat bot - preparation steps
  • 25.
    AI Conf 2025 Usecase - Chat bot - preparation steps
  • 26.
    AI Conf 2025 Usecase - Chat bot - preparation steps
  • 27.
    AI Conf 2025 GenerativeAI with Amazon Q
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
    AI Conf 2025 AmazonQ Business 1. Embedding (as needed)
  • 33.
    AI Conf 2025 AmazonQ Business 1. Embedding (as needed)
  • 34.
    AI Conf 2025 AmazonQ Business 1. Embedding (as needed)
  • 35.
    AI Conf 2025 AmazonQ Business 1. Embedding (as needed) 2. Request (question) 3. RAG 4. Response (answer)
  • 36.
    AI Conf 2025 AmazonQ Business 1. Embedding (as needed) 2. Request (question) 3. RAG 4. Response (answer)
  • 37.
    AI Conf 2025 AmazonQ Developer 1. Request
  • 38.
    AI Conf 2025 AmazonQ Developer 1. Request 2. Embedding (as needed)
  • 39.
    AI Conf 2025 AmazonQ Developer 1. Request 2. Embedding (as needed) 3. Context
  • 40.
    AI Conf 2025 GenerativeAI with Amazon Bedrock
  • 41.
    AI Conf 2025 AmazonBedrock 1 2 Knowledge base (embedding) 3 Agent Models 4 Prompt
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
    AI Conf 2025 AmazonBedrock 1 2 Model evaluation 3 Playground Data automation 4 Prompt routers / caching
  • 71.
  • 72.
    AI Conf 2025 Usecase - Chat bot - preparation steps
  • 73.
    AI Conf 2025 Usecase - Chat bot - preparation steps
  • 74.
    AI Conf 2025 Usecase - Chat bot - production version 1.0
  • 75.
    AI Conf 2025 Usecase - Chat bot - production version 1.0
  • 76.
    AI Conf 2025 Usecase - Chat bot - production version 1.0
  • 77.
    AI Conf 2025 Usecase - Chat bot - production version 1.0
  • 78.
    AI Conf 2025 Usecase - Chat bot - production version 1.0
  • 79.
    AI Conf 2025 Usecase - Chat bot - production version 1.1
  • 80.
    AI Conf 2025 Usecase - Chat bot - production version 2.0
  • 81.
    AI Conf 2025 Usecase - Chat bot - production version 2.0
  • 82.
    AI Conf 2025 Usecase - Chat bot - production version 2.0
  • 83.
    AI Conf 2025 Usecase - Chat bot - production version 2.0
  • 84.
    AI Conf 2025 Usecase - Chat bot - production version 2.0
  • 85.
    AI Conf 2025 Whichinfrastructure for the ChatBot ?
  • 86.
    AI Conf 2025 Solutions 1.like ChatGPT, max 30s
  • 87.
    AI Conf 2025 Solutions 1.like ChatGPT, max 30s 2. extend 30s timeout
  • 88.
    AI Conf 2025 Solutions 1.like ChatGPT, max 30s 2. extend 30s timeout Goals ● ↓ the response time ● ↓ the inference costs
  • 89.
    AI Conf 2025 Inference ➔AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2
  • 90.
    AI Conf 2025 Inference ➔AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2 ➔ Amazon SageMaker
  • 91.
    AI Conf 2025 Inference ➔AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2 ➔ Amazon SageMaker ➔ Amazon Bedrock / Q
  • 92.
    AI Conf 2025 Inference Needs ➔GPU ➔ AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2 ➔ Amazon SageMaker ➔ Amazon Bedrock / Q
  • 93.
    AI Conf 2025 Inference Needs ➔GPU ➔ Model loading ✖ ✖ ➔ AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2 ➔ Amazon SageMaker ➔ Amazon Bedrock / Q ✖
  • 94.
    AI Conf 2025 Inference Needs ➔GPU ➔ Model loading ✖ ✖ ➔ AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2 ➔ Amazon SageMaker ➔ Amazon Bedrock / Q ✖
  • 95.
  • 96.
    AI Conf 2025 AWSServices Comparison for a Chatbot Services Difficulty Embeddings Training $ Inference $ Amazon Q Business $0.264 / hour / 200MB $20 / user / mo Bedrock fine tuning $2 / 1000 queries $0.0079 / 1k tokens $30 / hour Bedrock on demand $0.00072 for input / 1k tokens and for output / 1k tokens Amazon SageMaker $2 / 1000 queries $0.921 / hour $0.921 / hour
  • 97.
    AI Conf 2025 AWSServices Comparison for a Chatbot Services Difficulty Embeddings Training $ Inference $ Amazon Q Business $0.264 / hour / 200MB $20 / user / mo Bedrock fine tuning $2 / 1000 queries $0.0079 / 1k tokens $30 / hour Bedrock on demand $0.00072 for input / 1k tokens and for output / 1k tokens Amazon SageMaker $2 / 1000 queries $0.921 / hour $0.921 / hour
  • 98.
    AI Conf 2025 AWSServices Comparison for a Chatbot Services Difficulty Embeddings Training $ Inference $ Amazon Q Business $0.264 / hour / 200MB $20 / user / mo Bedrock fine tuning $2 / 1000 queries $0.0079 / 1k tokens $30 / hour Bedrock on demand $0.00072 for input / 1k tokens and for output / 1k tokens Amazon SageMaker $2 / 1000 queries $0.921 / hour $0.921 / hour
  • 99.
    AI Conf 2025 AWSServices Comparison for a Chatbot Services Difficulty Embeddings Training $ Inference $ Amazon Q Business $0.264 / hour / 200MB $20 / user / mo Bedrock fine tuning $2 / 1000 queries $0.0079 / 1k tokens $30 / hour Bedrock on demand $0.00072 for input / 1k tokens and for output / 1k tokens Amazon SageMaker $2 / 1000 queries $0.921 / hour $0.921 / hour
  • 100.
    AI Conf 2025 ServicesEmbeddings $ Training $ Inference $ Amazon Q Business 190 20 (user / mo) Bedrock fine tuning 2 1.5089 22320 Bedrock on demand 1.0714 (per 1k token) Amazon SageMaker 2 0.0154 3.68 (serverless) 685.22 (provisioned) AWS Services Comparison for a Chatbot Excluded from costs: ML storage, data processing and provisioned concurrency (serverless only) Example: 1 training of 191011 tokens + 1 request of 20s for every hour, every day for a month
  • 101.
    AI Conf 2025 ServicesEmbeddings $ Training $ Inference $ Amazon Q Business 190 20 (user / mo) Bedrock fine tuning 2 1.5089 22320 Bedrock on demand 1.0714 (per 1k token) Amazon SageMaker 2 0.0154 3.68 (serverless) 685.22 (provisioned) AWS Services Comparison for a Chatbot Excluded from costs: ML storage, data processing and provisioned concurrency (serverless only) Example: 1 training of 191011 tokens + 1 request of 20s for every hour, every day for a month
  • 102.
    AI Conf 2025 ServicesEmbeddings $ Training $ Inference $ Amazon Q Business 190 20 (user / mo) Bedrock fine tuning 2 1.5089 22320 Bedrock on demand 1.0714 (per 1k token) Amazon SageMaker 2 0.0154 3.68 (serverless) 685.22 (provisioned) AWS Services Comparison for a Chatbot Excluded from costs: ML storage, data processing and provisioned concurrency (serverless only) Example: 1 training of 191011 tokens + 1 request of 20s for every hour, every day for a month
  • 103.
    AI Conf 2025 ServicesEmbeddings $ Training $ Inference $ Amazon Q Business 190 20 (user / mo) Bedrock fine tuning 2 1.5089 22320 Bedrock on demand 1.0714 (per 1k token) Amazon SageMaker 2 0.0154 3.68 (serverless) 685.22 (provisioned) AWS Services Comparison for a Chatbot Excluded from costs: ML storage, data processing and provisioned concurrency (serverless only) Example: 1 training of 191011 tokens + 1 request of 20s for every hour, every day for a month
  • 104.
    AI Conf 2025 ServicesEmbeddings $ Training $ Inference $ Amazon Q Business 190 Bedrock fine tuning 2 1.5089 22320 Bedrock on demand 1.0714 (per 1k token) Amazon SageMaker 2 0.0154 3.68 (serverless) 685.22 (provisioned) AWS Services Comparison for a Chatbot Excluded from costs: ML storage, data processing and provisioned concurrency (serverless only) Example: 1 training of 191011 tokens + 1 request of 20s for every hour, every day for a month
  • 105.
  • 106.
    Thank you! >>AI CONF2025 👉 slides & videos: https://www.improove.tech/videos