Amazon Q and Amazon Bedrock, fully managed vs. custom - 2025-06-25

1.
Amazon Q andBedrock, fully managed vs custom Alessandra Bilardi Data & Automation Specialist @ Corley Cloud >>AI CONF 2025

2.
AI Conf 2025 AmazonQ and Bedrock, fully managed vs custom

3.
AI Conf 2025 Oltre500 progetti su AWS Corley Cloud è una realtà certiﬁcata con innumerevoli riconoscimenti e un portfolio di centinaia di progetti AWS sviluppati in diversi ambiti: cloud native, migrazione, machine learning & AI, serverless, IoT, sicurezza e cloudOps. Advanced Partner AWS

4.
AI Conf 2025 Data& Automation Specialist @ Corley Cloud alessandra.bilardi@corley.it corley.it Alessandra Bilardi

5.
AI Conf 2025 AlessandraBilardi Data & Automation Specialist @ Corley Cloud alessandra.bilardi@corley.it corley.it

6.
AI Conf 2025 AlessandraBilardi Data & Automation Specialist @ Corley Cloud alessandra.bilardi@corley.it corley.it

7.
AI Conf 2025 SUMMARY Machinelearning steps and actors Generative AI with Amazon Q Generative AI with Amazon Bedrock Chat bot

8.
AI Conf 2025 Machinelearning steps and actors

9.
AI Conf 2025 Whatare the steps of ML ? ➔ The data may arrive ready for learning, but often some processing is needed ➔ Model training could be delegated to an AI system, except for custom steps ➔ Evaluation is a prediction for which we know the expected values, for which we can calculate metrics ➔ The prediction works on new data processed with point 1 with the best model saved in point 3 Preparation Training & Tuning Testing & Evaluation Prediction (inference)

10.
AI Conf 2025 MLsystem

11.

12.

13.

14.
AI Conf 2025 ML

15.
AI Conf 2025 ML

16.
AI Conf 2025 Arethere other steps or actors in ML ? ➔ Embeddings are objects that contain information about text, images, videos, audio or code ➔ The prompt is the text that contains the behavior that the model must have, the instructions to follow to respond to the request posed. ➔ Augmented Generation (AG) techniques allow us to exploit generalist models by providing them with instructions (the prompt), context (an extract of the embeddings) and a request to obtain a speciﬁc response. Question AG Answer Embeddings & Prompt LM

17.
AI Conf 2025 Usecase - Chat bot - preparation steps

18.

19.

20.

21.

22.

23.

24.

25.

26.

27.
AI Conf 2025 GenerativeAI with Amazon Q

28.
AI Conf 2025 AmazonQ

29.

30.

31.
AI Conf 2025 AmazonQ Business

32.
AI Conf 2025 AmazonQ Business 1. Embedding (as needed)

33.

34.

35.
AI Conf 2025 AmazonQ Business 1. Embedding (as needed) 2. Request (question) 3. RAG 4. Response (answer)

36.
AI Conf 2025 AmazonQ Business 1. Embedding (as needed) 2. Request (question) 3. RAG 4. Response (answer)

37.
AI Conf 2025 AmazonQ Developer 1. Request

38.
AI Conf 2025 AmazonQ Developer 1. Request 2. Embedding (as needed)

39.
AI Conf 2025 AmazonQ Developer 1. Request 2. Embedding (as needed) 3. Context

40.
AI Conf 2025 GenerativeAI with Amazon Bedrock

41.
AI Conf 2025 AmazonBedrock 1 2 Knowledge base (embedding) 3 Agent Models 4 Prompt

42.
AI Conf 2025 Models

43.
AI Conf 2025 Models

44.
AI Conf 2025 Models

45.
AI Conf 2025 Models

46.
AI Conf 2025 Models

47.
AI Conf 2025 Models

48.
AI Conf 2025 Models

49.
AI Conf 2025 Knowledgebase

50.

51.

52.

53.

54.
AI Conf 2025 Agent

55.
AI Conf 2025 Agent

56.
AI Conf 2025 Agent

57.
AI Conf 2025 Agent

58.
AI Conf 2025 Agent

59.
AI Conf 2025 Agent

60.
AI Conf 2025 Agent

61.
AI Conf 2025 Agent

62.
AI Conf 2025 Prompt

63.
AI Conf 2025 Prompt

64.
AI Conf 2025 Prompt

65.
AI Conf 2025 Prompt

66.
AI Conf 2025 Prompt

67.
AI Conf 2025 Flows

68.
AI Conf 2025 Flows

69.
AI Conf 2025 Flows

70.
AI Conf 2025 AmazonBedrock 1 2 Model evaluation 3 Playground Data automation 4 Prompt routers / caching

71.
AI Conf 2025 Chatbot

72.

73.

74.
AI Conf 2025 Usecase - Chat bot - production version 1.0

75.

76.

77.

78.

79.

80.

81.

82.

83.

84.

85.
AI Conf 2025 Whichinfrastructure for the ChatBot ?

86.
AI Conf 2025 Solutions 1.like ChatGPT, max 30s

87.
AI Conf 2025 Solutions 1.like ChatGPT, max 30s 2. extend 30s timeout

88.
AI Conf 2025 Solutions 1.like ChatGPT, max 30s 2. extend 30s timeout Goals ● ↓ the response time ● ↓ the inference costs

89.
AI Conf 2025 Inference ➔AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2

90.
AI Conf 2025 Inference ➔AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2 ➔ Amazon SageMaker

91.
AI Conf 2025 Inference ➔AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2 ➔ Amazon SageMaker ➔ Amazon Bedrock / Q

92.
AI Conf 2025 Inference Needs ➔GPU ➔ AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2 ➔ Amazon SageMaker ➔ Amazon Bedrock / Q

93.
AI Conf 2025 Inference Needs ➔GPU ➔ Model loading ✖ ✖ ➔ AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2 ➔ Amazon SageMaker ➔ Amazon Bedrock / Q ✖

94.
AI Conf 2025 Inference Needs ➔GPU ➔ Model loading ✖ ✖ ➔ AWS Lambda ➔ Amazon Fargate ➔ Amazon EC2 ➔ Amazon SageMaker ➔ Amazon Bedrock / Q ✖

95.
AI Conf 2025 Comparisonof solutions

96.
AI Conf 2025 AWSServices Comparison for a Chatbot Services Difficulty Embeddings Training $ Inference $ Amazon Q Business $0.264 / hour / 200MB $20 / user / mo Bedrock fine tuning $2 / 1000 queries $0.0079 / 1k tokens $30 / hour Bedrock on demand $0.00072 for input / 1k tokens and for output / 1k tokens Amazon SageMaker $2 / 1000 queries $0.921 / hour $0.921 / hour

97.

98.

99.

100.
AI Conf 2025 ServicesEmbeddings $ Training $ Inference $ Amazon Q Business 190 20 (user / mo) Bedrock fine tuning 2 1.5089 22320 Bedrock on demand 1.0714 (per 1k token) Amazon SageMaker 2 0.0154 3.68 (serverless) 685.22 (provisioned) AWS Services Comparison for a Chatbot Excluded from costs: ML storage, data processing and provisioned concurrency (serverless only) Example: 1 training of 191011 tokens + 1 request of 20s for every hour, every day for a month

101.

102.

103.

104.
AI Conf 2025 ServicesEmbeddings $ Training $ Inference $ Amazon Q Business 190 Bedrock fine tuning 2 1.5089 22320 Bedrock on demand 1.0714 (per 1k token) Amazon SageMaker 2 0.0154 3.68 (serverless) 685.22 (provisioned) AWS Services Comparison for a Chatbot Excluded from costs: ML storage, data processing and provisioned concurrency (serverless only) Example: 1 training of 191011 tokens + 1 request of 20s for every hour, every day for a month

105.
AI Conf 2025 Thanks forlistening!

106.
Thank you! >>AI CONF2025 👉 slides & videos: https://www.improove.tech/videos

Amazon Q and Amazon Bedrock, fully managed vs. custom - 2025-06-25

More Related Content

Similar to Amazon Q and Amazon Bedrock, fully managed vs. custom - 2025-06-25

More from Alessandra Bilardi

Recently uploaded

Amazon Q and Amazon Bedrock, fully managed vs. custom - 2025-06-25