GENERATIVE AI CON
AMAZON BEDROCK
Guido Nebiolo,AWS Ambassador @ Reply
23 November2023
(WITH REAL EXAMPLES INSIDE 😉)
2
• 20 years developer
• 10 years paid developer
• 8 years paid developer on AWS (mainly)
• 3 years paid to teach developing on AWS
(and other topics)
WHOAMI
aws sts get-caller-identity
3
THE AI REVOLUTION
4
WHAT IS GENERATIVE AI?
AI
ML
DL
Gen AI
Generative AI generates new
content for a variety of tasks
leveraging pretrained foundation
models that can be customized
with small fractions of data.
5
HOW LLMS WORKS?
PROBABILITY
I am truly excited
6
HOW LLMS WORKS?
PROBABILITY
I am truly excited to (61.91%)
about (21.72%)
for (4.87%)
and (2.93%)
that (1.98%)
7
HOW LLMS WORKS?
PROBABILITY
I am truly excited to be (25.43%)
announce (9.46%)
share (8.18%)
have (7.49%)
see (5.45%)
8
HOW LLMS WORKS?
PROBABILITY
I am truly excited to be a (16.57%)
joining (10.49%)
part (9.70%)
able (7.72%)
working (5.76%)
9
HOW LLMS WORKS?
PROBABILITY
I am truly excited to be a part (82.15%)
member (4.32%)
new (1.03%)
guest (0.88%)
partner (0.34%)
10
HOW LLMS WORKS?
PROBABILITY
I am truly excited to be a part of (99.66%)
nn (0.05%)
the (0.04%)
this (0.02%)
o (0.02%)
11
HOW LLMS WORKS?
PROBABILITY
I am truly excited to be a part of the (46.85%)
this (20.92%)
such (4.73%)
a (4.05%)
an (1.52%)
12
HOW LLMS WORKS?
PROBABILITY
I am truly excited to be a part of the team (5.32%)
(1.81%)
amazing (0.83%)
new (0.69%)
community (0.61%)
13
HOW LLMS WORKS?
PROBABILITY
I am truly excited to be a part of the team
14
FMS ON AWS
Out-of-the-Box
Managed
Model-as-a-Service
Managed ML Dev
Tooling
Proprietary Models
Provides a ready-to-use
solution with predefined
configurations, requiring
minimal setup and
customization
Build GenAI applications
on fully managed models
with choice of FMs
Tune or use publicly
available or open-source
models as is on managed
model
Build custom models from
scratch
15
KEY FEATURES OF BEDROCK
Accelerate development of
generative AI applications
using FMs through an API,
without managing
infrastructure
Choose FMs from Amazon,
AI21 Labs, Anthropic, Cohere,
and Stability AI to find the right
FM for your use case
Privately customize FMs using
your organization’s data
Multilingual LLMs for text
generation in Spanish,
French, German,
Portuguese, Italian, and
Dutch
LLM f or thoughtf ul
dialogue, content
creation, complex
reasoning, creativ ity ,
and coding, based on
Constitutional AI and
harmlessness training
Powerf ul and v ersatile
language models that
can be used f or a wide
range of natural
language processing
tasks. Optimized f or
dialogue use case
Generation of unique,
realistic, high-quality
images, art, logos, and
designs
Text summarization,
generation,
classif ication, open-
ended Q&A, inf ormation
extraction, embeddings
and search
JURASSIC CLAUDE LLAMA SDXL TITAN
Text generation model
f or business applications
and embeddings model
f or search, clustering, or
classif ication in 100+
languages
COMMAND
19
EMERGING GENERATIVE AI MODEL PATTERNS
Coherence
|
context
learning
Complexity | Time to market
In-context learning
using foundational
models
Model fine-tuning
using foundational
models
Training your own
model
#1: Contextual prompt engineering
#2: Retrieval augmented generation (RAG)
#3: Model fine-tuning
#4: Training models
PROMPT ENGINEERING
21
UNDERSTANDING PROMPT ENGINEERING
Summarize the following technical sentence:
Tags: generative ai, security, blogpost
Sentence: Security has been a hot topic since the
birth of Generative AI🔥. From the beginning, AWS
states that security is a shared responsibility
between us and them...
Summary:
22
UNDERSTANDING PROMPT ENGINEERING
INSTRUCTION
INPUT DATA
OUTPUT INDICATOR
CONTEXT
Summarize the following
technical sentence
23
UNDERSTANDING PROMPT ENGINEERING
INSTRUCTION
INPUT DATA
OUTPUT INDICATOR
CONTEXT
Summarize the following
technical sentence
Summary:
24
UNDERSTANDING PROMPT ENGINEERING
INSTRUCTION
INPUT DATA
OUTPUT INDICATOR
CONTEXT
Summarize the following
technical sentence
Sentence: Security has been a
hot topic since the birth of
Generative AI🔥. From the
beginning, AWS states that…
Summary:
25
UNDERSTANDING PROMPT ENGINEERING
INSTRUCTION
INPUT DATA
OUTPUT INDICATOR
CONTEXT
Summarize the following
technical sentence
Sentence: Security has been a
hot topic since the birth of
Generative AI🔥. From the
beginning, AWS states that…
Summary:
Tags: generative ai, security,
blogpost
26
UNDERSTANDING PROMPT ENGINEERING
Summarize the following technical sentence:
Tags: generative ai, security, blogpost
Sentence: Security has been a hot topic since the
birth of Generative AI🔥. From the beginning, AWS
states that security is a shared responsibility
between us and them...
Summary:
INSTRUCTION
INPUT DATA
OUTPUT INDICATOR
CONTEXT
Summarize the following
technical sentence
Sentence: Security has been a
hot topic since the birth of
Generative AI🔥. From the
beginning, AWS states that…
Summary:
Tags: generative ai, security,
blogpost
27
UNDERSTANDING PROMPT ENGINEERING
PLEASE
28
UNDERSTANDING PROMPT ENGINEERING
Please summarize the following technical sentence:
Tags: generative ai, security, blogpost
Sentence: Security has been a hot topic since the
birth of Generative AI🔥. From the beginning, AWS
states that security is a shared responsibility
between us and them...
Summary:
29
INFERENCE
PARAMETERS
30 Prompt: Captures the beauty of a tropical beach on a hot, sunny day.Include palm trees, crystal-clear waters.
INFERENCE
PARAMETERS
Higher the value means more
randomness.
TEMPERATURE
31 Prompt: Serene winter wonderland,showcasing a snow-covered forest with glistening trees, a frozen lake, and the peaceful,
cold atmosphere
INFERENCE
PARAMETERS
Higher the value means it will
only looks at a subset of tokens
whose probability adds up to a
certain threshold (Top P).
TOP P
32 Prompt: Cozy mountain cabin surrounded by a snowy, alpine landscape, with smoke rising from the chimney and a sky full of
stars on a freezing night.
INFERENCE
PARAMETERS
Similar to Top P, but, instead of
working in percentage, it
specifies an absolute number of
tokens.
TOP K
33
PROMPT ENGINEERING
TECHNIQUES
ONE-SHOT LEARNING FEW-SHOT LEARNING
34
ZERO-SHOT LEARNING
PROMPT ENGINEERING TECHNIQUES
35
ZERO SHOT DEMO
PromptEngineering
we didn't provide the
model with any
examples of text
alongside their
classifications, the
LLM already
understands
"sentiment"
36
FEW SHOT DEMO
PromptEngineering
Few-shot prompting
can be used as a
technique to enable
in-context learning
where we provide
demonstrations in
the prompt to steer
the model to better
performance.
37
FEW SHOT DEMO
PromptEngineering
few-shot prompting
is not enough to get
reliable responses
for this type of
reasoning problem.
enables complexreasoning capabilities through
intermediate reasoning steps.
CHAIN-OF-THOUGHT
generate knowledge to be used as part of the
prompt.
GENERATED KNOWLEDGE
38
MORE PROMPT ENGINEERING
TECHNIQUES
… AND MANY OTHERS
RETRIEVAL AUGMENTED
GENERATION
FMs knowledge is freezed at the time of of model
training.
POINT IN TIME
Generation of text that is not grounded in
accurate or real-world information.
HALLUCINATION
40
WHY RAG?
Retrieval Augmented Generation
41
UNDERSTANDING RAG
RAG’s internal knowledge
can be easily altered or
even supplemented on the
fly, controlling what RAG
knows and doesn’t know.
Retrieval Augmented
Generation (RAG) is a machine
learning approach that combines
elements of both retrieval-based
models and generative models to
improve the performance of
natural language understanding
and generation tasks.
Retrieval Augmented Generation
42
HIGH LEVEL DESIGN
RAG Architecture
High level:
1. Document
ingestion
2. Document
retrieval
3. Prompt
augmentation
Data
Sources
Embedder
Vector
Storage
Prompt Embedder
Retrieval
Engine
Prompt +
Docs
LLM
Output
43
UNDERSTANDING RAG
Retrieval Augmented Generation
44
RAG DEMO
Retrieval Augmented Generation
Prompt
Output
45
RAG DEMO
Retrieval Augmented Generation
Prompt + Context
Output
FINE TUNING
47
INTRODUCTION TO FINE TUNING
FOUNDATION
MODEL
TRAINING
DATA
CUSTOM
MODEL
48
FINE TUNING WITH BEDROCK
FOUNDATION
MODEL
TRAINING
DATA
CUSTOM
MODEL
49
TAKE-AWAYS
• To get better results, give as many details as possible to LLMs.
• Use RAG to cut training costs and decrease TTM when delivering POC
or MVP.
• Consider fine-tuning LLMs instead of giving them too many examples to
learn from.
(How many shots can an LLM handle?)
• Go on and build something, best learning path is hands-one experience.
Be part of the revolution!
Q&A
THANK YOU!

Generative AI con Amazon Bedrock.pdf

  • 1.
    GENERATIVE AI CON AMAZONBEDROCK Guido Nebiolo,AWS Ambassador @ Reply 23 November2023 (WITH REAL EXAMPLES INSIDE 😉)
  • 2.
    2 • 20 yearsdeveloper • 10 years paid developer • 8 years paid developer on AWS (mainly) • 3 years paid to teach developing on AWS (and other topics) WHOAMI aws sts get-caller-identity
  • 3.
  • 4.
    4 WHAT IS GENERATIVEAI? AI ML DL Gen AI Generative AI generates new content for a variety of tasks leveraging pretrained foundation models that can be customized with small fractions of data.
  • 5.
  • 6.
    6 HOW LLMS WORKS? PROBABILITY Iam truly excited to (61.91%) about (21.72%) for (4.87%) and (2.93%) that (1.98%)
  • 7.
    7 HOW LLMS WORKS? PROBABILITY Iam truly excited to be (25.43%) announce (9.46%) share (8.18%) have (7.49%) see (5.45%)
  • 8.
    8 HOW LLMS WORKS? PROBABILITY Iam truly excited to be a (16.57%) joining (10.49%) part (9.70%) able (7.72%) working (5.76%)
  • 9.
    9 HOW LLMS WORKS? PROBABILITY Iam truly excited to be a part (82.15%) member (4.32%) new (1.03%) guest (0.88%) partner (0.34%)
  • 10.
    10 HOW LLMS WORKS? PROBABILITY Iam truly excited to be a part of (99.66%) nn (0.05%) the (0.04%) this (0.02%) o (0.02%)
  • 11.
    11 HOW LLMS WORKS? PROBABILITY Iam truly excited to be a part of the (46.85%) this (20.92%) such (4.73%) a (4.05%) an (1.52%)
  • 12.
    12 HOW LLMS WORKS? PROBABILITY Iam truly excited to be a part of the team (5.32%) (1.81%) amazing (0.83%) new (0.69%) community (0.61%)
  • 13.
    13 HOW LLMS WORKS? PROBABILITY Iam truly excited to be a part of the team
  • 14.
    14 FMS ON AWS Out-of-the-Box Managed Model-as-a-Service ManagedML Dev Tooling Proprietary Models Provides a ready-to-use solution with predefined configurations, requiring minimal setup and customization Build GenAI applications on fully managed models with choice of FMs Tune or use publicly available or open-source models as is on managed model Build custom models from scratch
  • 15.
    15 KEY FEATURES OFBEDROCK Accelerate development of generative AI applications using FMs through an API, without managing infrastructure Choose FMs from Amazon, AI21 Labs, Anthropic, Cohere, and Stability AI to find the right FM for your use case Privately customize FMs using your organization’s data Multilingual LLMs for text generation in Spanish, French, German, Portuguese, Italian, and Dutch LLM f or thoughtf ul dialogue, content creation, complex reasoning, creativ ity , and coding, based on Constitutional AI and harmlessness training Powerf ul and v ersatile language models that can be used f or a wide range of natural language processing tasks. Optimized f or dialogue use case Generation of unique, realistic, high-quality images, art, logos, and designs Text summarization, generation, classif ication, open- ended Q&A, inf ormation extraction, embeddings and search JURASSIC CLAUDE LLAMA SDXL TITAN Text generation model f or business applications and embeddings model f or search, clustering, or classif ication in 100+ languages COMMAND
  • 16.
    19 EMERGING GENERATIVE AIMODEL PATTERNS Coherence | context learning Complexity | Time to market In-context learning using foundational models Model fine-tuning using foundational models Training your own model #1: Contextual prompt engineering #2: Retrieval augmented generation (RAG) #3: Model fine-tuning #4: Training models
  • 17.
  • 18.
    21 UNDERSTANDING PROMPT ENGINEERING Summarizethe following technical sentence: Tags: generative ai, security, blogpost Sentence: Security has been a hot topic since the birth of Generative AI🔥. From the beginning, AWS states that security is a shared responsibility between us and them... Summary:
  • 19.
    22 UNDERSTANDING PROMPT ENGINEERING INSTRUCTION INPUTDATA OUTPUT INDICATOR CONTEXT Summarize the following technical sentence
  • 20.
    23 UNDERSTANDING PROMPT ENGINEERING INSTRUCTION INPUTDATA OUTPUT INDICATOR CONTEXT Summarize the following technical sentence Summary:
  • 21.
    24 UNDERSTANDING PROMPT ENGINEERING INSTRUCTION INPUTDATA OUTPUT INDICATOR CONTEXT Summarize the following technical sentence Sentence: Security has been a hot topic since the birth of Generative AI🔥. From the beginning, AWS states that… Summary:
  • 22.
    25 UNDERSTANDING PROMPT ENGINEERING INSTRUCTION INPUTDATA OUTPUT INDICATOR CONTEXT Summarize the following technical sentence Sentence: Security has been a hot topic since the birth of Generative AI🔥. From the beginning, AWS states that… Summary: Tags: generative ai, security, blogpost
  • 23.
    26 UNDERSTANDING PROMPT ENGINEERING Summarizethe following technical sentence: Tags: generative ai, security, blogpost Sentence: Security has been a hot topic since the birth of Generative AI🔥. From the beginning, AWS states that security is a shared responsibility between us and them... Summary:
  • 24.
    INSTRUCTION INPUT DATA OUTPUT INDICATOR CONTEXT Summarizethe following technical sentence Sentence: Security has been a hot topic since the birth of Generative AI🔥. From the beginning, AWS states that… Summary: Tags: generative ai, security, blogpost 27 UNDERSTANDING PROMPT ENGINEERING PLEASE
  • 25.
    28 UNDERSTANDING PROMPT ENGINEERING Pleasesummarize the following technical sentence: Tags: generative ai, security, blogpost Sentence: Security has been a hot topic since the birth of Generative AI🔥. From the beginning, AWS states that security is a shared responsibility between us and them... Summary:
  • 26.
  • 27.
    30 Prompt: Capturesthe beauty of a tropical beach on a hot, sunny day.Include palm trees, crystal-clear waters. INFERENCE PARAMETERS Higher the value means more randomness. TEMPERATURE
  • 28.
    31 Prompt: Serenewinter wonderland,showcasing a snow-covered forest with glistening trees, a frozen lake, and the peaceful, cold atmosphere INFERENCE PARAMETERS Higher the value means it will only looks at a subset of tokens whose probability adds up to a certain threshold (Top P). TOP P
  • 29.
    32 Prompt: Cozymountain cabin surrounded by a snowy, alpine landscape, with smoke rising from the chimney and a sky full of stars on a freezing night. INFERENCE PARAMETERS Similar to Top P, but, instead of working in percentage, it specifies an absolute number of tokens. TOP K
  • 30.
  • 31.
    ONE-SHOT LEARNING FEW-SHOTLEARNING 34 ZERO-SHOT LEARNING PROMPT ENGINEERING TECHNIQUES
  • 32.
    35 ZERO SHOT DEMO PromptEngineering wedidn't provide the model with any examples of text alongside their classifications, the LLM already understands "sentiment"
  • 33.
    36 FEW SHOT DEMO PromptEngineering Few-shotprompting can be used as a technique to enable in-context learning where we provide demonstrations in the prompt to steer the model to better performance.
  • 34.
    37 FEW SHOT DEMO PromptEngineering few-shotprompting is not enough to get reliable responses for this type of reasoning problem.
  • 35.
    enables complexreasoning capabilitiesthrough intermediate reasoning steps. CHAIN-OF-THOUGHT generate knowledge to be used as part of the prompt. GENERATED KNOWLEDGE 38 MORE PROMPT ENGINEERING TECHNIQUES … AND MANY OTHERS
  • 36.
  • 37.
    FMs knowledge isfreezed at the time of of model training. POINT IN TIME Generation of text that is not grounded in accurate or real-world information. HALLUCINATION 40 WHY RAG? Retrieval Augmented Generation
  • 38.
    41 UNDERSTANDING RAG RAG’s internalknowledge can be easily altered or even supplemented on the fly, controlling what RAG knows and doesn’t know. Retrieval Augmented Generation (RAG) is a machine learning approach that combines elements of both retrieval-based models and generative models to improve the performance of natural language understanding and generation tasks. Retrieval Augmented Generation
  • 39.
    42 HIGH LEVEL DESIGN RAGArchitecture High level: 1. Document ingestion 2. Document retrieval 3. Prompt augmentation Data Sources Embedder Vector Storage Prompt Embedder Retrieval Engine Prompt + Docs LLM Output
  • 40.
  • 41.
    44 RAG DEMO Retrieval AugmentedGeneration Prompt Output
  • 42.
    45 RAG DEMO Retrieval AugmentedGeneration Prompt + Context Output
  • 43.
  • 44.
    47 INTRODUCTION TO FINETUNING FOUNDATION MODEL TRAINING DATA CUSTOM MODEL
  • 45.
    48 FINE TUNING WITHBEDROCK FOUNDATION MODEL TRAINING DATA CUSTOM MODEL
  • 46.
    49 TAKE-AWAYS • To getbetter results, give as many details as possible to LLMs. • Use RAG to cut training costs and decrease TTM when delivering POC or MVP. • Consider fine-tuning LLMs instead of giving them too many examples to learn from. (How many shots can an LLM handle?) • Go on and build something, best learning path is hands-one experience. Be part of the revolution!
  • 47.
  • 48.