Build with AI on Google Cloud Session #2

Build with AI
on Google Cloud
Session #2 GenAI Deep Dive
2/5/2025
Seattle | Surrey | Vancouver | Burnaby

GDG Seattle
2
Margaret Maynard-Reid
Yenchi Lin
Clive Boulton
Vishal Pallerla
I/O Extended 2019
2024 Build with AI
DevFest Seattle 2018
WTM Lightning Talks 2018
Cloud Study Jam 2018
Follow GDG Seattle on LinkedIn

GDG Surrey
3
Follow GDG Surrey on LinkedIn

GDG Vancouver
Follow GDG Seattle on LinkedIn
4
Follow GDG Vancouver on LinkedIn
Join our GDG Vancouver Community
Volunteer Interest Form

GDG Burnaby
GDG Burnaby Bevy | LinkedIn
5

Build with AI
on Google Cloud
Agenda
● Study series overview
● Talk 1: Imagen 3
● Talk 2: AI Foundations
● Q & A
Seattle | Surrey | Vancouver | Burnaby

Build with AI
on Google Cloud
Study series
overview
7

Topics for this session
● Online study series overview
● Intro to GenAI on Google Cloud
● GenAI beginner path
● …
8

Study series overview
Follow 5 generative AI paths on Google Cloud Skills Boost:
1. 1/22/25 - Beginner: Intro to GenAI (link)
2. 2/5/25 - Generate Smarter GenAI Outputs (link)
3. 2/19/25 - Build & Modernize Apps with GenAI (link)
4. 3/5/25 - Integrate GenAI into Your DataFlow (link)
5. 3/19/25 - Deploy & Manage GenAI Models (link)
Topics are not limited to the above.
Each session: 2 short talks (by Googlers or experts) + Q&A section.
10

What is a learning path?
A learning path has multiple courses
Each course has videos, recommended reading, quiz & hands-on labs.
You will have at least two weeks to work through the materials
It’s OK if you don’t ﬁnish and feel free to study ahead
11

Access to Cloud Skills Boost
● Sign up here: https://www.cloudskillsboost.google/
● By RSVP, you get free access for a few months
● The videos are accessible by default while labs each require a credit
● You can work on each GenAI paths before or after each session
Note: Make sure to sign up on Google Cloud Skills Boost with the same email that you used
for event RSVP.
12

Build with AI
on Google Cloud
Imagen 3: Beyond
Image Generation
Margaret Maynard-Reid, AI/ML GDE
13

AI/ML GDE (Google Developer Expert)
3D artist
Fashion Designer
Instructor of MSIS, UW Foster
Ex MS Design Studio, MSR, MS Bing
About me
margaretmz.art
14

What is Generative AI?
A type of AI that creates new content with generative models:
15
Text
Image
Video
Audio
Generative AI
Text
Image
Video
Audio

Vision Generative Models
● 2014 Generative Adversarial
Networks (GANs)
● 2016 Autoregressive Models
● 2019 Variational autoencoders
(VAEs)
● Flow-based models
● 2020 Diﬀusion models
● 2022 Diﬀusion Transformer
16
Source: Lilian Weng blog (link)

Diffusion Models
1. Gradually add gaussian
noise to training data
2. Learn how to reverse the
process to generate
images from noise.
17
Source: Nvidia developer blog (link)
Forward image diffusion
Generative reverse denoise

CLIP: Contrastive Language-Image Pre-training
CLIP is a bridge between NLP and computer
vision, connecting text and Images
It has a text encoder and image encoder,
trained with 400 million image-text pairs.
● DALLE, DALLE-2
● Stable Diﬀusion
● Imagen, Imagen 2, Imagen 3
Paper: Learning Transferable Visual Models From
Natural Language Supervision
18

Diffusion Transformer
Paper: Scalable Diffusion Models with
Transformers
SoTA models using diffusion
transformer:
● Pixart-a
● SORA
● Stable Diffusion 3
19

Timeline: generative AI in vision
Source: Sora paper
20
Imagen 3
Veo/VideoFX

What is Imagen 3?
Google’s state-of-the-art text-to-image model
● Generate images
● Edit images: inpainting, outpainting, background
● Customize with references
21

Imagen 3
Imagen 3 claims top spot of Text-to-Image models on LymSys arena
22

How to access Imagen 3?
Google Labs ImageFX
https://labs.google/fx/tools/image-fx
23
Google Gemini App
https://gemini.google.com/app/
Google Cloud Vertex AI
https://console.cloud.google.com/vertex-ai/
studio/vision?
Google Colab

Imagen 3 - image generation
Generate floral design in
watercolor-style with Imagen 3
24
Integrate the print into my 3D fashion
design in Clo3D

Original
Imagen 3 - use mask to change images
26
Removed necklace Changed earring

Imagen 3 - change background
27
“Change the background to a sandy
beach by the ocean with blue sky”
“Change the background
to a botanical garden”
“Change the background
to fashion runway”
Original image

Imagen 3 - reference image (product)
Max number of reference images: 4
Prompt = “woman drinking out of a teacup[1][2] wearing a green sweater”
28

Imagen 3 - reference image (person)
29
“A purple floral dress” “A green dress”
“A blue dress”
Original image

Veo 2 / VideoFX
Text to Image:
● Veo 2
Text to Image to Video:
● Imagen 3 and Veo 2
Google blog post:
State-of-the-art video and image
generation with Veo 2 and Imagen 3
30

Thank you!
Connect with me to learn more about AI, art & design!
@margaretmz
@margaretmz
@margaretmz
@margaretmz
31

Build with AI
on Google Cloud
32
AI Foundations: From
Embeddings to RAG
Annie Wang, Google

About Me
Software Engineer & Career Coach
@Google

You’re greeted with a
bunch of scary lights
on the dashboard!
You turn on
your car and…

LLM to generate response
Answer
“What does this warning
light mean on my car?”
model
actually… I'm not
sure…
��

Answer
model vector DB
query...
with the latest external knowledge,
less hallucinations
🤓

Input
Text
Embedding
Model Embeddings
[0.1, 0.002,
0.56, 0.98…]

Insect
Bug in the garden
Beetle
Caterpillar
Distance
JSONDecodeError:
Expecting ,
delimiter: line 1
column 8 (char 12)
Bug in the code

Input
Multimodal
Embedding
Model Embeddings
[0.1, 0.002,
0.56, 0.98…]
[0.93, 0.133,
0.142, 0.03…]
[0.22, 0.092,
0.391, 0.78…]
?

Retrieval & Similarity Search
Given a query, search a corpus of items for the most relevant candidate item(s)
…
1
2
Retrieved candidate_items should be
more similar to the query_item than any
other items in the embedding_space
embedding_space
query_item candidate_items
query search rank
k

Behind the scenes, embeddings are used
…
1
2
k
embedding_space
query_item candidate_items
query search rank
[0.1, 0.002,
0.56, 0.98...]
[0.97, 0.003,
0.532, 0.91...]
[0.94, 0.004,
0.553, 0.89...]
[0.1, 0.003,
0.52, 0.89...]

In this approach we create an anthemic opener showing real people using A.I. to do amazing things. This can be everyday people
Embedding
space
Position of objects
within an vector
space captures
meaning
This extends to
multimodal data
Joint Embedding
Vector Space
Image:
“gray tabby cat
laying in front of a
Christmas tree”
Text: size color
living

This augments the
existing LLM’s knowledge
with information it
wasn’t trained on
The LLM generates a
response that weaves
together retrieved chunks
+ pretrained knowledge
Chunks retrieved
from vector search
are fed into LLM
01 02 03

Standalone LLM
Individual asks question to LLM
LLM generates response based on
pretrained knowledge
Answer is returned to user
LLM to generate response Answer

Question / Input
Query embedding
Vector search
LLM to generate
response
Approx. Nearest
Neighbors
Search
Fetch actual text
based on doc ids
Vector database
Document chunks, images
“What does this warning light mean on my car?”
Answer
RAG
Inputs are turned to embeddings
Vector search → multimodal
outputs (documents, images)
Outputs sent to LLM
Answer is returned to user
Retrieve top-k
relevant items

Build with AI
on Google Cloud
Q&A
57

Build with AI
on Google Cloud
Cloud Skills Boost
walkthrough
58

59
Sign In -Google Cloud Skills Boost

More questions?
Post them on GDG Surrey
Discord server #gen_ai_gcp
61
Scan Me

Have fun studying!
Action items:
● Join discord - post your questions there
● Get access to Cloud Skills Boost credits
● Complete 2nd GenAI path on CSB
● Get started on 3rd GenAI path on CSB
Next session:
● Feb 19, 2025 - Session #3 Gemini (RSVP)
62

Build with AI on Google Cloud Session #2

More Related Content

Similar to Build with AI on Google Cloud Session #2

More from Margaret Maynard-Reid

Recently uploaded

Build with AI on Google Cloud Session #2