Character Generation Master
Leader: Mei
Member: Emily, Steven, Zero
Agenda
❖ The Core Goal
❖ Motivation
❖ Architecture
Diagram
❖ Implementation
❖ Technique
❖ Prompt
❖ Parameters
❖ Model Evaluation
❖ Story Generation
The Core Goal
❖ 獨特性
❖ 前瞻性
❖ 興趣
❖ 過往經驗
Motivation
CTkinter
Image
Story
Text Prompt
Architecture Diagram
Stable Diffusion
Text Prompt
Gemini
Hands-on videos
Implementation
Technique
Technology we use
Image
Story
Text Prompt
Technique
Stable Diffusion
Text Prompt
Gemini
1
2
Image
Story
Text Prompt
Technique
Stable Diffusion
Text Prompt
Gemini
1
2
Technique
●Forward Diffusion
●Reverse Diffusion
Diffusion Model
Technique Forward Diffusion
R
A
N
D
O
M
N
O
I
S
E
R
A
N
D
O
M
N
O
I
S
E
R
A
N
D
O
M
N
O
I
S
E
R
A
N
D
O
M
N
O
I
S
E
Technique Reverse Diffusion
P
R
E
D
I
C
T
E
D
N
O
I
S
E
P
R
E
D
I
C
T
E
D
N
O
I
S
E
P
R
E
D
I
C
T
E
D
N
O
I
S
E
P
R
E
D
I
C
T
E
D
N
O
I
S
E
Technique
Stable Diffusion
Technique
Stable Diffusion
Text
Prompt
Technique
Stable Diffusion
Text
Prompt
Text
Prompt
Technique
CLIP
Text
Encoder
U-Net
VAE
Decoder
Stable Diffusion
Technique
Text
Prompt
Image
CLIP
Text
Encoder
U-Net
VAE
Decoder
Technique Prompt to
Embeddings
Text
Promp
t
Image
Stable Diffusion
Technique
Text
Prompt
CLIP
Text
Encoder
Text
Embedding
s
Prompt to
Embeddings
Technique
Stable Diffusion
U-Net
VAE
Decoder
Image
Text
Embedding
s
U-Net
Technique
Stable Diffusion
Text
Embedding
s
VAE
Decoder
Image
Denoise
Technique Reverse Diffusion
P
R
E
D
I
C
T
E
D
N
O
I
S
E
P
R
E
D
I
C
T
E
D
N
O
I
S
E
P
R
E
D
I
C
T
E
D
N
O
I
S
E
P
R
E
D
I
C
T
E
D
N
O
I
S
E
Text
Embedding
Text
Embedding
Text
Embedding
Text
Embedding
Noisy Image
Technique
Text
Embedding
s
Noise Predictor
U-Net
Denoise
Noisy Image
Technique
Text
Embedding
s
Noise Predictor
U-Net
Text
Embedding
s
Denoise
Noisy Image
Technique
Noise Predictor
U-Net
Text
Embedding
s
Denoise
Technique
Noise Predictor
U-Net
Noisy Image
Predict Noise
Denoise
Technique
Noise Predictor
U-Net
Noisy Image Predict Noise
Denoise
Technique
Noise Predictor
U-Net
New Noisy Image
Denoise
Noisy Image
Technique
Text
Embedding
s
Noise Predictor
U-Net
Predict Noise
Denoise
Technique
Noise Predictor
U-Net
Denoise
Technique
Noise Predictor
U-Net
Denoise
Technique
Stable Diffusion
VAE
Decoder
Image
Technique
Technique
512 x 512 x 3 = 786432
Pixel Space
Technique
512 x 512 x 3 = 786432
Pixel Space
Technique
512 x 512 x 3 = 786432
Latent Diffusion Model
Latent Space 768
VAE
Decoder
VAE & Latent Space
Technique
VAE
Decoder
Latent Space
VAE & Latent Space
Technique
Encoder Decode
Latent Space
VAE VAE
Pixel
Space
Pixel
Space
VAE & Latent Space
Technique
Encoder Decode
Latent Space
VAE VAE
Pixel
Space
Pixel
Space
VAE & Latent Space
Technique
Encoder Decode
Latent Space
VAE VAE
Pixel
Space
Pixel
Space
VAE & Latent Space
Technique
Encoder Decode
Latent Space
VAE VAE
Pixel
Space
Pixel
Space
VAE & Latent Space
Technique
Encoder Decode
Latent Space
VAE VAE
Pixel
Space
Pixel
Space
VAE & Latent Space
Technique
Encoder Decode
Latent Space
VAE VAE
Pixel
Space
Pixel
Space
VAE & Latent Space
Technique
Encoder Decode
Latent Space
VAE VAE
Pixel
Space
Pixel
Space
VAE & Latent Space
Technique
Encoder Decode
Latent Space
VAE VAE
Pixel
Space
VAE & Latent Space
Technique
Encoder Decode
Latent Space
VAE VAE
Pixel
Space
Pixel
Space
VAE & Latent Space
VAE
Decoder
Technique
Latent Space
Pixel
Space
VAE & Latent Space
Technique
Prompt
The Art of Prompts
Prompt
Prompt
"black dog”, “cute boy”, “park”,
“In the park, a cute boy played fetch with a black dog,
laughter echoing as they chased each other under
the sun.”
Prompt
● Positive Prompt
● Negative Prompt
● Positive Prompt
○ Anything you want to appear in the generated image
● Negative Prompt
Prompt
❏ female
❏ adventurer
❏ ponytail
Positive
Prompt
Positive Prompt
Prompt
❏ female
❏ adventurer
❏ ponytail
Positive
Prompt
❏ masterpiece
❏ best quality
❏ highly detailed
❏ absurdres
Positive Prompt
Prompt
Before After
Positive Prompt
Prompt
❏ male
❏ adventurer
❏ layered_haircut
Positive
Prompt
Positive Prompt
Prompt
Positive
Prompt
❏ masterpiece
❏ best quality
❏ highly detailed
❏ absurdres
❏ male
❏ adventurer
❏ layered_haircut
Positive Prompt
Prompt
Before After
Positive Prompt
Prompt
● Positive Prompt
○ Anything you want to appear in the generated image
● Negative Prompt
● Positive Prompt
○ Anything you want to appear in the generated image
● Negative Prompt
○ Anything you don't want to appear on the generated
image
Negative Prompt
❏ male,
❏ deep blue hair,
❏ …...,
❏ blablablabla ......,
Prompt
Positive
Prompt
Prompt
❏ bad anatomy
Negative
Prompt
Negative Prompt
Negative Prompt
❏ male,
❏ deep blue hair,
❏ …...,
❏ blablablabla ......,
Prompt
Positive
Prompt
Negative Prompt
Prompt
❏ bad anatomy
❏ malformed hands
Negative
Prompt
Prompt
❏ bad anatomy
❏ malformed hands
❏ extra fingers
Negative
Prompt
Negative Prompt
❏ male,
❏ blue_hair,
❏ …...,
❏ blablablabla ......,
Prompt
Positive
Prompt
Prompt Weighting
❏ male,
❏ red_hair,
❏ …...,
❏ blablablabla ......,
Prompt
Positive
Prompt
Prompt Weighting
❏ male,
❏ red_hair and
blue_hair,
❏ …...,
❏ blablablabla ......,
Prompt
Positive
Prompt
Prompt Weighting
❏ male,
❏ (red_hair)1.05 and
blue_hair,
❏ …...,
❏ blablablabla ......,
Prompt
Positive
Prompt
Prompt Weighting
❏ male,
❏ (red_hair)1.1 and blue_hair,
❏ …...,
❏ blablablabla ......,
Prompt
Positive
Prompt
Prompt Weighting
Prompt
並非所有 Stable Diffusion Model 都是使用同一種語
法
Prompt Syntax
Parameters Eta Noise Seed Delta
Denoise
Process
Noisy Image
Control
Parameters Eta Noise Seed Delta
Parameters
● Timesteps 50
● CFG Scale 7.5
Default
Parameters
● Timesteps 50
● CFG Scale 7.5
Default
Timesteps
Technique
Noise Predictor
U-Net
Denoise
Parameters Timesteps
Positive
Prompt
❏ male
❏ adventurer
❏ layered_haircut
Parameters Timesteps
5 6 8 9
7
Parameters Timesteps
10 11 12 13 14
Parameters Timesteps
15 16 17 18 19
Parameters Timesteps
20 21 22 23 24
Parameters Timesteps
25 26 27 28 29
Parameters Timesteps
30 31 32 33 34
Parameters
● Timesteps 50
● CFG Scale 7.5
Default
CFG Scale
Parameters
Positive
Prompt
❏ concept art
❏ upper body
❏ solo
CFG Scale
❏ male
❏ adventurer
❏ layered_haircut
Parameters CFG Scale
1.0 1.5 2.0 2.5 3.0
Parameters CFG Scale
3.5 4.0 4.5 5.0 5.5
不同 Model 的測試與結果評
估
● Clip Score
● Inception Score
Model
Evaluation
( 參考來源: An Efficient Cross-Modal Privacy-Preserving Image–Text Retrieval Scheme)
Model Evaluation
CLIP Score 是一種評估圖像和文本之間相似度的指標
CLIP Score
Model Evaluation
衡量生成圖像的清晰度和多樣性
Inception Score 是一種評估生成模型生成圖像質量的指標
Inception
Score
Model Evaluation
● Clip Score
設定 10 組不同 prompt ,分別對八組模型出圖,進行 Clip
Score 值計算
● Inception Score
設定 136 組不同 prompt( 由 Gemini 產出 ) ,分別對八組模
型出圖,進行 Inception Score 值計算
評估方式
Model Evaluation 評估結果
Model Evaluation 評估結果
Model Evaluation 評估結果
Model Evaluation 評估結果
Model Evaluation
1. Clip Score 相近,可能在相似的數據集上訓練
2. Inception Score ,基本表現都不錯
評估結論
Story
Generation
使用 Gemini 生成角色背景故事
Story Generation 使用 gemini-1.5-flash
API
Ask :根據 UI 選擇的項目,加上隨機善惡陣營,組成提問句,例
如:
" 請寫一個故事人物的說明,名字請依照描述隨機給予,並用繁體中
文顯示,以下是人物的相關設定 : female, middle-age, angel,
grey hair, Neutral Good"
AI 的魔法不僅限於
此
想要親眼見證更多?來看我們的 DEMO 吧

Character Generation Master 角色生成大師【艾鍗學院】