Integrating Stable Diffusion into Artists' Workflows

CONFIDENTIAL |
CONFIDENTIAL |
DSC Pandora
Stable Diffusion in Artist’s Workflow
Ivan Jovanović, Lead Art Director, PLAYSTUDIOS EUROPE

CONFIDENTIAL | 2
Agenda
Intro
Stable Diffusion and Open Source Ecosystem
Tools and Art Production Pipelines
Conclusions

CONFIDENTIAL | 3
Intro / Motivation and goals
Motivation
Motivation for the AI image gen research was to learn tech and see if it is possible to integrate it with artists’ workflow,
and at which stage of design. Research was focused on:
- Characters - humanoid and animal
- Backgrounds - landscapes and background elements
- Objects/items - such as game symbols and game icons
- UI elements - such as fonts, frames and other UI elements
Goals
Goals of the research was the following:
- See how the tech can be integrated into to artists’ workflow as an additional tool which would complement
and augment artists’ style of work
- Output consistency in helping create art for a given theme, in a given style
- Output speed in comparison to traditional approach

CONFIDENTIAL | 4
Stable Diffusion and Open Source Ecosystem
Incredibly vibrant community of developers and testers
With great power comes great complexity which needs to be tamed to get predictable results

CONFIDENTIAL | 5
Tools / Software and hardware used in research
Focus on open-source, customizable and low-cost software solutions
These are software solutions which have been tested:
- Automatic1111 - frontend for Stable Diffusion, with additional features:
- Control networks
- ESRGAN upscalers
- ComfyUI node based UI for custom and templatized image creation
Intial hardware setup which was used for research:
- AMD Ryzen 3700 CPU (i7 equivalent)
- Nvidia 3090 GPU, 24Gb VRAM
- 32Gb RAM
- M.2 1Tb SSD drive
- Linux OS, Ubuntu 20.04 LTS

CONFIDENTIAL | 6
Tools / Harnessing control
Common theme for all of the workflow tests
One of the goals for the process itself was to test the ability to have control of all parts of the process
Random output
Raw ideas
Rough quality
Undefined subjects
Unusable results
Uncontrollable process
Focused design
Refined output
High Quality
Controllable process
Reproducible results

CONFIDENTIAL | 7
Character Pipeline / Building prompts
There is a methodology in prompt building which allows for more control of the output
Although it may seem trivial, prompt engineering is an important part of generation process. It is not always accurate
because the results are statistical, not deterministic, but more precise, descriptive prompts yield better results

CONFIDENTIAL | 8
Character Pipeline / Applying additional styles
There are multiple ways in which additional guidance can be applied to the generation
Neural net structures which have been pre-trained to a certain style or subject can be applied in addition to basic
checkpoint generation and prompt engineering

CONFIDENTIAL | 9
Character Pipeline / Infusing style on a subject
Infusing style on a generated subject using image to image generation

CONFIDENTIAL | 10
Character Pipeline / Blending between subjects
Blending between subjects in latent space using image to image

CONFIDENTIAL | 11
Character Pipeline / Greater control
Further control can be applied to various aspects of generation
We can control compositions, poses, colors, styles etc (varying degrees of usability atm)
Inferring a skeletal guide from reference image

CONFIDENTIAL | 12
Character Pipeline / Greater control
One of the key aspects of usability is control over elements of output
In these particular example multiple designs were generated using the same pose as a basis.
This method allows for precise control of composition elements

CONFIDENTIAL | 13
Character Pipeline / Variety within control
We are free to choose which aspects of image generation we control
We can choose to control one aspect and vary all other aspects of image creation including mood, subject, design,
colors etc…
We can achieve fairly precise control of composition elements

CONFIDENTIAL | 14
Character Pipeline / Refinement
Refining a design involves artist’s knowledge and experience
First step is refining selected design(s) in Photoshop and guiding it towards a more specific idea
Stable Diffusion Photoshop

CONFIDENTIAL | 15
Character Pipeline / Design variants
Design process involves going back and forth between artist and SD
Next step is bringing it back to A1111 for more iterations and design refinement with img2img and inpainting

CONFIDENTIAL | 16
Character Pipeline / Design variants

CONFIDENTIAL | 17
Character Pipeline / Inpainting elements
SD inpainting makes it very easy to add new elements to the design

CONFIDENTIAL | 18
Background Pipeline / Basic prompt building
Initial background design works similar as with characters (prompt building)
Backgrounds are easy to generate in large variety, but harder to refine than characters
Initial prompt + prompt engineering

CONFIDENTIAL | 19
Background Pipeline / Depth controlled composition
Control networks can be used to keep the composition in place
Backgrounds can be iterated on while keeping composition intact with dept maps (SD or artist generated)
Depth map basis

CONFIDENTIAL | 20
Background Pipeline / Variety of compositions
Initial background design works similar as with characters (prompt building)
We use depth map for initial composition guidance, and steer generation results with custom prompts
Depth map to various themes sharing same composition

CONFIDENTIAL | 21
Object pipeline / From concept to refinement
Initial ideation is similar to character pipeline
Design follows the same process of converging to a solution with input and filtering by the artist
Stable Diffusion Photoshop Stable Diffusion Photoshop

CONFIDENTIAL | 22
Object pipeline / Line art design to render
SD needs surprisingly little to be able to infer form
We can provide initial design and prompt, and guide SD to give us rendered solutions of our designs
Using an artist-provided initial design to guide output

CONFIDENTIAL | 23
UI elements pipeline / Font elements
Design iterations are possible with control networks and flat layouts
For UI elements a designer can provide flat design solutions for elements, and SD can iterate on variants and finalised
solutions

CONFIDENTIAL | 24
UI elements pipeline / Frame elements
Design iterations are possible with control networks and flat layouts
UI elements on the cabinet follow the same iteration process as the other UI elements

CONFIDENTIAL | 25
Results / Theme variations

CONFIDENTIAL | 26
Conclusions / Pros and Cons
Successes:
- The whole process takes from 30% to 50% less time than using traditional methods.
- The end result is an wide range of graphic assets, UI and splash art consistent with a given theme.
- Process can be used in all stages of artistic process - from references and ideas, design iterations, and
refinement. However, IT IS ONLY USEFUL IN THE HANDS OF A SKILLED ARTIST.
- The amount of produced asset variants is much larger than using traditional methods and can be used for to
converge on design ideas and direction faster.
- Response of artists is positive after initial experimentation and usage testing.
Limitations:
- Adherence to a very specific art style is currently not optimal without custom training.
- Less presence in dataset means less options for generation (characters>backgrounds>animals>objects>UI).
- Tendency of output to look ‘samey’ or to represent the common denominator of training data.

CONFIDENTIAL |
CONFIDENTIAL |
Thank you!
Questions?

Integrating Stable Diffusion into Artists' Workflows

Recommended

Recommended

More Related Content

Similar to Integrating Stable Diffusion into Artists' Workflows

Similar to Integrating Stable Diffusion into Artists' Workflows (20)

More from DataScienceConferenc1

More from DataScienceConferenc1 (20)

Recently uploaded

Recently uploaded (20)

Integrating Stable Diffusion into Artists' Workflows