SlideShare a Scribd company logo
1 of 81
The State of Skinning
… Or How To Maintain Your Physique
Welcome!
Tervetuloa!
Rulon Raymond
Sr. Engine Programmer
Introduction
1) Review
2) Evolution of techniques on console HW
3) The new hotness (hint: it’s a Clifford Algebra)
4) Extensions
DISCLAIMER: All screenshots and techniques presented
are not associated with any specific title, project, or
oragnization, unless otherwise stated.
Outline
What is Skinning?
What is Skinning?
I Was Skinning
Long Before 3D
Animated Models
Were All The Rage
 Step 1: Generate a cool animated pose.
 Step 2: ???
 Step 3: Use fancy lighting and shaders to draw an
animated model on-screen (i.e. profit)
What is Skinning?
 Step2: Skinning!
What is Skinning?
Skinned Model, ready for drawing
Model
Vertices
Bone
Weights
Bone
Transforms
What is Skinning?
𝑣′ = 𝑆𝑘𝑖𝑛(𝑊, 𝑇, 𝑣)
What is Skinning?
𝑣: The initial vertex transform
𝑊: Array of bone weighting values
𝑇: Array of bone transforms
𝑣′: The final vertex transform
Skinning on Consoles
• Sony Playstation (1995)
• Geometry Transform Engine (GTE)
Skinning on Consoles
• Sony Playstation2 (2000)
• Vector Unit 0 (VU0)
Skinning on Consoles
 Microsoft Xbox (2001)
 NVIDIA GPU (DirectX 8.x)
Skinning on Consoles
 Microsoft Xbox 360 (2005)
 PowerPC CPU
Skinning Implementation
 Sony PS3 ( 2006)
 Synergistic Processing Units (SPU’s)
Skinning on Consoles
Why not use the GPU for skinning on Xbox 360 and PS3?
The CPU’s/SPU’s are actually quite fast.
Skinning Implementation
3X
@3.2Ghz
6X
@3.2Ghz
(with many restrictions…)
Why not use the GPU for skinning on Xbox 360 and PS3?
Split Vertex Streams
Skinning on Consoles
Vertex
Position, Tangent Space
•Skinned
Colors, UV’s, etc.
•Constant – sent straight to GPU
Stream 0
Stream 1
Why not use the GPU for skinning on Xbox 360 and PS3?
Unified Memory Architecture
Skinning on Consoles
// Just skinned a vertex. Now write it out as
// three 16-byte vectors
__stvx( skinnedVertexData0, vertsOutBuffer, 0 );
__stvx( skinnedVertexData1, vertsOutBuffer, 16 );
__stvx( skinnedVertexData2, vertsOutBuffer, 32 );
// Gah – why’d that take so long?
// ~20% faster!
// (F*&^% write-combine memory)
__stvx( skinnedVertexData0, vertsOutBuffer, 0 );
_WriteBarrier();
__stvx( skinnedVertexData1, vertsOutBuffer, 16 );
_WriteBarrier();
__stvx( skinnedVertexData2, vertsOutBuffer, 32 );
Why not use the GPU for skinning on Xbox 360 and PS3?
So you can use the GPU for other things.
Skinning on Consoles
 Microsoft Xbox One (2013)
 Sony PS4 (2013)
 AMD GCN GPU
Skinning on Consoles
Skinning on Consoles
GPU Frame
Draw Calls
IDLE
Draw
Calls
Post FX
IDLEGCN Compute Unit
GCN Compute Unit
Async Compute Skinning
Skinning on Consoles
GPU Frame
Draw Calls
Skinning
Draw
Calls
Post FX
SkinningGPU Compute Unit
GPU Compute Unit
• Generate Draw
List (frame N)
Visible Models
• Async
Compute
Dispatch
Thread.
Model Skinning
Workloads • GPU rendering
(frame N-1)
Skinned Model
(frame N)
Skinning on Consoles
Async Compute Skinning
Skinning on Consoles
MATH WARNING!
The standard approach to
real-time skinning, used in
almost every modern 3D
game.
Linear Matrix Blend Skinning
Suffers from some well-
documented problems...
The “candy wrapper” effect
Linear Matrix Blend Skinning
Mesh Volume Preservation
Example: “flat ass syndrome”
Linear Matrix Blend Skinning
Q: Why do these problems exist?
A: Let’s take a closer look at the underlying math…
Linear Matrix Blend Skinning
𝑣′
=
𝑖=1
𝑛
𝑤𝑖 𝑀𝑗𝑖 𝑣
Linear Matrix Blend Skinning
 Apply the property of distrubutivity:
𝑣′ = (
𝑖=1
𝑛
𝑤𝑖 𝑀𝑗𝑖)𝑣
Linear Matrix Blend Skinning
 To keep it simple: Let 𝑀𝑗𝑖 represent a rigid transform.
 No scale, shear, …
 Most common scenario for skinning in games.
 A linear combination of rigid
transforms DOES NOT yield a
rigid transform!
 Orthonormal matrices aren’t
closed under addition.
 Scaling values can creep into
the final vertex transforms.
 Extreme cases can result in
rank-deficient matrices.
Linear Matrix Blend Skinning
𝑣′
𝑣
𝑀𝑗1 𝑣
𝑀𝑗2 𝑣
Example: The “candy
wrapper” artifact
 The most common workaround to these issues is the addition of new
bones.
 Hand-animated or procedural.
 Split the rotation of a joint, relative to its parent, into even increments –
for a single axis only.
 Example: Arm Twist Bone
 Parented to the shoulder and consistently represents exactly half its twist(roll)
motion.
Linear Matrix Blend Skinning
Adding these bones is not
free!
 Memory and processing
overhead.
 Exact amount depends on
actual implementation.
Linear Matrix Blend Skinning
 Dual Quaternions to the rescue!
 But what exactly are they?
 Let’s start with a quick review of the vanilla variety of
quaternions…
Linear Matrix Blend Skinning
 𝑞 = [𝑎, 𝑏, 𝑐, 𝑑]
 𝑞 = 𝑎 + 𝑏𝑖 + 𝑐𝑗 + 𝑑𝑘
 𝑖2 = 𝑗2 = 𝑘2 = 𝑖𝑗𝑘 = −1
 𝑞 = 𝑟, 𝑣 , 𝑟 ∈ ℝ, 𝑣 ∈ ℝ3
 𝑞 = cos
𝜃
2
, 𝑠 sin
𝜃
2
, 𝜃 ∈ ℝ, 𝑠 ∈ ℝ3
Quaternions
Hamilton - 1843
A 4D extension of complex numbers
For our purposes all we care about is unit quaternions.
 Conveniently represent rotations.
 Conjugate: 𝑞∗
= 𝑎 − 𝑏𝑗 − 𝑐𝑘 − 𝑑𝑧
Quaternions
𝑞∗
= 𝑞−1
, 𝑞 = 1
One important quaternion
equation to note:
𝑣′
= 𝑞𝑣𝑞∗
, 𝑣 = 0, 𝑥, 𝑦, 𝑧
Applies a rotation to a 3D point
Quaternions
𝑑 = 𝑎 + 𝑏𝜀, 𝜀2 = 0
Similar in form to complex numbers
Stored as: 𝑑 = 𝑎 𝑏
Dual Numbers
Conjugate
𝑑∗ = 𝑎 − 𝑏𝜀
Multiplication
𝑑0 𝑑1 = 𝑎0 + 𝑏0 𝜀 𝑎1 + 𝑏1 𝜀 = 𝑎0 𝑎1 + (𝑎0 𝑏1 + 𝑏0 𝑎1)𝜀
Dual Numbers
Basically a quaternion whose elements are dual numbers
 𝑞 = 𝑤 + 𝑖 𝑥 + 𝑗 𝑦 + 𝑘 𝑧 (quaternion form)
 𝑤 is the scalar part (dual number)
 𝑥, 𝑦, 𝑧 is the vector part (dual vector)
 𝑞 = 𝑞 𝑎 + 𝑞 𝑏 𝜀 (dual number form)
 𝑞 𝑎 : “non-dual part”
 𝑞 𝑏 : “dual part”
 Most useful for skinning.
Dual Quaternions
 Multiplication:
 𝑝 𝑞 = 𝑝 𝑎 𝑞 𝑎 + (𝑝 𝑏 𝑞 𝑎 + 𝑝 𝑎 𝑞 𝑏)𝜀
 Quaternion Conjugate:
 𝑞∗ = 𝑞 𝑎
∗ + 𝑞 𝑏
∗ 𝜀
 Dual Conjugate:
 𝑞 = 𝑞 𝑎 − 𝑞 𝑏 𝜀
 Quaternion & Dual Conjugate:
 𝑞∗ = 𝑞 𝑎
∗
− 𝑞 𝑏
∗
𝜀 = (𝑞 𝑎 − 𝑞 𝑏 𝜀)∗
Dual Quaternions
𝑁𝑜𝑟𝑚( 𝑞) = 𝑞 𝑎 +
𝑞 𝑎, 𝑞 𝑏
𝑞 𝑎
𝜀
Dual Quaternions
𝑞∗ = 𝑞−1, 𝑞 = 1
Rigid Transforms:
 𝑞 𝑟𝑜𝑡𝑎𝑡𝑖𝑜𝑛 = 𝑞 𝑎 + 0𝜀
 𝑞𝑡𝑟𝑎𝑛𝑠𝑙𝑎𝑡𝑖𝑜𝑛 = (1,0,0,0) +
(0,𝑡 𝑥,𝑡 𝑦,𝑡 𝑧)
2
𝜀
 𝑞 𝑟𝑖𝑔𝑖𝑑 = 𝑞𝑡𝑟𝑎𝑛𝑠𝑙𝑎𝑡𝑖𝑜𝑛 𝑞 𝑟𝑜𝑡𝑎𝑡𝑖𝑜𝑛
= 𝑞 𝑎 +
(0,𝑡 𝑥,𝑡 𝑦,𝑡 𝑧)
2
𝑞 𝑎 𝜀
Dual Quaternions
Transforming a 3D point
𝑣′ = 𝑞 𝑟𝑖𝑔𝑖𝑑 𝑣 𝑞 𝑟𝑖𝑔𝑖𝑑
−1
, 𝑣 = (1,0,0,0) + (0, 𝑣 𝑥, 𝑣 𝑦, 𝑣𝑧)𝜀
Dual Quaternions
Geometric Interpretation
 Recall: 𝑞 = cos
𝜃
2
+
𝑠 sin
𝜃
2
( 𝑠 = axis, θ =
𝑎𝑛𝑔𝑙𝑒)
 → 𝒒 = 𝒄𝒐𝒔
𝜽
𝟐
+ 𝒔 𝒔𝒊𝒏
𝜽
𝟐
Dual Quaternions
𝜽 = 𝜃 𝑎 + 𝜃 𝑏 𝜖 : dual quaternion
representing only a rotation
• 𝑡 =
(0,𝑡 𝑥,𝑡 𝑦,𝑡 𝑧)
2
: translation vector, in
quaternion form
• 𝜃 𝑎 : angle of rotation
• 𝜃 𝑏 = 𝑡, 𝑠 𝑎 : translation along 𝑠 𝑎
𝒔 = 𝑠 𝑎 + 𝑠 𝑏 𝜀 : unit dual quaternion with a
0 scalar part
• 𝑠 𝑎 = 0, 𝑠 𝑥, 𝑠 𝑦, 𝑠 𝑧 : direction of axis
of rotation
• 𝑠 𝑏 = (
1
2
( 𝑠 𝑎 × 𝑡 cot
𝜃 𝑎
2
+ 𝑡)) × 𝑠 𝑎 :
moment of rotation axis
Screw Transform!
 Rotation about an axis followed by translation along that
axis.
 All rigid transforms can be described this way.
Dual Quaternions
Simple Case:
𝐷𝑄𝐵 𝑞0, 𝑞1, t =
1 − 𝑡 𝑞0 + 𝑡 𝑞1
( 1 − 𝑡 𝑞0 + 𝑡 𝑞1)
Dual Quaternion Blend Skinning
𝑞0
𝑞1
𝑞 𝐷𝑄𝐵
𝐷𝑄𝐵 𝑞0, … , 𝑞 𝑛, 𝑤0, … , 𝑤 𝑛 =
𝑤0 𝑞0 + … + 𝑤 𝑛 𝑞 𝑛
𝑤0 𝑞0 + … + 𝑤 𝑛 𝑞 𝑛
Dual Quaternion Blend Skinning
Unlike with matrix blending, the result is
always a rigid transform!
 Very accurate, but not perfect.
 Can introduce accelerations when input dual
quaternions differ greatly.
 8.15 degrees : Maximum rotational deviation
 15.1% : Maximum translational deviation
 Modified SLERP can be used if absolute accuracy is
required.
 𝑆𝐿𝐸𝑅𝑃 𝑞0, 𝑞1, 𝑡 = 𝑞1 𝑞0
∗ 𝑡
𝑞0
 Efficiency tradeoff usually not worth it.
Dual Quaternion Blend Skinning
Must handle antipodality!
Polarity rule: 𝑞 ≡ − 𝑞
We want: ∀ 𝑞0, … , 𝑞 𝑛 ∶ 𝑞𝑖, 𝑞 𝑗 ≥ 0
Fix up all dual quaternions prior to skinning.
Dual Quaternion Blend Skinning
𝑞
− 𝑞
for ( all bones’ unit dual quaternions, dq[i] )
if ( InnerProduct( dq[i], dq[parent[i]] ) < 0.0 )
Negate( dq[i] );
Dual Quaternion Blend Skinning
// Input: unit quaternion 'q0', translation vector 't'
// Output: unit dual quaternion 'dq'
static void QuatTrans2UDQ( const float q0[4], const float t[3], float dq[2][4] )
{
// Non-Dual Part: dq[0] = q0
for ( int i=0; i<4; i++ )
dq[0][i] = q0[i];
// Dual Part: dq[1] = ((0,t[0],t[1],t[2])/2)*q0
dq[1][3] = -0.5f*(t[0]*q0[0] + t[1]*q0[1] + t[2]*q0[2]); // Scalar Component
dq[1][0] = 0.5f*( t[0]*q0[3] + t[1]*q0[2] - t[2]*q0[1]); // Vector Component 0
dq[1][1] = 0.5f*(-t[0]*q0[2] + t[1]*q0[3] + t[2]*q0[0]); // Vector Component 1
dq[1][2] = 0.5f*( t[0]*q0[1] - t[1]*q0[0] + t[2]*q0[3]); // Vector Component 2
}
Generating a Dual Quaternion
Dual Quaternion Blending
Dual Quaternion Blend Skinning
// Input: array of dual quaternions 'dqIn'
// Input: array of weights 'w‘, totaling 1.0
// Input: size of the above two arrays (> 1)
// Output: the blended dual quaternion 'dqOut'
static void DQB( const float dqIn[][2][4], float w[], int numDQ,
float dqOut[2][4] )
{
// dqOut = w[0]*dqIn[0]
Vec4Scale( dqIn[0][0], w[0], dqOut[0] );
Vec4Scale( dqIn[0][1], w[0], dqOut[1] );
for( int i = 1; i < numDQ; ++i )
{
// dqOut += w[i]*dqIn[i]
Vec4Mad( dqOut[0], w[i], dqIn[i][0], dqOut[0] );
Vec4Mad( dqOut[1], w[i], dqIn[i][1], dqOut[1] );
}
}
Transformation Using a Dual Quaternion
Dual Quaternion Blend Skinning
// Input: unit dual quaternion 'dq'
// Input: input position 'vecIn'
// Output: rigidly transformed position 'vecOut'
static void DQTransform( const float dq[2][4],
const vec3_t vecIn, vec3_t vecOut )
{
vec4_t q0, q1;
float a0, ae, recipDeLen;
vec3_t d0, de, temp1, temp2, temp3, temp4, temp5;
vec3_t temp6, temp7, temp8, temp9, temp10, temp11;
recipDeLen = 1.0f / I_sqrt( dq[0][3]*dq[0][3]
+ dq[0][0]*dq[0][0]
+ dq[0][1]*dq[0][1]
+ dq[0][2]*dq[0][2] );
// Normalize both parts of the dual quaternion, based
// on the length of the non-dual part.
Vec4Scale( dq[0], recipDeLen, q0 );
Vec4Scale( dq[1], recipDeLen, q1 );
// Isolate the scalar and vector parts of both
// quaternions. This is just for code clarity and can
// be omitted for SIMD optimization.
a0 = q0[3];
ae = q1[3];
memcpy( d0, &q0[0], sizeof( d0 ));
memcpy( de, &q1[0], sizeof( de ));
// Transform 'vecIn' by the dual quaternion
// to produce 'vecOut'. vecOut = dq*v*dq^-1
Vec3Cross( d0, vecIn, temp1 );
Vec3Mad( temp1, a0, vecIn, temp2 );
Vec3Scale( de, a0, temp3 );
Vec3Scale( d0, ae, temp4 );
Vec3Cross( d0, de, temp5 );
Vec3Sub( temp3, temp4, temp6 );
Vec3Add( temp6, temp5, temp7 );
Vec3Scale( temp7, 2.0f, temp8 );
Vec3Scale( d0, 2.0f, temp9 );
Vec3Cross( temp9, temp2, temp10 );
Vec3Add( vecIn, temp10, temp11 );
Vec3Add( temp11, temp8, vecOut );
}
0
5
10
15
20
25
30
35
Matrix Skinning (column-
major)
DQB Skinning
Dual Quaternion Blend Skinning
Instruction Counts (XB360 VMX )
0
5
10
15
20
25
30
35
Blending
(2)
Blending
(3)
Blending
(4)
Transform
Pos
Transform
Vec
Matrix Skinning (row-
major)
DQB Skinning
Dual Quaternion Blend Skinning
Instruction Counts (XB360 GPU)
Dual Quaternion Blend Skinning
On GCN GPU DQ
Skinning
Matrix Skinning
Aggregate $
Efficiency  
VGPR Count
 
Memory Stalls
 
DRAM Footprint
 
DQ vs. Matrix Skinning
DQ Skinning is ~24% faster***
Dual Quaternion Blend Skinning
***: Depends heavily on vertex layout, tangent space quality,
number of bones, and weighting distributions.
Optional Optimizations:
 Compress quaternions
 10:10:10:2 format for non-dual component
 Tune max waves/SIMD
 Generate skinning transforms on the GPU
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Procedural Motions
Dual Quaternion Blend Skinning
Spore © EA
(2008)
IK
Dual Quaternion Blend Skinning
Especially when animations are
played on characters with different
or custom proportions.
Ragdolls: Can you spot all the artifacts DQB would resolve?
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Dual Quaternion Blend Skinning
Pros
 GPU/SIMD friendly
 No asset changes required
 Cheaper transform blending
 More cache friendly
 Requires less memory/constants
 Conducive to procedural motions
 (Mostly) replaces the need for the
rotational split bones mentioned
earlier.
 Can be enabled selectively (per-
LOD, per-submesh, high end
machines only)
Dual Quaternion Blend Skinning
Cons
 Less intuitive than matrices
 Local scaling must be handled
separately
 Actual vertex transform is more
ALU
 Still not 100% accurate
 Potential bulge artifacts
 Not widely adopted in games (yet)
 No more flat asses!
Skinning
Blend Shapes
Skinning
Geometry Caching
Skinning
 “Bulging-free dual quaternion skinning” (Kim, 2014)
Skinning
Skinning
1. 𝐵𝑢𝑙𝑔𝑒 𝑣𝑡 = CalcBulge
𝑣0, 𝐹𝐾𝑏𝑜𝑛𝑒𝑠0, 𝐹𝐾𝑏𝑜𝑛𝑒𝑠 𝑚𝑖𝑛 ,
𝐹𝐾𝑏𝑜𝑛𝑒𝑠 𝑚𝑎𝑥, 𝑃𝑟𝑜𝑐𝑒𝑑𝑢𝑟𝑎𝑙𝐵𝑜𝑛𝑒𝑠
2. Solve for: Bone weights on 𝑣0 to
minimize 𝐵𝑢𝑙𝑔𝑒 𝑣𝑡 for all t.
3. Re-weight artists-selected vertices in
Maya/Max.
Skinning
 The optimal model skinning approach can vary per
platform.
 Give dual quaternion skinning a look.
 Don’t assume skinning is a “solved problem”.
(Unless you’re Leatherface)
Conclusion
Rulon@InfinityWard.com
Questions?

More Related Content

What's hot

11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound image11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound image
Alexander Decker
 
Fast dct algorithm using winograd’s method
Fast dct algorithm using winograd’s methodFast dct algorithm using winograd’s method
Fast dct algorithm using winograd’s method
IAEME Publication
 
Dragan Huterer Seminar - 19/05/2011
Dragan Huterer Seminar - 19/05/2011Dragan Huterer Seminar - 19/05/2011
Dragan Huterer Seminar - 19/05/2011
CosmoAIMS Bassett
 
Peridynamic simulation of delamination propagation in fiber-reinforced composite
Peridynamic simulation of delamination propagation in fiber-reinforced compositePeridynamic simulation of delamination propagation in fiber-reinforced composite
Peridynamic simulation of delamination propagation in fiber-reinforced composite
YILE HU
 

What's hot (20)

Optimal nonlocal means algorithm for denoising ultrasound image
Optimal nonlocal means algorithm for denoising ultrasound imageOptimal nonlocal means algorithm for denoising ultrasound image
Optimal nonlocal means algorithm for denoising ultrasound image
 
11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound image11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound image
 
Lecture 4: Stochastic Hydrology (Site Characterization)
Lecture 4: Stochastic Hydrology (Site Characterization)Lecture 4: Stochastic Hydrology (Site Characterization)
Lecture 4: Stochastic Hydrology (Site Characterization)
 
Lecture 12 (Image transformation)
Lecture 12 (Image transformation)Lecture 12 (Image transformation)
Lecture 12 (Image transformation)
 
Ch13
Ch13Ch13
Ch13
 
Sliced Wasserstein距離と生成モデル
Sliced Wasserstein距離と生成モデルSliced Wasserstein距離と生成モデル
Sliced Wasserstein距離と生成モデル
 
Gaussian Image Blurring in CUDA C++
Gaussian Image Blurring in CUDA C++Gaussian Image Blurring in CUDA C++
Gaussian Image Blurring in CUDA C++
 
RCIM 2008 - Janus
RCIM 2008 - JanusRCIM 2008 - Janus
RCIM 2008 - Janus
 
Direct method for soliton solution
Direct method for soliton solutionDirect method for soliton solution
Direct method for soliton solution
 
Erlangga
ErlanggaErlangga
Erlangga
 
Separable bilateral filtering for fast video preprocessing
Separable bilateral filtering for fast video preprocessingSeparable bilateral filtering for fast video preprocessing
Separable bilateral filtering for fast video preprocessing
 
Fast dct algorithm using winograd’s method
Fast dct algorithm using winograd’s methodFast dct algorithm using winograd’s method
Fast dct algorithm using winograd’s method
 
Circular Convolution
Circular ConvolutionCircular Convolution
Circular Convolution
 
Métodos Numéricos
Métodos NuméricosMétodos Numéricos
Métodos Numéricos
 
HTML5 Animation in Mobile Web Games
HTML5 Animation in Mobile Web GamesHTML5 Animation in Mobile Web Games
HTML5 Animation in Mobile Web Games
 
Dragan Huterer Seminar - 19/05/2011
Dragan Huterer Seminar - 19/05/2011Dragan Huterer Seminar - 19/05/2011
Dragan Huterer Seminar - 19/05/2011
 
Greedy algo revision 2
Greedy algo revision 2Greedy algo revision 2
Greedy algo revision 2
 
Maximizing performance of 3 d user generated assets in unity
Maximizing performance of 3 d user generated assets in unityMaximizing performance of 3 d user generated assets in unity
Maximizing performance of 3 d user generated assets in unity
 
03 image transform
03 image transform03 image transform
03 image transform
 
Peridynamic simulation of delamination propagation in fiber-reinforced composite
Peridynamic simulation of delamination propagation in fiber-reinforced compositePeridynamic simulation of delamination propagation in fiber-reinforced composite
Peridynamic simulation of delamination propagation in fiber-reinforced composite
 

Similar to Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern approaches to model skinning

Robotic Manipulator with Revolute and Prismatic Joints
Robotic Manipulator with Revolute and Prismatic JointsRobotic Manipulator with Revolute and Prismatic Joints
Robotic Manipulator with Revolute and Prismatic Joints
Travis Heidrich
 

Similar to Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern approaches to model skinning (20)

Introduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchIntroduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from Scratch
 
Vibration Reduction on Beams Subjected to Moving Loads by Linear and Nonlinea...
Vibration Reduction on Beams Subjected to Moving Loads by Linear and Nonlinea...Vibration Reduction on Beams Subjected to Moving Loads by Linear and Nonlinea...
Vibration Reduction on Beams Subjected to Moving Loads by Linear and Nonlinea...
 
Robotic Manipulator with Revolute and Prismatic Joints
Robotic Manipulator with Revolute and Prismatic JointsRobotic Manipulator with Revolute and Prismatic Joints
Robotic Manipulator with Revolute and Prismatic Joints
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2
 
Computer graphics
Computer graphicsComputer graphics
Computer graphics
 
Computer Graphics Unit 1
Computer Graphics Unit 1Computer Graphics Unit 1
Computer Graphics Unit 1
 
Introduction to PyTorch
Introduction to PyTorchIntroduction to PyTorch
Introduction to PyTorch
 
2d/3D transformations in computer graphics(Computer graphics Tutorials)
2d/3D transformations in computer graphics(Computer graphics Tutorials)2d/3D transformations in computer graphics(Computer graphics Tutorials)
2d/3D transformations in computer graphics(Computer graphics Tutorials)
 
Smart Room Gesture Control
Smart Room Gesture ControlSmart Room Gesture Control
Smart Room Gesture Control
 
Differential Calculus- differentiation
Differential Calculus- differentiationDifferential Calculus- differentiation
Differential Calculus- differentiation
 
The Day You Finally Use Algebra: A 3D Math Primer
The Day You Finally Use Algebra: A 3D Math PrimerThe Day You Finally Use Algebra: A 3D Math Primer
The Day You Finally Use Algebra: A 3D Math Primer
 
UNIT I_3.pdf
UNIT I_3.pdfUNIT I_3.pdf
UNIT I_3.pdf
 
Introduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksIntroduction to Artificial Neural Networks
Introduction to Artificial Neural Networks
 
CVPR2016 Fitting Surface Models to Data 抜粋
CVPR2016 Fitting Surface Models to Data 抜粋CVPR2016 Fitting Surface Models to Data 抜粋
CVPR2016 Fitting Surface Models to Data 抜粋
 
Calculus Early Transcendentals 10th Edition Anton Solutions Manual
Calculus Early Transcendentals 10th Edition Anton Solutions ManualCalculus Early Transcendentals 10th Edition Anton Solutions Manual
Calculus Early Transcendentals 10th Edition Anton Solutions Manual
 
Solutions Manual for Calculus Early Transcendentals 10th Edition by Anton
Solutions Manual for Calculus Early Transcendentals 10th Edition by AntonSolutions Manual for Calculus Early Transcendentals 10th Edition by Anton
Solutions Manual for Calculus Early Transcendentals 10th Edition by Anton
 
Journey to structure from motion
Journey to structure from motionJourney to structure from motion
Journey to structure from motion
 
lecture 5 courseII (6).pptx
lecture 5 courseII (6).pptxlecture 5 courseII (6).pptx
lecture 5 courseII (6).pptx
 
04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks
 
DCT
DCTDCT
DCT
 

More from Umbra Software

More from Umbra Software (9)

GDC16: Improving geometry culling for Deus Ex: Mankind Divided by Nicolas Trudel
GDC16: Improving geometry culling for Deus Ex: Mankind Divided by Nicolas TrudelGDC16: Improving geometry culling for Deus Ex: Mankind Divided by Nicolas Trudel
GDC16: Improving geometry culling for Deus Ex: Mankind Divided by Nicolas Trudel
 
GDC16: Arbitrary amount of 3D data running on Gear VR by Vinh Truong
GDC16: Arbitrary amount of 3D data running on Gear VR by Vinh TruongGDC16: Arbitrary amount of 3D data running on Gear VR by Vinh Truong
GDC16: Arbitrary amount of 3D data running on Gear VR by Vinh Truong
 
GDC2014: Boosting your ARM mobile 3D rendering performance with Umbra
GDC2014: Boosting your ARM mobile 3D rendering performance with Umbra GDC2014: Boosting your ARM mobile 3D rendering performance with Umbra
GDC2014: Boosting your ARM mobile 3D rendering performance with Umbra
 
Umbra Ignite 2015: – Remy Chinchilla & Kevin Cerdà AAA indie production for ...
Umbra Ignite 2015: –  Remy Chinchilla & Kevin Cerdà AAA indie production for ...Umbra Ignite 2015: –  Remy Chinchilla & Kevin Cerdà AAA indie production for ...
Umbra Ignite 2015: – Remy Chinchilla & Kevin Cerdà AAA indie production for ...
 
Umbra Ignite 2015: Balázs Török – The blanket that’s always too short
Umbra Ignite 2015: Balázs Török – The blanket that’s always too shortUmbra Ignite 2015: Balázs Török – The blanket that’s always too short
Umbra Ignite 2015: Balázs Török – The blanket that’s always too short
 
Umbra Ignite 2015: Thor Gunnarsson & Reynir Hardarson – Nailing AAA quality i...
Umbra Ignite 2015: Thor Gunnarsson & Reynir Hardarson – Nailing AAA quality i...Umbra Ignite 2015: Thor Gunnarsson & Reynir Hardarson – Nailing AAA quality i...
Umbra Ignite 2015: Thor Gunnarsson & Reynir Hardarson – Nailing AAA quality i...
 
Umbra Ignite 2015: Jérémy Virga – Dishonored 2 rendering engine architecture ...
Umbra Ignite 2015: Jérémy Virga – Dishonored 2 rendering engine architecture ...Umbra Ignite 2015: Jérémy Virga – Dishonored 2 rendering engine architecture ...
Umbra Ignite 2015: Jérémy Virga – Dishonored 2 rendering engine architecture ...
 
Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...
Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...
Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...
 
Umbra Ignite 2015: Alex Evans – Learning from failure – prototypes, R&D, iter...
Umbra Ignite 2015: Alex Evans – Learning from failure – prototypes, R&D, iter...Umbra Ignite 2015: Alex Evans – Learning from failure – prototypes, R&D, iter...
Umbra Ignite 2015: Alex Evans – Learning from failure – prototypes, R&D, iter...
 

Recently uploaded

Recently uploaded (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern approaches to model skinning

  • 1. The State of Skinning … Or How To Maintain Your Physique
  • 3. Rulon Raymond Sr. Engine Programmer Introduction
  • 4. 1) Review 2) Evolution of techniques on console HW 3) The new hotness (hint: it’s a Clifford Algebra) 4) Extensions DISCLAIMER: All screenshots and techniques presented are not associated with any specific title, project, or oragnization, unless otherwise stated. Outline
  • 6. What is Skinning? I Was Skinning Long Before 3D Animated Models Were All The Rage
  • 7.  Step 1: Generate a cool animated pose.  Step 2: ???  Step 3: Use fancy lighting and shaders to draw an animated model on-screen (i.e. profit) What is Skinning?
  • 9. Skinned Model, ready for drawing Model Vertices Bone Weights Bone Transforms What is Skinning?
  • 10. 𝑣′ = 𝑆𝑘𝑖𝑛(𝑊, 𝑇, 𝑣) What is Skinning? 𝑣: The initial vertex transform 𝑊: Array of bone weighting values 𝑇: Array of bone transforms 𝑣′: The final vertex transform
  • 12. • Sony Playstation (1995) • Geometry Transform Engine (GTE) Skinning on Consoles
  • 13. • Sony Playstation2 (2000) • Vector Unit 0 (VU0) Skinning on Consoles
  • 14.  Microsoft Xbox (2001)  NVIDIA GPU (DirectX 8.x) Skinning on Consoles
  • 15.  Microsoft Xbox 360 (2005)  PowerPC CPU Skinning Implementation
  • 16.  Sony PS3 ( 2006)  Synergistic Processing Units (SPU’s) Skinning on Consoles
  • 17. Why not use the GPU for skinning on Xbox 360 and PS3? The CPU’s/SPU’s are actually quite fast. Skinning Implementation 3X @3.2Ghz 6X @3.2Ghz (with many restrictions…)
  • 18. Why not use the GPU for skinning on Xbox 360 and PS3? Split Vertex Streams Skinning on Consoles Vertex Position, Tangent Space •Skinned Colors, UV’s, etc. •Constant – sent straight to GPU Stream 0 Stream 1
  • 19. Why not use the GPU for skinning on Xbox 360 and PS3? Unified Memory Architecture Skinning on Consoles // Just skinned a vertex. Now write it out as // three 16-byte vectors __stvx( skinnedVertexData0, vertsOutBuffer, 0 ); __stvx( skinnedVertexData1, vertsOutBuffer, 16 ); __stvx( skinnedVertexData2, vertsOutBuffer, 32 ); // Gah – why’d that take so long? // ~20% faster! // (F*&^% write-combine memory) __stvx( skinnedVertexData0, vertsOutBuffer, 0 ); _WriteBarrier(); __stvx( skinnedVertexData1, vertsOutBuffer, 16 ); _WriteBarrier(); __stvx( skinnedVertexData2, vertsOutBuffer, 32 );
  • 20. Why not use the GPU for skinning on Xbox 360 and PS3? So you can use the GPU for other things. Skinning on Consoles
  • 21.  Microsoft Xbox One (2013)  Sony PS4 (2013)  AMD GCN GPU Skinning on Consoles
  • 22. Skinning on Consoles GPU Frame Draw Calls IDLE Draw Calls Post FX IDLEGCN Compute Unit GCN Compute Unit
  • 23. Async Compute Skinning Skinning on Consoles GPU Frame Draw Calls Skinning Draw Calls Post FX SkinningGPU Compute Unit GPU Compute Unit
  • 24. • Generate Draw List (frame N) Visible Models • Async Compute Dispatch Thread. Model Skinning Workloads • GPU rendering (frame N-1) Skinned Model (frame N) Skinning on Consoles Async Compute Skinning
  • 27. The standard approach to real-time skinning, used in almost every modern 3D game. Linear Matrix Blend Skinning Suffers from some well- documented problems...
  • 28. The “candy wrapper” effect Linear Matrix Blend Skinning
  • 29. Mesh Volume Preservation Example: “flat ass syndrome” Linear Matrix Blend Skinning
  • 30. Q: Why do these problems exist? A: Let’s take a closer look at the underlying math… Linear Matrix Blend Skinning
  • 32.  Apply the property of distrubutivity: 𝑣′ = ( 𝑖=1 𝑛 𝑤𝑖 𝑀𝑗𝑖)𝑣 Linear Matrix Blend Skinning  To keep it simple: Let 𝑀𝑗𝑖 represent a rigid transform.  No scale, shear, …  Most common scenario for skinning in games.
  • 33.  A linear combination of rigid transforms DOES NOT yield a rigid transform!  Orthonormal matrices aren’t closed under addition.  Scaling values can creep into the final vertex transforms.  Extreme cases can result in rank-deficient matrices. Linear Matrix Blend Skinning 𝑣′ 𝑣 𝑀𝑗1 𝑣 𝑀𝑗2 𝑣 Example: The “candy wrapper” artifact
  • 34.  The most common workaround to these issues is the addition of new bones.  Hand-animated or procedural.  Split the rotation of a joint, relative to its parent, into even increments – for a single axis only.  Example: Arm Twist Bone  Parented to the shoulder and consistently represents exactly half its twist(roll) motion. Linear Matrix Blend Skinning
  • 35. Adding these bones is not free!  Memory and processing overhead.  Exact amount depends on actual implementation. Linear Matrix Blend Skinning
  • 36.  Dual Quaternions to the rescue!  But what exactly are they?  Let’s start with a quick review of the vanilla variety of quaternions… Linear Matrix Blend Skinning
  • 37.  𝑞 = [𝑎, 𝑏, 𝑐, 𝑑]  𝑞 = 𝑎 + 𝑏𝑖 + 𝑐𝑗 + 𝑑𝑘  𝑖2 = 𝑗2 = 𝑘2 = 𝑖𝑗𝑘 = −1  𝑞 = 𝑟, 𝑣 , 𝑟 ∈ ℝ, 𝑣 ∈ ℝ3  𝑞 = cos 𝜃 2 , 𝑠 sin 𝜃 2 , 𝜃 ∈ ℝ, 𝑠 ∈ ℝ3 Quaternions Hamilton - 1843 A 4D extension of complex numbers
  • 38. For our purposes all we care about is unit quaternions.  Conveniently represent rotations.  Conjugate: 𝑞∗ = 𝑎 − 𝑏𝑗 − 𝑐𝑘 − 𝑑𝑧 Quaternions 𝑞∗ = 𝑞−1 , 𝑞 = 1
  • 39. One important quaternion equation to note: 𝑣′ = 𝑞𝑣𝑞∗ , 𝑣 = 0, 𝑥, 𝑦, 𝑧 Applies a rotation to a 3D point Quaternions
  • 40. 𝑑 = 𝑎 + 𝑏𝜀, 𝜀2 = 0 Similar in form to complex numbers Stored as: 𝑑 = 𝑎 𝑏 Dual Numbers
  • 41. Conjugate 𝑑∗ = 𝑎 − 𝑏𝜀 Multiplication 𝑑0 𝑑1 = 𝑎0 + 𝑏0 𝜀 𝑎1 + 𝑏1 𝜀 = 𝑎0 𝑎1 + (𝑎0 𝑏1 + 𝑏0 𝑎1)𝜀 Dual Numbers
  • 42. Basically a quaternion whose elements are dual numbers  𝑞 = 𝑤 + 𝑖 𝑥 + 𝑗 𝑦 + 𝑘 𝑧 (quaternion form)  𝑤 is the scalar part (dual number)  𝑥, 𝑦, 𝑧 is the vector part (dual vector)  𝑞 = 𝑞 𝑎 + 𝑞 𝑏 𝜀 (dual number form)  𝑞 𝑎 : “non-dual part”  𝑞 𝑏 : “dual part”  Most useful for skinning. Dual Quaternions
  • 43.  Multiplication:  𝑝 𝑞 = 𝑝 𝑎 𝑞 𝑎 + (𝑝 𝑏 𝑞 𝑎 + 𝑝 𝑎 𝑞 𝑏)𝜀  Quaternion Conjugate:  𝑞∗ = 𝑞 𝑎 ∗ + 𝑞 𝑏 ∗ 𝜀  Dual Conjugate:  𝑞 = 𝑞 𝑎 − 𝑞 𝑏 𝜀  Quaternion & Dual Conjugate:  𝑞∗ = 𝑞 𝑎 ∗ − 𝑞 𝑏 ∗ 𝜀 = (𝑞 𝑎 − 𝑞 𝑏 𝜀)∗ Dual Quaternions
  • 44. 𝑁𝑜𝑟𝑚( 𝑞) = 𝑞 𝑎 + 𝑞 𝑎, 𝑞 𝑏 𝑞 𝑎 𝜀 Dual Quaternions 𝑞∗ = 𝑞−1, 𝑞 = 1
  • 45. Rigid Transforms:  𝑞 𝑟𝑜𝑡𝑎𝑡𝑖𝑜𝑛 = 𝑞 𝑎 + 0𝜀  𝑞𝑡𝑟𝑎𝑛𝑠𝑙𝑎𝑡𝑖𝑜𝑛 = (1,0,0,0) + (0,𝑡 𝑥,𝑡 𝑦,𝑡 𝑧) 2 𝜀  𝑞 𝑟𝑖𝑔𝑖𝑑 = 𝑞𝑡𝑟𝑎𝑛𝑠𝑙𝑎𝑡𝑖𝑜𝑛 𝑞 𝑟𝑜𝑡𝑎𝑡𝑖𝑜𝑛 = 𝑞 𝑎 + (0,𝑡 𝑥,𝑡 𝑦,𝑡 𝑧) 2 𝑞 𝑎 𝜀 Dual Quaternions
  • 46. Transforming a 3D point 𝑣′ = 𝑞 𝑟𝑖𝑔𝑖𝑑 𝑣 𝑞 𝑟𝑖𝑔𝑖𝑑 −1 , 𝑣 = (1,0,0,0) + (0, 𝑣 𝑥, 𝑣 𝑦, 𝑣𝑧)𝜀 Dual Quaternions
  • 47. Geometric Interpretation  Recall: 𝑞 = cos 𝜃 2 + 𝑠 sin 𝜃 2 ( 𝑠 = axis, θ = 𝑎𝑛𝑔𝑙𝑒)  → 𝒒 = 𝒄𝒐𝒔 𝜽 𝟐 + 𝒔 𝒔𝒊𝒏 𝜽 𝟐 Dual Quaternions 𝜽 = 𝜃 𝑎 + 𝜃 𝑏 𝜖 : dual quaternion representing only a rotation • 𝑡 = (0,𝑡 𝑥,𝑡 𝑦,𝑡 𝑧) 2 : translation vector, in quaternion form • 𝜃 𝑎 : angle of rotation • 𝜃 𝑏 = 𝑡, 𝑠 𝑎 : translation along 𝑠 𝑎 𝒔 = 𝑠 𝑎 + 𝑠 𝑏 𝜀 : unit dual quaternion with a 0 scalar part • 𝑠 𝑎 = 0, 𝑠 𝑥, 𝑠 𝑦, 𝑠 𝑧 : direction of axis of rotation • 𝑠 𝑏 = ( 1 2 ( 𝑠 𝑎 × 𝑡 cot 𝜃 𝑎 2 + 𝑡)) × 𝑠 𝑎 : moment of rotation axis
  • 48. Screw Transform!  Rotation about an axis followed by translation along that axis.  All rigid transforms can be described this way. Dual Quaternions
  • 49. Simple Case: 𝐷𝑄𝐵 𝑞0, 𝑞1, t = 1 − 𝑡 𝑞0 + 𝑡 𝑞1 ( 1 − 𝑡 𝑞0 + 𝑡 𝑞1) Dual Quaternion Blend Skinning 𝑞0 𝑞1 𝑞 𝐷𝑄𝐵
  • 50. 𝐷𝑄𝐵 𝑞0, … , 𝑞 𝑛, 𝑤0, … , 𝑤 𝑛 = 𝑤0 𝑞0 + … + 𝑤 𝑛 𝑞 𝑛 𝑤0 𝑞0 + … + 𝑤 𝑛 𝑞 𝑛 Dual Quaternion Blend Skinning Unlike with matrix blending, the result is always a rigid transform!
  • 51.  Very accurate, but not perfect.  Can introduce accelerations when input dual quaternions differ greatly.  8.15 degrees : Maximum rotational deviation  15.1% : Maximum translational deviation  Modified SLERP can be used if absolute accuracy is required.  𝑆𝐿𝐸𝑅𝑃 𝑞0, 𝑞1, 𝑡 = 𝑞1 𝑞0 ∗ 𝑡 𝑞0  Efficiency tradeoff usually not worth it. Dual Quaternion Blend Skinning
  • 52. Must handle antipodality! Polarity rule: 𝑞 ≡ − 𝑞 We want: ∀ 𝑞0, … , 𝑞 𝑛 ∶ 𝑞𝑖, 𝑞 𝑗 ≥ 0 Fix up all dual quaternions prior to skinning. Dual Quaternion Blend Skinning 𝑞 − 𝑞 for ( all bones’ unit dual quaternions, dq[i] ) if ( InnerProduct( dq[i], dq[parent[i]] ) < 0.0 ) Negate( dq[i] );
  • 53. Dual Quaternion Blend Skinning // Input: unit quaternion 'q0', translation vector 't' // Output: unit dual quaternion 'dq' static void QuatTrans2UDQ( const float q0[4], const float t[3], float dq[2][4] ) { // Non-Dual Part: dq[0] = q0 for ( int i=0; i<4; i++ ) dq[0][i] = q0[i]; // Dual Part: dq[1] = ((0,t[0],t[1],t[2])/2)*q0 dq[1][3] = -0.5f*(t[0]*q0[0] + t[1]*q0[1] + t[2]*q0[2]); // Scalar Component dq[1][0] = 0.5f*( t[0]*q0[3] + t[1]*q0[2] - t[2]*q0[1]); // Vector Component 0 dq[1][1] = 0.5f*(-t[0]*q0[2] + t[1]*q0[3] + t[2]*q0[0]); // Vector Component 1 dq[1][2] = 0.5f*( t[0]*q0[1] - t[1]*q0[0] + t[2]*q0[3]); // Vector Component 2 } Generating a Dual Quaternion
  • 54. Dual Quaternion Blending Dual Quaternion Blend Skinning // Input: array of dual quaternions 'dqIn' // Input: array of weights 'w‘, totaling 1.0 // Input: size of the above two arrays (> 1) // Output: the blended dual quaternion 'dqOut' static void DQB( const float dqIn[][2][4], float w[], int numDQ, float dqOut[2][4] ) { // dqOut = w[0]*dqIn[0] Vec4Scale( dqIn[0][0], w[0], dqOut[0] ); Vec4Scale( dqIn[0][1], w[0], dqOut[1] ); for( int i = 1; i < numDQ; ++i ) { // dqOut += w[i]*dqIn[i] Vec4Mad( dqOut[0], w[i], dqIn[i][0], dqOut[0] ); Vec4Mad( dqOut[1], w[i], dqIn[i][1], dqOut[1] ); } }
  • 55. Transformation Using a Dual Quaternion Dual Quaternion Blend Skinning // Input: unit dual quaternion 'dq' // Input: input position 'vecIn' // Output: rigidly transformed position 'vecOut' static void DQTransform( const float dq[2][4], const vec3_t vecIn, vec3_t vecOut ) { vec4_t q0, q1; float a0, ae, recipDeLen; vec3_t d0, de, temp1, temp2, temp3, temp4, temp5; vec3_t temp6, temp7, temp8, temp9, temp10, temp11; recipDeLen = 1.0f / I_sqrt( dq[0][3]*dq[0][3] + dq[0][0]*dq[0][0] + dq[0][1]*dq[0][1] + dq[0][2]*dq[0][2] ); // Normalize both parts of the dual quaternion, based // on the length of the non-dual part. Vec4Scale( dq[0], recipDeLen, q0 ); Vec4Scale( dq[1], recipDeLen, q1 ); // Isolate the scalar and vector parts of both // quaternions. This is just for code clarity and can // be omitted for SIMD optimization. a0 = q0[3]; ae = q1[3]; memcpy( d0, &q0[0], sizeof( d0 )); memcpy( de, &q1[0], sizeof( de )); // Transform 'vecIn' by the dual quaternion // to produce 'vecOut'. vecOut = dq*v*dq^-1 Vec3Cross( d0, vecIn, temp1 ); Vec3Mad( temp1, a0, vecIn, temp2 ); Vec3Scale( de, a0, temp3 ); Vec3Scale( d0, ae, temp4 ); Vec3Cross( d0, de, temp5 ); Vec3Sub( temp3, temp4, temp6 ); Vec3Add( temp6, temp5, temp7 ); Vec3Scale( temp7, 2.0f, temp8 ); Vec3Scale( d0, 2.0f, temp9 ); Vec3Cross( temp9, temp2, temp10 ); Vec3Add( vecIn, temp10, temp11 ); Vec3Add( temp11, temp8, vecOut ); }
  • 56. 0 5 10 15 20 25 30 35 Matrix Skinning (column- major) DQB Skinning Dual Quaternion Blend Skinning Instruction Counts (XB360 VMX )
  • 58. Dual Quaternion Blend Skinning On GCN GPU DQ Skinning Matrix Skinning Aggregate $ Efficiency   VGPR Count   Memory Stalls   DRAM Footprint  
  • 59. DQ vs. Matrix Skinning DQ Skinning is ~24% faster*** Dual Quaternion Blend Skinning ***: Depends heavily on vertex layout, tangent space quality, number of bones, and weighting distributions.
  • 60. Optional Optimizations:  Compress quaternions  10:10:10:2 format for non-dual component  Tune max waves/SIMD  Generate skinning transforms on the GPU Dual Quaternion Blend Skinning
  • 65. Procedural Motions Dual Quaternion Blend Skinning Spore © EA (2008)
  • 66. IK Dual Quaternion Blend Skinning Especially when animations are played on characters with different or custom proportions.
  • 67. Ragdolls: Can you spot all the artifacts DQB would resolve? Dual Quaternion Blend Skinning
  • 73. Pros  GPU/SIMD friendly  No asset changes required  Cheaper transform blending  More cache friendly  Requires less memory/constants  Conducive to procedural motions  (Mostly) replaces the need for the rotational split bones mentioned earlier.  Can be enabled selectively (per- LOD, per-submesh, high end machines only) Dual Quaternion Blend Skinning Cons  Less intuitive than matrices  Local scaling must be handled separately  Actual vertex transform is more ALU  Still not 100% accurate  Potential bulge artifacts  Not widely adopted in games (yet)  No more flat asses!
  • 77.  “Bulging-free dual quaternion skinning” (Kim, 2014) Skinning
  • 79. 1. 𝐵𝑢𝑙𝑔𝑒 𝑣𝑡 = CalcBulge 𝑣0, 𝐹𝐾𝑏𝑜𝑛𝑒𝑠0, 𝐹𝐾𝑏𝑜𝑛𝑒𝑠 𝑚𝑖𝑛 , 𝐹𝐾𝑏𝑜𝑛𝑒𝑠 𝑚𝑎𝑥, 𝑃𝑟𝑜𝑐𝑒𝑑𝑢𝑟𝑎𝑙𝐵𝑜𝑛𝑒𝑠 2. Solve for: Bone weights on 𝑣0 to minimize 𝐵𝑢𝑙𝑔𝑒 𝑣𝑡 for all t. 3. Re-weight artists-selected vertices in Maya/Max. Skinning
  • 80.  The optimal model skinning approach can vary per platform.  Give dual quaternion skinning a look.  Don’t assume skinning is a “solved problem”. (Unless you’re Leatherface) Conclusion

Editor's Notes

  1. Hello everyone, and welcome to Umbra Ignite 2015! <thank Umbra for orgazing this event, and let the audience know they have some awesome talks ahead of them>
  2. Hello, everyone. My name is Rulon Raymond, I’m a Sr. Engine Programmer at Infinity Ward, and today I’m going to talk on a subject that I feel doesn’t get a lot of attention outside of film or academia, despite its critical role in so many video games – Skinning. So, when I was contemplating what I should talk about at this event I had narrowed it down to two subjects. One was more high level, and the other more focused. I heard this crowd was a fan of technical talks to rolled the dice and decided to with the latter. To anyone hoping for a hand-wavey lecture on “the struggles of the modern game developer” – apologies.
  3. Outline: Review skinning The evolution of skinning techniques on console hardware Dual Quaternion Blend Skinning Extensions DISCLAIMER
  4. Let’s start by taking a step back to answer the question: what is skinning?
  5. It’s much more than Leatherface’s favorite pastime – it’s also a crucial rendering process present in almost any modern 3D game.
  6. Step 1: Combine a bunch of key-framed and procedural bone transforms to construct a pose. Step 2: <silence> Step 3: Profit!
  7. Here is a visual example of skinning in action. One the left we have a static character model However, when combined with an animated skeletal pose, we are able to morph it into whatever shape the artists and animators desire.
  8. Each frame of the game, as part of the preparation for rendering objects, we take a bunch of bone weights, bone transforms, and model vertices – run them through a skinning process, and Voila! – a skinned model, ready for rendering.
  9. What skinning really boils down to is a function that operates on input vertex data, a series of bone transforms, and an array of weights to the different bone transforms, to produce a new vertex.
  10. Over the past 20 years, the tactics used skin models in games have evolved quite a bit. I’m going to focus mainly on console side of things, since it’s their architectures that have driven a lot of the decisions here. Also, I want to stress that the following approaches are not based on any particular game, but rather my own experiences combined with a survey of other titles.
  11. When Sony released the original Playstation in 1995, it included a dedicated “Geometry Transform Engine” coprocessor. Aside from the normal world view model projection transforms needed to show off those sweet flat-shaded polygons, it could be used for very basic skinning. There were a number of limitations here - the least of which being constrained to fixed point operation - but it was convincing enough to bring life to many characters.
  12. Co-processors were still in style when the PS2 hit the scene. This time around there were a couple more general purpose vector unit coprocessors that were a total pain to work with, but by far the best choice for all your skinning needs. Since VU1 was typically busy with work that the GPU should really have been capable of doing itself, a few lines of perfectly-written macrocode and DMA instructions was all it took to make efficient skinning a reality.
  13. The original X-Box was very PC-like. As such it was a no-brainer to shove skinning into a vertex shader and run it on the GPU. The trick then became reusing the skinning results across multiple draw calls. Basically avoiding the need to re-skin the same model multiple times if it needed to be rendered in a split-screen game, etc.
  14. For the Xbox 360, it seems the trend was to move skinning back off the GPU and onto its shiny new multi-core PowerPC CPU’s. I’ll get into the reasons why in just a second.
  15. The PS3’s SPU’s were born performing massive amounts of serialized computation, and this perfect for skinning. No contest here.
  16. I’d like to take a quick aside and dive a bit deeper into why the CPU’s were often chosen for skinning models on the Xbox 360 and PS3. For one, the processors on these machines are quite fast. In fact, purely in terms of raw ALU throughput they’re technically more powerful than the Xbox One and PS4. Of course to realize that power, the skinning routines written for these consoles often needs to be in hand-tuned VMX intrinsics or assembly.
  17. With the newer graphics hardware and API, these consoles were able to catch up to PC’s a bit and allow for individual vertex data to be split into separate streams. This means we only need to skin geometric vertex, such as positions, normals, binormals, and tangents. Data that’s not typically affected by articulated transforms, such as texture coordinates, can be stored in separate buffers, that are completely skipped over during skinning. When rendering, we just need to match up the skinned vertex stream with the rest of their vertex data By iterating over a much smaller set of data each frame we’re able to save quite a bit of time. If you’re not already doing this, I’d highly recommend giving it a shot as it’s really a key skinning optimization that holds true with all modern platforms.
  18. By using general purpose processors for skinning, we are able to write directly to memory that the GPU can read at full speed as well. Since these processors use in-order execution, and most likely you’ll be writing to memory address spaces flagged as write-combine, it’s important to enforce the order that you’re writing out data. For instance, I was working on some fully optimized VMX-based skinning code and found the skinned vertex data was sometimes being committed to memory non-sequentially, even though it was processed that way. By adding a lightweight memory barrier around each vector write operation we saw an almost 20% gain in overall performance.
  19. Finally, this generation of consoles moved their GPU’s away from having fixed purpose processing units, to a unified shading architecture. This meant we could take GPU cycles that may have been used for skinning in a vertex shader, and instead put them to use for fancy post-processing fx, or other new-fangled render features to help your title stand out.
  20. With the recent arrival of new hardware, the question once again arose as to where is the best place to perform model skinning. The switch to more general purpose out-of-order CPU’s is great for most game code, but since the clock speeds have dropped a bit compared to their predecessors, hand-optimized assembly wasn’t going to cut it. Especially when you’re also looking to push the bar forward with even more bones and model data. Luckily, both of these consoles share a common GPU architecture – AMD’s “Graphics Core Next”, or GCN.
  21. So, this very, VERY, simplified diagram of a GPU frame shows how we are not maximizing throughput at certain times in the frame. During the first set of draw calls, and during the post processing effects, we have a compute unit pretty much idle. This may be a result of poor shader code, but it might also be unavoidable. Since optimizing shaders for maximum throughput is outside the scope of this talk, let’s just assume everything here is as optimal as it can be. Now if only there was something we could do with those wasted GPU resources….
  22. And it’s async Compute Skinning to the rescue! A nice feature of the GCN GPU’s is the introduction of asynchronous compute engines. These allow for independent scheduling and compute shader dispatching of workloads that can run alongside general draw calls. By using asynchronous dispatches we are able to fill in the gaps left by the renderer and fit skinning in at a very low cost. Even in a perfect scenario it won’t quite be free, but with careful scheduling and GPU resource divisions you can get close. Looking at this diagram, some might wonder how we can be so casual about when our skinning process takes place. If it’s concurrent with post-processing fx, isn’t that a bit late in the frame? Shouldn’t all models be skinned before they’re drawn?
  23. It turns out this was fine for us, for a handful of reasons. For one, we don’t want to actually kick off a skinning workload for a model until we know it’s going to be visible. This means that we can kick off skinning workloads while constructing draw lists for the GPU. As soon as we determine a skinned model is visible, we queue up its skinning compute work to an async compute queue, and rely on a custom background CPU thread to actually submit the work to the GPU, ideally at times when it will least impact outstanding render calls.
  24. Now that we’ve gone over the evolution of skinning on consoles, I’m going to switch gears a bit and dig deeper into exactly what all that skinning code we’re writing for the “processor du jour” is actually doing.
  25. … unfortunately for some, that means exploring some of the underlying math in modern skinning approaches. More specifically I’ll visit the path we’ve recently journeyed down at Infinity Ward, as a major shift in how we skin all of our models. I realize the inclusion of not just one, but many, actual math equations is a bit of a Powerpoint faux pas. However, I’ve decided to do it anyway after hearing from Farhad that this crowd was huge fan of more technical talks. Also, I’ve always considered it a bit of a cop out when presenters just gloss over subjects that are heavy on math not everyone may be familiar with. If you’re eyes tend to bleed when presented with summations or equivalence conditions, you may want to close them for the next few minutes. Then blame Farhad :-p
  26. Almost all modern 3D games utilize a linear matrix blending approach to mesh skinning. Here we see just a couple possibly familiar examples. It’s tried and true, and has, for better or worse, become the defacto solution that few developers question. However, it does exhibit some problems that have become all to familiar to character artists and riggers.
  27. Though it’s ugly enough to where very few games actually ship with this issue in plain sight, anyone that’s worked with mesh skinning or animation has probably run into the “candy wrapper” effect. That’s the not-so-technical term used to describe joints that have collapsed on themselves when they are rotated further than anticipated when the mesh was authored. On the left we have an example with a pair of overly twisted arms that have produced some scrunched elbows. And on the right is a much more extreme case of what happens when an alien attempts a Poltergeist-style head spin.
  28. A much more omnipresent, but subtle side effect of linear matrix blend skinning is the failure to preserve mesh volume. This is most pronounced when bones are animated to poses that are substantially different from the pose used as the basis for authoring the skinning data. Here we can see how this malady has manifested itself as “flat ass syndrome” on this skater. His left leg has been rotated almost 90 degrees from his standing base pose which has caused his ass to suck deeper into the pelvic area. This, of course, is not desirable. I mean, imagine if it was Kim Kardashian performing this kickflip… she would never sign off on her likeness.
  29. Why do these problems exist in the first place? To gain a better understanding, let’s take a closer look at the linear matrix blend skinning routine.
  30. Here we have the mathematical formula for transforming a vertex by a series of weighted matrices. Every bone that influences a given vertex gets its matrix scaled by a weight, where the sum all weights equals one, then transforms the vertex position. All these transformed values are subsequently summed up to produce the final skinned position. Of course any vertex can contain normals, binormals, and tangents, in addition to position, that may need this transform applied as well.
  31. If we take the equation on the previous slide and apply the property of distributivity, we can yield something a bit more useful. Here it’s obvious that the skinning is performed by transforming all vertices by a weighted sum of bone matrices. Also, for the remainder of this presentation all bone transforms will be assumed to be rigid – meaning they contain translation and rotation only - unless otherwise stated. This isn’t a requirement, but is the most common scenario in games and simplifies much of the underlying math.
  32. The biggest problem with this approach to mesh skinning is that a linear combination of rigid transform matrices does not yield a rigid transform. Even though we start with a set of orthonormal matrices, they are not closed under addition and as a result, scaling values can creep into the final blended transform. Extreme cases can even result in rank-deficiencies – which is when the 3 columns of the rotation matrix are no longer linearly independent. This is the specific cause of the most pronounced “candy wrapper” effect. [describe image] Generally speaking, when we have all these joints that are performing simple rotations, but when blended together end up doing much more, it’s not surprising that we are seeing some artifacts.
  33. Luckily, these problems have become so familiar that most artists (and programmers) see them as a fact of life and have devised some nifty workarounds. The most common workaround to these issues is the addition of new bones to the skeleton that help distribute joint transforms.
  34. Of course adding these helper bones to a rig is not free. If their motions are exported as part of the keyframed animations, they will increase the file size of all animations that are used by the modified rig. If they are procedural, that means a series of potentially complex transforms must be calculated each frame, thus increasing the time required to generate a final animated pose.
  35. It turns out we have an alternate approach to avoiding the artifacts inherent to linear matrix blend skinning – Dual Quaternions! But what exactly are they, and how will the help us with skinning? Let’s start with a quick review of the vanilla variety of quaternions.
  36. Quaternions were conjured up by this photogenic fellow, Sir William Hamilton, while he was traveling under a bridge in 1843 and are essentially a 4D extension of complex numbers. They’re often represented in a few different ways, depending on how they’re planned to be used. As a 4-vector: The most common way to store and manipulate them in games as it fits in a standard SSE/VMX register. As a polynomial, where the coefficients adhere to the signature quaternion identity [recite] As a real number “scalar part” paired with a 3-vector “imaginary part” - As a function of a unit axis and angle pair
  37. For the purpose of this talk, the only quaternions we really care about lie on the unit hypersphere. They are commonly used to represent rotations, with well-documented routines for conversion to and from matrices, axis/angle pairs, and Euler angles. It’s because of this handy property that quaternions have become ubiquitous in game engines, particularly for dealing with skeletal animations. One noteworthy property of unit quaternions is that their conjugates, denoted by an asterisk superscript, are essentially equivalent to their inverses, which comes in handy for figuring out the inverse of supplied rotations.
  38. One last important quaternion equation to note is the one that applies a rotation to a 3D point. For a given point we create a quaternion with a 0 scalar part, and a vector part composed of the point’s x, y, and z coordinates. We’ll call that “v”. We then take this value and multiply it by a quaternion, “q”, that represents the desired rotation, along with its conjugate. Since quaternion multiplication is non-commutative, the order of operations does matter here. It might not be as efficient as applying a 3x3 rotation matrix but yields the same effect and is handy in cases where you’re forced to deal with quaternions alone.
  39. We have one more topic to examine before getting into dual quaternions. Dual numbers are similar to complex numbers in that they take a real value and extend it with a special epsilon term, also called the “dual unit”. In complex numbers we had the “i” term where i^2=-1, and in dual numbers we have this epsilon term where epsilon^2=0, but epsilon itself is not equal to 0.
  40. The conjugate is also similar to that of complex numbers but multiplication of two dual numbers – something we will need shortly - is a bit unique. Notice how the dual part DOES NOT feed back into the non-dual part. As with quaternions, there’s a definitely a wealth of information and complexity related to dual numbers, but this is all we’ll need to know for our particular skinning solution.
  41. Now it’s safe to define the wonder that is: the dual quaternion. They are basically a quaternion whose elements are dual numbers and can be expressed in either “quaternion form” or “dual number form”. [describe both forms] The “dual number form” is often the most useful representation for our skinning application since it allows us to represent a dual quaternion as a pair of standard ones.
  42. Here we have a few of the most notable dual quaternion operations. Note that the multiplication is non-commutative, just like regular quaternions. There are also two different types of conjugates for dual quaternions – the “quaternion conjugate”, where the conjugate is taken for both quaternion components, and the “dual conjugate”, where the “dual part” is simply negated. These two different conjugates can also be combined, which will be required by our skinning solution.
  43. For our use case, it’s most convenient to represent dual quaternions as a pair of quaternions. As with regular quaternions, for representing transforms we only care about unit dual quaternions so here’s how we get the norm. Note how it’s NOT just a straightforward “normalize both of the quaternion parts”. The “non dual” part is normalized as usual, but the “dual part” is normalized based on the length of the “NON dual” part. Also know that when we are dealing with unit dual quaternions their inverse can be determined by applying both the dual and quaternion conjugates mentioned on the previous slide.
  44. Now let’s take a look at how unit dual quaternions can be used to represent full 3D rigid transforms. Of course, a rigid transform is rotation + translation only – no scale. Local scaling must be handled separately. [describe the parts of the rigid transform]
  45. Similarly to regular quaternions, dual quaternions can be used to transform a given 3D point or vector without any conversion to matrices, axis/angle, etc. We create a dual quaternion from the position, where the non-dual part is the identity quaternion, and the dual part is its x, y, and z coordinates. Then it can be multiplied by the dual quaternion representing the rigid transform, along with its inverse – as calculated by applying the “dual” AND “quaternion” conjugate operations.
  46. Now that we’ve seen the algebra behind dual quaternion rigid transformations, it’s time to look at their geometric interpretation. Recall how a regular quaternion can be represented by an axis and an angle? Well, it turns out dual quaternions follow very closely in suit, where the axis and angle have been replaced by their dual quaternion counterparts. If we take this equation and break it down, we see that a dual quaternion can be described using a unit axis, a rotation around it, a translation along it, and its moment.
  47. It turns out this geometric interpretation is a convenient way to represent a screw transform. A screw transform is basically a rotation about an axis, followed by a translation along that axis. Though all rigid transforms can be described this way it’s not usually convenient unless you’re tackling specific problems in fields like mechanics, crystallography, or projective geometry.
  48. Now that we’ve seen how dual quaternions can be used to represent rigid transforms, it’s time to see how multiple dual quaternions can be blended together. After all, this is the sort of thing we need to do when determining the final transform for any given skinned vertex. To start, let’s just look at the simple case of interpolating between two dual quaternions. Notice how we’re essentially performing a component-wise linear interpolation, then normalizing to keep the result useful.
  49. Here we can see how blending is determined for N many dual quaternions. It’s nothing more than a normalized weighted sum of all input dual quaternions! Looking back at our geometric interpretation of dual quaternions as screw transforms, we can conclude that by blending the individual properties of the transform, we will always yield another valid transform.
  50. While this approach is leaps and bounds ahead of linear matrix blending, in terms of accuracy, it is still not perfect. This makes sense when you think about it. What we’re doing is averaging all these unit dual quaternions, which produces one that is no longer of unit length. By re-normalizing , we are essentially projecting the result back onto the dual unit hypersphere. This is very similar to what’s being done in most animation systems when rotational keyframes are interpolated. A bit of analysis yields the maximum possible error to be 8.15 degrees of rotation and 15.1% translation. In practice the actual error is far less. As with skeletal animation, the loss of accuracy is usually an acceptable trade-off, considering how efficient it is compared to the alternative of SLERPing everything. If absolute accuracy is required it is possible to use a modified SLERP routine on these dual quaternions.
  51. It’s very important that we remember to handle the unique issue of antipodality. Just like with regular unit quaternions, a unit dual quaternion will represent the same transform as its negation. Our DQB routine requires that all dual quaternions being blended together lie on the same dual hyper-hemisphere. Mathematically speaking, this means that for all dual quaternions being blended together, all pairs of them must yield positive inner products. It would probably be most efficient to perform this “hemispherization” step upon finalizing a pose to ensure all dual quaternions used for skinning adhere to this rule. [describe bottom logic]
  52. Now that we’ve got the mathematical foundation out of the way, it’s time to see how one might actually code all this up. Here we have a very simple function that takes a unit quaternion representing rotation, along with a translation vector, and produces a unit dual quaternion that expresses the entire rigid transformation. This process will need to run once on every bone’s transform as soon as the pose is finalized. Though I don’t have the code for reference, take my word that it’s much cheaper to generate a dual quaternion than a matrix, given the same inputs. Also, this code is obviously not written for efficiency, but would be trivial to create a SIMD-optimized version.
  53. This is a generic dual quaternion blend function that takes an array of dual quaternions, along with their associated weights, and calculates their final blended dual quaternion. In practice this function would probably be specialized for the known quantities of bone weights the game supports so that the “for” loop could be factored out.
  54. Here we have the code that will take a dual quaternion, presumably computed through dual quaternion blending, and use it to transform a given point. It’s not as efficient as it could be, as it was written with readability in mind, but would be trivial to convert over to using pure SIMD instructions. Note that this routine should only be used for transforming position. When skinning vertex normals, binormals, and tangents you just want to perform a regular quaternion transform using the non-dual part of the dual quaternion.
  55. Here we have a very simple performance comparison where we juxtapose the complexity of skinning via column-major matrices versus dual quaternions. Though the positional transform incurs a constant cost for all vertices, the cost of blending together bone transforms can vary depending on the number influencing a given vertex. Since some platforms are likely to hide stalls by re-ordering instructions or making use of more registers, the relative performance of DQB skinning may be a bit better than shown. Also note that transforms for both matrix and DQB skinning of normals, bi-normals, tangents is about 25% cheaper. While this chart confirms that matrix skinning will probably be a tad cheaper for most current applications, DQB skinning can surpass its efficiency when the number of bones influencing a vertex grows to around 10.
  56. Just for a quick comparison, here’s the same skinning logic when written for the X-Box 360 GPU. Aside from requiring 30% less shader constants it’s pretty apparent that these operations are quite conducive to at least this generation graphics hardware.
  57. While those stats are great for the XB360 and PS3, pure ALU instruction counts don’t hold the same level of importance on modern GPU’s. Instead Aggregate cache efficiency, VGPR count, Mem stalls, DRAM footprint. Luckily, dual quaternions outperform linear matrix blend skinning on all fronts. Sorry for not going into more detail here. Blame NDA’s.
  58. All combined the savings did vary on a number of circumstances, but on average we found we were able to skin 100,000 verts weighted to 2 bones about 24% faster than those same verts skinned with matrices. This is a very gross generalization, so I almost hate even bringing it up. Thus the multiple asterisks by that comment  While we found it to be at least a little faster for every one of our skinning use cases, there are unfortunately too many factors to flat out guarantee the same in every game engine or scenario.
  59. Still room for optimization as well: Compress quats. Store tangent space in gbuffer as 10:10:10:2 quaternion.  The 2 bits are used to store which channel needs to be reconstructed.  Found that always reconstructing the W component was too lossy. Dual part can probably be compressed as well by quantizing with the model bounds as a factor. Haven’t tried this out though. Be sure to tune the maximum number of waves per-SIMD! Generate skinning transforms on GPU Now let’s take a look at some of our early A|B visuals results.
  60. Here we have a Call of Duty character with his arms over-twisted by 90 degrees. Notice how the bicep area is suffering from the “candy wrapper” effect as it has squeezed in on itself.
  61. This is the same pose, but with dual quaternion blend skinning applied. Notice how even though the texture gets a bit smeared, the biceps are no longer scrunched in unnatural ways. The inner wrist of the first-person hand also appears more full, for reasons I will get to shortly. [flip back and forth a few times]
  62. Here we have another Call of Duty character in an actual fight scene, using its normal linear matrix blend skinning routine. Looks ok, right?
  63. Here’s that same scene with dual quaternion blend skinning enabled. [flip back and forth a few times] This is a perfect example of how the new skinning solution can be used to preserve mesh volume. Notice how the shooter’s knee becomes visible, his ass and torso poof out, and his hand is more correctly aligned with the gun. Also notice the businessman in the background and how his knee looks a lot more natural. Even though these poses can be considered extreme, in that they are very different than the crucifix pose the character was skinned in, for this type of gameplay they are natural and have loads of room for quick improvements. Needless to say, when I showed these particular results to some character artists, they were quite excited.
  64. So the previous two examples might seem a bit subtle or unrealistic enough to be worth the added expense. Let’s take a look at the benefits you can see when dual quaternion blend skinning is used in conjunction with procedural motions.
  65. DQB skinning can be used to improve the quality of meshes animated with IK – especially in cases where character customization is possible or you’re handling a variety of character proportions. GH story.
  66. Here is a screenshot of a few randomly thrown ragdolls in a Call of Duty engine. A close look would yield at least couple artifacts that DQB skinning would definitely help resolve, particularly in the legs and arms. Despite our best efforts to constrain the motions of ragdolls in-game, it’s almost certain that your character can end up in some pose you never intended. An arm may be over-twisted because of how it was landed on or a head may be wedged in a crevice and twisted sideways. Since the rigger never accounted for these situations when authoring the skinning data, they will most likely not look that great. With DQB you can at least have a little more confidence that the weird poses your ragdolls manage to fall into will at least look a little more natural.
  67. Procedurally animated cloth, hair, dangly chains, etc. can often look bent or scrunched when swinging around in-game. By simply applying DQB skinning to these troubled meshes, assuming they are driven by bones, we can ensure their shapes will be retained even during the most unexpected motions.
  68. So, after many rounds of profiling, and validating visuals, and getting the thumbs up from content, we rolled Call of Duty: Ghosts over to a full dual quaternion blend skinning pathway. We reduced overall bone count, increased fidelity and lighting quality, and greatly improved runtime performance. So everything’s perfect, right?
  69. WRONG!
  70. We already had a bunch of assets that were authored to work with linear blend skinning, and while this actually works pretty well for the most part, there were some that had to be revisited to address bulging artifacts. Here is a common example we encountered: The linear blend skinning results that the artist intended are in the upper left. The image in the upper right is skinned using dual quaternions, but was authored for linear blend skinning weights. Notice the unnatural bulge in the neck. To fix this, we had to re-weight the vertices in the neck more heavily to the neck bone than the head bone. With a quick pass at this we were able to achieve visually comparable results using only dual quaternions. (See the bottom picture.) Did this mean pencils down on dual quaternion skinning? Unfortunately not.
  71. We found it particularly difficult to combat the bulge effects that can occur in the underarm and side-chest area. Technically this can be solved by adding in procedural helper bones to explicitly reverse the bulge with arm motion. This kind of sucks since one of the benefits of switching to dual quaternion skinning was the removal of these types of joints. Luckily, this was a pretty isolated scenario so we conceded, added a couple helper bones, and shipped the game.
  72. Pros & cons. GPU/SIMD friendly – In fact it’s more GPU friendly than CPU, because quaternion multiplies can be accomplished in fewer instructions Can be added for high-end platforms only, with no underlying asset changes, to increase visual fidelity.
  73. Before wrapping things up, there are a couple more topics I’d like to go over that can take existing skinning techniques to the next level, so to speak, and are generally relevant to the current state of skinning in video games. Blend shapes have been around in film for a while, but it’s really just been the past few years that this technique has become realistic for games. The basic concept here is that the artists generate a library of full-resolution morph targets, and then use a parametrically blended combination of them to represent any new shape they might want. A common use case here is facial animations, where performance-captured poses that correspond to individual phonemes or emotions can blended together to create more complex and dynamic expressions.
  74. Another extension to skinning, only recent possible for gaming hardware, is real-time geometry caching. This is where many, many gigabytes of arbitrary model transform data are reduced to something that can fit within the many budget requirements of a game. Here, rigid objects are simply key-framed, but anything that bends or folds, such as cloth, utilizes dense vertex animation. The most notable example of this tech is in “Ryse: Son of Rome”, and if you’re at all interested in this stuff, I’d highly recommend the talk Crytek gave on this very subject at last year’s SIGGRAPH.
  75. Finally - I’d like to revisit the bulging artifacts seen with dual quaternion blend skinning. We hired a few super talented character artists who came from a film background where they had become accustomed to a “Blend Smooth Skinning” feature in Maya that produces final skinning as a blended combination of dual quaternion blend skinning and classical linear skinning. They really disliked having to manually weight vertices using only dual quaternions, specifically for the purpose of combatting bulge. This was especially an issues with the procedural fixup bones under the arm, as they can be less intuitive to deal with than other joints. In the name of keeping runtime performance as fast as possible, we decided against adopting Maya’s hybrid approach.
  76. Luckily, a research paper came out last year to address this very problem. I’ll spare you all a slew of even more equations, but highly recommend checking it out if you decide to explore dual quaternion skinning for your game. Basically, what this paper proposes is a method of pre-calculating exactly how much a given vertex is susceptible to unwanted bulge. It then uses that information to apply positional and tangent basis correction as a post process to normal dual quaternion skinning. The post process itself is quite cheap, and something we could probably squeeze into out performance budget, however, if we didn’t need to…
  77. Remember those underarm procedural helper bones, and how tough it was to weight vertices to them? It turns out you can use this bulge-correcting math to automate that process. We’re currently ironing out a process in Maya that will determine the optimal vertex weighting to any nearby procedural joints in such a way that it will minimize runtime bulging. Sure it may mean a couple extra bones in the rig, but leaves the runtime code as-is.
  78. First you boil the bulge calculation down to a simple function that takes a vertex, all forward kinematic (non-procedural) bones that it’s weighted to, the extents of their motion, and all procedural bones in the rig. It’s important to note that all procedural bones used here have very well-defined parametric motions. The result is a single correctional dual quaternion. Next you solve for the vertex’s bone weights across its kinematic bones, as well as all nearby procedural bones, with the goal of minimizing the bulge factor. If the resultant minimum is not acceptable, adding additional helper bones to the rig can bring it down further. You can be slick here and apply a Simplex solver to do this work, but brute force iteration is unlikely to be very prohibitive, and yields acceptable results. Finally we need to repeat this for all vertices that the artist has selected in their tool of choice, and apply the results to their working data set.
  79. Give dual quaternion blend skinning a look. It was an all around win for us, but might not be right for you.
  80. One of the big reasons for creating this presentation was the lack of accessible material on the subject. However, if you are looking for some very detailed analysis of Dual Quaternion Blend Skinning, I’d suggest looking up the “Geometric Skinning with Approximate Dual Quaternion Blending” whitepaper. Also, if we you do end up changing the in-game skinning solution it will no longer yield results identical to what the artists are seeing in their authoring tools. Luckily there are some plug-ins out there, at least for Maya and XSI, there that can help give a more accurate preview when desired.
  81. That’ pretty much wraps up what I have to say. Are there any questions? If you happen to think of anything later one, feel free to shoot me an e-mail. Thanks for listening!