The Technology behind Shadow Warrior, ZTG 2014

Jarosław Pleskot
Jarosław PleskotProgrammer at Flying Wild Hog
The Technology behind 
Jarosław Pleskot 
Senior Engine Programmer 
Flying Wild Hog 
2014
Facts about Shadow Warrior 
 published by Devolver Digital 
 18 months production time 
 team ~35 people (2 tech programmers, 6 total) 
 modified Hard Reset’s engine (Roadhog)
Presentation overview 
Act I, Skinned decals generation and rendering 
Act II, Foliage authoring and rendering 
Act III, Seawater rendering
Act I, Skinned decals
Skinned decals – entry point 
 In Hard Reset decals only on non-skinned geometry 
(static or movable) 
 Characters destruction by showing/hiding parts of a 
model or by changing texture 
 Lot of blood and gore in Shadow Warrior, must have 
skinned decals
Skinned decals – two techniques 
1. Deffered decals [1][2] 
+ mesh generation not needed 
+ small amount of data to store and pass to graphics card 
- decal floats when mesh is animated 
- can be projected on other surfaces, need to mask out 
(additional gbuffer usage or additional passes) 
Source: http://broniac.blogspot.com/2011/06/deferred-decals.html
Skinned decals – two techniques 
2. Geometry based [3] 
+ stable result when animating mesh 
+ cover only desired surface 
- mesh generation: time and memory 
- additional input data required to generate a decal
Skinned decals – two techniques 
2. Geometry based [3] 
+ stable result when animating mesh 
+ cover only desired surface 
- mesh generation: time and memory 
- additional input data required to generate a decal
Skinned decals – input data 
 Load vertex buffer into CPU memory 
 Generating adjacency per each triangle in mesh 
 3 adjacent triangles 
 Mesh can consist of many isolated elements => 
adjacency groups (store first triangle index of each 
adjacency group) 
struct STriangleAdjacency 
{ 
UInt32 m_adj0; 
UInt32 m_adj1; 
UInt32 m_adj2; 
UInt32 m_group; 
};
Skinned decals – generation 
 Asynchronous (job based) 
 Copy decal parameters and skinning matrices to job 
Basic algorithm 
for all adjacency groups 
{ 
find triangle closest to hit point 
expand decal by adding adjacent triangles until size reached 
and calculate UVs and TBN for each new triangle 
}
Skinned decals – generation 
1. Find triangle closest to hit point 
 Don't want to process entire mesh 
 Skeleton and weights == skinned 
mesh is naturally divided 
 In preprocess step create triangle 
list for every bone 
 Iterate through selected and 
adjacent bones' lists 
 Use skinning matrices to get 
worldspace positions
Skinned decals – generation 
2. Expand decal 
add hit triangle to “open” list 
while open list not empty 
{ 
pop front and calculate its vertices’ positions 
if any inside decal (bounding box test, sizeZ = max( sizeX, sizeY )) 
{ 
add triangle to the output list with new UVs and TBN 
add adjacent triangles to “open” list if not already processed 
} 
}
Skinned decals – generation 
2. Expand decal 
add hit triangle to “open” list 
while open list not empty 
{ 
pop front and calculate its vertices’ positions 
if any inside decal (bounding box test, sizeZ = max( sizeX, sizeY )) 
{ 
add triangle to the output list with new UVs and TBN 
add adjacent triangles to “open” list if not already processed 
} 
} 
Special case for first triangle: 
If triangle field > decal field always pass bounding box 
test
Skinned decals – dismemberment 
 Character dismemberment 
implemented 
 Decals must be split, how?
Skinned decals – dismemberment 
First version 
Store decal spawn info, 
recompute on 
destruction 
- additional CPU time 
(5 enemies destroyed at 
once can produce 50 
jobs) 
- cannot show anything until 
recomputed decal arrive 
== blink
Skinned decals – dismemberment 
Second version 
 Character dismemberment is hand-made 
by creating separate 
meshes 
 Modify spawn algorithm – create 
separate decal chunks for every 
visible mesh 
 Input: adjacency per chunk 
 On cutting create new decals that 
references initial decal geometry 
 Render proper decal chunks
Skinned decals – dismemberment 
Second version 
+ no recomputation 
+ split decals available instantly 
- more draws for initial decal (merge on render) 
Modified algorithm 
for all visible chunks 
{ 
for all adjacency groups 
{ 
find triangle closest to hit point 
expand decal chunk by adding adjacent triangles until size 
reached 
and calculate UVs and TBN for each new triangle 
} 
}
Skinned decals – rendering 
 Rendered through dynamic vertex buffers 
 One pass (compose) or 2 passes (normal+compose) 
 Possible animation through alpha test level shifting 
(lower alpha test reference value == bigger decal)
Skinned decals – details 
 Decal size hard limit: 10k vertices 
 Decal count limit: 100 decals (FIFO) 
 Vertex memory: 30 MB total, in pool 
 Typical bullet decal (500 triangles) spawn time 
around 0.5 ms on Intel core i7 (async, still can do 
better) 
 Big decals == skinned geometry rendered multiple 
times, avoid them, use other techniques, e.g. 
texture layering 
 Use „clamp to border color” with alpha 0.0
Act II, Foliage system
Foliage system – entry point 
• In Hard Reset vegetation only in one area of 
one DLC level 
• In Shadow Warrior many open levels: forests, 
villages, towns, etc. 
• Vegetation made as level geometry == no 
LoD, no instancing, hard to create and control 
(overdraw)
Foliage system – entry point 
Requirements: 
● Instancing 
● Specific LoD system 
● Easy to plant (levels created in 3dsmax, 
gameplay in game editor)
Foliage system – entry point 
Requirements: 
● Instancing 
● Specific LoD system 
● Easy to plant (levels created in 3dsmax, 
gameplay in game editor) 
Spawn meshes – meshes with relative 
foliage density stored as vertex color
Foliage system – planting 
Spawn meshes for level 2 in 3ds Max
Foliage system – planting 
• Render spawn meshes in top-down view to an 
image (density, position.z and normal) 
• In 50x50cm blocks generate random plant 
positions (ρ = ρmesh * ρblock, pos.z interpolated) 
• Set random yaw 
• Optionally align with normal vector 
• Store packed matrix 
All random values are static!
Foliage system – planting 
Many levels of foliage possible by splitting 
spawn meshes to separate 3dsmax objects
Foliage system – editor
Foliage system – storage 
Initially one quad tree per map, batch index 
and LoD level stored with transformation 
Changed to multi resolution grids (2 levels: 4x4 
and 64x64 meters, one grid per batch)
Foliage system – storage 
Grid node contains min/max Z coord and object 
ranges for low and high density arrays 
Transformation packed into 32 bytes 
struct SObject 
{ 
Half4 m_plane0; // 8 
Half4 m_plane1; // 16 
Half4 m_plane2; // 24 
Vec3Packed64 m_position;// 32 
};
Foliage system – storage 
typedef UInt64 Vec3Packed64; 
#define POSITION_PACKED_PACK_SCALE 200.0f 
#define POSITION_PACKED_UNPACK_SCALE ( 1.0f / POSITION_PACKED_PACK_SCALE ) 
// 21 bit max 
#define VEC3PACKED64_MASK 0x00000000001fffff 
#define VEC3PACKED64_SIGN_RECOVER_SHIFT 11 
inline Vec3Packed64 Vec3Packed64Pack( const Vec3& vec, Float packScale ) 
{ 
return ( ( UInt64( vec.X * packScale ) & VEC3PACKED64_MASK ) << 42 ) 
| ( ( UInt64( vec.Y * packScale ) & VEC3PACKED64_MASK ) << 21 ) 
| ( UInt64( vec.Z * packScale ) & VEC3PACKED64_MASK ); 
} 
21 bits per component 
± 5,2km with 2 mm resolution
Foliage system – storage 
inline Vec3 Vec3Packed64Unpack( const Vec3Packed64 vecPacked, Float unpackScale ) 
{ 
Vec3 result; 
{ 
Int32 value = Int32( vecPacked & VEC3PACKED64_MASK ); 
value <<= VEC3PACKED64_SIGN_RECOVER_SHIFT; 
value >>= VEC3PACKED64_SIGN_RECOVER_SHIFT; 
result.Z = Float( value ) * unpackScale; 
} 
{ 
Int32 value = Int32( ( vecPacked >> 21 ) & VEC3PACKED64_MASK ); 
value <<= VEC3PACKED64_SIGN_RECOVER_SHIFT; 
value >>= VEC3PACKED64_SIGN_RECOVER_SHIFT; 
result.Y = Float( value ) * unpackScale; 
} 
{ 
Int32 value = Int32( ( vecPacked >> 42 ) & VEC3PACKED64_MASK ); 
value <<= VEC3PACKED64_SIGN_RECOVER_SHIFT; 
value >>= VEC3PACKED64_SIGN_RECOVER_SHIFT; 
result.X = Float( value ) * unpackScale; 
} 
return result; 
} 
Shift 
arithmetic 
right!
Foliage system – rendering 
• Dynamic vertex buffer, 8192 instances max 
• Several batches (one batch == all visible 
objects with the same mesh and materials) 
• LoD levels: 
Low – 20% density, range multiplier x1 
High – 100% density, range multiplier x1 
Ultra – 100% density, range multiplier x2 
• Gather with Z range ±15 meters 
• Dissolve out on last 5 meters
Foliage system – details 
• Gather time for 8192 instances in 9 batches: 0.41 ms on Intel 
core i7 3.4 GHz 
• GPU time: 1.22 ms (0.89 normal + 0.33 compose, 730k + 470k 
PSPixelsOut) on Radeon R9 270, 1920x1080 
• Memory usage: from 0.7 to 10 MB per level
Foliage system – results 
~4200 instances 
gathered
Foliage system – results 
~6400 instances 
gathered
Foliage system – results 
Not only plants. Our artists are very creative!
Act III, Seawater
Seawater – entry point 
• Docks location with stormy weather planned 
in Shadow Warrior 
• DX9 renderer (no hw tesselation) 
• Dedicated translucent water shader used in 
Hard Reset (simple waves, refraction, water 
fog, foam)
Seawater – mesh 
• 3 LoD levels (quad size: 0.5 x 0.5, 2x2, 8x8 
meters, LoD 0 dims: 48x48 meters) 
• Edge vertices stretched beyond camera far Z 
• 33k tris total 
• Mesh moved with camera, snapped to integer 
world coordinates (constant sampling 
positions) 
• Stencil test
Seawater – vertex shader 
Position processing 
Distortion Filter Asymmetry Choppiness 
Add vertex 
texture 
Flatten 
edges 
Distortion 
derivative 
Filter 
Flatten 
edges 
Modulate 
Normals 
Distortion 
derivative 
w. phase 
offset 
Filter 
Bias and 
modulate 
Foam multiplier 
Affect and 
orthogonalize 
TBN
Seawater – distortion 
float4 fWaterGeomWave0; // xy - frequency, z - speed, w – amplitude 
… 
float DistortionFunc( float arg, float4 params ) 
{ 
float modBase = 0.5 + sin( arg * params.x ) * 0.5; 
float modArg = modBase * params.y - params.y; 
float modAmp = modBase * params.z - params.z; 
return sin( arg * ( 1.0 + modArg ) ) * ( 1.0 + modAmp ); 
} 
float DistortionDerivativeFunc( float arg, float4 params ) 
{ 
float modBase = 0.5 + sin( arg * params.x ) * 0.5; 
float modArg = modBase * params.y - params.y; 
float modAmp = modBase * params.z - params.z; 
return cos( arg * ( 1.0 + modArg ) ) * ( 1.0 + modAmp ); 
} 
Randomize 
waves 
reusing 
wave 
parameters 
(HACK) 
#define DISTORTION_0( arg ) DistortionFunc( arg, fWaterGeomWave0FunctionParam ) 
… 
#define D_DISTORTION_0( arg ) DistortionDerivativeFunc( arg, fWaterGeomWave0FunctionParam ) 
… 
float arg0 = dot( posWS.xy, fWaterGeomWave0.xy ) + time * fWaterGeomWave0.z; 
…
Seawater – waves 
• Sea waves are the signal, mesh is a sampling 
mechanism 
• Nyquist theroem: sampling frequency must be 
at least 2 times higher than peak signal 
frequency to avoid aliasing 
• Different LoD == different sampling 
frequencies
Seawater – waves 
Solution: 
• Calculate cuttoff frequency for each vertex 
• Pass it to a shader as a vertex attribute 
• Filter waves generated in vertex shader using 
this frequency limit 
struct WaterVertex 
{ 
Vec3 m_pos; 
Half2 m_uv0; 
Half2 m_uv1; 
Float m_geomSoftness; 
Float m_waveFreqLimit; 
};
Seawater – filter 
Diagonal direction has lowest 
sampling frequency 
Lerp cutoff frequencies on 
LoD boundaries 
Only fc0 and fc1 used in 
practice
Seawater – filter 
float DistortionFilter( vert_in i, float2 waveFreq ) 
{ 
float waveFreqEff = length( waveFreq.xy ); 
float val = -2.5 / i.m_waveFreqLimit * ( waveFreqEff - 
i.m_waveFreqLimit * 0.8 ); 
float filter = saturate( 0.5 + 0.5 * val ); 
return filter; 
}
Seawater – filter 
float DistortionFilter( vert_in i, float2 waveFreq ) 
{ 
float waveFreqEff = length( waveFreq.xy ); 
float val = -2.5 / i.m_waveFreqLimit * ( waveFreqEff - 
i.m_waveFreqLimit * 0.8 ); 
float filter = saturate( 0.5 + 0.5 * val ); 
return filter; 
} 
#define CALC_FILTER_0 float filter0 = DistortionFilter( i, fWaterGeomWave0.xy ) 
#define CALC_FILTER_1 float filter1 = DistortionFilter( i, fWaterGeomWave1.xy ) 
#define CALC_FILTER_2 float filter2 = DistortionFilter( i, fWaterGeomWave2.xy ) 
#define CALC_FILTER_3 float filter3 = DistortionFilter( i, fWaterGeomWave3.xy ) 
#define FILTER_0( val ) filter0*val 
#define FILTER_1( val ) filter1*val 
#define FILTER_2( val ) filter2*val 
#define FILTER_3( val ) filter3*val
Seawater – wave asymmetry 
float distort = 0.0; 
distort += FILTER_0( DISTORTION_0( arg0 ) * fWaterGeomWave0.w ); 
… 
// FLATTENING STAGE 
posWS.z += distort * i.m_geomSoftness; 
// NORMAL DISTORTION CODE NEEDED TO AFFECT POSITION 
float cos0 = FILTER_0( D_DISTORTION_0( arg0 ) ); 
float2 diff0 = i.m_geomSoftness * normalize( fWaterGeomWave0.xy ) * cos0 ); 
… 
// ASYMMETRY AND CHOPPINESS 
#define ASYMMETRY( arg, power ) ( arg > 0.0 ? power * arg*arg : 1.0 ) 
posWS.xy += diff0 * fWaterGeomWaveChoppiness.x * 
ASYMMETRY( cos0, fWaterGeomWaveAsymmetry.x );
Seawater – wave asymmetry 
// ASYMMETRY AND CHOPPINESS 
#define ASYMMETRY( arg, power ) ( arg > 0.0 ? power * arg*arg : 1.0 ) 
posWS.xy += diff0 * fWaterGeomWaveChoppiness.x * ASYMMETRY( cos0, 
fWaterGeomWaveAsymmetry.x );
Seawater – vertex texture 
• Displace vertices with perlin 
noise to avoid wave tiling 
• Only LoD 1 and LoD 2 
• Calculate proper mip level to 
avoid aliasing 
• 256x256 R32F vertex texture 
• 1024x1024 normals (read in PS)
Seawater – pixel shader 
• Translucent at the begining, changed to opaque 
later on 
• 2 x diffuse (water + foam) 
• 2 sliding normals + perlin noise normal 
• Environment map 
• Deffered lighting
Seawater – results 
• GPU time 0.80 ms (0.02 ms mask, 0.25 ms normal, 
0.53 ms compose) @ Radeon R9 270, 1920x1080 
• 0 pixels draw (depth & stencil fail): 0.16 ms (0.00 ms mask, 0.07 ms normal, 
0.09 ms compose) 
• 0 pixels draw (stencil fail): 0.39 ms (0.00 ms mask, 0.30 ms normal, 
0.09 ms compose)
Special thanks 
Łukasz Zdunowski – Lead Artist 
Zbigniew Siatecki – Environment Artist 
Dominik Misiurski – FX Artist 
Artur Maksara – Producer 
… and the rest of our team.
References 
1. http://broniac.blogspot.com/2011/06/deferred-decals.html 
2. http://humus.name/index.php?page=3D&ID=83 
3. “Character Animation with Direct3D”, Carl Granberg, Charles River Media, 2009 
Questions? 
Contact: 
Email: jarek.pleskot AT flyingwildhog.com 
Facebook: Jarosław Pleskot 
Twitter: @JaroslawPleskot
1 of 53

Recommended

Shadow Warrior 2 and the evolution of the Roadhog Engine, GIC15 by
Shadow Warrior 2 and the evolution of the Roadhog Engine, GIC15Shadow Warrior 2 and the evolution of the Roadhog Engine, GIC15
Shadow Warrior 2 and the evolution of the Roadhog Engine, GIC15Jarosław Pleskot
4.4K views39 slides
Terrain in Battlefield 3: A Modern, Complete and Scalable System by
Terrain in Battlefield 3: A Modern, Complete and Scalable SystemTerrain in Battlefield 3: A Modern, Complete and Scalable System
Terrain in Battlefield 3: A Modern, Complete and Scalable SystemElectronic Arts / DICE
146.9K views69 slides
GDC 2012: Advanced Procedural Rendering in DX11 by
GDC 2012: Advanced Procedural Rendering in DX11GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11smashflt
74.4K views101 slides
GDC 2014 - Deformable Snow Rendering in Batman: Arkham Origins by
GDC 2014 - Deformable Snow Rendering in Batman: Arkham OriginsGDC 2014 - Deformable Snow Rendering in Batman: Arkham Origins
GDC 2014 - Deformable Snow Rendering in Batman: Arkham OriginsColin Barré-Brisebois
36.9K views43 slides
FlameWorks GTC 2014 by
FlameWorks GTC 2014FlameWorks GTC 2014
FlameWorks GTC 2014Simon Green
16.8K views63 slides
Advancements in-tiled-rendering by
Advancements in-tiled-renderingAdvancements in-tiled-rendering
Advancements in-tiled-renderingmistercteam
2.2K views61 slides

More Related Content

What's hot

GTC 2014 - DirectX 11 Rendering and NVIDIA GameWorks in Batman: Arkham Origins by
GTC 2014 - DirectX 11 Rendering and NVIDIA GameWorks in Batman: Arkham OriginsGTC 2014 - DirectX 11 Rendering and NVIDIA GameWorks in Batman: Arkham Origins
GTC 2014 - DirectX 11 Rendering and NVIDIA GameWorks in Batman: Arkham OriginsColin Barré-Brisebois
6K views66 slides
DirectX 11 Rendering in Battlefield 3 by
DirectX 11 Rendering in Battlefield 3DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3Electronic Arts / DICE
56.5K views56 slides
Shiny PC Graphics in Battlefield 3 by
Shiny PC Graphics in Battlefield 3Shiny PC Graphics in Battlefield 3
Shiny PC Graphics in Battlefield 3Electronic Arts / DICE
92.3K views95 slides
Clustered defered and forward shading by
Clustered defered and forward shadingClustered defered and forward shading
Clustered defered and forward shadingWuBinbo
87 views10 slides
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run by
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunFive Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunElectronic Arts / DICE
32.1K views96 slides
Colin Barre-Brisebois - GDC 2011 - Approximating Translucency for a Fast, Che... by
Colin Barre-Brisebois - GDC 2011 - Approximating Translucency for a Fast, Che...Colin Barre-Brisebois - GDC 2011 - Approximating Translucency for a Fast, Che...
Colin Barre-Brisebois - GDC 2011 - Approximating Translucency for a Fast, Che...Colin Barré-Brisebois
84.2K views39 slides

What's hot(20)

GTC 2014 - DirectX 11 Rendering and NVIDIA GameWorks in Batman: Arkham Origins by Colin Barré-Brisebois
GTC 2014 - DirectX 11 Rendering and NVIDIA GameWorks in Batman: Arkham OriginsGTC 2014 - DirectX 11 Rendering and NVIDIA GameWorks in Batman: Arkham Origins
GTC 2014 - DirectX 11 Rendering and NVIDIA GameWorks in Batman: Arkham Origins
Clustered defered and forward shading by WuBinbo
Clustered defered and forward shadingClustered defered and forward shading
Clustered defered and forward shading
WuBinbo87 views
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run by Electronic Arts / DICE
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunFive Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Colin Barre-Brisebois - GDC 2011 - Approximating Translucency for a Fast, Che... by Colin Barré-Brisebois
Colin Barre-Brisebois - GDC 2011 - Approximating Translucency for a Fast, Che...Colin Barre-Brisebois - GDC 2011 - Approximating Translucency for a Fast, Che...
Colin Barre-Brisebois - GDC 2011 - Approximating Translucency for a Fast, Che...
Modern Graphics Pipeline Overview by slantsixgames
Modern Graphics Pipeline OverviewModern Graphics Pipeline Overview
Modern Graphics Pipeline Overview
slantsixgames4.4K views
Paris Master Class 2011 - 02 Screen Space Material System by Wolfgang Engel
Paris Master Class 2011 - 02 Screen Space Material SystemParis Master Class 2011 - 02 Screen Space Material System
Paris Master Class 2011 - 02 Screen Space Material System
Wolfgang Engel586 views
Secrets of CryENGINE 3 Graphics Technology by Tiago Sousa
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics Technology
Tiago Sousa32.3K views
Oit And Indirect Illumination Using Dx11 Linked Lists by Holger Gruen
Oit And Indirect Illumination Using Dx11 Linked ListsOit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked Lists
Holger Gruen23.4K views
Paris Master Class 2011 - 05 Post-Processing Pipeline by Wolfgang Engel
Paris Master Class 2011 - 05 Post-Processing PipelineParis Master Class 2011 - 05 Post-Processing Pipeline
Paris Master Class 2011 - 05 Post-Processing Pipeline
Wolfgang Engel2.5K views
Massive Point Light Soft Shadows by Wolfgang Engel
Massive Point Light Soft ShadowsMassive Point Light Soft Shadows
Massive Point Light Soft Shadows
Wolfgang Engel2.4K views
Optimizing the Graphics Pipeline with Compute, GDC 2016 by Graham Wihlidal
Optimizing the Graphics Pipeline with Compute, GDC 2016Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016
Graham Wihlidal135.4K views
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007) by Johan Andersson
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Johan Andersson16.3K views
Geometry Shader-based Bump Mapping Setup by Mark Kilgard
Geometry Shader-based Bump Mapping SetupGeometry Shader-based Bump Mapping Setup
Geometry Shader-based Bump Mapping Setup
Mark Kilgard4.4K views
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla... by Gurbinder Gill
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Gurbinder Gill2.6K views
Around the World in 80 Shaders by stevemcauley
Around the World in 80 ShadersAround the World in 80 Shaders
Around the World in 80 Shaders
stevemcauley9K views
The Rendering Pipeline - Challenges & Next Steps by Johan Andersson
The Rendering Pipeline - Challenges & Next StepsThe Rendering Pipeline - Challenges & Next Steps
The Rendering Pipeline - Challenges & Next Steps
Johan Andersson101.1K views

Similar to The Technology behind Shadow Warrior, ZTG 2014

Extreme dxt compression by
Extreme dxt compressionExtreme dxt compression
Extreme dxt compressionAtaceyhun Çelik
355 views16 slides
Masked Software Occlusion Culling by
Masked Software Occlusion CullingMasked Software Occlusion Culling
Masked Software Occlusion CullingIntel® Software
4K views76 slides
Siggraph2016 - The Devil is in the Details: idTech 666 by
Siggraph2016 - The Devil is in the Details: idTech 666Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Tiago Sousa
7.1K views58 slides
Minko stage3d workshop_20130525 by
Minko stage3d workshop_20130525Minko stage3d workshop_20130525
Minko stage3d workshop_20130525Minko3D
5.4K views49 slides
Implementing a modern, RenderMan compliant, REYES renderer by
Implementing a modern, RenderMan compliant, REYES rendererImplementing a modern, RenderMan compliant, REYES renderer
Implementing a modern, RenderMan compliant, REYES rendererDavide Pasca
3.1K views31 slides
Hardware Acceleration for Machine Learning by
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningCastLabKAIST
1.5K views256 slides

Similar to The Technology behind Shadow Warrior, ZTG 2014(20)

Siggraph2016 - The Devil is in the Details: idTech 666 by Tiago Sousa
Siggraph2016 - The Devil is in the Details: idTech 666Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666
Tiago Sousa7.1K views
Minko stage3d workshop_20130525 by Minko3D
Minko stage3d workshop_20130525Minko stage3d workshop_20130525
Minko stage3d workshop_20130525
Minko3D5.4K views
Implementing a modern, RenderMan compliant, REYES renderer by Davide Pasca
Implementing a modern, RenderMan compliant, REYES rendererImplementing a modern, RenderMan compliant, REYES renderer
Implementing a modern, RenderMan compliant, REYES renderer
Davide Pasca3.1K views
Hardware Acceleration for Machine Learning by CastLabKAIST
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
CastLabKAIST1.5K views
Trident International Graphics Workshop 2014 4/5 by Takao Wada
Trident International Graphics Workshop 2014 4/5Trident International Graphics Workshop 2014 4/5
Trident International Graphics Workshop 2014 4/5
Takao Wada507 views
Cascades Demo Secrets by icastano
Cascades Demo SecretsCascades Demo Secrets
Cascades Demo Secrets
icastano8.3K views
Beyond porting by Cass Everitt
Beyond portingBeyond porting
Beyond porting
Cass Everitt69.4K views
Beginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeks by JinTaek Seo
Beginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeksBeginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeks
Beginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeks
JinTaek Seo418 views
Digit recognizer by convolutional neural network by Ding Li
Digit recognizer by convolutional neural networkDigit recognizer by convolutional neural network
Digit recognizer by convolutional neural network
Ding Li83 views
Realtime Per Face Texture Mapping (PTEX) by basisspace
Realtime Per Face Texture Mapping (PTEX)Realtime Per Face Texture Mapping (PTEX)
Realtime Per Face Texture Mapping (PTEX)
basisspace1.9K views
Rendering Technologies from Crysis 3 (GDC 2013) by Tiago Sousa
Rendering Technologies from Crysis 3 (GDC 2013)Rendering Technologies from Crysis 3 (GDC 2013)
Rendering Technologies from Crysis 3 (GDC 2013)
Tiago Sousa24.6K views
Angel cunado_The Terrain Of KUF2 by drandom
Angel cunado_The Terrain Of KUF2Angel cunado_The Terrain Of KUF2
Angel cunado_The Terrain Of KUF2
drandom2.9K views
Accelerating HPC Applications on NVIDIA GPUs with OpenACC by inside-BigData.com
Accelerating HPC Applications on NVIDIA GPUs with OpenACCAccelerating HPC Applications on NVIDIA GPUs with OpenACC
Accelerating HPC Applications on NVIDIA GPUs with OpenACC
inside-BigData.com2.1K views
3D Multi Object GAN by Yu Nishimura
3D Multi Object GAN3D Multi Object GAN
3D Multi Object GAN
Yu Nishimura4.2K views
Eye deep by sveitser
Eye deepEye deep
Eye deep
sveitser867 views
java memory management & gc by exsuns
java memory management & gcjava memory management & gc
java memory management & gc
exsuns10.1K views

Recently uploaded

Pdffromtextfile_1.pdf by
Pdffromtextfile_1.pdfPdffromtextfile_1.pdf
Pdffromtextfile_1.pdfTRIEU QUANG NGO
8 views2 slides
Gym Members Community.pptx by
Gym Members Community.pptxGym Members Community.pptx
Gym Members Community.pptxnasserbf1987
10 views5 slides
Helko van den Brom - VSL by
Helko van den Brom - VSLHelko van den Brom - VSL
Helko van den Brom - VSLDutch Power
116 views18 slides
PB CV v0.3 by
PB CV v0.3PB CV v0.3
PB CV v0.3Pedro Borracha
14 views16 slides
231121 SP slides - PAS workshop November 2023.pdf by
231121 SP slides - PAS workshop November 2023.pdf231121 SP slides - PAS workshop November 2023.pdf
231121 SP slides - PAS workshop November 2023.pdfPAS_Team
211 views15 slides
Product Quality Review.pptx by
Product Quality Review.pptxProduct Quality Review.pptx
Product Quality Review.pptxKarishma Majik
6 views16 slides

Recently uploaded(20)

Gym Members Community.pptx by nasserbf1987
Gym Members Community.pptxGym Members Community.pptx
Gym Members Community.pptx
nasserbf198710 views
Helko van den Brom - VSL by Dutch Power
Helko van den Brom - VSLHelko van den Brom - VSL
Helko van den Brom - VSL
Dutch Power116 views
231121 SP slides - PAS workshop November 2023.pdf by PAS_Team
231121 SP slides - PAS workshop November 2023.pdf231121 SP slides - PAS workshop November 2023.pdf
231121 SP slides - PAS workshop November 2023.pdf
PAS_Team211 views
Christan van Dorst - Hyteps by Dutch Power
Christan van Dorst - HytepsChristan van Dorst - Hyteps
Christan van Dorst - Hyteps
Dutch Power114 views
ERGONOMIC RISK ASSESSMENT (ERA).pptx by j967z4hcnp
ERGONOMIC RISK ASSESSMENT (ERA).pptxERGONOMIC RISK ASSESSMENT (ERA).pptx
ERGONOMIC RISK ASSESSMENT (ERA).pptx
j967z4hcnp6 views
Roozbeh Torkzadeh - TU Eindhoven by Dutch Power
Roozbeh Torkzadeh - TU EindhovenRoozbeh Torkzadeh - TU Eindhoven
Roozbeh Torkzadeh - TU Eindhoven
Dutch Power108 views
I use my tools to help people by mywampa
I use my tools to help peopleI use my tools to help people
I use my tools to help people
mywampa7 views

The Technology behind Shadow Warrior, ZTG 2014

  • 1. The Technology behind Jarosław Pleskot Senior Engine Programmer Flying Wild Hog 2014
  • 2. Facts about Shadow Warrior  published by Devolver Digital  18 months production time  team ~35 people (2 tech programmers, 6 total)  modified Hard Reset’s engine (Roadhog)
  • 3. Presentation overview Act I, Skinned decals generation and rendering Act II, Foliage authoring and rendering Act III, Seawater rendering
  • 4. Act I, Skinned decals
  • 5. Skinned decals – entry point  In Hard Reset decals only on non-skinned geometry (static or movable)  Characters destruction by showing/hiding parts of a model or by changing texture  Lot of blood and gore in Shadow Warrior, must have skinned decals
  • 6. Skinned decals – two techniques 1. Deffered decals [1][2] + mesh generation not needed + small amount of data to store and pass to graphics card - decal floats when mesh is animated - can be projected on other surfaces, need to mask out (additional gbuffer usage or additional passes) Source: http://broniac.blogspot.com/2011/06/deferred-decals.html
  • 7. Skinned decals – two techniques 2. Geometry based [3] + stable result when animating mesh + cover only desired surface - mesh generation: time and memory - additional input data required to generate a decal
  • 8. Skinned decals – two techniques 2. Geometry based [3] + stable result when animating mesh + cover only desired surface - mesh generation: time and memory - additional input data required to generate a decal
  • 9. Skinned decals – input data  Load vertex buffer into CPU memory  Generating adjacency per each triangle in mesh  3 adjacent triangles  Mesh can consist of many isolated elements => adjacency groups (store first triangle index of each adjacency group) struct STriangleAdjacency { UInt32 m_adj0; UInt32 m_adj1; UInt32 m_adj2; UInt32 m_group; };
  • 10. Skinned decals – generation  Asynchronous (job based)  Copy decal parameters and skinning matrices to job Basic algorithm for all adjacency groups { find triangle closest to hit point expand decal by adding adjacent triangles until size reached and calculate UVs and TBN for each new triangle }
  • 11. Skinned decals – generation 1. Find triangle closest to hit point  Don't want to process entire mesh  Skeleton and weights == skinned mesh is naturally divided  In preprocess step create triangle list for every bone  Iterate through selected and adjacent bones' lists  Use skinning matrices to get worldspace positions
  • 12. Skinned decals – generation 2. Expand decal add hit triangle to “open” list while open list not empty { pop front and calculate its vertices’ positions if any inside decal (bounding box test, sizeZ = max( sizeX, sizeY )) { add triangle to the output list with new UVs and TBN add adjacent triangles to “open” list if not already processed } }
  • 13. Skinned decals – generation 2. Expand decal add hit triangle to “open” list while open list not empty { pop front and calculate its vertices’ positions if any inside decal (bounding box test, sizeZ = max( sizeX, sizeY )) { add triangle to the output list with new UVs and TBN add adjacent triangles to “open” list if not already processed } } Special case for first triangle: If triangle field > decal field always pass bounding box test
  • 14. Skinned decals – dismemberment  Character dismemberment implemented  Decals must be split, how?
  • 15. Skinned decals – dismemberment First version Store decal spawn info, recompute on destruction - additional CPU time (5 enemies destroyed at once can produce 50 jobs) - cannot show anything until recomputed decal arrive == blink
  • 16. Skinned decals – dismemberment Second version  Character dismemberment is hand-made by creating separate meshes  Modify spawn algorithm – create separate decal chunks for every visible mesh  Input: adjacency per chunk  On cutting create new decals that references initial decal geometry  Render proper decal chunks
  • 17. Skinned decals – dismemberment Second version + no recomputation + split decals available instantly - more draws for initial decal (merge on render) Modified algorithm for all visible chunks { for all adjacency groups { find triangle closest to hit point expand decal chunk by adding adjacent triangles until size reached and calculate UVs and TBN for each new triangle } }
  • 18. Skinned decals – rendering  Rendered through dynamic vertex buffers  One pass (compose) or 2 passes (normal+compose)  Possible animation through alpha test level shifting (lower alpha test reference value == bigger decal)
  • 19. Skinned decals – details  Decal size hard limit: 10k vertices  Decal count limit: 100 decals (FIFO)  Vertex memory: 30 MB total, in pool  Typical bullet decal (500 triangles) spawn time around 0.5 ms on Intel core i7 (async, still can do better)  Big decals == skinned geometry rendered multiple times, avoid them, use other techniques, e.g. texture layering  Use „clamp to border color” with alpha 0.0
  • 20. Act II, Foliage system
  • 21. Foliage system – entry point • In Hard Reset vegetation only in one area of one DLC level • In Shadow Warrior many open levels: forests, villages, towns, etc. • Vegetation made as level geometry == no LoD, no instancing, hard to create and control (overdraw)
  • 22. Foliage system – entry point Requirements: ● Instancing ● Specific LoD system ● Easy to plant (levels created in 3dsmax, gameplay in game editor)
  • 23. Foliage system – entry point Requirements: ● Instancing ● Specific LoD system ● Easy to plant (levels created in 3dsmax, gameplay in game editor) Spawn meshes – meshes with relative foliage density stored as vertex color
  • 24. Foliage system – planting Spawn meshes for level 2 in 3ds Max
  • 25. Foliage system – planting • Render spawn meshes in top-down view to an image (density, position.z and normal) • In 50x50cm blocks generate random plant positions (ρ = ρmesh * ρblock, pos.z interpolated) • Set random yaw • Optionally align with normal vector • Store packed matrix All random values are static!
  • 26. Foliage system – planting Many levels of foliage possible by splitting spawn meshes to separate 3dsmax objects
  • 28. Foliage system – storage Initially one quad tree per map, batch index and LoD level stored with transformation Changed to multi resolution grids (2 levels: 4x4 and 64x64 meters, one grid per batch)
  • 29. Foliage system – storage Grid node contains min/max Z coord and object ranges for low and high density arrays Transformation packed into 32 bytes struct SObject { Half4 m_plane0; // 8 Half4 m_plane1; // 16 Half4 m_plane2; // 24 Vec3Packed64 m_position;// 32 };
  • 30. Foliage system – storage typedef UInt64 Vec3Packed64; #define POSITION_PACKED_PACK_SCALE 200.0f #define POSITION_PACKED_UNPACK_SCALE ( 1.0f / POSITION_PACKED_PACK_SCALE ) // 21 bit max #define VEC3PACKED64_MASK 0x00000000001fffff #define VEC3PACKED64_SIGN_RECOVER_SHIFT 11 inline Vec3Packed64 Vec3Packed64Pack( const Vec3& vec, Float packScale ) { return ( ( UInt64( vec.X * packScale ) & VEC3PACKED64_MASK ) << 42 ) | ( ( UInt64( vec.Y * packScale ) & VEC3PACKED64_MASK ) << 21 ) | ( UInt64( vec.Z * packScale ) & VEC3PACKED64_MASK ); } 21 bits per component ± 5,2km with 2 mm resolution
  • 31. Foliage system – storage inline Vec3 Vec3Packed64Unpack( const Vec3Packed64 vecPacked, Float unpackScale ) { Vec3 result; { Int32 value = Int32( vecPacked & VEC3PACKED64_MASK ); value <<= VEC3PACKED64_SIGN_RECOVER_SHIFT; value >>= VEC3PACKED64_SIGN_RECOVER_SHIFT; result.Z = Float( value ) * unpackScale; } { Int32 value = Int32( ( vecPacked >> 21 ) & VEC3PACKED64_MASK ); value <<= VEC3PACKED64_SIGN_RECOVER_SHIFT; value >>= VEC3PACKED64_SIGN_RECOVER_SHIFT; result.Y = Float( value ) * unpackScale; } { Int32 value = Int32( ( vecPacked >> 42 ) & VEC3PACKED64_MASK ); value <<= VEC3PACKED64_SIGN_RECOVER_SHIFT; value >>= VEC3PACKED64_SIGN_RECOVER_SHIFT; result.X = Float( value ) * unpackScale; } return result; } Shift arithmetic right!
  • 32. Foliage system – rendering • Dynamic vertex buffer, 8192 instances max • Several batches (one batch == all visible objects with the same mesh and materials) • LoD levels: Low – 20% density, range multiplier x1 High – 100% density, range multiplier x1 Ultra – 100% density, range multiplier x2 • Gather with Z range ±15 meters • Dissolve out on last 5 meters
  • 33. Foliage system – details • Gather time for 8192 instances in 9 batches: 0.41 ms on Intel core i7 3.4 GHz • GPU time: 1.22 ms (0.89 normal + 0.33 compose, 730k + 470k PSPixelsOut) on Radeon R9 270, 1920x1080 • Memory usage: from 0.7 to 10 MB per level
  • 34. Foliage system – results ~4200 instances gathered
  • 35. Foliage system – results ~6400 instances gathered
  • 36. Foliage system – results Not only plants. Our artists are very creative!
  • 38. Seawater – entry point • Docks location with stormy weather planned in Shadow Warrior • DX9 renderer (no hw tesselation) • Dedicated translucent water shader used in Hard Reset (simple waves, refraction, water fog, foam)
  • 39. Seawater – mesh • 3 LoD levels (quad size: 0.5 x 0.5, 2x2, 8x8 meters, LoD 0 dims: 48x48 meters) • Edge vertices stretched beyond camera far Z • 33k tris total • Mesh moved with camera, snapped to integer world coordinates (constant sampling positions) • Stencil test
  • 40. Seawater – vertex shader Position processing Distortion Filter Asymmetry Choppiness Add vertex texture Flatten edges Distortion derivative Filter Flatten edges Modulate Normals Distortion derivative w. phase offset Filter Bias and modulate Foam multiplier Affect and orthogonalize TBN
  • 41. Seawater – distortion float4 fWaterGeomWave0; // xy - frequency, z - speed, w – amplitude … float DistortionFunc( float arg, float4 params ) { float modBase = 0.5 + sin( arg * params.x ) * 0.5; float modArg = modBase * params.y - params.y; float modAmp = modBase * params.z - params.z; return sin( arg * ( 1.0 + modArg ) ) * ( 1.0 + modAmp ); } float DistortionDerivativeFunc( float arg, float4 params ) { float modBase = 0.5 + sin( arg * params.x ) * 0.5; float modArg = modBase * params.y - params.y; float modAmp = modBase * params.z - params.z; return cos( arg * ( 1.0 + modArg ) ) * ( 1.0 + modAmp ); } Randomize waves reusing wave parameters (HACK) #define DISTORTION_0( arg ) DistortionFunc( arg, fWaterGeomWave0FunctionParam ) … #define D_DISTORTION_0( arg ) DistortionDerivativeFunc( arg, fWaterGeomWave0FunctionParam ) … float arg0 = dot( posWS.xy, fWaterGeomWave0.xy ) + time * fWaterGeomWave0.z; …
  • 42. Seawater – waves • Sea waves are the signal, mesh is a sampling mechanism • Nyquist theroem: sampling frequency must be at least 2 times higher than peak signal frequency to avoid aliasing • Different LoD == different sampling frequencies
  • 43. Seawater – waves Solution: • Calculate cuttoff frequency for each vertex • Pass it to a shader as a vertex attribute • Filter waves generated in vertex shader using this frequency limit struct WaterVertex { Vec3 m_pos; Half2 m_uv0; Half2 m_uv1; Float m_geomSoftness; Float m_waveFreqLimit; };
  • 44. Seawater – filter Diagonal direction has lowest sampling frequency Lerp cutoff frequencies on LoD boundaries Only fc0 and fc1 used in practice
  • 45. Seawater – filter float DistortionFilter( vert_in i, float2 waveFreq ) { float waveFreqEff = length( waveFreq.xy ); float val = -2.5 / i.m_waveFreqLimit * ( waveFreqEff - i.m_waveFreqLimit * 0.8 ); float filter = saturate( 0.5 + 0.5 * val ); return filter; }
  • 46. Seawater – filter float DistortionFilter( vert_in i, float2 waveFreq ) { float waveFreqEff = length( waveFreq.xy ); float val = -2.5 / i.m_waveFreqLimit * ( waveFreqEff - i.m_waveFreqLimit * 0.8 ); float filter = saturate( 0.5 + 0.5 * val ); return filter; } #define CALC_FILTER_0 float filter0 = DistortionFilter( i, fWaterGeomWave0.xy ) #define CALC_FILTER_1 float filter1 = DistortionFilter( i, fWaterGeomWave1.xy ) #define CALC_FILTER_2 float filter2 = DistortionFilter( i, fWaterGeomWave2.xy ) #define CALC_FILTER_3 float filter3 = DistortionFilter( i, fWaterGeomWave3.xy ) #define FILTER_0( val ) filter0*val #define FILTER_1( val ) filter1*val #define FILTER_2( val ) filter2*val #define FILTER_3( val ) filter3*val
  • 47. Seawater – wave asymmetry float distort = 0.0; distort += FILTER_0( DISTORTION_0( arg0 ) * fWaterGeomWave0.w ); … // FLATTENING STAGE posWS.z += distort * i.m_geomSoftness; // NORMAL DISTORTION CODE NEEDED TO AFFECT POSITION float cos0 = FILTER_0( D_DISTORTION_0( arg0 ) ); float2 diff0 = i.m_geomSoftness * normalize( fWaterGeomWave0.xy ) * cos0 ); … // ASYMMETRY AND CHOPPINESS #define ASYMMETRY( arg, power ) ( arg > 0.0 ? power * arg*arg : 1.0 ) posWS.xy += diff0 * fWaterGeomWaveChoppiness.x * ASYMMETRY( cos0, fWaterGeomWaveAsymmetry.x );
  • 48. Seawater – wave asymmetry // ASYMMETRY AND CHOPPINESS #define ASYMMETRY( arg, power ) ( arg > 0.0 ? power * arg*arg : 1.0 ) posWS.xy += diff0 * fWaterGeomWaveChoppiness.x * ASYMMETRY( cos0, fWaterGeomWaveAsymmetry.x );
  • 49. Seawater – vertex texture • Displace vertices with perlin noise to avoid wave tiling • Only LoD 1 and LoD 2 • Calculate proper mip level to avoid aliasing • 256x256 R32F vertex texture • 1024x1024 normals (read in PS)
  • 50. Seawater – pixel shader • Translucent at the begining, changed to opaque later on • 2 x diffuse (water + foam) • 2 sliding normals + perlin noise normal • Environment map • Deffered lighting
  • 51. Seawater – results • GPU time 0.80 ms (0.02 ms mask, 0.25 ms normal, 0.53 ms compose) @ Radeon R9 270, 1920x1080 • 0 pixels draw (depth & stencil fail): 0.16 ms (0.00 ms mask, 0.07 ms normal, 0.09 ms compose) • 0 pixels draw (stencil fail): 0.39 ms (0.00 ms mask, 0.30 ms normal, 0.09 ms compose)
  • 52. Special thanks Łukasz Zdunowski – Lead Artist Zbigniew Siatecki – Environment Artist Dominik Misiurski – FX Artist Artur Maksara – Producer … and the rest of our team.
  • 53. References 1. http://broniac.blogspot.com/2011/06/deferred-decals.html 2. http://humus.name/index.php?page=3D&ID=83 3. “Character Animation with Direct3D”, Carl Granberg, Charles River Media, 2009 Questions? Contact: Email: jarek.pleskot AT flyingwildhog.com Facebook: Jarosław Pleskot Twitter: @JaroslawPleskot