SlideShare a Scribd company logo
Far Cry and DirectXFar Cry and DirectX
Carsten WenzelCarsten Wenzel
Far Cry uses the latest DX9 featuresFar Cry uses the latest DX9 features
• Shader Models 2.x / 3.0Shader Models 2.x / 3.0 
- Except for vertex textures and dynamicExcept for vertex textures and dynamic
flow controlflow control
• Geometry InstancingGeometry Instancing 
• Floating-point render targetsFloating-point render targets 
Dynamic flow control in PSDynamic flow control in PS
• To consolidate multiple lights into one pass,To consolidate multiple lights into one pass,
we ideally would want to do something likewe ideally would want to do something like
this…this…
float3float3 finalColfinalCol == 0;0;
float3float3 diffuseColdiffuseCol == tex2Dtex2D( diffuseMap, IN.diffuseUV.xy );( diffuseMap, IN.diffuseUV.xy );
float3float3 normalnormal == mulmul( IN.tangentToWorldSpace,( IN.tangentToWorldSpace,
tex2Dtex2D( normalMap, IN.bumpUV.xy ).xyz );( normalMap, IN.bumpUV.xy ).xyz );
forfor(( intint ii == 0; i < cNumLights; i++ )0; i < cNumLights; i++ )
{{
float3float3 lightCollightCol = LightColor[ i ];= LightColor[ i ];
float3float3 lightVeclightVec == normalizenormalize( cLightPos[ i ].xyz –( cLightPos[ i ].xyz –
IN.pos.xyz );IN.pos.xyz );
// …// …
// Attenuation, Specular, etc. calculated via// Attenuation, Specular, etc. calculated via
if( const_boolean )if( const_boolean )
// …// …
floatfloat nDotLnDotL == saturatesaturate(( dotdot( lightVec.xyz, normal ) );( lightVec.xyz, normal ) );
final += lightCol.xyz * diffuseCol.xyz * nDotL * atten;final += lightCol.xyz * diffuseCol.xyz * nDotL * atten;
}}
return(return( float4float4( finalCol, 1 ) );( finalCol, 1 ) );
Dynamic flow control in PSDynamic flow control in PS
• Welcome to the real world…Welcome to the real world…
– Dynamic indexing only allowed on inputDynamic indexing only allowed on input
registers; prevents passing light data viaregisters; prevents passing light data via
constant registers and index them in loopconstant registers and index them in loop
– Passing light info via input registers notPassing light info via input registers not
feasible as there are not enough of them (onlyfeasible as there are not enough of them (only
10)10)
– Dynamic branching is not freeDynamic branching is not free
Loop unrollingLoop unrolling
• We chose not to use dynamic branching and loopsWe chose not to use dynamic branching and loops
• Used static branching and unrolled loops insteadUsed static branching and unrolled loops instead
• Works well with Far Cry’s existing shader frameworkWorks well with Far Cry’s existing shader framework
• Shaders are precompiled for different light masksShaders are precompiled for different light masks
– 0-4 dynamic light sources per pass0-4 dynamic light sources per pass
– 3 different light types (spot, omni, directional)3 different light types (spot, omni, directional)
– 2 modification types per light (specular only, occlusion map)2 modification types per light (specular only, occlusion map)
• Can result in over 160 instructions after loop unrollingCan result in over 160 instructions after loop unrolling
when using 4 lightswhen using 4 lights
– Too long for ps_2_0Too long for ps_2_0
– Just fine for ps_2_a, ps_2_b and ps_3_0!Just fine for ps_2_a, ps_2_b and ps_3_0!
• To avoid run time stalls, use a pre-warmed shader cacheTo avoid run time stalls, use a pre-warmed shader cache
How the shader cache worksHow the shader cache works
• Specific shader depends on:Specific shader depends on:
1)1) Material typeMaterial type
(e.g. skin, phong, metal)(e.g. skin, phong, metal)
2)2) Material usage flagsMaterial usage flags
(e.g. bump-mapped, specular)(e.g. bump-mapped, specular)
3)3) Specific environmentSpecific environment
(e.g. light mask, fog)(e.g. light mask, fog)
How the shader cache worksHow the shader cache works
• Cache access:Cache access:
– Object to render already has shader handles? Use those!Object to render already has shader handles? Use those!
– Otherwise try to find the shader in memoryOtherwise try to find the shader in memory
– If that fails load from harddiskIf that fails load from harddisk
– If that fails generate VS/PS, store backup on harddiskIf that fails generate VS/PS, store backup on harddisk
– Finally, save shader handles in objectFinally, save shader handles in object
• Not the ideal solution butNot the ideal solution but
– Works reasonably well on existing hardwareWorks reasonably well on existing hardware
– Was easy to integrate without changing assetsWas easy to integrate without changing assets
• For the cache to be efficient…For the cache to be efficient…
– All used combinations of a shader should exist as pre-cached files onAll used combinations of a shader should exist as pre-cached files on
HDHD
• On the fly update causes stalls due to time required for shaderOn the fly update causes stalls due to time required for shader
compilation!compilation!
– However, maintaining the cache can become cumbersomeHowever, maintaining the cache can become cumbersome
Loop unrolling – Pros/ConsLoop unrolling – Pros/Cons
• Pros:Pros:
– Speed! Not branching dynamically saves quite a fewSpeed! Not branching dynamically saves quite a few
cyclescycles
– At the time, we found shader switching to be moreAt the time, we found shader switching to be more
efficient than dynamic branchingefficient than dynamic branching
• Cons:Cons:
– Needs sophisticated shader caching, due to number ofNeeds sophisticated shader caching, due to number of
shader combinations per light mask (244 aftershader combinations per light mask (244 after
presorting of combinations)presorting of combinations)
– Shader pre-compilation takes timeShader pre-compilation takes time
– Shader cache for Far Cry 1.3 requires about 430 MBShader cache for Far Cry 1.3 requires about 430 MB
(compressed down to ~23 MB in patch exe)(compressed down to ~23 MB in patch exe)
Geometry InstancingGeometry Instancing
• Potentially saves cost ofPotentially saves cost of n-1n-1 draw calls when renderingdraw calls when rendering nn
instances of an objectinstances of an object
• Far Cry uses it mainly to speed up vegetation renderingFar Cry uses it mainly to speed up vegetation rendering
• Per instance attributes:Per instance attributes:
– PositionPosition
– SizeSize
– Bending infoBending info
– Rotation (only if needed)Rotation (only if needed)
• Reduce the number of instance attributes! Two methods:Reduce the number of instance attributes! Two methods:
– Vertex shader constantsVertex shader constants
• Use for objects having more than 100 polygonsUse for objects having more than 100 polygons
– Attribute streamsAttribute streams
• Use for smaller objects (sprites, impostors)Use for smaller objects (sprites, impostors)
Instance Attributes in VS ConstantsInstance Attributes in VS Constants
• Best for objects with large numbers ofBest for objects with large numbers of
polygonspolygons
• Put instance index into additional streamPut instance index into additional stream
• UseUse SetStreamSourceFrequencySetStreamSourceFrequency to setupto setup
geometry instancing as follows…geometry instancing as follows…
SetStreamSourceFrequency( geomStream,SetStreamSourceFrequency( geomStream,
D3DSTREAMSOURCE_INDEXEDDATA | numInstances );D3DSTREAMSOURCE_INDEXEDDATA | numInstances );
SetStreamSourceFrequency( instStream,SetStreamSourceFrequency( instStream,
D3DSTREAMSOURCE_INSTANCEDATA | 1 );D3DSTREAMSOURCE_INSTANCEDATA | 1 );
• Be sure to reset the vertex stream frequencyBe sure to reset the vertex stream frequency
once you’re done,once you’re done, SSSF( strNum, 1 )SSSF( strNum, 1 )!!
const float4x4 cMatViewProj;
const float4 cPackedInstanceData[ numInstances ];
float4x4 matWorld;
float4x4 matMVP;
int i = IN.InstanceIndex;
matWorld[ 0 ] = float4( cPackedInstanceData[ i ].w, 0, 0,
cPackedInstanceData[ i ].x );
matWorld[ 1 ] = float4( 0, cPackedInstanceData[ i ].w, 0,
cPackedInstanceData[ i ].y );
matWorld[ 2 ] = float4( 0, 0, cPackedInstanceData[ i ].w,
cPackedInstanceData[ i ].z );
matWorld[ 3 ] = float4( 0, 0, 0, 1 );
matMVP = mul( cMatViewProj, matWorld );
OUT.HPosition = mul( matMVP, IN.Position );
VS Snippet to unpack attributes (positionVS Snippet to unpack attributes (position
& size) from VS constants to create& size) from VS constants to create
matMVP and transform vertexmatMVP and transform vertex
Instance Attribute StreamsInstance Attribute Streams
• Best for objects with few polygonsBest for objects with few polygons
• Put per instance data into additional streamPut per instance data into additional stream
• Setup vertex stream frequency as before andSetup vertex stream frequency as before and
reset when you’re donereset when you’re done
const float4x4 cMatViewProj;
float4x4 matWorld;
float4x4 matMVP;
matWorld[ 0 ] = float4( IN.PackedInstData.w, 0, 0,
IN.PackedInstData.x );
matWorld[ 1 ] = float4( 0, IN.PackedInstData.w, 0,
IN.PackedInstData.y );
matWorld[ 2 ] = float4( 0, 0, IN.PackedInstData.w,
IN.PackedInstData.z );
matWorld[ 3 ] = float4( 0, 0, 0, 1 );
matMVP = mul( cMatViewProj, matWorld );
OUT.HPosition = mul( matMVP, IN.Position );
VS Snippet to unpack attributes (positionVS Snippet to unpack attributes (position
& size) from attribute stream to create& size) from attribute stream to create
matMVP and transform vertexmatMVP and transform vertex
Geometry Instancing – ResultsGeometry Instancing – Results
• Depending on the amount ofDepending on the amount of
vegetation, rendering speedvegetation, rendering speed
increases up to 40% (when heavilyincreases up to 40% (when heavily
draw call limited)draw call limited)
• Allows us to increase sprite distanceAllows us to increase sprite distance
ratio, a nice visual improvement withratio, a nice visual improvement with
only a moderate rendering speed hitonly a moderate rendering speed hit
Scene drawn normally
Batches visualized – Vegetation objects
tinted the same way get submitted in one
draw call!
High Dynamic Range Rendering
High Dynamic Range RenderingHigh Dynamic Range Rendering
• Uses A16B16G16R16F render targetUses A16B16G16R16F render target
formatformat
• Alpha blending and filtering is essentialAlpha blending and filtering is essential
• Unified solution for post-processingUnified solution for post-processing
• Replaces whole range of post-processingReplaces whole range of post-processing
hacks (glare, flares)hacks (glare, flares)
HDR – ImplementationHDR – Implementation
• HDR in Far Cry follows standardHDR in Far Cry follows standard
approachesapproaches
– Kawase’s bloom filtersKawase’s bloom filters
– Reinhard’s tone mapping operatorReinhard’s tone mapping operator
– See DXSDK sampleSee DXSDK sample
• Performance hintPerformance hint
– For post processing try splitting your colorFor post processing try splitting your color
into rg, ba and write them into two MRTs ofinto rg, ba and write them into two MRTs of
format G16R16F. That’s more cacheformat G16R16F. That’s more cache
efficient on some cards.efficient on some cards.
Bloom from [Kawase03]Bloom from [Kawase03]
• Repeatedly apply small bRepeatedly apply small blur filterslur filters
• Composite bloom with original imageComposite bloom with original image
– Ideally in HDR space, followed by tone mappingIdeally in HDR space, followed by tone mapping
Texture samplingTexture sampling
pointspoints
Pixel being RenderedPixel being Rendered
11stst
passpass
22ndnd
passpass
33rdrd
passpass
From [Kawase03]From [Kawase03]
Increase Filter Size Each PassIncrease Filter Size Each Pass
No HDR
HDR (tone mapped scene +
bloom + stars)
[Reinhard02] – Tone Mapping[Reinhard02] – Tone Mapping
1)1) Calculate scene luminanceCalculate scene luminance
On GPU done by sampling the log()On GPU done by sampling the log()
values, scaling them down to 1x1 andvalues, scaling them down to 1x1 and
calculating the exp()calculating the exp()
2)2) Scale to target averageScale to target average
luminanceluminance αα
3)3) Apply tone mappingApply tone mapping
operatoroperator
• To simulate light adaptation replaceTo simulate light adaptation replace LumLumavgavg in step 2 and 3 by anin step 2 and 3 by an
adapted luminance value which slowly converges towardsadapted luminance value which slowly converges towards LumLumavgavg
• For further information attend Reinhard’s session called “ToneFor further information attend Reinhard’s session called “Tone
Reproduction In Interactive Applications” this Friday, March 11 atReproduction In Interactive Applications” this Friday, March 11 at
10:30am10:30am
HDR – Watch outHDR – Watch out
• Currently no FSAACurrently no FSAA
• Extremely fill rate hungryExtremely fill rate hungry
• Needs support for float buffer blendingNeeds support for float buffer blending
• HDR-aware productionHDR-aware production11
::
– Light mapsLight maps
– SkyboxSkybox
1)1) For prototyping, we actually modified our light map generatorFor prototyping, we actually modified our light map generator
to generate HDR maps and tried HDR skyboxes. They lookto generate HDR maps and tried HDR skyboxes. They look
great. However we didn’t include them in the patch because…great. However we didn’t include them in the patch because…
– Compressing HDR light map textures is challengingCompressing HDR light map textures is challenging
– Bandwidth requirements would have been even biggerBandwidth requirements would have been even bigger
– Far Cry patch size would have been hugeFar Cry patch size would have been huge
– No time to adjust and test all levelsNo time to adjust and test all levels
ConclusionConclusion
• Dynamic flow control in ps_3_0Dynamic flow control in ps_3_0
• Geometry InstancingGeometry Instancing
• High Dynamic Range RenderingHigh Dynamic Range Rendering
ReferencesReferences
• [Kawase03][Kawase03] Masaki Kawase, “FrameMasaki Kawase, “Frame
Buffer Postprocessing Effects inBuffer Postprocessing Effects in
DOUBLE-S.T.E.A.L (Wreckless),” GameDOUBLE-S.T.E.A.L (Wreckless),” Game
Developer’s Conference 2003Developer’s Conference 2003
• [Reinhard02][Reinhard02] Erik Reinhard, MichaelErik Reinhard, Michael
Stark, Peter Shirley and JamesStark, Peter Shirley and James
Ferwerda, “Photographic ToneFerwerda, “Photographic Tone
Reproduction for Digital Images,”Reproduction for Digital Images,”
SIGGRAPH 2002.SIGGRAPH 2002.
QuestionsQuestions
??????

More Related Content

What's hot

Fingerprinting Chemical Structures
Fingerprinting Chemical StructuresFingerprinting Chemical Structures
Fingerprinting Chemical Structures
Rajarshi Guha
 
Java Garbage Collection, Monitoring, and Tuning
Java Garbage Collection, Monitoring, and TuningJava Garbage Collection, Monitoring, and Tuning
Java Garbage Collection, Monitoring, and Tuning
Carol McDonald
 
Gdc2011 direct x 11 rendering in battlefield 3
Gdc2011 direct x 11 rendering in battlefield 3Gdc2011 direct x 11 rendering in battlefield 3
Gdc2011 direct x 11 rendering in battlefield 3
drandom
 
Windows to reality getting the most out of direct3 d 10 graphics in your games
Windows to reality   getting the most out of direct3 d 10 graphics in your gamesWindows to reality   getting the most out of direct3 d 10 graphics in your games
Windows to reality getting the most out of direct3 d 10 graphics in your games
changehee lee
 

What's hot (20)

SPU Assisted Rendering
SPU Assisted RenderingSPU Assisted Rendering
SPU Assisted Rendering
 
Borderless Per Face Texture Mapping
Borderless Per Face Texture MappingBorderless Per Face Texture Mapping
Borderless Per Face Texture Mapping
 
Lecture6.pptx
Lecture6.pptxLecture6.pptx
Lecture6.pptx
 
Fingerprinting Chemical Structures
Fingerprinting Chemical StructuresFingerprinting Chemical Structures
Fingerprinting Chemical Structures
 
Dx11 performancereloaded
Dx11 performancereloadedDx11 performancereloaded
Dx11 performancereloaded
 
Introduction to Data Oriented Design
Introduction to Data Oriented DesignIntroduction to Data Oriented Design
Introduction to Data Oriented Design
 
A Step Towards Data Orientation
A Step Towards Data OrientationA Step Towards Data Orientation
A Step Towards Data Orientation
 
Engineering fast indexes (Deepdive)
Engineering fast indexes (Deepdive)Engineering fast indexes (Deepdive)
Engineering fast indexes (Deepdive)
 
Java Garbage Collection, Monitoring, and Tuning
Java Garbage Collection, Monitoring, and TuningJava Garbage Collection, Monitoring, and Tuning
Java Garbage Collection, Monitoring, and Tuning
 
PyTorch crash course
PyTorch crash coursePyTorch crash course
PyTorch crash course
 
그래픽 최적화로 가...가버렷! (부제: 배치! 배칭을 보자!) , Batch! Let's take a look at Batching! -...
그래픽 최적화로 가...가버렷! (부제: 배치! 배칭을 보자!) , Batch! Let's take a look at Batching! -...그래픽 최적화로 가...가버렷! (부제: 배치! 배칭을 보자!) , Batch! Let's take a look at Batching! -...
그래픽 최적화로 가...가버렷! (부제: 배치! 배칭을 보자!) , Batch! Let's take a look at Batching! -...
 
Tomasz Nurkiewicz - Programowanie reaktywne: czego się nauczyłem
Tomasz Nurkiewicz - Programowanie reaktywne: czego się nauczyłemTomasz Nurkiewicz - Programowanie reaktywne: czego się nauczyłem
Tomasz Nurkiewicz - Programowanie reaktywne: czego się nauczyłem
 
FlameWorks GTC 2014
FlameWorks GTC 2014FlameWorks GTC 2014
FlameWorks GTC 2014
 
Gdc2011 direct x 11 rendering in battlefield 3
Gdc2011 direct x 11 rendering in battlefield 3Gdc2011 direct x 11 rendering in battlefield 3
Gdc2011 direct x 11 rendering in battlefield 3
 
Spark and Shark: Lightning-Fast Analytics over Hadoop and Hive Data
Spark and Shark: Lightning-Fast Analytics over Hadoop and Hive DataSpark and Shark: Lightning-Fast Analytics over Hadoop and Hive Data
Spark and Shark: Lightning-Fast Analytics over Hadoop and Hive Data
 
Demystifying DataFrame and Dataset
Demystifying DataFrame and DatasetDemystifying DataFrame and Dataset
Demystifying DataFrame and Dataset
 
04 - Qt Data
04 - Qt Data04 - Qt Data
04 - Qt Data
 
To Swift 2...and Beyond!
To Swift 2...and Beyond!To Swift 2...and Beyond!
To Swift 2...and Beyond!
 
Visibility Optimization for Games
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for Games
 
Windows to reality getting the most out of direct3 d 10 graphics in your games
Windows to reality   getting the most out of direct3 d 10 graphics in your gamesWindows to reality   getting the most out of direct3 d 10 graphics in your games
Windows to reality getting the most out of direct3 d 10 graphics in your games
 

Similar to Far cry 3

HES2011 - Aaron Portnoy and Logan Brown - Black Box Auditing Adobe Shockwave
HES2011 - Aaron Portnoy and Logan Brown - Black Box Auditing Adobe ShockwaveHES2011 - Aaron Portnoy and Logan Brown - Black Box Auditing Adobe Shockwave
HES2011 - Aaron Portnoy and Logan Brown - Black Box Auditing Adobe Shockwave
Hackito Ergo Sum
 
MongoDB Basic Concepts
MongoDB Basic ConceptsMongoDB Basic Concepts
MongoDB Basic Concepts
MongoDB
 
London devops logging
London devops loggingLondon devops logging
London devops logging
Tomas Doran
 
Making a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie TycoonMaking a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie Tycoon
Jean-Philippe Doiron
 

Similar to Far cry 3 (20)

Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
Building Reliable Cloud Storage with Riak and CloudStack - Andy Gross, Chief ...
 
PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.
 
HES2011 - Aaron Portnoy and Logan Brown - Black Box Auditing Adobe Shockwave
HES2011 - Aaron Portnoy and Logan Brown - Black Box Auditing Adobe ShockwaveHES2011 - Aaron Portnoy and Logan Brown - Black Box Auditing Adobe Shockwave
HES2011 - Aaron Portnoy and Logan Brown - Black Box Auditing Adobe Shockwave
 
Unraveling mysteries of the Universe at CERN, with OpenStack and Hadoop
Unraveling mysteries of the Universe at CERN, with OpenStack and HadoopUnraveling mysteries of the Universe at CERN, with OpenStack and Hadoop
Unraveling mysteries of the Universe at CERN, with OpenStack and Hadoop
 
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
 
Riak add presentation
Riak add presentationRiak add presentation
Riak add presentation
 
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache Storm
 
MongoDB Basic Concepts
MongoDB Basic ConceptsMongoDB Basic Concepts
MongoDB Basic Concepts
 
owasp lithuania chapter - exploit vs anti-exploit
owasp lithuania chapter - exploit vs anti-exploitowasp lithuania chapter - exploit vs anti-exploit
owasp lithuania chapter - exploit vs anti-exploit
 
London devops logging
London devops loggingLondon devops logging
London devops logging
 
Making a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie TycoonMaking a game with Molehill: Zombie Tycoon
Making a game with Molehill: Zombie Tycoon
 
Approximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming ApplicationsApproximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming Applications
 
Eusecwest
EusecwestEusecwest
Eusecwest
 
HPTS talk on micro-sharding with Katta
HPTS talk on micro-sharding with KattaHPTS talk on micro-sharding with Katta
HPTS talk on micro-sharding with Katta
 
[Ruxcon 2011] Post Memory Corruption Memory Analysis
[Ruxcon 2011] Post Memory Corruption Memory Analysis[Ruxcon 2011] Post Memory Corruption Memory Analysis
[Ruxcon 2011] Post Memory Corruption Memory Analysis
 
Caffe framework tutorial2
Caffe framework tutorial2Caffe framework tutorial2
Caffe framework tutorial2
 
Volumetric Lighting for Many Lights in Lords of the Fallen
Volumetric Lighting for Many Lights in Lords of the FallenVolumetric Lighting for Many Lights in Lords of the Fallen
Volumetric Lighting for Many Lights in Lords of the Fallen
 
Jvm memory model
Jvm memory modelJvm memory model
Jvm memory model
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
Cdn cs6740
Cdn cs6740Cdn cs6740
Cdn cs6740
 

Recently uploaded

ASSISTING WITH THE USE OF URINAL BY ANUSHRI SRIVASTAVA.pptx
ASSISTING WITH THE USE OF URINAL BY ANUSHRI SRIVASTAVA.pptxASSISTING WITH THE USE OF URINAL BY ANUSHRI SRIVASTAVA.pptx
ASSISTING WITH THE USE OF URINAL BY ANUSHRI SRIVASTAVA.pptx
AnushriSrivastav
 
Antibiotic Stewardship by Anushri Srivastava.pptx
Antibiotic Stewardship by Anushri Srivastava.pptxAntibiotic Stewardship by Anushri Srivastava.pptx
Antibiotic Stewardship by Anushri Srivastava.pptx
AnushriSrivastav
 
CHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdf
CHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdfCHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdf
CHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdf
Sachin Sharma
 
Urinary Elimination BY ANUSHRI SRIVASTAVA.pptx
Urinary Elimination BY ANUSHRI SRIVASTAVA.pptxUrinary Elimination BY ANUSHRI SRIVASTAVA.pptx
Urinary Elimination BY ANUSHRI SRIVASTAVA.pptx
AnushriSrivastav
 
Integrated Mother and Neonate Childwood Illness Health Care
Integrated Mother and Neonate Childwood Illness  Health CareIntegrated Mother and Neonate Childwood Illness  Health Care
Integrated Mother and Neonate Childwood Illness Health Care
ASKatoch1
 
Benefits of Dentulu's Salivary Testing.pptx
Benefits of Dentulu's Salivary Testing.pptxBenefits of Dentulu's Salivary Testing.pptx
Benefits of Dentulu's Salivary Testing.pptx
Dentulu Inc
 

Recently uploaded (20)

ASSISTING WITH THE USE OF URINAL BY ANUSHRI SRIVASTAVA.pptx
ASSISTING WITH THE USE OF URINAL BY ANUSHRI SRIVASTAVA.pptxASSISTING WITH THE USE OF URINAL BY ANUSHRI SRIVASTAVA.pptx
ASSISTING WITH THE USE OF URINAL BY ANUSHRI SRIVASTAVA.pptx
 
Digital Healthcare: The Future of Medical Consultations
Digital Healthcare: The Future of Medical ConsultationsDigital Healthcare: The Future of Medical Consultations
Digital Healthcare: The Future of Medical Consultations
 
Call Girls in Jaipur (Rajasthan) call me [🔝89011-83002🔝] Escort In Jaipur ℂal...
Call Girls in Jaipur (Rajasthan) call me [🔝89011-83002🔝] Escort In Jaipur ℂal...Call Girls in Jaipur (Rajasthan) call me [🔝89011-83002🔝] Escort In Jaipur ℂal...
Call Girls in Jaipur (Rajasthan) call me [🔝89011-83002🔝] Escort In Jaipur ℂal...
 
A Community health , health for prisoners
A Community health  , health for prisonersA Community health  , health for prisoners
A Community health , health for prisoners
 
CHAPTER- 1 SEMESTER - V NATIONAL HEALTH PROGRAMME RELATED TO CHILD.pdf
CHAPTER- 1 SEMESTER - V NATIONAL HEALTH PROGRAMME RELATED TO CHILD.pdfCHAPTER- 1 SEMESTER - V NATIONAL HEALTH PROGRAMME RELATED TO CHILD.pdf
CHAPTER- 1 SEMESTER - V NATIONAL HEALTH PROGRAMME RELATED TO CHILD.pdf
 
Healthcare Companion Robots: Key Features and Functionalities, Benefits, Chal...
Healthcare Companion Robots: Key Features and Functionalities, Benefits, Chal...Healthcare Companion Robots: Key Features and Functionalities, Benefits, Chal...
Healthcare Companion Robots: Key Features and Functionalities, Benefits, Chal...
 
Myopia Management & Control Strategies.pptx
Myopia Management & Control Strategies.pptxMyopia Management & Control Strategies.pptx
Myopia Management & Control Strategies.pptx
 
#cALL# #gIRLS# In Chhattisgarh ꧁❤8901183002❤꧂#cALL# #gIRLS# Service In Chhatt...
#cALL# #gIRLS# In Chhattisgarh ꧁❤8901183002❤꧂#cALL# #gIRLS# Service In Chhatt...#cALL# #gIRLS# In Chhattisgarh ꧁❤8901183002❤꧂#cALL# #gIRLS# Service In Chhatt...
#cALL# #gIRLS# In Chhattisgarh ꧁❤8901183002❤꧂#cALL# #gIRLS# Service In Chhatt...
 
PT MANAGEMENT OF URINARY INCONTINENCE.pptx
PT MANAGEMENT OF URINARY INCONTINENCE.pptxPT MANAGEMENT OF URINARY INCONTINENCE.pptx
PT MANAGEMENT OF URINARY INCONTINENCE.pptx
 
R3 Stem Cells and Kidney Repair A New Horizon in Nephrology.pptx
R3 Stem Cells and Kidney Repair A New Horizon in Nephrology.pptxR3 Stem Cells and Kidney Repair A New Horizon in Nephrology.pptx
R3 Stem Cells and Kidney Repair A New Horizon in Nephrology.pptx
 
Antibiotic Stewardship by Anushri Srivastava.pptx
Antibiotic Stewardship by Anushri Srivastava.pptxAntibiotic Stewardship by Anushri Srivastava.pptx
Antibiotic Stewardship by Anushri Srivastava.pptx
 
QA Paediatric dentistry department, Hospital Melaka 2020
QA Paediatric dentistry department, Hospital Melaka 2020QA Paediatric dentistry department, Hospital Melaka 2020
QA Paediatric dentistry department, Hospital Melaka 2020
 
Virtual Health Platforms_ Revolutionizing Patient Care.pdf
Virtual Health Platforms_ Revolutionizing Patient Care.pdfVirtual Health Platforms_ Revolutionizing Patient Care.pdf
Virtual Health Platforms_ Revolutionizing Patient Care.pdf
 
CHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdf
CHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdfCHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdf
CHAPTER 1 SEMESTER V - ROLE OF PEADIATRIC NURSE.pdf
 
Dehradun ❤CALL Girls 8901183002 ❤ℂall Girls IN Dehradun ESCORT SERVICE❤
Dehradun ❤CALL Girls  8901183002 ❤ℂall  Girls IN Dehradun ESCORT SERVICE❤Dehradun ❤CALL Girls  8901183002 ❤ℂall  Girls IN Dehradun ESCORT SERVICE❤
Dehradun ❤CALL Girls 8901183002 ❤ℂall Girls IN Dehradun ESCORT SERVICE❤
 
Urinary Elimination BY ANUSHRI SRIVASTAVA.pptx
Urinary Elimination BY ANUSHRI SRIVASTAVA.pptxUrinary Elimination BY ANUSHRI SRIVASTAVA.pptx
Urinary Elimination BY ANUSHRI SRIVASTAVA.pptx
 
Valle Egypt Illustrates Consequences of Financial Elder Abuse
Valle Egypt Illustrates Consequences of Financial Elder AbuseValle Egypt Illustrates Consequences of Financial Elder Abuse
Valle Egypt Illustrates Consequences of Financial Elder Abuse
 
Integrated Mother and Neonate Childwood Illness Health Care
Integrated Mother and Neonate Childwood Illness  Health CareIntegrated Mother and Neonate Childwood Illness  Health Care
Integrated Mother and Neonate Childwood Illness Health Care
 
Management of psoriasis.pptx (Recent advances)
Management of psoriasis.pptx (Recent advances)Management of psoriasis.pptx (Recent advances)
Management of psoriasis.pptx (Recent advances)
 
Benefits of Dentulu's Salivary Testing.pptx
Benefits of Dentulu's Salivary Testing.pptxBenefits of Dentulu's Salivary Testing.pptx
Benefits of Dentulu's Salivary Testing.pptx
 

Far cry 3

  • 1. Far Cry and DirectXFar Cry and DirectX Carsten WenzelCarsten Wenzel
  • 2. Far Cry uses the latest DX9 featuresFar Cry uses the latest DX9 features • Shader Models 2.x / 3.0Shader Models 2.x / 3.0  - Except for vertex textures and dynamicExcept for vertex textures and dynamic flow controlflow control • Geometry InstancingGeometry Instancing  • Floating-point render targetsFloating-point render targets 
  • 3. Dynamic flow control in PSDynamic flow control in PS • To consolidate multiple lights into one pass,To consolidate multiple lights into one pass, we ideally would want to do something likewe ideally would want to do something like this…this… float3float3 finalColfinalCol == 0;0; float3float3 diffuseColdiffuseCol == tex2Dtex2D( diffuseMap, IN.diffuseUV.xy );( diffuseMap, IN.diffuseUV.xy ); float3float3 normalnormal == mulmul( IN.tangentToWorldSpace,( IN.tangentToWorldSpace, tex2Dtex2D( normalMap, IN.bumpUV.xy ).xyz );( normalMap, IN.bumpUV.xy ).xyz ); forfor(( intint ii == 0; i < cNumLights; i++ )0; i < cNumLights; i++ ) {{ float3float3 lightCollightCol = LightColor[ i ];= LightColor[ i ]; float3float3 lightVeclightVec == normalizenormalize( cLightPos[ i ].xyz –( cLightPos[ i ].xyz – IN.pos.xyz );IN.pos.xyz ); // …// … // Attenuation, Specular, etc. calculated via// Attenuation, Specular, etc. calculated via if( const_boolean )if( const_boolean ) // …// … floatfloat nDotLnDotL == saturatesaturate(( dotdot( lightVec.xyz, normal ) );( lightVec.xyz, normal ) ); final += lightCol.xyz * diffuseCol.xyz * nDotL * atten;final += lightCol.xyz * diffuseCol.xyz * nDotL * atten; }} return(return( float4float4( finalCol, 1 ) );( finalCol, 1 ) );
  • 4. Dynamic flow control in PSDynamic flow control in PS • Welcome to the real world…Welcome to the real world… – Dynamic indexing only allowed on inputDynamic indexing only allowed on input registers; prevents passing light data viaregisters; prevents passing light data via constant registers and index them in loopconstant registers and index them in loop – Passing light info via input registers notPassing light info via input registers not feasible as there are not enough of them (onlyfeasible as there are not enough of them (only 10)10) – Dynamic branching is not freeDynamic branching is not free
  • 5. Loop unrollingLoop unrolling • We chose not to use dynamic branching and loopsWe chose not to use dynamic branching and loops • Used static branching and unrolled loops insteadUsed static branching and unrolled loops instead • Works well with Far Cry’s existing shader frameworkWorks well with Far Cry’s existing shader framework • Shaders are precompiled for different light masksShaders are precompiled for different light masks – 0-4 dynamic light sources per pass0-4 dynamic light sources per pass – 3 different light types (spot, omni, directional)3 different light types (spot, omni, directional) – 2 modification types per light (specular only, occlusion map)2 modification types per light (specular only, occlusion map) • Can result in over 160 instructions after loop unrollingCan result in over 160 instructions after loop unrolling when using 4 lightswhen using 4 lights – Too long for ps_2_0Too long for ps_2_0 – Just fine for ps_2_a, ps_2_b and ps_3_0!Just fine for ps_2_a, ps_2_b and ps_3_0! • To avoid run time stalls, use a pre-warmed shader cacheTo avoid run time stalls, use a pre-warmed shader cache
  • 6. How the shader cache worksHow the shader cache works • Specific shader depends on:Specific shader depends on: 1)1) Material typeMaterial type (e.g. skin, phong, metal)(e.g. skin, phong, metal) 2)2) Material usage flagsMaterial usage flags (e.g. bump-mapped, specular)(e.g. bump-mapped, specular) 3)3) Specific environmentSpecific environment (e.g. light mask, fog)(e.g. light mask, fog)
  • 7. How the shader cache worksHow the shader cache works • Cache access:Cache access: – Object to render already has shader handles? Use those!Object to render already has shader handles? Use those! – Otherwise try to find the shader in memoryOtherwise try to find the shader in memory – If that fails load from harddiskIf that fails load from harddisk – If that fails generate VS/PS, store backup on harddiskIf that fails generate VS/PS, store backup on harddisk – Finally, save shader handles in objectFinally, save shader handles in object • Not the ideal solution butNot the ideal solution but – Works reasonably well on existing hardwareWorks reasonably well on existing hardware – Was easy to integrate without changing assetsWas easy to integrate without changing assets • For the cache to be efficient…For the cache to be efficient… – All used combinations of a shader should exist as pre-cached files onAll used combinations of a shader should exist as pre-cached files on HDHD • On the fly update causes stalls due to time required for shaderOn the fly update causes stalls due to time required for shader compilation!compilation! – However, maintaining the cache can become cumbersomeHowever, maintaining the cache can become cumbersome
  • 8. Loop unrolling – Pros/ConsLoop unrolling – Pros/Cons • Pros:Pros: – Speed! Not branching dynamically saves quite a fewSpeed! Not branching dynamically saves quite a few cyclescycles – At the time, we found shader switching to be moreAt the time, we found shader switching to be more efficient than dynamic branchingefficient than dynamic branching • Cons:Cons: – Needs sophisticated shader caching, due to number ofNeeds sophisticated shader caching, due to number of shader combinations per light mask (244 aftershader combinations per light mask (244 after presorting of combinations)presorting of combinations) – Shader pre-compilation takes timeShader pre-compilation takes time – Shader cache for Far Cry 1.3 requires about 430 MBShader cache for Far Cry 1.3 requires about 430 MB (compressed down to ~23 MB in patch exe)(compressed down to ~23 MB in patch exe)
  • 9. Geometry InstancingGeometry Instancing • Potentially saves cost ofPotentially saves cost of n-1n-1 draw calls when renderingdraw calls when rendering nn instances of an objectinstances of an object • Far Cry uses it mainly to speed up vegetation renderingFar Cry uses it mainly to speed up vegetation rendering • Per instance attributes:Per instance attributes: – PositionPosition – SizeSize – Bending infoBending info – Rotation (only if needed)Rotation (only if needed) • Reduce the number of instance attributes! Two methods:Reduce the number of instance attributes! Two methods: – Vertex shader constantsVertex shader constants • Use for objects having more than 100 polygonsUse for objects having more than 100 polygons – Attribute streamsAttribute streams • Use for smaller objects (sprites, impostors)Use for smaller objects (sprites, impostors)
  • 10. Instance Attributes in VS ConstantsInstance Attributes in VS Constants • Best for objects with large numbers ofBest for objects with large numbers of polygonspolygons • Put instance index into additional streamPut instance index into additional stream • UseUse SetStreamSourceFrequencySetStreamSourceFrequency to setupto setup geometry instancing as follows…geometry instancing as follows… SetStreamSourceFrequency( geomStream,SetStreamSourceFrequency( geomStream, D3DSTREAMSOURCE_INDEXEDDATA | numInstances );D3DSTREAMSOURCE_INDEXEDDATA | numInstances ); SetStreamSourceFrequency( instStream,SetStreamSourceFrequency( instStream, D3DSTREAMSOURCE_INSTANCEDATA | 1 );D3DSTREAMSOURCE_INSTANCEDATA | 1 ); • Be sure to reset the vertex stream frequencyBe sure to reset the vertex stream frequency once you’re done,once you’re done, SSSF( strNum, 1 )SSSF( strNum, 1 )!!
  • 11. const float4x4 cMatViewProj; const float4 cPackedInstanceData[ numInstances ]; float4x4 matWorld; float4x4 matMVP; int i = IN.InstanceIndex; matWorld[ 0 ] = float4( cPackedInstanceData[ i ].w, 0, 0, cPackedInstanceData[ i ].x ); matWorld[ 1 ] = float4( 0, cPackedInstanceData[ i ].w, 0, cPackedInstanceData[ i ].y ); matWorld[ 2 ] = float4( 0, 0, cPackedInstanceData[ i ].w, cPackedInstanceData[ i ].z ); matWorld[ 3 ] = float4( 0, 0, 0, 1 ); matMVP = mul( cMatViewProj, matWorld ); OUT.HPosition = mul( matMVP, IN.Position ); VS Snippet to unpack attributes (positionVS Snippet to unpack attributes (position & size) from VS constants to create& size) from VS constants to create matMVP and transform vertexmatMVP and transform vertex
  • 12. Instance Attribute StreamsInstance Attribute Streams • Best for objects with few polygonsBest for objects with few polygons • Put per instance data into additional streamPut per instance data into additional stream • Setup vertex stream frequency as before andSetup vertex stream frequency as before and reset when you’re donereset when you’re done
  • 13. const float4x4 cMatViewProj; float4x4 matWorld; float4x4 matMVP; matWorld[ 0 ] = float4( IN.PackedInstData.w, 0, 0, IN.PackedInstData.x ); matWorld[ 1 ] = float4( 0, IN.PackedInstData.w, 0, IN.PackedInstData.y ); matWorld[ 2 ] = float4( 0, 0, IN.PackedInstData.w, IN.PackedInstData.z ); matWorld[ 3 ] = float4( 0, 0, 0, 1 ); matMVP = mul( cMatViewProj, matWorld ); OUT.HPosition = mul( matMVP, IN.Position ); VS Snippet to unpack attributes (positionVS Snippet to unpack attributes (position & size) from attribute stream to create& size) from attribute stream to create matMVP and transform vertexmatMVP and transform vertex
  • 14. Geometry Instancing – ResultsGeometry Instancing – Results • Depending on the amount ofDepending on the amount of vegetation, rendering speedvegetation, rendering speed increases up to 40% (when heavilyincreases up to 40% (when heavily draw call limited)draw call limited) • Allows us to increase sprite distanceAllows us to increase sprite distance ratio, a nice visual improvement withratio, a nice visual improvement with only a moderate rendering speed hitonly a moderate rendering speed hit
  • 16. Batches visualized – Vegetation objects tinted the same way get submitted in one draw call!
  • 17. High Dynamic Range Rendering
  • 18. High Dynamic Range RenderingHigh Dynamic Range Rendering • Uses A16B16G16R16F render targetUses A16B16G16R16F render target formatformat • Alpha blending and filtering is essentialAlpha blending and filtering is essential • Unified solution for post-processingUnified solution for post-processing • Replaces whole range of post-processingReplaces whole range of post-processing hacks (glare, flares)hacks (glare, flares)
  • 19. HDR – ImplementationHDR – Implementation • HDR in Far Cry follows standardHDR in Far Cry follows standard approachesapproaches – Kawase’s bloom filtersKawase’s bloom filters – Reinhard’s tone mapping operatorReinhard’s tone mapping operator – See DXSDK sampleSee DXSDK sample • Performance hintPerformance hint – For post processing try splitting your colorFor post processing try splitting your color into rg, ba and write them into two MRTs ofinto rg, ba and write them into two MRTs of format G16R16F. That’s more cacheformat G16R16F. That’s more cache efficient on some cards.efficient on some cards.
  • 20. Bloom from [Kawase03]Bloom from [Kawase03] • Repeatedly apply small bRepeatedly apply small blur filterslur filters • Composite bloom with original imageComposite bloom with original image – Ideally in HDR space, followed by tone mappingIdeally in HDR space, followed by tone mapping
  • 21. Texture samplingTexture sampling pointspoints Pixel being RenderedPixel being Rendered 11stst passpass 22ndnd passpass 33rdrd passpass From [Kawase03]From [Kawase03] Increase Filter Size Each PassIncrease Filter Size Each Pass
  • 23. HDR (tone mapped scene + bloom + stars)
  • 24. [Reinhard02] – Tone Mapping[Reinhard02] – Tone Mapping 1)1) Calculate scene luminanceCalculate scene luminance On GPU done by sampling the log()On GPU done by sampling the log() values, scaling them down to 1x1 andvalues, scaling them down to 1x1 and calculating the exp()calculating the exp() 2)2) Scale to target averageScale to target average luminanceluminance αα 3)3) Apply tone mappingApply tone mapping operatoroperator • To simulate light adaptation replaceTo simulate light adaptation replace LumLumavgavg in step 2 and 3 by anin step 2 and 3 by an adapted luminance value which slowly converges towardsadapted luminance value which slowly converges towards LumLumavgavg • For further information attend Reinhard’s session called “ToneFor further information attend Reinhard’s session called “Tone Reproduction In Interactive Applications” this Friday, March 11 atReproduction In Interactive Applications” this Friday, March 11 at 10:30am10:30am
  • 25. HDR – Watch outHDR – Watch out • Currently no FSAACurrently no FSAA • Extremely fill rate hungryExtremely fill rate hungry • Needs support for float buffer blendingNeeds support for float buffer blending • HDR-aware productionHDR-aware production11 :: – Light mapsLight maps – SkyboxSkybox 1)1) For prototyping, we actually modified our light map generatorFor prototyping, we actually modified our light map generator to generate HDR maps and tried HDR skyboxes. They lookto generate HDR maps and tried HDR skyboxes. They look great. However we didn’t include them in the patch because…great. However we didn’t include them in the patch because… – Compressing HDR light map textures is challengingCompressing HDR light map textures is challenging – Bandwidth requirements would have been even biggerBandwidth requirements would have been even bigger – Far Cry patch size would have been hugeFar Cry patch size would have been huge – No time to adjust and test all levelsNo time to adjust and test all levels
  • 26. ConclusionConclusion • Dynamic flow control in ps_3_0Dynamic flow control in ps_3_0 • Geometry InstancingGeometry Instancing • High Dynamic Range RenderingHigh Dynamic Range Rendering
  • 27. ReferencesReferences • [Kawase03][Kawase03] Masaki Kawase, “FrameMasaki Kawase, “Frame Buffer Postprocessing Effects inBuffer Postprocessing Effects in DOUBLE-S.T.E.A.L (Wreckless),” GameDOUBLE-S.T.E.A.L (Wreckless),” Game Developer’s Conference 2003Developer’s Conference 2003 • [Reinhard02][Reinhard02] Erik Reinhard, MichaelErik Reinhard, Michael Stark, Peter Shirley and JamesStark, Peter Shirley and James Ferwerda, “Photographic ToneFerwerda, “Photographic Tone Reproduction for Digital Images,”Reproduction for Digital Images,” SIGGRAPH 2002.SIGGRAPH 2002.

Editor's Notes

  1. &amp;lt;number&amp;gt;
  2. &amp;lt;number&amp;gt;