Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Rendering AAA-Quality Characters of Project A1

12,092 views

Published on

NDC 2016 Programming Session: "Rendering AAA-Quality Characters of Project A1" by Hyunwoo Ki.
This is a translated material in English.

Published in: Engineering

Rendering AAA-Quality Characters of Project A1

  1. 1. Rendering AAA-Quality Characters of Project ‘A1’ Hyunwoo Ki Lead Graphics Programmer A1 / NEXON * Translated material in English.
  2. 2. • Unannounced project of Nexon / New IP  In Development  By game development experts  Target Platform: High-end PC - 60 FPS at Ultra Quality / Low-spec PCs will be okay at Medium Quality • We first announce our project at this conference A1
  3. 3. • Programming Session: Character Rendering  Including resources and implementation in development  World shown in this talk is a temporary  All content can be changed in development A1@NDC 2016 • 04/27 15:20 Art Session by Art Director • 04/27 17:05 Programming Session by Senior Gameplay Programmer
  4. 4. • Competitive, stunning visuals  Unprecedented quality in Korea  We show current results and technology at this conference • UE4 + @@@  Use powerful rendering features of UE4  Plus, new rendering/animation/VFX features made by our team Visual of A1
  5. 5. • Skin, Hair, and Metal Rendering • Shadow Rendering • Low Level Shader Optimization • Run-time Rigged Physics Simulation Agenda
  6. 6. Skin Rendering
  7. 7. • Multiple scattering  Dipole diffusion approximation • Included in UE4  Integration of Jimenez’s implementation  We use it without any changes SSSSS Screen Space SubSurface Scattering Activision R&D. Property of Activision Publishing. Not Actual Gameplay
  8. 8. No SSSSS
  9. 9. • Softer lighting • Softer shadows • Translucent look SSSSS
  10. 10. • Specular along geometry silhouettes • Plastic-like look Base Normal Only
  11. 11. • Specular along skin surface details • More realistic With Detail Normal
  12. 12. • No transmission  Ignoring irradiance from outside of visible surfaces  Lack of a screen-space approach  There’s a solution but it isn’t included in UE4 • Low frequency lighting  Can’t handle strong scattering at a short distance Limit: UE4 SSSSS Burley, “Extending the Disney BRDF to a BSDF with Integrated Subsurface Scattering”, SIGGRAPH 2015
  13. 13. • Transmission (Backlit) • Back scattering • Higher frequency lighting Single Scattering Single Scattering Multiple Scattering (dipole)
  14. 14. Multiple Scattering
  15. 15. + =Single + Multiple Scattering * Single scattering is exaggerated to show different looks on the presentation screen.
  16. 16. ground truth Ki09 • Introduced in ShaderX7  Hyunwoo Ki, “Real-Time Subsurface Scattering using Shadow Maps”  Store translucent irradiance into multiple shadow maps (RSM/TSM style)  Approximate scattering distance by ray marching with shadow map projection  Estimate radiance using the stored irradiance and the scattering distance • Integrated this technique into UE4 with small changes  Deferred Single Scattering Single Scattering using Shadow Maps
  17. 17. • Shoot rays from the camera  Refract rays on the surfaces • Draw stepping distance samples  Exploit Quasi Monte Carlo sampling • Project xp onto shadow maps  To approximate incident scattering distance, Si Review: [Ki09] Ray Marching                                      2 0 2 11 ,, iiiiit xs oi xs otos A π xiiiiiooiiooo ωdsdω,xLωFeω,ωpeωFx dAdωωnω,xLω,;xω,xSω,xL itioto i                   N i M j iit xs oi xs otos ω,xEeω,ωpeωFx itioto 0 0 ,,      tos lo g
  18. 18. 1) Deferred Shadows Pass:  Ki’s ray marching (QMC + shadow projection)  Assume that irradiance and normal are equal for all sampling steps -> Only use the shadow depth map (no additional buffers!)  Constant scattering parameters: limited by G-buffers and performance reason  Output: scalar scattering transfer instead of shadow amount (single channel) 2) Deferred Lighting Pass:  SS = intensity * (scattering transfer * SSS color) * bidirectional Fresnel transmittance * HG phase function  Output: direct lighting + single scattering • Physically incorrect but plausible looks Deferred Single Scattering              N i M j ss iitoiots toi eω,xEω,ωpωF 0 0 ,,    2) Lighting Pass 1) Shadowing Pass
  19. 19. float ComputeSingleScatteringUsingShadowMap(FGBufferData GBuffer, FShadowMapSamplerSettings Settings, float3 V, float3 L, float3 WorldPosition) { const int NumSamples = SHADOW_QUALITY * 3; const float Eta = 1.3; const float EtaInverse = 1.0 / Eta; const float ExtinctionCoefficient = 2.55; const float MeanFreePath = 1.0 / ExtinctionCoefficient; const float3 OutgoingDirection = -refract(V, -GBuffer.WorldNormal, EtaInverse.x); const float InverseNumSamples = 1.0f / (float) NumSamples; const float Sample = InverseNumSamples * 0.5; float SingleScattering = 0; for (int i = 0; i < NumSamples; ++i) { float RefractedOutgoingDistance = -log(Sample) * MeanFreePath; float3 ScatteringPoint = (OutgoingDirection * RefractedOutgoingDistance) + WorldPosition; float4 ScatteringPointInShadowSpace = mul(float4(ScatteringPoint, 1.0f), WorldToShadowMatrix); ScatteringPointInShadowSpace.xy /= ScatteringPointInShadowSpace.w; float ShadowmapDepthAtScatteringPoint = Texture2DSampleLevel(Settings.ShadowDepthTexture, Settings.ShadowDepthTextureSampler, ScatteringPointInShadowSpace.xy, 0).r; float IncidentDistance = max(0, abs((ShadowmapDepthAtScatteringPoint + Settings.ProjectionDepthBiasParameters.x) - ScatteringPointInShadowSpace.z)) * ProjectionDepthBiasParameters.y; float TravelPathLength = IncidentDistance + RefractedOutgoingDistance * DISTANCE_SCALE; float LightContribution = exp(-ExtinctionCoefficient * TravelPathLength); float Weight = exp(-ExtinctionCoefficient * RefractedOutgoingDistance); SingleScattering += LightContribution / Weight; Sample += InverseNumSamples; } return SingleScattering / (float) NumSamples * SINGLESCATTERING_INTENSITY; } Appendix) Ray marching code
  20. 20. • Transmission on thin body parts  Backlit effects • Brighter skin surfaces  With texture masking / artistic choice • Added approx. 1 ms per light  At a closeup view (worst case) Rendering Results
  21. 21. * Single scattering is exaggerated to show different looks on the presentation screen.
  22. 22. • Using dithered, temporal sampling  Like volumetric lighting: Killzone: Shadow Fall, Loads of Fallen, INSIDE, etc.  Temporal reprojection • Replace shadow rendering for skin  Currently we use conventional PCF (UE4 default)  But we expect that volumetric attenuation by SS can represent shadowing Future Work: Single Scattering
  23. 23. Hair Rendering
  24. 24. • Layered card mesh with alpha texture • + Hair strands mesh • Optional textures:  Color, normal, roughness, AO, specular noise, etc. • Similar to Destiny and The Order: 1886 Modeling and Texturing
  25. 25. • Transparency  Alpha blending: per-pixel drawing order  Lighting: deferred lighting? (incompatible)  Shadowing: deferred shadowing? (incompatible) • Physically based shading model  Not fit to GGX Problem
  26. 26. Alpha Test : Alpha Blend
  27. 27. Order of Alpha Blending
  28. 28. • Need per-pixel, sorted alpha blending • Choice: K-Buffer style  Per-Pixel Linked List (PPLL)  DX11 Unordered Access View (UAV)  Based on an AMD TressFX 2.0 sample  We integrated this into UE4 Order Independent Transparency (OIT)
  29. 29. • Head UAV: RWTexture2D<uint>  Head indices for each linked list of pixels on the screen • PPLL UAV: RWStructuredBuffer<FOitLinkedListDataElement>  Container to store all fragments being shaded  An element is added when each fragment is drawn Review: PPLL OIT uint PixelCount = OitLinkedListPPLLDataUAV.IncrementCounter(); int2 UAVTargetIndex = int2(SVPosition.xy); uint OldStartOffset; InterlockedExchange(OitLinkedListHeadAddrUAV[UAVTargetIndex], PixelCount, OldStartOffset); OitLinkedListPPLLDataUAV[PixelCount] = NewElement;
  30. 30. • K-Buffer  Do manual alpha blending for the front-most K fragments: sorted  Do manual alpha blending for the remainder fragments: unsorted • See references in detail Review: PPLL OIT float4 BlendTransparency(float4 FragmentColor, float4 FinalColor) { float4 OutColor; OutColor.xyz = mad(-FinalColor.xyz, FragmentColor.w, FinalColor.xyz) + FragmentColor.xyz * FragmentColor.w; OutColor.w = mad(-FinalColor.w, FragmentColor.w, FinalColor.w); return OutColor; }
  31. 31. • Store radiance computed by forward lighting into PPLL / Force early-Z • Sort and blend in the following pass • PPLL layout: DX11 StructuredBuffer  Pack HDR radiance from half4 to uint: for memory, bandwidth, and cache alignment (128 bit) UE4 Integration struct FOitLinkedListDataElement { half4 Radiance; uint NormalAndOpacity; uint Depth; uint Next; }; struct FOitLinkedListDataElement { uint RadianceRGY32; uint NormalAndOpacity; uint Depth; uint Next; }; float3 UnpackRGY32(uint PackedColor) { const float ONE_OVER_255 = 1.0f / 255.0f; float3 RGB; RGB.r = (PackedColor & 0xff000000) >> 24; RGB.g = (PackedColor & 0x00ff0000) >> 16; RGB.b = 255.0f - (RGB.r + RGB.g); float Luminance = f16tof32(PackedColor); return RGB * Luminance * ONE_OVER_255; } uint PackRGY32(float3 Color) { float Luminance = (Color.r + Color.g + Color.b); Color.rg /= Luminance; uint PackedValue = uint(Color.r * 255.0f + 0.5f) << 24 | uint(Color.g * 255.0f + 0.5f) << 16; PackedValue |= f32tof16(Luminance); return PackedValue; }
  32. 32. Rendering Results
  33. 33. • Serious pixel over draws due to layered geometry • Optimization: single pass Opacity Thresholding using UAV  Observation: Alpha blending of hair is not for transparency but for rendering thin strands with clean silhouettes, and hair textures has almost opaque texels (especially inner layers) excepts hair tips  Goal: Reduce drawing invisible pixels occluded by almost opaque pixels from the same geometry  Unordered Opacity Thresholding Reducing Pixel Over Draws
  34. 34. • Set the opacity threshold for each hair material: ex) 0.95 • Do manual Z test with an additional UAV Z buffer RWTexture2D<uint> in PS  Try to write Z if opacity of a fragment is higher than the opacity threshold  Always do Z test with the UAV Z buffer const uint DepthAsUint = asuint(SVPosition.w); const uint DepthToWrite = (Opacity > OpacityThreshold) ? DepthAsUint : INVALID_UINT; uint OldMinDepthAsUint; InterlockedMin(OitOpacityThresholdingDepthUAV[int2(SVPosition.xy)], DepthToWrite, OldMinDepthAsUint); if (DepthAsUint > OldMinDepthAsUint) { discard; } Unordered Opacity Thresholding
  35. 35. • Added costs to read and write the UAV but reduced the total rendering cost • +5~10% rendering speed and -15% memory usage  Reduced heavy lighting costs  Reduced PPLL size  No additional draw calls  Unpredictable performance gain by the rasterization order Rendering Results
  36. 36. • More efficient unordered opacity thresholding  Exploit the order of triangle indices and locality?  Multiple meshes: adding per-geometry draw calls but reducing per-pixel over draws • Use ROV for DX12 Future Work: OIT
  37. 37. • Per-pixel lighting for transparent materials • Based on an experimental feature of UE4 • Limited usage  For hair / For glass and others in the future • Supports approximated shadows  Transparent Deferred Shadows Forward+ Lighting
  38. 38. • 16x16 tiled culling • Forward light data  Constant buffer: faster than Structured Buffer (NV)  128 bit stride: cache efficiency, under 64KB  SOA? AOS? Forward+ Lighting
  39. 39. Forward+ Lighting
  40. 40. Forward+ Lighting + Shadowing + Scattering
  41. 41. • Problem of forward shadowing  Adding complex nested dynamic branch / GPR pressure  Increasing forward light data (CB): shadow matrix and other parameters  Using many VRAM simultaneously (CSM and cube shadow maps) • Transparent shadow approximation:  Volumetric attenuation on the front-most transparent pixels  Integrated with the deferred shadow pass  Inaccurate but acceptable looks Transparent Deferred Shadows
  42. 42. 1) Transparent Z Pre Pass  Using depth rendering material proxy if needed  The buffer is used for post processing as well 2) Shadow Depth and Deferred Shadows Passes  Shadow Depth Pass: Also using depth rendering material proxy  Deferred Shadows Pass: Compute volume lighting attenuation if transparent Z is less than opaque Z  Resolve the deferred shadowing buffer into a texture array for each light 3) Forward+ Lighting Pass  Fetching shadow amount from the texture array and weighting according to opacity Transparent Deferred Shadows float DeferredShadow = DeferredShadowsTextureArray.SampleLevel(PointSampler, float3(ScreenUV, LightIndex), 0).x; DeferredShadow = lerp(1, DeferredShadow, Opacity);
  43. 43. Deferred Shadows : Transparent Deferred Shadows
  44. 44. Final Rendering * =
  45. 45. • Many lights and various materials for forward+ lighting • Faster reading forward light data • Better shadowing for inner layers  Attenuation by screen Z? • Solve conflict of storage for transparent shadows and single scattering  Currently single scattering is ignored when hair strands are on the skin Future Work: Lighting and Shadowing
  46. 46. • Marschner’s: hair strand = cylinder • Three components of specular  Primary reflection: R  Secondary reflection: TRT  Transmission: TT • + fake light scattering Physically Based Shading Model
  47. 47. Hair Lighting
  48. 48. R: Primary Reflection
  49. 49. TRT: Secondary Reflection
  50. 50. TT: Transmission
  51. 51. • Longitudinal Scattering  Using ALU  Gaussian function instead of lookup table • Azimuthal Scattering  Very complex math  2D function -> lookup table with constant material properties  Texture 2D Array: CosPhi, CosTheta, HairProfileID ALU : Lookup Table float3 ComputeLongitudinalScattering(float3 Theta, float Roughness) { const float bR = DecodeHairLongitudinalWidth(Roughness); const float3 Beta3 = float3(bR, bR * 0.5, bR * 2); return exp(-0.5 * Square(Theta) / Square(Beta3) ) / (sqrt(2.0 * PI) * Beta3); } U V Index
  52. 52. • Define per-hair properties  Create a lookup table for azimuthal scattering according to this asset  G buffer-free  Reusable • Lookup table  Scalar light transfer for TRT and TT / performance reason  R: R, G: TRT, B: TT, A: Unused Hair Profile Asset
  53. 53. • To give scalar TRT/TT spectral color • Physically, volumetric attenuation by transmission  Angle: light, camera and tangent  Thickness: 1 - opacity (to brighten hair tips)  Travel distance of light: 2 * TT = TRT Color Shift const float Thickness = (1.0 - Opacity); const float ColorTintFactor = saturate(CosThetaD) + Square(Thickness) + 1e-4; const float2 BaseColorTintPower = float2(0.6, 1.2) / ColorTintFactor; float3 TTColorTint = pow(BaseColor, BaseColorTintPower.x); float3 TRTColorTint = pow(BaseColor, BaseColorTintPower.y);
  54. 54. • Fake Scattering: similar to UE 4.11  Volumetric attenuation and color shift using transparent deferred shadows  Phase function with camera and light vectors: from forward to backward scattering  Per-light scattering effect Scattering: Direct Lighting float3 ScatteringLighting = 0; if (bShadowed) { float HGPhaseFunc = HGPhaseFunctionSchlick(VoL, ScatteringAnisotropy); float3 ScatteringColor = GBuffer.BaseColor * Shadow; float3 ScatterAttenuation = saturate(pow(ScatteringColor / Luminance(ScatteringColor), 1.0 - Shadow)); ScatteringLighting = ScatteringColor * ScatterAttenuation * HGPhaseFunc * ScatteringIntensity; }
  55. 55. Fake Scattering
  56. 56. • Screen Space Volume Photon Mapping  Use OIT PPLL as the volume photon map: radiance, normal and depth  Gather nearest photons on the front-most pixels after sorting  3 x 3 Gaussian kernel  Isotropic phase function: hard to handle per-material properties  Can’t filter radiance from other geometry but it will be attenuated  Flickering due to UAV writing -> reduced by TAA Scattering: Indirect Lighting float2 PhotonScreenUV = InScreenUV + float2(SEARCH_RADIUS * dx, SEARCH_RADIUS * dy); int2 PhotonAddress = int2(PhotonScreenUV * View.ViewSizeAndInvSize.xy + View.ViewSizeAndInvSize.zw); uint PhotonListIndex = OitLinkedListHeadAddressSRV[PhotonAddress]; if (PhotonListIndex == INVALID_UINT) { continue; } FOitLinkedListDataElement Photon = OitLinkedListDataSRV[PhotonListIndex]; float3 PhotonRadiance = UnpackRGY32(Photon.RadianceRGY32); float3 PhotonNormal = GetNormalFromPack(Photon.NormalAndOpacity); float CosTheta = dot(FrontmostNormal, PhotonNormal); float3 PhotonPositionWS = ConstructWorldPosition(PhotonScreenUV, asfloat(Photon.Depth)); float R3 = distance(FrontmostPositionWS, PhotonPositionWS) * PHOTON_DISTANCE_SCALE; float3 PhotonContribution = 3.0 * PhotonRadiance; PhotonContribution /= R3; PhotonContribution *= HGPhaseFunctionSchlick(CosTheta, MATERIAL_ANISOTROPY); // isotropic GaussPhotonScattering += PhotonContribution * GaussianWeights[GaussianWeightIndex];
  57. 57. Screen Space Volume Photon Mapping
  58. 58. Diffuse Lighting • Tangent Lambertian DiffuseLighting = max(0, sqrt(1 – SinThetaI * SinTetaI)) * EnergeConservingWrappedDiffuse(N, L, 1) * DiffuseIntensity
  59. 59. Final Rendering
  60. 60. • Specular parameters  Noise: fuzzy highlight  Roughness: width and strength of highlight  Shift: position of highlight peek  + binding a hair profile asset • Diffuse, fake scattering and textures • Other hair parameters -> To create various styled hairs Hair Material
  61. 61. • Better scattering • Material LOD • Finding the best method for a certain hair style Future Work: Hair Shading
  62. 62. Metal Rendering
  63. 63. • Experimental • Anisotropic GGX • Far Cry 4 style • To all lighting components  Direct lighting, sky lighting, reflection environment, SSR, etc.  Using tangent irradiance map? Anisotropic Specular
  64. 64. Shadow Rendering
  65. 65. • For a character viewer and a lobby  EVSM: Exponential Variance Shadow Maps • For the game scene  Sun: PCSS  Other types of lights: PCF / No changes from UE4  Additional shadows: Screen Space Inner Shadows (new feature) Scene-Specific Shadows
  66. 66. • Experimental • Pre-filtered shadows  No changes from the original algorithm  Nice look but light leaks and low performance • Optimization  CSM split limits: maximum 2  Scissor test: larger than screen space shadow bounds EVSM
  67. 67. • For Sun in the game scene • One of an effect of time of lighting changes  Day and night cycle in the game play  Different blur size by time: sharp at noon, soft at sunrise and sunset  Different blur size by distance between occluders and receivers • Optimization  CSM split limits: maximum 2 • Temporal reprojection  Reduce flickering due to moving Sun, and sampling artifacts  Lerp according to difference between prev and current frame PCSS float2 Attenuation = Texture2DSample(SunLightAttenuationTexture, TextureSamplerPoint, PrevScreenUV).xy; float2 ShadowAndSSSTransmission = float2(Shadow, SSSTransmission); const float2 TAAFactor = Square(1.0 - abs(ShadowAndSSSTransmission - Attenuation)); ShadowAndSSSTransmission = lerp(ShadowAndSSSTransmission, Attenuation, 0.5 * TAAFactor);
  68. 68. • Shadows in shadows:  Darker shadows on environment occluded by characters  Better looks when a character is on the shadowed surfaces  Directionality: difference from AO • Using scene depth  No preprocessing or asset settings  Comparison> Capsule based: The Order: 1886 or UE 4.11 Screen Space Inner Shadows
  69. 69. • G-buffer changes: add caster and receiver bit masks 1) Stencil Masking Pass  Write stencil at shadowed pixels by Sun  Ignore unlit pixels 2) Shadow Rendering Pass: SSR styled ray marching  Shoot rays to the half vector between Sun and Sky  Limit max tracing distance: artistic choice and performance win  Temporal reprojection to the previous frame’s buffer Screen Space Inner Shadows L Sky H Sun ShadowInner Shadow
  70. 70. • ½ resolution buffer • Separable Gaussian blur • Approximately 0.5 ms • Shadow amount is applied for sky lighting and GI with a receiver mask Screen Space Inner Shadows
  71. 71. • Important for game visuals  Look  Day and night cycle • How to reduce costs and flickering?  High draw calls  Moving Sun Future Work: Shadow Rendering
  72. 72. Run-time Rigged Physics Simulation
  73. 73. • Movement of short hair • Trembling body fat and cloth wrinkles • To reduce work load of artists Goal
  74. 74. • Define simulator assets in the editor  Currently we support spring simulation only • Group vertices  According to vertex colors roughly painted by artists, and the simulator assets • Sample simulator bones  In the bounds of a vertex group / according to a density setting / snap to the nearest vertex  Poisson distribution / deterministic sampling • Rig the simulator bones  Simulator bone -> become a child of the nearest bone in a character  Vertex -> rigged with the nearest simulator bone  Distance based skinning weights Algorithm
  75. 75. • Trying to rich animation with various methods  Module based animation  Procedural animation  Physics simulation • They may be introduced by other conferences  ex) NDC 2017? Animation Techniques of A1
  76. 76. Low Level Shader Optimization
  77. 77. • Using the GPU profiler of UE4  Checking rendering costs and doing high level shader optimization • Using RenderDoc, Intel GPA, and AMD GPU PerfStudio  Debugging shader code and doing low level shader optimization • Rewriting shader code by hand with optimization references Approach
  78. 78. • Only critical parts of shader written by Epic Games  To upgrade a new version of the engine continously  We will check all of shader code before shipping • Currently we focus on shader written by ours • This talk shows examples of our optimization Target Code
  79. 79. Before After Static Branch with Preprocessor float3 L = (LightPositionAndIsDirectional.w == 1) ? -LightPositionAndIsDirectional.xyz : normalize(LightPositionAndIsDirectional.xyz - OpaqueWorldPosition); #if USE_FADE_PLANE // CSM case = directional light float3 L = -LightPositionAndIsDirectional.xyz; #else float3 L = (LightPositionAndIsDirectional.w == 1) ? -LightPositionAndIsDirectional.xyz : normalize(LightPositionAndIsDirectional.xyz - OpaqueWorldPosition); #endif
  80. 80. Before After Vectorization and Explicit MAD float2 LookupUV = float2(CosPhiD, CosThetaD) * 0.5 + 0.5; float Backlit = saturate(-CosThetaD * 0.5 + 0.5); float3 LookupUVandBacklit = saturate(mad(float3(CosPhiD, CosThetaD, -CosThetaD), 0.5, 0.5));
  81. 81. Before After Share Preceding Computation for (int X = 0; X < NumSamplesSqrt; ++X) { for (int Y = 0; Y < NumSamplesSqrt; Y++) { float2 ShadowOffset = TexelSize * StepSize * float2(X, Y); const float2 BaseTexelSize = TexelSize * StepSize; … for (int X = 0; X < NumSamplesSqrt; ++X) { for (int Y = 0; Y < NumSamplesSqrt; Y++) { float2 ShadowOffset = BaseTexelSize * float2(X, Y);
  82. 82. Before After Rearrange Scalar/Vector Operation float3 DiffuseLighting = (TangentDiffuse * GBuffer.DiffuseColor) / PI * NoLWrapped * ShadowColor; float3 DiffuseLighting = GBuffer.DiffuseColor * (TangentDiffuse / PI * NoLWrapped * ShadowColor);
  83. 83. Before After Use Modifiers as Input Force += -normalize(Position) * Simulation.GravityStrength; Force += normalize(-Position) * Simulation.GravityStrength;
  84. 84. Before After Rearrange Code Lines // 무언가 긴 작업… // clip(OpacityMask); // 셰이더 메인이 시작 후 최대한 빨리… // clip(OpacityMask);
  85. 85. Use termination if possible Early-Z, Stencil, and Discard if (GBuffer.ShadingModelID != 0) { discard; } float Attenuation = Texture2DSample(SunLightAttenuationTexture, TextureSamplerPoint, ScreenUV).x; if (Attenuation > 0.99) { discard; } ------- [EARLYDEPTHSTENCIL] void OrderIndependentTransparencyCompositePixelMain( float2 InScreenUV: TexCoord0, float4 InSVPosition: SV_Position, out float4 OutColor: SV_Target0) { …
  86. 86. Data packing and cache line alignment ALU VS. lookup table Misc.: Above Metioned #if OIT_PACK_RADIANCE uint RadianceRGY32; #else half4 Radiance; #endif float3 ComputeLongitudinalScattering(float3 Theta, float Roughness) { const float bR = DecodeHairLongitudinalWidth(Roughness); const float3 Beta3 = float3(bR, bR * 0.5, bR * 2); return exp(-0.5 * Square(Theta) / Square(Beta3) ) / (sqrt(2.0 * PI) * Beta3); }
  87. 87. • Continuous work • Optimization of shader written by Epic Games • Although aggressive and smart optimization of a shader compiler is amazing, it sometimes produces unwanted results. • Need explicitly writing GPU friendly code and checking disassembly Future Work: Low Level Shader Optimization
  88. 88. • New IP of Nexon • AAA-Quality Graphics • With Cutting Edge Graphics Technology • In Development Summary of This Talk
  89. 89. • Colleagues of A1 • Support Team of Epic Korea • Authors of References Acknowledgement
  90. 90. Thanks WE ARE HIRING!
  91. 91. • AMD TressFX Hair  http://www.amd.com/en-us/innovations/software-technologies/technologies-gaming/tressfx • Burke, “Hair In Destiny”, SIGGRAPH 2014  http://advances.realtimerendering.com/destiny/siggraph2014/heads/ • Burley, “Extending the Disney BRDF to a BSDF with Integrated Subsurface Scattering”, SIGGRAPH 2015  http://blog.selfshadow.com/publications/s2015-shading-course/#course_content • d‘Eon, “An Energy-Conserving Hair Reflectance Model”, ESR 2011  http://www.eugenedeon.com/ • GCN Performance Tweets  http://developer.amd.com/wordpress/media/2013/05/GCNPerformanceTweets.pdf • Wexler, “GPU-Accelerated High-Quality Hidden Surface Removal”, Graphics Hardware 2005  http://developer.amd.com/wordpress/media/2013/05/GCNPerformanceTweets.pdf • Jensen, “A Practical Model for Subsurface Light Transport”, SIGGRAPH 2001  http://www.graphics.stanford.edu/papers/bssrdf/ • Jimenez, “Next Generation Character Rendering”, GDC 2013  http://www.iryoku.com/stare-into-the-future • Ki, Real-Time Subsurface Scattering using Shadow Maps, ShaderX7  http://amzn.to/1TxzadP • Marschner, “Light Scattering from Human Hair Fibers”, SIGGRAPH 2003  https://www.cs.cornell.edu/~srm/publications/SG03-hair-abstract.html • McAuley, “Rendering the World of Far Cry 4”, GDC 2015  http://www.gdcvault.com/play/1022235/Rendering-the-World-of-Far • Moon, Simulating multiple scattering in hair using a photon mapping approach, SIGGRAPH 2006  https://www.cs.cornell.edu/~srm/publications/SG06-hair.pdf • More Explosions, More Chaos, and Definitely More Blowing Stuff Up: Optimizations and New DirectX Features in ‘Just Cause 3′  https://software.intel.com/sites/default/files/managed/20/d5/2016_GDC_Optimizations-and-DirectX-features-in-JC3_v0-92_X.pdf • Nguyen, “Hair Animation and Rendering in the Nalu Demo”, GPU Gems 2  http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter23.html • NVIDIA GameWorks Blog  https://developer.nvidia.com/gameworks/blog • Persson, Low-level Shader Optimization for Next-Gen and DX11, GDC 2014  http://www.humus.name/index.php?page=Articles • Persson, Low-Level Thinking in High-Level Shading Languages, GDC 2013  http://www.humus.name/index.php?page=Articles • Pettineo, “A Sampling of Shadow Techniques”  https://mynameismjp.wordpress.com/2013/09/10/shadow-maps/ • Phail-Liff, “Melton and Moustaches: The Character Art and Shot Lighting Pipelines of The Order: 1886”  http://www.readyatdawn.com/presentations/ • Thibieroz, Grass, Fur and all Things Hairy, GDC 2014  http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/Grass-Fur-and-All-Things-Hairy-Thibieroz-Hillesland.ppsx • Unreal Engine 4  https://www.unrealengine.com/ References

×