The next generation of GPU APIs for Game Engines


Published on

Demonstrate about new pipeline of GPU APIs for developing real time game engine.
Developing for DirectX12, Vulkan or Metal requires a redesign of the game engine. Developers can achieve key benefits like reduced power consumption and optimized CPU and GPU, multi-threading on multiple GPU devices.

Published in: Technology
  1. 1. The next generation of for Game Engines GPU APIs Pooya Eimandar Fanap March 2018
  2. 2. POOYA EIMANDAR • Lead developer of Wolf.Engine (an open source 3D game engine) • Project Manager and lead developer of Project Falcon since 2017 • CEO at BaziPardaz Ltd since 2011 • Founder at • Member of Microsoft Partner Network since 2013 • Author at PackT Publications and GameDev.Net • Lecturer at The University of Applied Science and Technology National Foundation of Computer Games (2013 – 2015) • Lecturer at Iran Game Development Institute (2014 – 2015) • Lead developer of Persian Game Engine (2010 - 2014) • Member of IGDF jury panel for the best computer games technology. (2014 - 2015)
  3. 3. • (codename Black Kitten) • • The use of motion sensors in medical and health industry. (Jan 27, 2014 - First Conference of Game & Medical Health) • DirectX Graphics Diagnostic. (Oct 11, 2013. GameDev.Net) PUBLICATIONS
  4. 4. • In Oct 1958, Physicist William Higinbotham created first video • 1970s : Golden age of Arcade Games powered by Fujitsu’s MB14241, Atari 2600’s Television Interface Adaptor and etc. • 1992 : Silicon Graphics Inc., started developing OpenGL in 1991 and released it in January 1992 • 1995: Microsoft DirectX released as Windows Game SDK for Windows 95 HISTORY
  5. 5. HLSL PIPELINE vs_1_1 supports 128 instructions such as: add, vs, log, mov, max, min m4x4 vs_2_0 supports 256 instructions vs_3_0 supports minimum 512 instructions and up to the number of slots in D3DCAPS9.MaxPixelShader30InstructionSlots Vs_4_0 and later versions : No restriction Memory Resources (Buffer, Texture) Pixel Shader Output Merger Input Assembler Vertex Shader Hull Shader Tessellator Domain Shader Geometry Shader Rasterizer Stream Output Control Shader Evaluation Shader Fragment Shader GLSL PIPELINE
  6. 6. THE EVOLUTION OF GPU APIS • 1992 OpenGL: Fixed Functions Pipeline • 1995 DirectX on Microsoft Windows 95, Microsoft stopped supporting OpenGL(till now v.1.1) • 1996 3dfx’s Glide: Geometry & Texture Mapping • 1998 DirectX 6: IHV independent + Multi Texturing • 1999 DirectX 7: Hardware Texturing & Lighting + Cube Maps • 2000 DirectX 8: Programmable Shaders • 2002 DirectX 9: Floating point texture mapping, multiple RTs, Multiple-Element Textures, texture lookups in the vertex shader and stencil buffer techniques • 2004 OpenGL 2.0: GLSL • 2006 DirectX 10: Major Update • 2009 DirectX 11: Compute Shader • 2010 OpenGL 3.3 + OpenGL 4.0: It was designed to target hardware able to support Direct3D 10/11 • 2012 DirectX 11.1: Direct2D + Direct3D, Integrated with WINRT • 2013 DirectX 11.2: Dircet2D Geometry Rasterization, swap chain composition • 2014 DirectX 11.3: Xbox One • 2014 Apple Metal : released for IOS 8 • 2015 Apple Metal : released for Mac OSX El Cptain • 2015 DirectX 12: Windows 10 • 2016 Vulkan : The next generation of OpenGL • 2017 OpenGL 4.6: released at 25th Anniversary of OpenGL
  7. 7. THE NEXT GENERATION • 2013 : AMD originally developed Mantle in cooperation with DICE • Mantle was designed as an alternative to Direct3D and OpenGL • 2015: Mantle's public SDK was suspended, as DirectX 12 and the Mantle-derived Vulkan (Next Generation of OpenGL) rose in popularity DirectX 12 Since 2014 on Apple IOS 8 Since 2015 on Mac OSX El Capitan Xbox One (DX11.X) Xbox One X Windows 10 Windows 7/8/8.1/10 Linux Android Almost Cross Platform
  8. 8. • Low driver overhead • Minimize runtime validation • Multithreaded GPU command buffer recording from CPU Cores • Explicit Memory Management, Local Host, Device Host, Shared Memory between CPU and GPU • Provide explicit access to multiple GPUs WHY DO WE NEED TO MIGRATE TO NEW APIS? Application Application responsible for memory allocation and thread management to record command buffers Direct GPU Control GPU Next Gen APIs DirectX 12 Vulkan Metal Traditional APIs DirectX 11 OpenGL OpenGL ES Controlling GPU offered by traditional graphics drivers for managing memory, context and etc.
  9. 9. OpenGL void load_texture_from_memory_rgba(_In_ uint8_t* pRGBAData) { glBindTexture(GL_TEXTURE_2D, ”texture_name”); glTexImage2D(GL_TEXTURE_2D, 0, GL_SRGBA8, _width, _height, 0, GL_UNSIGNED_BYTE, GL_RGBA, pRGBAData); glGenerateMipmap(GL_TEXTURE_2D); } SAMPLE CODE
  10. 10. void create_image() { const VkImageCreateInfo _image_create_info = { VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, // Type nullptr, // Next 0, // Flags _image_type, // ImageType _format, // Format { _width, _height, _depth }, _mip_map_levels , // MipLevels _layer_count, // ArrayLayers VK_SAMPLE_COUNT_1_BIT, // Samples VK_IMAGE_TILING_OPTIMAL, // Tiling _usage_flags, // Usage VK_SHARING_MODE_EXCLUSIVE, // SharingMode 0, // QueueFamilyIndexCount nullptr, // QueueFamilyIndices VK_IMAGE_LAYOUT_UNDEFINED // InitialLayout }; vkCreateImage(vk_device, &_image_create_info, nullptr, &_image_view); } SAMPLE CODE void load_texture_from_memory_rgba(_In_ uint8_t* pRGBAData) { create_image(); allocate_memory(); //bind to memory vkBindImageMemory(vk_device, _image_view, _memory, 0) copy_data_to_texture_2D(pRGBAData); create_sampler(); create_image_view(); } Vulkan
  11. 11. DIRECT GPU COMPONENTS Heap Memory Image Image View Image View Buffer Sampler Frame Buffer Render Pass Command Buffer Pool Main Command Buffer Second Command Buffer Graphics Pipeline Barrier Synchronization Begin Render Pass Bind Graphics Pipeline Set Dynamic States Bind to Buffers Update Buffer Bind Descriptor Sets Draw Execute Commands End Render Pass Buffer Descriptor Set Descriptor Set Pool Queue Device
  12. 12. MEMORY ALLOCATION Heap Memory Memory Chuck Memory Chuck Memory Chuck Buffer 1 Buffer 2 Buffer 3 Vertex BufferIndex BufferUniform
  13. 13. MEMORY ALLOCATION Memory Chuck Buffer 1 Buffer 2 Buffer 3 Vertex BufferIndex BufferUniform
  14. 14. MEMORY ALLOCATION Memory Chuck Buffer Vertex BufferIndex BufferUniform
  15. 15. THREAD SYNCHRONIZATION Main Command Buffer Execute Commands Four synchronization types: • Fences, being used to communicate completion of execution of command buffer submissions to queues back to the application. • Semaphores, being generally associated with resources or groups of resources and can be used to marshal ownership of shared data. Their status is not visible to the host. (Queues • Events, providing a finer-grained synchronization primitive which can be signaled at command level granularity by both device and host, and can be waited upon by either. • Barriers, providing execution and memory synchronization between sets of commands. cmd buffer cmd buffer cmd buffer cmd buffer cmd bufferFence 1 Fence 2 Barrier Synchronization
  16. 16. GRAPHICS PIPELINE Main Command Buffer Graphics Pipeline Bind Graphics Pipeline Graphics Pipeline • Snapshot from all GPU States • Rasterization state • Shader Stage • Vertex Input • Tessellation state • Multi Sample State • Depth & Stencil State • Color Blend State
  17. 17. DRAW Main Command Buffer Draw Draw methods: • Direct Draw • Set Vertex, Index • Call Draw • Too Slow, Many Draw Calls • Instanced Draw • Setup instance buffer • Draw all with same instance buffer • Draw each object with same number of vertices and indices • InDirect Draw • the buffer can be generated and updated offline with no need to actually update the command buffers that contain the actual drawing functions • On indirect call draws all objects with associated vertex and index buffer
  18. 18. WOLF ENGINE • • • The Wolf Engine is the next generation of Persian Game Engine which is a cross-platform open source game engine. The Wolf is a comprehensive set of C++ open source libraries for rendering and game developing. • Script language : LUA • Binding Languages: PyWolf, a Python binding for Wolf Engine
