Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DirectGMA on AMD’S FirePro™ GPUS

2,896 views

Published on

Learn more about DirectGMA in this blog post: bit.ly/AMDDirectGMA

AMD has introduced Direct Graphics Memory Access in order to:
‒ Makes a portion of the GPU memory accessible to other devices
‒ Allows devices on the bus to write directly into this area of GPU memory
‒ Allows GPUs to write directly into the memory of remote devices on the bus supporting DirectGMA
‒ Provides a driver interface to allow 3rd party hardware vendors to support data exchange with an AMD GPU using DirectGMA
‒ and more

View the accompanying blog post here: bit.ly/AMDDirectGMA

Published in: Technology
  • Be the first to comment

  • Be the first to like this

DirectGMA on AMD’S FirePro™ GPUS

  1. 1. DIRECTGMA ON AMD’S FIREPRO™ GPUS BRUNO STEFANIZZI SEP 2014
  2. 2. 2 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  Exposing Graphic memory of a GPU to any device has been always the goal for any application looking for low latency communication of his data between every device and the GPU. This is why AMD has introduced DirectGMA (Direct Graphics Memory Access) in order to: ‒ Makes a portion of the GPU memory accessible to other devices ‒ Allows devices on the bus to write directly into this area of GPU memory ‒ Allows GPUs to write directly into the memory of remote devices on the bus supporting DirectGMA ‒ Provides a driver interface to allow 3rd party hardware vendors to support data exchange with an AMD GPU using DirectGMA ‒ APIs supporting AMD’s DirectGMA are: OpenGL, OpenCLTM, DirectX® ‒ The supported operation systems are: Windows ® 7 64 Bit and Linux ® 64 Bit ‒ The supported cards (AMD FirePro™ W W5x00 and above as well as all AMD FireProTM S series) INTRODUCTION TO DIRECTGMA
  3. 3. 3 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  Peer-to-Peer Transfers between GPUs Use high-speed DMA transfers to copy data between the memories of two GPUs on the same system/PCIe bus.  Peer-to-Peer Transfers between GPU and FPGAs Use high-speed DMA transfers to copy data between the memories of the GPU and the FPGA memory.  DirectGMA for Video Optimized pipeline for frame-based devices such as frame grabbers, video switchers, HD-SDI capture, and CameraLink devices. See our SDI webpage INTRODUCTION TO DIRECTGMA
  4. 4. 4 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | AMD’S DIRECTGMA P2P  Direct communication between PCI cards  Bidirectional DirectGMA P2P requires memory on both cards CPU PCI Bus
  5. 5. 5 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  The OpenGL extension AMD_BUS_ADDRESSABLE_MEMORY provides access to DirectGMA  The functions are:  The new tokens are: DIRECTGMA IN OPENGL void glMakeBuffersResident(sizei n, uint* buffers, uint64* baddr, uint64* maddr); void glBufferBusAddress(enum target, sizeiptr size, uint64 surfbusaddress, uint64 markerbusaddress); void glWaitMarker(uint buf, uint value); void glWriteMarker(uint buf, uint value, uint64 offset); GL_BUS_ADDRESSABLE_MEMORY_AMD GL_EXTERNAL_PHYSICAL_MEMORY_AMD
  6. 6. 6 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  To receive data a buffer needs to be created that can be accessed by other devices on the bus  The physical address of this buffer needs to be known in order to have a remote device writing to this address DIRECTGMA IN OPENGL | CREATING A BUFFER TO RECEIVE DATA glGenBuffers(m_uiNumBuffers, m_pBuffer); m_pBufferBusAddress = new unsigned long long[m_uiNumBuffers]; m_pMarkerBusAddress = new unsigned long long[m_uiNumBuffers]; for (unsigned int i = 0; i < m_uiNumBuffers; i++) { glBindBuffer(GL_BUS_ADDRESSABLE_MEMORY_AMD, m_pBuffer[i]); glBufferData(GL_BUS_ADDRESSABLE_MEMORY_AMD, m_uiBufferSize, 0, GL_DYNAMIC_DRAW); } // Call makeResident when all BufferData calls were submitted. glMakeBuffersResidentAMD(m_uiNumBuffers, m_pBuffer, m_pBufferBusAddress, m_pMarkerBusAddress); // Make sure that the buffer creation really succeeded if (glGetError() != GL_NO_ERROR) return false; glBindBuffer(GL_BUS_ADDRESSABLE_MEMORY_AMD, 0);
  7. 7. 7 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  To write into the buffer on a remote device we need to create an OpenGL buffer and assign the physical addresses of the memory on the remote device DIRECTGMA IN OPENGL | USING A BUFFER ON A REMOTE DEVICE glGenBuffers(m_uiNumBuffers, m_pBuffer); for (unsigned int i = 0; i < m_uiNumBuffers; i++) { glBindBuffer(GL_EXTERNAL_PHYSICAL_MEMORY_AMD, m_pBuffer[i]); glBufferBusAddressAMD(GL_EXTERNAL_PHYSICAL_MEMORY_AMD, m_uiBufferSize, m_pBufferBusAddress[i], m_pMarkerBusAddress[i]); if (glGetError() != GL_NO_ERROR) return false; } glBindBuffer(GL_EXTERNAL_PHYSICAL_MEMORY_AMD, 0);
  8. 8. 8 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  Create one thread per GPU. Each thread creates its own context. One thread adds as data sink the other as source.  On the sink GPU a GL_BUS_ADDRESSABLE_MEMORY_AMD buffer is created  On the source GPU a buffer is created. DIRECTGMA IN OPENGL | GPU TO GPU COPY glGenBuffers(m_uiNumBuffers, m_pSinkBuffer); for (unsigned int i = 0; i < m_uiNumBuffers; i++) { glBindBuffer(GL_BUS_ADDRESSABLE_MEMORY_AMD, m_pSinkBuffer[i]); glBufferData(GL_BUS_ADDRESSABLE_MEMORY_AMD, m_uiBufferSize, 0, GL_DYNAMIC_DRAW); } // Call makeResident when all BufferData calls were submitted. glMakeBuffersResidentAMD(m_uiNumBuffers, m_pBuffer, m_pBufferBusAddress, m_pMarkerBusAddress); glGenBuffers(m_uiNumBuffers, m_pSourceBuffer); for (unsigned int i = 0; i < m_uiNumBuffers; i++) { glBindBuffer(GL_EXTERNAL_PHYSICAL_MEMORY_AMD, m_pSourceBuffer[i]); glBufferBusAddressAMD(GL_EXTERNAL_PHYSICAL_MEMORY_AMD, m_uiBufferSize, m_pBufferBusAddress[i], m_pMarkerBusAddress[i]); } GPU 0: Sink GPU 1: Source
  9. 9. 9 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  The source creates data and copies it into the GL_EXTERNAL_PHYSICAL_MEMORY buffer that has it’s data store on the sink device  The sink device receives the data and copies it into a texture to be displayed DIRECTGMA IN OPENGL | GPU TO GPU COPY // Submit draw calls that do not require data sent by the source … glBindTexture(GL_TEXTURE_2D, m_uiTexture); glBindBuffer(GL_PIXEL_UNPACK_BUFFER, uiBufferIdx); // Indicate that the following commands will need the data transferred by the source glWaitMarkerAMD(uiBufferId, uiTransferId); // Copy buffer into texture glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, m_uiTextureWidth, m_uiTextureHeight, m_nExtFormat, m_nType, NULL); // Draw using received texture // Draw … ++uiTransferId; // Bind buffer that has its data store on the sink GPU glBindBuffer(GL_PIXEL_PACK_BUFFER, uiBufferid); // Copy local buffer into remote buffer glReadPixels(0, 0, m_uiBufferWidth, m_uiBufferHeight, m_nExtFormat, m_nType, NULL); // Write marker glWriteMarkerAMD(uiBufferId, uiTransferId , ullMarkerBusAddress); glFlush(); GPU 0: Sink GPU 1: Source
  10. 10. 10 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | DIRECTGMA IN OPENGL | OVERLAPPING EXECUTION GPU 1 render GPU 1 transfer GPU 0 render GPU 0 use buffer GPU 0 wait
  11. 11. 11 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 |  The OpenCL extension CL_AMD_BUS_ADDRESSABLE_MEMORY provides access to DirectGMA  The functions are:  The new tokens are: DIRECTGMA IN OPENCL cl_int clEnqueueWaitSignalAMD(cl_command_queue command_queue, cl_mem mem_object, uint value, cl_uint num_events, … cl_int clEnqueueWriteSignalAMD(cl_command_queue command_queue, cl_mem mem_object, uint value, cl_ulong offset, … cl_int clEnqueueMakeBuffersResidentAMD(cl_command_queue command_queue, cl_uint num_mem_objects, cl_mem* mem_objects, cl_bool blocking_make_resident, cl_bus_address_amd * bus_addresses, cl_uint num_events, … CL_BUS_ADDRESSABLE_MEMORY_AMD CL_EXTERNAL_PHYSICAL_MEMORY_AMD
  12. 12. 12 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | DIRECTGMA | DX9  The DirectGMA functionality in DX9 is made available through a so called communication surface  The process for using it is as follow: ‒ Create an 1x1 offscreen plain surface of format FOURCC_SDIF ‒ Lock the surface. On lock, the driver will allocate and return a pointer to a AMDDX9SDICOMMPACKET structure. This structure is the communication surface. ‒ Assign and cast the pBits pointer to a locally created AMDDX9SDICOMMPACKET pointer.  The most essential commands are: AMD_SDI_CMD_GET_CAPS_DATA AMD_SDI_CMD_CREATE_SURFACE_LOCAL_BEGIN AMD_SDI_CMD_CREATE_SURFACE_LOCAL_END AMD_SDI_CMD_CREATE_SURFACE_REMOTE_BEGIN AMD_SDI_CMD_CREATE_SURFACE_REMOTE_END AMD_SDI_CMD_QUERY_PHY_ADDRESS_LOCAL AMD_SDI_CMD_SYNC_WAIT_MARKER AMD_SDI_CMD_SYNC_WRITE_MARKER
  13. 13. 13 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | DIRECTGMA | DX9  Running a DirectGMA command: HRESULT RunSDICommand(IN LPDIRECT3DDEVICE9 pd3dDevice, IN AMDDX9SDICMD sdiCmd, IN PBYTE pInBuf, IN DWORD dwInBufSize, IN PBYTE pOutBuf, IN DWORD dwOutBufSize) { HRESULT hr; PAMDDX9SDICOMMPACKET pCommPacket; D3DLOCKED_RECT lockedRect; LPDIRECT3DSURFACE9 pCommSurf = NULL; hr = pd3dDevice->CreateOffscreenPlainSurface(1, 1, (D3DFORMAT) FOURCC_SDIF, D3DPOOL_DEFAULT, &pCommSurf, NULL); hr = pCommSurf->LockRect(&lockedRect, NULL, 0); pCommPacket = (PAMDDX9SDICOMMPACKET)(lockedRect.pBits); pCommPacket->dwSign = 'SDIF'; pCommPacket->pResult = &hr; pCommPacket->sdiCmd = sdiCmd; pCommPacket->pOutBuf = pOutBuf; pCommPacket->dwOutBufSize = dwOutBufSize; pCommPacket->pInBuf = pInBuf; pCommPacket->dwInBufSize = dwInBufSize; pCommSurf->UnlockRect(); REL(pCommSurf); return hr; }
  14. 14. 14 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | DIRECTGMA | DX9  Create a local surface that can be accessed by a remote device hr = RunSDICommand(pd3dDevice, AMD_SDI_CMD_CREATE_SURFACE_LOCAL_BEGIN, NULL, 0, NULL, 0); if (SUCCEEDED(hr)) { // Create SDI_LOCAL resources here hr = pd3dDevice->CreateTexture(width, height, 1, usage, format, D3DPOOL_DEFAULT, ppTex, NULL); if (SUCCEEDED(hr)) { hr = MakeAllocDoneViaDumpDraw( pd3dDevice, *ppTex ); hr = RunSDICommand(pd3dDevice, AMD_SDI_CMD_CREATE_SURFACE_LOCAL_END, NULL, 0, (PBYTE)pAttrib, sizeof(AMDDX9SDISURFACEATTRIBUTES)); if (SUCCEEDED(hr)) { pAttrib->surfaceHandle, pAttrib->surfaceAddr.surfaceBusAddr, pAttrib->surfaceAddr.markerBusAddr); } } } return hr;
  15. 15. 15 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | DIRECTGMA | DX10 DX 11  The AMD’s DirectGMA extension is accessed by way of the IAmdDxExt interface. In order to create this interface, the extension client must do the following: ‒ Include the “AmdDxExtSDIApi.h” file ‒ Get the exported function AmdDxExtCreate() from the DXX driver using GetProcAddress() ‒ Call AmdDxExtCreate to create an IAmdDxExt interface ‒ Get and use the desired specific extension interfaces ‒ Close the AMD DirectX extension interface IAmdDxExt once it is no longer needed ‒ Release the SDI interface IAmdDxExtSDI ‒ Release the extension interface IAmdDxExt
  16. 16. 16 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | DIRECTGMA | DX 10 DX11  The following DirectGMA functions are provided: HRESULT CreateSDIAdapterSurfaces(AmdDxRemoteSDISurfaceList *pList) ; HRESULT QuerySDIAllocationAddress(AmdDxSDIQueryAllocInfo *pInfo) ; HRESULT MakeResidentSDISurfaces(AmdDxLocalSDISurfaceList *pList) ; BOOL WriteMarker(ID3D10Resource *pResource, AmdDxMarkerInfo *pMarkerInfo); BOOL WaitMarker(ID3D10Resource *pResource, UINT val); BOOL WriteMarker11(ID3D11Resource *pResource, AmdDxMarkerInfo *pMarkerInfo) ; BOOL WaitMarker11(ID3D11Resource *pResource, UINT val);
  17. 17. 17 | DIRECTGMA ON AMD’S FIREPRO™ GPUS | SEPTEMBER 8, 2014 | DISCLAIMER & ATTRIBUTION The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION © 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. SPEC is a registered trademark of the Standard Performance Evaluation Corporation (SPEC). Other names are for informational purposes only and may be trademarks of their respective owners.

×