SlideShare a Scribd company logo
1 of 20
Download to read offline
OPTIMIZING	
  FFMPEG	
  AND	
  HANDBRAKE	
  
USING	
  OPENCL	
  
SRIKANTH	
  GOLLAPUDI	
  &	
  MICHAEL	
  WOOTTON	
  
FFMPEG	
  

INTRODUCTION	
  

!  FFMPEG	
  is	
  a	
  very	
  popular	
  open	
  source	
  mulLmedia	
  soNware	
  library	
  used	
  to	
  record,	
  convert	
  and	
  stream	
  
Audio	
  &	
  Video.	
  
!  Used	
  by	
  popular	
  OpenSource	
  projects	
  like	
  Handbrake,	
  VLC	
  player,	
  Chrome	
  etc.	
  

!  Single	
  stop	
  soluLon	
  for	
  
‒  Decoding	
  different	
  codec	
  formats	
  (Audio	
  &	
  Video)	
  
‒  Handling	
  various	
  container	
  formats	
  (mp4,	
  wmv,	
  avi,	
  m2ts,	
  m2ps	
  etc.)	
  
‒  Encoding	
  to	
  popular	
  Video	
  &	
  Audio	
  codec	
  formats	
  (H.264,	
  VC-­‐1,	
  Mpeg2	
  etc.)	
  
‒  Different	
  video	
  filtering	
  algorithms	
  (Deshake,	
  Scale,	
  Unsharp	
  etc.)	
  
‒  Managing	
  different	
  pixel	
  formats	
  (NV12,	
  RGB,	
  YV12	
  etc.)	
  
‒  Cross-­‐placorm	
  support	
  (Windows	
  and	
  Linux)	
  

2	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
FFMPEG	
  –	
  TYPICAL	
  USAGE	
  SCENARIO	
  AND	
  PROCESSING	
  INVOLVED	
  

	
  Imagine	
  a	
  video	
  edit	
  using	
  FFMPEG	
  

Video	
  
	
  	
  	
  Decode	
  

3	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  

Video	
  shake	
  
	
  	
  	
  	
  removal	
  

Sharp/Blur	
  

	
  	
  Scale	
  

Video	
  
	
  	
  	
  Encode	
  
FFMPEG	
  –	
  TYPICAL	
  USAGE	
  SCENARIO	
  AND	
  PROCESSING	
  INVOLVED	
  

	
  Imagine	
  a	
  video	
  edit	
  using	
  FFMPEG	
  
Video	
  
	
  	
  	
  Decode	
  

Video	
  shake	
  
	
  	
  	
  	
  removal	
  

Sharp/Blur	
  

	
  	
  Scale	
  

GPU	
  

HW	
  Decoder	
  

CPU	
  
AMD	
  APU	
  

HETEROGENEOUS	
  SOLUTION	
  
4	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  

Video	
  
	
  	
  	
  Encode	
  

HW	
  Encoder	
  
FFMPEG	
  –	
  SCOPE	
  FOR	
  ACCELERATION	
  
Leverage	
  Heterogeneous	
  compute	
  
!  Accelerate	
  Video	
  Decode	
  and	
  Encode	
  using	
  HW	
  accelerators	
  
‒  Load	
  on	
  CPU	
  to	
  perform	
  decode	
  and	
  encode	
  is	
  taken	
  off	
  
‒  Power	
  savings	
  =>	
  longer	
  baiery	
  life	
  

!  Accelerate	
  Video	
  Processing	
  filter	
  using	
  GPU	
  
‒  Increased	
  performance	
  compared	
  to	
  CPU	
  implementaLon	
  
‒  ApplicaLon	
  runs	
  at	
  higher	
  fps	
  
‒  Possible	
  to	
  apply	
  more	
  filters	
  to	
  achieve	
  beier	
  video	
  quality	
  

!  Use	
  CPU	
  for	
  Serial	
  processing	
  and	
  control	
  
‒  Efficient	
  usage	
  of	
  resources	
  

	
  

5	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
FFMPEG	
  –	
  OUR	
  WORK	
  
	
  

!  AMD	
  and	
  MulLcoreware	
  Inc.,	
  worked	
  on	
  acceleraLng	
  FFMPEG	
  	
  
!  Enable	
  usage	
  of	
  Hardware	
  decoder	
  	
  
‒  To	
  support	
  decoding	
  of	
  H.264,	
  VC-­‐1,	
  MPEG2	
  and	
  Mpeg4	
  pt2	
  codecs	
  
‒  Windows	
  
‒ IntegraLon	
  of	
  DXVA2	
  API	
  to	
  ffmpeg.exe	
  
‒ DXVA2	
  funcLonality	
  already	
  available	
  in	
  ffmpeg’s	
  libavcodec	
  library	
  
‒ Extremely	
  difficult	
  for	
  applicaLon	
  developers	
  to	
  make	
  use	
  of	
  DXVA2	
  API	
  in	
  libavcodec	
  
‒  Needs	
  deep	
  understanding	
  of	
  DXVA2	
  API	
  and	
  specific	
  codec	
  level	
  knowledge	
  

‒ Coded	
  up	
  all	
  the	
  necessary	
  steps	
  needed	
  to	
  use	
  HW	
  decoder	
  using	
  DXVA2	
  in	
  ffmpeg.exe	
  app	
  
‒ Created	
  a	
  command	
  line	
  opLon	
  for	
  ffmpeg.exe	
  to	
  enable	
  usage	
  of	
  HW	
  assisted	
  decode	
  

!  Make	
  use	
  of	
  DirectX(R)	
  9	
  to	
  OpenCLTM	
  interop	
  APIs	
  available	
  in	
  OpenCL1.2TM	
  
‒  This	
  ensures	
  the	
  decoded	
  frame	
  is	
  retained	
  in	
  GPU	
  memory	
  and	
  passed	
  on	
  to	
  OpenCLTM	
  filter	
  
	
  
6	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
FFMPEG	
  –	
  OUR	
  WORK	
  
	
  

!  Introduced	
  OpenCLTM	
  in	
  ffmpeg	
  
‒  Created	
  OpenCLTM	
  	
  infrastructure	
  in	
  libavuLl	
  to	
  enable	
  usage	
  of	
  OpenCLTM	
  	
  in	
  ffmpeg	
  
	
  

!  AcceleraLon	
  of	
  Video	
  processing	
  filters	
  on	
  GPU	
  using	
  OpenCLTM	
  
‒  Added	
  OpenCLTM	
  	
  implementaLon	
  for	
  the	
  following	
  filters	
  in	
  libavfilter	
  
‒  Deshake	
  -­‐	
  This	
  filter	
  helps	
  remove	
  camera	
  shake	
  from	
  hand-­‐holding	
  a	
  camera,	
  moving	
  on	
  a	
  vehicle,	
  etc.	
  
‒  Unsharp	
  -­‐	
  Sharpen	
  or	
  blur	
  the	
  input	
  video	
  
‒  Scale	
  -­‐	
  Scale	
  (resize)	
  the	
  input	
  video	
  
‒  Denoise	
  -­‐	
  High	
  precision/quality	
  3d	
  denoise	
  filter.	
  This	
  filter	
  aims	
  to	
  reduce	
  image	
  noise	
  producing	
  smooth	
  images	
  
‒  Yadif	
  -­‐	
  Deinterlace	
  the	
  input	
  video	
  
‒  Lnterlace	
  -­‐	
  temporal	
  field	
  interlacing	
  
‒  Gradfun	
  -­‐	
  Fix	
  the	
  banding	
  arLfacts	
  introduced	
  by	
  truncaLon	
  to	
  8bit	
  color	
  depth	
  

!  OpLmizaLon	
  of	
  ffmpeg	
  pipeline	
  to	
  run	
  decode,	
  filters	
  &	
  encode	
  in	
  parallel	
  
	
  
7	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
FFMPEG	
  –	
  PERFORMANCE	
  
" Performance	
  numbers	
  of	
  transcode	
  pipeline	
  using	
  ffmpeg	
  on	
  A10-­‐6800K	
  APU	
  

Accelerated	
  ffmpeg	
  	
  

55	
  

60	
  

57	
  

Original	
  ffmpeg	
  (CPU)	
  

FPS	
  

50	
  

29	
  

40	
  
30	
  

22	
  

20	
  
10	
  

1.3	
  

0	
  

8	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  

23	
  

16	
  
1.2	
  
FFMPEG	
  –	
  STATUS	
  
	
  

!  Ffmpeg	
  2.0	
  contains	
  OpenCL	
  work	
  
‒  OpenCL	
  framework	
  in	
  libavuLl	
  
‒  Deshake	
  and	
  unsharp	
  OpenCL	
  implementaLons	
  in	
  libavfilter	
  

!  DXVA2	
  patch	
  is	
  under	
  review	
  
!  Further	
  OpLmizaLons	
  and	
  tuning	
  in	
  progress	
  for	
  other	
  filters.	
  

9	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
FFMPEG	
  –	
  CHALLENGES	
  
	
  

!  Introducing	
  OpenCL	
  into	
  ffmpeg	
  
‒  Reviewers	
  were	
  not	
  well	
  versed	
  with	
  OpenCL	
  

!  Retaining	
  data	
  on	
  GPU	
  memory	
  in	
  the	
  pipeline	
  
‒  Ffmpeg	
  soNware	
  architectural	
  changes	
  needed	
  for	
  this	
  

!  RecompilaLon	
  of	
  kernels	
  on	
  every	
  run	
  
‒  Ffmpeg	
  does	
  not	
  allow	
  saving	
  compiled	
  binary	
  files	
  on	
  local	
  machine	
  

!  Ffmpeg	
  soNware	
  needs	
  pipeline	
  level	
  opLmizaLons	
  to	
  take	
  benefit	
  of	
  heterogeneous	
  placorm	
  

10	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
FFMPEG	
  –	
  FUTURE	
  WORK	
  
	
  

!  Add	
  support	
  for	
  HW	
  assisted	
  encode	
  (H.264)	
  
‒  AMD	
  is	
  going	
  to	
  give	
  out	
  C++	
  API	
  to	
  access	
  HW	
  Encoder	
  called	
  AMF	
  
‒  More	
  details	
  available	
  in	
  the	
  talk	
  tomorrow	
  
	
  Innova'ng	
  with	
  AMD	
  Mul'media	
  Technologies	
  (MM-­‐4095)	
  
	
  

!  OpLmize	
  OpenCL	
  implementaLon	
  of	
  filters	
  for	
  beier	
  performance	
  
!  Explore	
  using	
  HSA	
  features	
  to	
  boost	
  performance	
  
!  OpLmize	
  memory	
  transfers	
  	
  
‒  Retain	
  buffers	
  on	
  device	
  memory	
  across	
  Decode,	
  Filter	
  and	
  Encode	
  modules	
  

11	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
Handbrake
	
  
Improvements
	
  
WHAT	
  IS	
  HANDBRAKE?	
  
!  Open	
  Source	
  Video	
  Transcoder	
  
!  Converts	
  videos	
  from	
  most	
  popular	
  format	
  
!  Selectable	
  output	
  format	
  and	
  bitrates	
  
!  Video	
  Resizing	
  
!  Video	
  Filters	
  
‒ Deinterlacing	
  
‒ Decomb	
  
‒ Deblock	
  
‒ Grayscale	
  
‒ Cropping	
  

13	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
CURRENT	
  ENHANCEMENTS	
  
!  Hardware	
  Video	
  Decode	
  
‒ Input	
  video	
  decoded	
  via	
  DXVA2	
  
‒ ULlizes	
  UVD	
  on	
  AMD	
  GPUs	
  and	
  APUs	
  

!  OpenCL™	
  accelerated	
  Video	
  ResoluLon	
  changes	
  
‒ Video	
  Frames	
  are	
  resized	
  using	
  OpenCL	
  kernels	
  
‒ Example:	
  1080p	
  converted	
  to	
  720p	
  

14	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
IMPROVING	
  OPENCL	
  SCALING	
  
!  The	
  OpenCL	
  Scaling	
  Enhancement	
  was	
  under-­‐performing	
  
!  IdenLfied	
  Issues:	
  
‒ Image	
  format	
  conversion	
  
‒ Buffer	
  staging	
  
‒ Separable	
  Scaling	
  using	
  two	
  kernels	
  

15	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
OPENCL	
  SCALING	
  IMPROVEMENTS	
  

Reduce	
  Memory	
  Copies:	
  
	
  

!  Modify	
  the	
  exisLng	
  HandBrake	
  buffer	
  system	
  
!  IdenLfy	
  which	
  buffers	
  will	
  contain	
  video	
  data	
  (vs.	
  audio,	
  capLons,	
  etc.)	
  
!  Video	
  buffers	
  are	
  allocated	
  out	
  of	
  pinned	
  Host	
  Memory	
  
!  Non-­‐OpenCL	
  aware	
  code	
  writes	
  data	
  to	
  the	
  correct	
  place	
  
!  Kernels	
  can	
  directly	
  read/write	
  the	
  buffers	
  via	
  Zero	
  Copy	
  

16	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
OPENCL	
  SCALING	
  IMPROVEMENTS	
  

Switch	
  to	
  a	
  Single	
  Kernel:	
  
	
  
!  Eliminate	
  the	
  two	
  kernel	
  approach	
  
!  Process	
  blocks	
  of	
  data	
  rather	
  than	
  lines	
  
!  Support	
  HandBrake	
  naLve	
  image	
  packing	
  
!  Use	
  LDS	
  to	
  further	
  reduce	
  Global	
  Memory	
  accesses	
  

17	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
RESULTS	
  
!  The	
  single	
  kernel	
  completes	
  quickly	
  
!  No	
  extra	
  memory	
  copies	
  are	
  required	
  
!  Kernel	
  execuLon	
  Lme	
  to	
  scale	
  one	
  frame	
  (1080p	
  -­‐>	
  720p)*	
  
‒ AMD	
  A10-­‐6800K	
  –	
  2.4	
  ms	
  
‒ AMD	
  HD7750	
  –	
  1.0	
  ms	
  

!  ApplicaLon	
  Performance	
  on	
  A10-­‐6800K	
  
	
  Feature	
  

Performance	
  (FPS)	
  

Improvement	
  over	
  SW	
  

SoNware	
  

36.08	
  

0.0	
  

Scaling	
  

39.64	
  

9.9%	
  

UVD	
  

40.53	
  	
  

12.3%	
  

Scaling	
  +	
  UVD	
  

44.95	
  

23.9%	
  

*	
  All	
  Lmes	
  measured	
  on	
  a	
  development	
  system	
  
18	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
THANK	
  YOU	
  
QuesLons	
  

19	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  
DISCLAIMER	
  &	
  ATTRIBUTION	
  

The	
  informaLon	
  presented	
  in	
  this	
  document	
  is	
  for	
  informaLonal	
  purposes	
  only	
  and	
  may	
  contain	
  technical	
  inaccuracies,	
  omissions	
  and	
  typographical	
  errors.	
  
	
  
The	
  informaLon	
  contained	
  herein	
  is	
  subject	
  to	
  change	
  and	
  may	
  be	
  rendered	
  inaccurate	
  for	
  many	
  reasons,	
  including	
  but	
  not	
  limited	
  to	
  product	
  and	
  roadmap	
  
changes,	
  component	
  and	
  motherboard	
  version	
  changes,	
  new	
  model	
  and/or	
  product	
  releases,	
  product	
  differences	
  between	
  differing	
  manufacturers,	
  soNware	
  
changes,	
  BIOS	
  flashes,	
  firmware	
  upgrades,	
  or	
  the	
  like.	
  AMD	
  assumes	
  no	
  obligaLon	
  to	
  update	
  or	
  otherwise	
  correct	
  or	
  revise	
  this	
  informaLon.	
  However,	
  AMD	
  
reserves	
  the	
  right	
  to	
  revise	
  this	
  informaLon	
  and	
  to	
  make	
  changes	
  from	
  Lme	
  to	
  Lme	
  to	
  the	
  content	
  hereof	
  without	
  obligaLon	
  of	
  AMD	
  to	
  noLfy	
  any	
  person	
  of	
  
such	
  revisions	
  or	
  changes.	
  
	
  
AMD	
  MAKES	
  NO	
  REPRESENTATIONS	
  OR	
  WARRANTIES	
  WITH	
  RESPECT	
  TO	
  THE	
  CONTENTS	
  HEREOF	
  AND	
  ASSUMES	
  NO	
  RESPONSIBILITY	
  FOR	
  ANY	
  
INACCURACIES,	
  ERRORS	
  OR	
  OMISSIONS	
  THAT	
  MAY	
  APPEAR	
  IN	
  THIS	
  INFORMATION.	
  
	
  
AMD	
  SPECIFICALLY	
  DISCLAIMS	
  ANY	
  IMPLIED	
  WARRANTIES	
  OF	
  MERCHANTABILITY	
  OR	
  FITNESS	
  FOR	
  ANY	
  PARTICULAR	
  PURPOSE.	
  IN	
  NO	
  EVENT	
  WILL	
  AMD	
  BE	
  
LIABLE	
  TO	
  ANY	
  PERSON	
  FOR	
  ANY	
  DIRECT,	
  INDIRECT,	
  SPECIAL	
  OR	
  OTHER	
  CONSEQUENTIAL	
  DAMAGES	
  ARISING	
  FROM	
  THE	
  USE	
  OF	
  ANY	
  INFORMATION	
  
CONTAINED	
  HEREIN,	
  EVEN	
  IF	
  AMD	
  IS	
  EXPRESSLY	
  ADVISED	
  OF	
  THE	
  POSSIBILITY	
  OF	
  SUCH	
  DAMAGES.	
  
	
  
ATTRIBUTION	
  
©	
  2013	
  Advanced	
  Micro	
  Devices,	
  Inc.	
  All	
  rights	
  reserved.	
  AMD,	
  the	
  AMD	
  Arrow	
  logo	
  and	
  combinaLons	
  thereof	
  are	
  trademarks	
  of	
  Advanced	
  Micro	
  Devices,	
  
Inc.	
  in	
  the	
  United	
  States	
  and/or	
  other	
  jurisdicLons.	
  	
  SPEC	
  	
  is	
  a	
  registered	
  trademark	
  of	
  the	
  Standard	
  Performance	
  EvaluaLon	
  CorporaLon	
  (SPEC).	
  Other	
  
names	
  are	
  for	
  informaLonal	
  purposes	
  only	
  and	
  may	
  be	
  trademarks	
  of	
  their	
  respecLve	
  owners.	
  
20	
   |	
  	
  	
  PRESENTATION	
  TITLE	
  	
  	
  |	
  	
  	
  November	
  19,	
  2013	
  	
  	
  |	
  	
  	
  CONFIDENTIAL	
  

More Related Content

What's hot

Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...AMD Developer Central
 
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor MillerPL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor MillerAMD Developer Central
 
HSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben GasterHSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben GasterAMD Developer Central
 
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...AMD Developer Central
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesAMD Developer Central
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahAMD Developer Central
 
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...AMD Developer Central
 
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey PavlenkoMM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey PavlenkoAMD Developer Central
 
Final lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tbFinal lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tbr Skip
 
PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovAMD Developer Central
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAMD Developer Central
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...AMD Developer Central
 
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelIS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelAMD Developer Central
 
HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...
HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...
HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...AMD Developer Central
 
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...AMD Developer Central
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...AMD Developer Central
 
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornDirect3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornAMD Developer Central
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...AMD Developer Central
 
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderPT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderAMD Developer Central
 
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin CoumansGS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin CoumansAMD Developer Central
 

What's hot (20)

Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
 
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor MillerPL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
 
HSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben GasterHSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben Gaster
 
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math Libraries
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
 
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
 
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey PavlenkoMM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
 
Final lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tbFinal lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tb
 
PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry Kozlov
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
 
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelIS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
 
HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...
HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...
HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...
 
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
 
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornDirect3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
 
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderPT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
 
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin CoumansGS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
GS-4150, Bullet 3 OpenCL Rigid Body Simulation, by Erwin Coumans
 

Similar to Optimizing FFMPEG and Handbrake for Hardware Acceleration

CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...
CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...
CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...AMD Developer Central
 
MOVED: RDK/WPE Port on DB410C - SFO17-206
MOVED: RDK/WPE Port on DB410C - SFO17-206MOVED: RDK/WPE Port on DB410C - SFO17-206
MOVED: RDK/WPE Port on DB410C - SFO17-206Linaro
 
Advanced Video Production with FOSS
Advanced Video Production with FOSSAdvanced Video Production with FOSS
Advanced Video Production with FOSSKirk Kimmel
 
Utf 8'en'ibm sametime 9 - voice and video deployment
Utf 8'en'ibm sametime 9 - voice and video deployment Utf 8'en'ibm sametime 9 - voice and video deployment
Utf 8'en'ibm sametime 9 - voice and video deployment a8us
 
Accelerate graphics performance with ozone-gbm on Intel based Linux desktop s...
Accelerate graphics performance with ozone-gbm on Intel based Linux desktop s...Accelerate graphics performance with ozone-gbm on Intel based Linux desktop s...
Accelerate graphics performance with ozone-gbm on Intel based Linux desktop s...Joone Hur
 
Embedded Recipes 2018 - Upstream multimedia on amlogic so cs from fiction t...
Embedded Recipes 2018 - Upstream multimedia on amlogic so cs   from fiction t...Embedded Recipes 2018 - Upstream multimedia on amlogic so cs   from fiction t...
Embedded Recipes 2018 - Upstream multimedia on amlogic so cs from fiction t...Anne Nicolas
 
Kernel Recipes 2014 - Testing Video4Linux Applications and Drivers
Kernel Recipes 2014 - Testing Video4Linux Applications and DriversKernel Recipes 2014 - Testing Video4Linux Applications and Drivers
Kernel Recipes 2014 - Testing Video4Linux Applications and DriversAnne Nicolas
 
"Building Complete Embedded Vision Systems on Linux—From Camera to Display," ...
"Building Complete Embedded Vision Systems on Linux—From Camera to Display," ..."Building Complete Embedded Vision Systems on Linux—From Camera to Display," ...
"Building Complete Embedded Vision Systems on Linux—From Camera to Display," ...Edge AI and Vision Alliance
 
Elc Europe 2020 : u-boot- porting and maintaining a bootloader for a multimed...
Elc Europe 2020 : u-boot- porting and maintaining a bootloader for a multimed...Elc Europe 2020 : u-boot- porting and maintaining a bootloader for a multimed...
Elc Europe 2020 : u-boot- porting and maintaining a bootloader for a multimed...Neil Armstrong
 
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono..."The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...Edge AI and Vision Alliance
 
DCC Labs Company Presentation
DCC Labs Company PresentationDCC Labs Company Presentation
DCC Labs Company PresentationDCC Labs
 
FIWARE Global Summit - Real-time Media Stream Processing Using Kurento
FIWARE Global Summit - Real-time Media Stream Processing Using KurentoFIWARE Global Summit - Real-time Media Stream Processing Using Kurento
FIWARE Global Summit - Real-time Media Stream Processing Using KurentoFIWARE
 
High Quality 360 Video Rendering and Streaming
High Quality 360 Video Rendering and StreamingHigh Quality 360 Video Rendering and Streaming
High Quality 360 Video Rendering and StreamingITU
 
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...Intel® Software
 
Simplifying and accelerating converged media with Open Visual Cloud
Simplifying and accelerating converged media with Open Visual CloudSimplifying and accelerating converged media with Open Visual Cloud
Simplifying and accelerating converged media with Open Visual CloudLiz Warner
 
01 high bandwidth acquisitioncomputing compressionall in a box
01 high bandwidth acquisitioncomputing compressionall in a box01 high bandwidth acquisitioncomputing compressionall in a box
01 high bandwidth acquisitioncomputing compressionall in a boxYutaka Kawai
 
FIWARE Global Summit - Real-time Media Stream Processing Using Kurento
FIWARE Global Summit - Real-time Media Stream Processing Using KurentoFIWARE Global Summit - Real-time Media Stream Processing Using Kurento
FIWARE Global Summit - Real-time Media Stream Processing Using KurentoFIWARE
 
FIWARE Global Summit - Real-time Media Stream Processing Using Kurento
FIWARE Global Summit - Real-time Media Stream Processing Using KurentoFIWARE Global Summit - Real-time Media Stream Processing Using Kurento
FIWARE Global Summit - Real-time Media Stream Processing Using KurentoFIWARE
 
LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2Linaro
 

Similar to Optimizing FFMPEG and Handbrake for Hardware Acceleration (20)

CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...
CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...
CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...
 
MOVED: RDK/WPE Port on DB410C - SFO17-206
MOVED: RDK/WPE Port on DB410C - SFO17-206MOVED: RDK/WPE Port on DB410C - SFO17-206
MOVED: RDK/WPE Port on DB410C - SFO17-206
 
Advanced Video Production with FOSS
Advanced Video Production with FOSSAdvanced Video Production with FOSS
Advanced Video Production with FOSS
 
Flowframes
FlowframesFlowframes
Flowframes
 
Utf 8'en'ibm sametime 9 - voice and video deployment
Utf 8'en'ibm sametime 9 - voice and video deployment Utf 8'en'ibm sametime 9 - voice and video deployment
Utf 8'en'ibm sametime 9 - voice and video deployment
 
Accelerate graphics performance with ozone-gbm on Intel based Linux desktop s...
Accelerate graphics performance with ozone-gbm on Intel based Linux desktop s...Accelerate graphics performance with ozone-gbm on Intel based Linux desktop s...
Accelerate graphics performance with ozone-gbm on Intel based Linux desktop s...
 
Embedded Recipes 2018 - Upstream multimedia on amlogic so cs from fiction t...
Embedded Recipes 2018 - Upstream multimedia on amlogic so cs   from fiction t...Embedded Recipes 2018 - Upstream multimedia on amlogic so cs   from fiction t...
Embedded Recipes 2018 - Upstream multimedia on amlogic so cs from fiction t...
 
Kernel Recipes 2014 - Testing Video4Linux Applications and Drivers
Kernel Recipes 2014 - Testing Video4Linux Applications and DriversKernel Recipes 2014 - Testing Video4Linux Applications and Drivers
Kernel Recipes 2014 - Testing Video4Linux Applications and Drivers
 
"Building Complete Embedded Vision Systems on Linux—From Camera to Display," ...
"Building Complete Embedded Vision Systems on Linux—From Camera to Display," ..."Building Complete Embedded Vision Systems on Linux—From Camera to Display," ...
"Building Complete Embedded Vision Systems on Linux—From Camera to Display," ...
 
Elc Europe 2020 : u-boot- porting and maintaining a bootloader for a multimed...
Elc Europe 2020 : u-boot- porting and maintaining a bootloader for a multimed...Elc Europe 2020 : u-boot- porting and maintaining a bootloader for a multimed...
Elc Europe 2020 : u-boot- porting and maintaining a bootloader for a multimed...
 
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono..."The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
 
DCC Labs Company Presentation
DCC Labs Company PresentationDCC Labs Company Presentation
DCC Labs Company Presentation
 
FIWARE Global Summit - Real-time Media Stream Processing Using Kurento
FIWARE Global Summit - Real-time Media Stream Processing Using KurentoFIWARE Global Summit - Real-time Media Stream Processing Using Kurento
FIWARE Global Summit - Real-time Media Stream Processing Using Kurento
 
High Quality 360 Video Rendering and Streaming
High Quality 360 Video Rendering and StreamingHigh Quality 360 Video Rendering and Streaming
High Quality 360 Video Rendering and Streaming
 
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
 
Simplifying and accelerating converged media with Open Visual Cloud
Simplifying and accelerating converged media with Open Visual CloudSimplifying and accelerating converged media with Open Visual Cloud
Simplifying and accelerating converged media with Open Visual Cloud
 
01 high bandwidth acquisitioncomputing compressionall in a box
01 high bandwidth acquisitioncomputing compressionall in a box01 high bandwidth acquisitioncomputing compressionall in a box
01 high bandwidth acquisitioncomputing compressionall in a box
 
FIWARE Global Summit - Real-time Media Stream Processing Using Kurento
FIWARE Global Summit - Real-time Media Stream Processing Using KurentoFIWARE Global Summit - Real-time Media Stream Processing Using Kurento
FIWARE Global Summit - Real-time Media Stream Processing Using Kurento
 
FIWARE Global Summit - Real-time Media Stream Processing Using Kurento
FIWARE Global Summit - Real-time Media Stream Processing Using KurentoFIWARE Global Summit - Real-time Media Stream Processing Using Kurento
FIWARE Global Summit - Real-time Media Stream Processing Using Kurento
 
LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2
 

More from AMD Developer Central

DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsAMD Developer Central
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceAMD Developer Central
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozAMD Developer Central
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellAMD Developer Central
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonAMD Developer Central
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevAMD Developer Central
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasAMD Developer Central
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...AMD Developer Central
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14AMD Developer Central
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14AMD Developer Central
 
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...AMD Developer Central
 
Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14AMD Developer Central
 
Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14AMD Developer Central
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14AMD Developer Central
 

More from AMD Developer Central (20)

DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
 
Media SDK Webinar 2014
Media SDK Webinar 2014Media SDK Webinar 2014
Media SDK Webinar 2014
 
DirectGMA on AMD’S FirePro™ GPUS
DirectGMA on AMD’S  FirePro™ GPUSDirectGMA on AMD’S  FirePro™ GPUS
DirectGMA on AMD’S FirePro™ GPUS
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop Intelligence
 
Inside XBox- One, by Martin Fuller
Inside XBox- One, by Martin FullerInside XBox- One, by Martin Fuller
Inside XBox- One, by Martin Fuller
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas Thibieroz
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
Gcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodesGcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodes
 
Inside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin FullerInside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin Fuller
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan Nevraev
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
 
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
 
Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14
 
Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
 

Recently uploaded

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

Optimizing FFMPEG and Handbrake for Hardware Acceleration

  • 1. OPTIMIZING  FFMPEG  AND  HANDBRAKE   USING  OPENCL   SRIKANTH  GOLLAPUDI  &  MICHAEL  WOOTTON  
  • 2. FFMPEG   INTRODUCTION   !  FFMPEG  is  a  very  popular  open  source  mulLmedia  soNware  library  used  to  record,  convert  and  stream   Audio  &  Video.   !  Used  by  popular  OpenSource  projects  like  Handbrake,  VLC  player,  Chrome  etc.   !  Single  stop  soluLon  for   ‒  Decoding  different  codec  formats  (Audio  &  Video)   ‒  Handling  various  container  formats  (mp4,  wmv,  avi,  m2ts,  m2ps  etc.)   ‒  Encoding  to  popular  Video  &  Audio  codec  formats  (H.264,  VC-­‐1,  Mpeg2  etc.)   ‒  Different  video  filtering  algorithms  (Deshake,  Scale,  Unsharp  etc.)   ‒  Managing  different  pixel  formats  (NV12,  RGB,  YV12  etc.)   ‒  Cross-­‐placorm  support  (Windows  and  Linux)   2   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL  
  • 3. FFMPEG  –  TYPICAL  USAGE  SCENARIO  AND  PROCESSING  INVOLVED    Imagine  a  video  edit  using  FFMPEG   Video        Decode   3   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL   Video  shake          removal   Sharp/Blur      Scale   Video        Encode  
  • 4. FFMPEG  –  TYPICAL  USAGE  SCENARIO  AND  PROCESSING  INVOLVED    Imagine  a  video  edit  using  FFMPEG   Video        Decode   Video  shake          removal   Sharp/Blur      Scale   GPU   HW  Decoder   CPU   AMD  APU   HETEROGENEOUS  SOLUTION   4   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL   Video        Encode   HW  Encoder  
  • 5. FFMPEG  –  SCOPE  FOR  ACCELERATION   Leverage  Heterogeneous  compute   !  Accelerate  Video  Decode  and  Encode  using  HW  accelerators   ‒  Load  on  CPU  to  perform  decode  and  encode  is  taken  off   ‒  Power  savings  =>  longer  baiery  life   !  Accelerate  Video  Processing  filter  using  GPU   ‒  Increased  performance  compared  to  CPU  implementaLon   ‒  ApplicaLon  runs  at  higher  fps   ‒  Possible  to  apply  more  filters  to  achieve  beier  video  quality   !  Use  CPU  for  Serial  processing  and  control   ‒  Efficient  usage  of  resources     5   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL  
  • 6. FFMPEG  –  OUR  WORK     !  AMD  and  MulLcoreware  Inc.,  worked  on  acceleraLng  FFMPEG     !  Enable  usage  of  Hardware  decoder     ‒  To  support  decoding  of  H.264,  VC-­‐1,  MPEG2  and  Mpeg4  pt2  codecs   ‒  Windows   ‒ IntegraLon  of  DXVA2  API  to  ffmpeg.exe   ‒ DXVA2  funcLonality  already  available  in  ffmpeg’s  libavcodec  library   ‒ Extremely  difficult  for  applicaLon  developers  to  make  use  of  DXVA2  API  in  libavcodec   ‒  Needs  deep  understanding  of  DXVA2  API  and  specific  codec  level  knowledge   ‒ Coded  up  all  the  necessary  steps  needed  to  use  HW  decoder  using  DXVA2  in  ffmpeg.exe  app   ‒ Created  a  command  line  opLon  for  ffmpeg.exe  to  enable  usage  of  HW  assisted  decode   !  Make  use  of  DirectX(R)  9  to  OpenCLTM  interop  APIs  available  in  OpenCL1.2TM   ‒  This  ensures  the  decoded  frame  is  retained  in  GPU  memory  and  passed  on  to  OpenCLTM  filter     6   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL  
  • 7. FFMPEG  –  OUR  WORK     !  Introduced  OpenCLTM  in  ffmpeg   ‒  Created  OpenCLTM    infrastructure  in  libavuLl  to  enable  usage  of  OpenCLTM    in  ffmpeg     !  AcceleraLon  of  Video  processing  filters  on  GPU  using  OpenCLTM   ‒  Added  OpenCLTM    implementaLon  for  the  following  filters  in  libavfilter   ‒  Deshake  -­‐  This  filter  helps  remove  camera  shake  from  hand-­‐holding  a  camera,  moving  on  a  vehicle,  etc.   ‒  Unsharp  -­‐  Sharpen  or  blur  the  input  video   ‒  Scale  -­‐  Scale  (resize)  the  input  video   ‒  Denoise  -­‐  High  precision/quality  3d  denoise  filter.  This  filter  aims  to  reduce  image  noise  producing  smooth  images   ‒  Yadif  -­‐  Deinterlace  the  input  video   ‒  Lnterlace  -­‐  temporal  field  interlacing   ‒  Gradfun  -­‐  Fix  the  banding  arLfacts  introduced  by  truncaLon  to  8bit  color  depth   !  OpLmizaLon  of  ffmpeg  pipeline  to  run  decode,  filters  &  encode  in  parallel     7   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL  
  • 8. FFMPEG  –  PERFORMANCE   " Performance  numbers  of  transcode  pipeline  using  ffmpeg  on  A10-­‐6800K  APU   Accelerated  ffmpeg     55   60   57   Original  ffmpeg  (CPU)   FPS   50   29   40   30   22   20   10   1.3   0   8   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL   23   16   1.2  
  • 9. FFMPEG  –  STATUS     !  Ffmpeg  2.0  contains  OpenCL  work   ‒  OpenCL  framework  in  libavuLl   ‒  Deshake  and  unsharp  OpenCL  implementaLons  in  libavfilter   !  DXVA2  patch  is  under  review   !  Further  OpLmizaLons  and  tuning  in  progress  for  other  filters.   9   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL  
  • 10. FFMPEG  –  CHALLENGES     !  Introducing  OpenCL  into  ffmpeg   ‒  Reviewers  were  not  well  versed  with  OpenCL   !  Retaining  data  on  GPU  memory  in  the  pipeline   ‒  Ffmpeg  soNware  architectural  changes  needed  for  this   !  RecompilaLon  of  kernels  on  every  run   ‒  Ffmpeg  does  not  allow  saving  compiled  binary  files  on  local  machine   !  Ffmpeg  soNware  needs  pipeline  level  opLmizaLons  to  take  benefit  of  heterogeneous  placorm   10   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL  
  • 11. FFMPEG  –  FUTURE  WORK     !  Add  support  for  HW  assisted  encode  (H.264)   ‒  AMD  is  going  to  give  out  C++  API  to  access  HW  Encoder  called  AMF   ‒  More  details  available  in  the  talk  tomorrow    Innova'ng  with  AMD  Mul'media  Technologies  (MM-­‐4095)     !  OpLmize  OpenCL  implementaLon  of  filters  for  beier  performance   !  Explore  using  HSA  features  to  boost  performance   !  OpLmize  memory  transfers     ‒  Retain  buffers  on  device  memory  across  Decode,  Filter  and  Encode  modules   11   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL  
  • 13. WHAT  IS  HANDBRAKE?   !  Open  Source  Video  Transcoder   !  Converts  videos  from  most  popular  format   !  Selectable  output  format  and  bitrates   !  Video  Resizing   !  Video  Filters   ‒ Deinterlacing   ‒ Decomb   ‒ Deblock   ‒ Grayscale   ‒ Cropping   13   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL  
  • 14. CURRENT  ENHANCEMENTS   !  Hardware  Video  Decode   ‒ Input  video  decoded  via  DXVA2   ‒ ULlizes  UVD  on  AMD  GPUs  and  APUs   !  OpenCL™  accelerated  Video  ResoluLon  changes   ‒ Video  Frames  are  resized  using  OpenCL  kernels   ‒ Example:  1080p  converted  to  720p   14   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL  
  • 15. IMPROVING  OPENCL  SCALING   !  The  OpenCL  Scaling  Enhancement  was  under-­‐performing   !  IdenLfied  Issues:   ‒ Image  format  conversion   ‒ Buffer  staging   ‒ Separable  Scaling  using  two  kernels   15   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL  
  • 16. OPENCL  SCALING  IMPROVEMENTS   Reduce  Memory  Copies:     !  Modify  the  exisLng  HandBrake  buffer  system   !  IdenLfy  which  buffers  will  contain  video  data  (vs.  audio,  capLons,  etc.)   !  Video  buffers  are  allocated  out  of  pinned  Host  Memory   !  Non-­‐OpenCL  aware  code  writes  data  to  the  correct  place   !  Kernels  can  directly  read/write  the  buffers  via  Zero  Copy   16   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL  
  • 17. OPENCL  SCALING  IMPROVEMENTS   Switch  to  a  Single  Kernel:     !  Eliminate  the  two  kernel  approach   !  Process  blocks  of  data  rather  than  lines   !  Support  HandBrake  naLve  image  packing   !  Use  LDS  to  further  reduce  Global  Memory  accesses   17   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL  
  • 18. RESULTS   !  The  single  kernel  completes  quickly   !  No  extra  memory  copies  are  required   !  Kernel  execuLon  Lme  to  scale  one  frame  (1080p  -­‐>  720p)*   ‒ AMD  A10-­‐6800K  –  2.4  ms   ‒ AMD  HD7750  –  1.0  ms   !  ApplicaLon  Performance  on  A10-­‐6800K    Feature   Performance  (FPS)   Improvement  over  SW   SoNware   36.08   0.0   Scaling   39.64   9.9%   UVD   40.53     12.3%   Scaling  +  UVD   44.95   23.9%   *  All  Lmes  measured  on  a  development  system   18   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL  
  • 19. THANK  YOU   QuesLons   19   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL  
  • 20. DISCLAIMER  &  ATTRIBUTION   The  informaLon  presented  in  this  document  is  for  informaLonal  purposes  only  and  may  contain  technical  inaccuracies,  omissions  and  typographical  errors.     The  informaLon  contained  herein  is  subject  to  change  and  may  be  rendered  inaccurate  for  many  reasons,  including  but  not  limited  to  product  and  roadmap   changes,  component  and  motherboard  version  changes,  new  model  and/or  product  releases,  product  differences  between  differing  manufacturers,  soNware   changes,  BIOS  flashes,  firmware  upgrades,  or  the  like.  AMD  assumes  no  obligaLon  to  update  or  otherwise  correct  or  revise  this  informaLon.  However,  AMD   reserves  the  right  to  revise  this  informaLon  and  to  make  changes  from  Lme  to  Lme  to  the  content  hereof  without  obligaLon  of  AMD  to  noLfy  any  person  of   such  revisions  or  changes.     AMD  MAKES  NO  REPRESENTATIONS  OR  WARRANTIES  WITH  RESPECT  TO  THE  CONTENTS  HEREOF  AND  ASSUMES  NO  RESPONSIBILITY  FOR  ANY   INACCURACIES,  ERRORS  OR  OMISSIONS  THAT  MAY  APPEAR  IN  THIS  INFORMATION.     AMD  SPECIFICALLY  DISCLAIMS  ANY  IMPLIED  WARRANTIES  OF  MERCHANTABILITY  OR  FITNESS  FOR  ANY  PARTICULAR  PURPOSE.  IN  NO  EVENT  WILL  AMD  BE   LIABLE  TO  ANY  PERSON  FOR  ANY  DIRECT,  INDIRECT,  SPECIAL  OR  OTHER  CONSEQUENTIAL  DAMAGES  ARISING  FROM  THE  USE  OF  ANY  INFORMATION   CONTAINED  HEREIN,  EVEN  IF  AMD  IS  EXPRESSLY  ADVISED  OF  THE  POSSIBILITY  OF  SUCH  DAMAGES.     ATTRIBUTION   ©  2013  Advanced  Micro  Devices,  Inc.  All  rights  reserved.  AMD,  the  AMD  Arrow  logo  and  combinaLons  thereof  are  trademarks  of  Advanced  Micro  Devices,   Inc.  in  the  United  States  and/or  other  jurisdicLons.    SPEC    is  a  registered  trademark  of  the  Standard  Performance  EvaluaLon  CorporaLon  (SPEC).  Other   names  are  for  informaLonal  purposes  only  and  may  be  trademarks  of  their  respecLve  owners.   20   |      PRESENTATION  TITLE      |      November  19,  2013      |      CONFIDENTIAL