Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

UIImageView vs Metal #tryswiftconf

5,039 views

Published on

Slides for a talk @ try! Swift Tokyo 2018

[Abstract] Metal is an API that provides access to the GPU. Apple announced it's 10x times faster than OpenGL. In this session, I'll explain the basics of Metal, then compare the performance of graphics rendering with UIImageView.

Even if you don't use the API directly, your app is implicitly benefitting from Metal. This comparison to a familiar class will lead you to be conscious of the GPU layer that we usually miss.

【概要】

MetalはGPUへのアクセスを提供するAPIで、OpenGLより10倍速いという謳い文句で登場しました。本セッションではMetalの基礎を解説しつつ、iOSにおけるグラフィックス描画性能をUIImageViewと比較してみます。

MetalのAPIを直接利用する機会がなくても、Metalはあなたのアプリの裏で暗躍しています。身近なクラスとの比較を通じて、普段我々が意識することのないGPUのレイヤで何が起きているのか、目を向けてみるきっかけになればと思います。

Published in: Technology
  • Be the first to comment

UIImageView vs Metal #tryswiftconf

  1. 1. try! Swift Tokyo 2018 Shuichi Tsutsumi @shu223 UIImageView vs Metal
  2. 2. Shuichi Tsutsumi @shu223 • iOS Developer - @Fyusion Inc. - @Freelance
  3. 3. Today’s Goal • Learn “how to use Metal” • Be conscious the GPU layer through Metal
  4. 4. Agenda • Compare the graphics rendering performance of Metal to UIImageView → Learn a lot around GPU 1. UIKit is optimized well with GPU. 2. Consider also the GPU, when measuring the performance. 3. Pay attention to the processing flow between CPU and GPU. 4. Be careful where the resource is.
  5. 5. imageView.image = image
  6. 6. What’s happening?
  7. 7. ScreenProcessor Frame Buffer Pixel Data for a frame Write 60 times/sec Draw Pixels * Resolution
  8. 8. Difference Between CPU and GPU CPU is a Sports Car • Very fast • Can’t process many tasks in parallel GPU is a Bus • Not as fast as CPU • Can process many “same” tasks in parallel
  9. 9. • CPU is very fast, good for any tasks (general-purpose processor) - However, if used to process everything, it will easily reach to 100% load. → Utilize GPU as much as possible, 
 if the task is good for GPU
 (= can be computed in parallel)
  10. 10. Processor ScreenFrame Buffer Write 60 times/sec * Resolution Pixel Data for 1 frame GPU
  11. 11. What’s ?
  12. 12. Provide access to GPU GPU Your app ???
  13. 13. What’s the difference from OpenGL?
  14. 14. OpenGL • Cross-platform • Supports many vendors’ GPUs
  15. 15. Metal • Developed by Apple • Optimized for Apple’s hardware • 10x faster than OpenGL
  16. 16. Sounds great!
  17. 17. Metal Implementation
  18. 18. imageView.image = image
  19. 19. To achieve this with Metal…
  20. 20. func draw(in view: MTKView) { guard let drawable = view.currentDrawable else {return} guard let commandBuffer = commandQueue.makeCommandBuffer() else {fatalError()} guard let blitEncoder = commandBuffer.makeBlitCommandEncoder() else {fatalError()} let w = min(texture.width, drawable.texture.width) let h = min(texture.height, drawable.texture.height) blitEncoder.copy(from: texture, sourceSlice: 0, sourceLevel: 0, sourceOrigin: MTLOrigin(x: 0, y: 0, z: 0), sourceSize: MTLSizeMake(w, h, texture.depth), to: drawable.texture, destinationSlice: 0, destinationLevel: 0, destinationOrigin: MTLOrigin(x: 0, y: 0, z: 0)) blitEncoder.endEncoding() commandBuffer.present(drawable) commandBuffer.commit() commandBuffer.waitUntilCompleted() } private let device = MTLCreateSystemDefaultDevice()! 
 private func setup() { commandQueue = device.makeCommandQueue() let textureLoader = MTKTextureLoader(device: device) texture = try! textureLoader.newTexture( name: "highsierra", scaleFactor: view.contentScaleFactor, bundle: nil) mtkView.device = device mtkView.delegate = self mtkView.colorPixelFormat = texture.pixelFormat }
  21. 21. this is the Minimumimplementation
  22. 22. private let device = MTLCreateSystemDefaultDevice()! 
 private func setup() { commandQueue = device.makeCommandQueue() let textureLoader = MTKTextureLoader(device: device) texture = try! textureLoader.newTexture( name: "highsierra", scaleFactor: view.contentScaleFactor, bundle: nil) mtkView.device = device mtkView.delegate = self mtkView.colorPixelFormat = texture.pixelFormat } func draw(in view: MTKView) { guard let drawable = view.currentDrawable else {return} guard let commandBuffer = commandQueue.makeCommandBuffer() else {fatalE guard let blitEncoder = commandBuffer.makeBlitCommandEncoder() else {fa let w = min(texture.width, drawable.texture.width) let h = min(texture.height, drawable.texture.height) blitEncoder.copy(from: texture, sourceSlice: 0, sourceLevel: 0, sourceOrigin: MTLOrigin(x: 0, y: 0, z: 0), sourceSize: MTLSizeMake(w, h, texture.depth), to: drawable.texture, destinationSlice: 0, destinationLevel: 0, destinationOrigin: MTLOrigin(x: 0, y: 0, z: 0)) blitEncoder.endEncoding() commandBuffer.present(drawable) commandBuffer.commit() commandBuffer.waitUntilCompleted() } imageView.image = image
  23. 23. private let device = MTLCreateSystemDefaultDevice()! 
 private func setup() { commandQueue = device.makeCommandQueue() let textureLoader = MTKTextureLoader(device: device) texture = try! textureLoader.newTexture( name: "highsierra", scaleFactor: view.contentScaleFactor, bundle: nil) mtkView.device = device mtkView.delegate = self mtkView.colorPixelFormat = texture.pixelFormat } func draw(in view: MTKView) { guard let drawable = view.currentDrawable else {return} guard let commandBuffer = commandQueue.makeCommandBuffer() else {fatalE guard let blitEncoder = commandBuffer.makeBlitCommandEncoder() else {fa let w = min(texture.width, drawable.texture.width) let h = min(texture.height, drawable.texture.height) blitEncoder.copy(from: texture, sourceSlice: 0, sourceLevel: 0, sourceOrigin: MTLOrigin(x: 0, y: 0, z: 0), sourceSize: MTLSizeMake(w, h, texture.depth), to: drawable.texture, destinationSlice: 0, destinationLevel: 0, destinationOrigin: MTLOrigin(x: 0, y: 0, z: 0)) blitEncoder.endEncoding() commandBuffer.present(drawable) commandBuffer.commit() commandBuffer.waitUntilCompleted() } imageView.image = image 💡
  24. 24. My Idea: A Metal wrapper class to draw an image ✓ Easy to use as UIImageView ✓ Metal Accelerated “MetalImageView” metalImageView.texture = texture
  25. 25. Powered by
  26. 26. Performance comparison with UIImageView
  27. 27. Sample App for the comparison • Render large images in table cells. - 5120 x 3200 (elcapitan.jpg) - 1245 x 1245 (sierra.png)
  28. 28. Measuring Code let time1 = CACurrentMediaTime() if isMetal { let metalCell = cell as! MetalTableViewCell metalCell.metalImageView.textureName = name } else { let uikitCell = cell as! TableViewCell uikitCell.uiImageView.image = UIImage(named: name) } let time2 = CACurrentMediaTime() print("time:(time2-time1)") Time Interval Render with UIImageView Render with Metal
  29. 29. Results • Metal is 10x - 20x faster! Time to render an image UIImageView 0.4 - 0.6 msec Metal 0.02 - 0.05 msec iPhone 6s
  30. 30. Something weird Metal is more lagging, choppy UIImageView Metal
  31. 31. Measuring Code let time1 = CACurrentMediaTime() if isMetal { let metalCell = cell as! MetalTableViewCell metalCell.metalImageView.textureName = name } else { let uikitCell = cell as! TableViewCell uikitCell.uiImageView.image = UIImage(named: name) } let time2 = CACurrentMediaTime() print("time:(time2-time1)")
  32. 32. Basic Concept
  33. 33. 2. CPU creates GPU commands 
 as a command buffer 1. Load image data to memory 
 for GPU (& CPU) 4. GPU processes the commands 3. Push it to GPU
  34. 34. let time1 = CACurrentMediaTime() if isMetal { let metalCell = cell as! MetalTableViewCell metalCell.metalImageView.textureName = name } else { let uikitCell = cell as! TableViewCell uikitCell.uiImageView.image = UIImage(named: name) } let time2 = CACurrentMediaTime() print("time:(time2-time1)")
  35. 35. 2. CPU creates GPU commands 
 as a command buffer 1. Load image data to memory 
 for GPU (& CPU) 3. Push it to GPU 4. GPU processes the commands NOT Considered!
  36. 36. • Measure the time until the GPU processing is completed func draw(in view: MTKView) { // Prepare the command buffer ... // Push the command buffer commandBuffer.commit() // Wait commandBuffer.waitUntilCompleted() // Measure let endTime = CACurrentMediaTime() print(“Time: (endTime - startTime)”) } Fixed measuring code Submit commands to GPU Wait until the GPU processing is completed Calculate the total time
  37. 37. Results • Metal is SLOWER !? - Less than 30fps even the best case → My implementation should have problems • UIImageView is fast enough anyways. Time to render an image UIImageView 0.4 - 0.6 msec Metal 40 - 200 msec
  38. 38. Why does UIImageView so fast?
  39. 39. ※WWDC17 Platforms State of the Unionより UIKit internally uses Metal
  40. 40. • UIKit has been updated, and optimized well. • Should use UIKit rather than making a custom UI component with low level APIs (e.g. Metal) unless there is particular reasons it can be better.
  41. 41. Point 1: UIKit is optimized well with GPU
  42. 42. Point 2: Consider also the GPU, 
 when measuring the performance
  43. 43. Why was MetalImageView so slow? What was the problem? (My Metal Wrapper)
  44. 44. Profile using Instruments Metal System Trace
  45. 45. On CPU On GPU Create command buffers etc.(on CPU) Submit command buffers etc.(on CPU) Process shaders(on GPU)
  46. 46. On CPU On GPU
  47. 47. Problem 1
  48. 48. Resize (MPSImageLanczosScale) Render (MTLBlitCommandEncoder) Unexpected interval Measuring Time
  49. 49. Current processing flow 1. Resize with MPSImageLanczosScale 2. After 1 is completed, call setNeedsDisplay() 3. draw(in:) of MTKViewDelegate is called 4. Render to screen in the draw(in:) Problem
  50. 50. The CPU is waiting for the GPU On CPU On GPU
  51. 51. Resize Render 2. CPU creates GPU commands 
 as a command buffer 4. GPU processes the commands 3. Push it to GPU
  52. 52. FIX: Combined the commands • Encode both commands into a command buffer - Resize - Render • Push the command buffer to a GPU
  53. 53. 2. CPU creates GPU commands 
 as a command buffer 4. GPU processes the commands 3. Push it to GPU Resize Render
  54. 54. Resize Render
  55. 55. Unexpected interval Combine Resize Render Resize+Render
  56. 56. Point 3: Pay attention to the processing flow between CPU and GPU
  57. 57. Problem 2
  58. 58. Speculation: Loading textures is the bottleneck(?) 1. Load image data to memory 
 for GPU (& CPU)
  59. 59. Measure the time to load textures let startTime = CACurrentMediaTime() textureLoader.newTexture(name: name, scaleFactor: scaleFactor, bundle: nil) { (texture, error) in let endTime = CACurrentMediaTime() print("Time to load (name): (endTime - startTime)") • Results: 20 - 500 msec → It’s the bottleneck!
  60. 60. Fix: Cache the loaded textures • UIImage(named:) caches internally, too • “Caching loaded image data” is NOT a Metal/GPU specific idea.
  61. 61. Metal/GPU specific point: “Where is the resource?” Memory for GPU (& CPU) private var cachedTextures: [String: MTLTexture] = [:]OK private var cachedImages: [String: UIImage] = [:]NG
  62. 62. After adopting Cache
  63. 63. Point 4: Be careful where the resource is.
  64. 64. Wrap up
  65. 65. Today’s Goal • Learn “how to use Metal” • Be conscious the GPU layer through Metal
  66. 66. • Compared the graphics rendering performance of Metal to UIImageView → Learned a lot around GPU 1. UIKit is optimized well with GPU. 2. Consider also the GPU, when measuring the performance. 3. Pay attention to the processing flow between CPU and GPU. 4. Be careful where the resource is.
  67. 67. Thank you! https://github.com/shu223

×