SIGGRAPH Asia 2012: GPU-accelerated Path Rendering


Published on

Presented at SIGGRAPH Asia 2012 in Singapore on Friday, 30 November 14:15 - 16:00 during the "Points and Vectors" session.

Find the paper at or on Slideshare.

For thirty years, resolution-independent 2D standards (e.g. PostScript, SVG) have relied largely on CPU-based algorithms for the filling and stroking of paths. Learn about our approach to accelerate path rendering with our GPU-based "Stencil, then Cover" programming interface. We've built and productized our OpenGL-based system.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • NV_path_rendering provides a new third pipeline—in addition to the vertex and pixel pipelines—for rendering pixels
  • SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

    1. 1. GPU-accelerated Path Rendering Mark Kilgard & Jeff Bolz NVIDIA Corporation November 30, 2012
    2. 2. GPUs are good at a lot of stuff
    3. 3. GamesBattlefield 3, EA
    4. 4. Data visualization
    5. 5. Product design Catia
    6. 6. Physics simulationCUDA N-Body [Nyland et al., GPU Gems 3, 2007]
    7. 7. Interactive ray tracingOptiX [Parker et al., SIGGRAPH 2010]
    8. 8. Game physicsPhysX [Tonge et al., SIGGRAPH 2012]
    9. 9. Molecular modeling NCSA
    10. 10. Impressive stuff
    11. 11. What about advancing 2D graphics?
    12. 12. Can GPUs render & improve the immersive web?
    13. 13. Complete Web Pages Rendered via OpenGLwithout Pre-rendered Glyph Bitmaps and all on GPU
    14. 14. No tricks Every glyph is rendered from its outline; no render-to-textureNot just zoomed & rotated,also perspective Magnify & minify with no transitional pixelization or tile popping artifacts synced to refresh rate; 60 Hz updates
    15. 15. Zoomed inLive demo!Web page Control points of TrueType glyphs visualized Projected
    16. 16. What is path rendering?A rendering approach Resolution-independent two- dimensional graphics Occlusion & transparency depend on rendering order So called “Painter’s Algorithm” Basic primitive is a path to be filled or stroked Path is a sequence of path commands Commands are – moveto, lineto, curveto, arcto, closepath, etc.Standards Content: PostScript, PDF, TrueType fonts, Flash, Scalable Vector Graphics (SVG), HTML5 Canvas, Silverlight, Office drawings APIs: Apple Quartz 2D, Khronos OpenVG, Microsoft Direct2D, Cairo, Skia, Qt::QPainter, Anti-grain Graphics
    17. 17. Path Rendering StandardsDocument Resolution- Immersive 2D Graphics OfficePrinting and Independent Web Programming ProductivityExchange Fonts Experience Interfaces Applications Java 2D API OpenType Flash QtGui API TrueType Scalable Mac OS X Vector 2D API Adobe Illustrator GraphicsOpen XMLPaper (XPS) Inkscape HTML 5 Khronos API Open Source
    18. 18. Seminal Path Rendering Paper John Warnock & Douglas Wyatt, Xerox PARC Presented SIGGRAPH 1982 Warnock founded Adobe months later John Warnock Adobe founder
    19. 19. Reasons toGPU-accelerate Path Rendering Increasing screen Multi-touch resolutions Increasing screen densities Power wall More functionality with less latency… Immersive 2D web content …with less power
    20. 20. Live DemoClassic PostScript content Complex text rendering Flash content New York Times rendered from its resolution-independent form
    21. 21. Live demo! Gradients with blending Dashed stroking Complex gradient content Maps with textDragon, andzoomed dragon 3D dice, but really 2D + gradients
    22. 22. Last Year’s SIGGRAPH Results inReal-time“Digital Micrography” Ron Maharik, Mikhail Bessmeltsev, AllaSheffer, Ariel Shamir, and Nathan CarrSIGGRAPH 2011“Girl with Words inHer Hair” scene 591 paths 338,507 commands 1,244,474 scalar coordinates
    23. 23. Our Contributions A novel “stencil, then cover” programming interface for path rendering, well-suited to acceleration by GPUs Our NV_path_rendering API Our programming interface’s efficient implementation within OpenGL to avoid CPU bottlenecks Productized, shipping in GeForce/Quadro drivers Accompanying algorithms to handle tessellation-free stenciled stroking of paths standard stroking embellishments such as dashing clipping paths to arbitrary paths mixing 3D and path rendering
    24. 24. Notable Prior Art Loop & Blinn 2005: Resolution independent curve rendering using programmable graphics hardware Kokojima, et al. 2006: Resolution independent rendering of deformable vector objects using graphics hardware Rueda, et al. 2008: GPU-based rendering of curved polygons using simplicial coverings
    25. 25. CPU vs. GPU at Rendering Tasks over Time100% 100%90% 90%80% 80%70% 70%60% 60%50% GPU 50% GPU CPU CPU40% 40%30% 30%20% 20%10% 10% 0% 0% 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 Pipelined 3D Interactive Rendering Path Rendering Goal of our research is to make path rendering a GPU task Render all interactive pixels, whether 3D or 2D or web content with the GPU
    26. 26. Our Approach Step 1 Step 2: Stencil Cover repeat “Stencil, then Cover” (StC) Map the path rendering task from a sequential algorithm… …to a pipelined and massively parallel task Break path rendering into two steps First, “stencil” the path’s coverage into stencil buffer Second, conservatively “cover” path Test against path coverage determined in the 1st step Shade the path And reset the stencil value to render next path
    27. 27. Our Implemented System:NV_path_rendering OpenGL extension to GPU-accelerate path rendering Uses “stencil, then cover” (StC) approach via OpenGL calls Create a path object Step 1: “Stencil” the path object into the stencil buffer GPU provides fast stenciling of filled or stroked paths Step 2: “Cover” the path object and stencil test against its coverage stenciled by the prior step Application can configure arbitrary shading during the step More details later Supports the union of functionality of all major path rendering standards Includes all stroking embellishments Includes first-class text and font support Allows functionality to mix with traditional 3D and programmable shading
    28. 28. Pixel pipeline Vertex pipeline Path pipeline Application Path specificationPixel assembly Vertex assembly Transform path (unpack) Vertex operations transform feedback Primitive assemblyPixel operations Primitive operations Fill/Stroke Covering Pixel pack Rasterization read Texture back Fragment operations memory Fill/Stroke Application Raster operations Stenciling Framebuffer Display
    29. 29. Stencil Fill Process VisualizedVisualizationof “invisible”stencil-onlygeometrygeneratedduringstencil stepNet resultof stencilincrementsanddecrementsis path’swindingnumber
    30. 30. Cover Fill Geometry Visualized
    31. 31. Stroking Approach Stroked line segments are straightforward Drawn as rectangles into the stencil buffer Curved stroked segments are involved Curved segments are broken into stroked quadratic segments Hulls are formed around each stroked quadratic segment An intricate fragment discard shader solves the cubic equation for every sample to determine the sample’s containment in the quadratic stroke segment If contained, the sample’s stencil sample is updated Caps & joins are also drawn into the stencil buffer Covering geometry is computed as union of rectangles, hulls, and cap/join geometry
    32. 32. Quadratic Stroking Hulls VisualizedSimplequadraticBeziersegment,movingcontrol pointsDrawnwith strokingNon-convexhull usedfor thestroking stencilstep isvisualized
    33. 33. Intricate Path’sStroking Example Join style geometry Zoomed stroking Same zoom: Stencil hull geometry
    34. 34. Excellent Geometric Fidelity forStroking Correct stroking is hard Lots of CPU implementations GPU-accelerated OpenVG reference approximate stroking GPU-accelerated stroking   avoids such short-cuts GPU has FLOPS to compute true stroke point containment Cairo Qt   Stroking with tight end-point curve
    35. 35. Combined for a Complex Scenes With Many PathsStencil Fill Geometry Cover Fill Geometry Filling-only Result Complete Tiger 240 paths 2,510 commands 12,174 coordinatesStencil Stroke Geometry Cover Stroke Geometry Stroking-only Result
    36. 36. ConfigurationGPU: GeForce 480 GTX (GF100)CPU: Core i7 950 @ 3.07 GHz NV_path_rendering Compared to Alternatives With Release 300 driver NV_path_rendering Alternative APIs rendering same content 2,000.00 2,000.00 16x Cairo 1,800.00 1,800.00 8x Qt 1,600.00 Skia Bitmap 1,600.00 4x Skia Ganesh FBO (16x) 2x 1,400.00 1,400.00 Skia Ganesh Aliased (1x) 1x Direct2D GPU 1,200.00 Direct2D WARP Frames per second Frames per second 1,200.00 1,000.00 1,000.00 Alternative approaches 800.00 800.00 are all much slower 600.00 600.00 400.00 400.00 200.00 200.00 - - 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 Window Resolution in Pixels Window Resolution in Pixels
    37. 37. ConfigurationGPU: GeForce 480 GTX (GF100)CPU: Core i7 950 @ 3.07 GHz Detail on Alternatives Same results, changed Y Axis Alternative APIs rendering same content 250.00 2,000.00 Cairo Cairo Qt 1,800.00 Skia Bitmap Qt Skia Ganesh FBO (16x) 1,600.00 Skia Bitmap 200.00 Skia Ganesh Aliased (1x) Skia Ganesh FBO (16x) Direct2D GPU 1,400.00 Skia Ganesh Aliased (1x) Direct2D WARP Direct2D GPU 1,200.00 Direct2D WARP Frames per second 150.00 Frames per second 1,000.00 Fast, but 800.00 unacceptable 100.00 quality 600.00 400.00 50.00 200.00 - - 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 200x200 500x500 800x800 1000x1000 1100x1100 100x100 300x300 400x400 600x600 700x700 900x900 Window Resolution in Pixels Window Resolution in Pixels
    38. 38. 1000.00 10.00 100.00 0.10 1.00 100x100 200x200 300x300 400x400 500x500 600x600 700x700 tiger 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 100x100 Wels h_dragon 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 b 400x400 500x500 600x600 Across an range of scenes… 700x700 800x800 900x900 Celtic_round_dogs utterfly 1000x1000 1100x1100 100x100 200x200 300x300 NVpr16/Cairo 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 NVpr16/SkiaBitmap 100x100 NVpr16/SkiaGanesh 200x200 300x300 400x400 NVpr16/Direct2D GPU 500x500 600x600 700x700 NVpr16/Direct2D WARP 800x800 900x900 1000x1000 1100x1100 100x100 200x200 s pikesAm erican_Sam oa 300x300 400x400 500x500 600x600 700x700 800x800 900x900 cowboy 1000x1000 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 Em 1100x1100 100x100 200x200 300x300 400x400 Release 300 GeForce GTX 480 Speedups over Alternatives 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 Buonapartebrace_the_World 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 Yokozawa 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 Cougarxis is logarithmic—shows how many TIMES faster NV_path_rendering is that competitor 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 tiger_clipped_by_he
    39. 39. Partial Solutions Not Enough Path rendering has 30 years of heritage and history Can’t do a 90% solution and expect Software to change John Warnock Trying to “mix” CPU and GPU methods doesn’t work Adobe founder Expensive to move software—needs to be an unambiguous win Must surpass CPU approaches on all fronts Performance Quality Functionality Conformance to standards More power efficient Enable new applications Inspiration: Perceptive Pixel
    40. 40. Dashing Content Examples Artist made windows with Same cake dashed line missing dashed segment stroking details Technical diagrams and charts oftenFrosting on cake is dashed employ dashingelliptical arcs with roundend caps for “beaded” look;flowers are also dashing All content shown is fully GPU rendered Dashing character outlines for quilted look
    41. 41. First-class, Resolution-independentFont Support Fonts are a standard, first-class part of all path rendering systems Foreign to 3D graphics systems such as OpenGL and Direct3D, but natural for path rendering Because letter forms in fonts have outlines defined with paths TrueType, PostScript, and OpenType fonts all use outlines to specify glyphs NV_path_rendering makes font support easy Can specify a range of path objects with A specified font Sequence or range of Unicode character points No requirement for applications use font API to load glyphs You can also load glyphs “manually” from your own glyph outlines
    42. 42. Rendering Paths Clipped toSome Other Arbitrary Path Example clipping the PostScript tiger to a heart constructed from two cubic Bezier curves unclipped tiger tiger with pink background clipped to heart
    43. 43. Complex Clipping Example tiger is 240 paths cowboy clip is the union of 1,366 paths result of clipping tiger to the union of all the cowboy paths
    44. 44. NV_path_rendering is more than justmatching CPU vector graphics 3D and vector graphics mix Superior quality   CPU GPU Competitors Arbitrary programmable shader on 2D in perspective is free paths— bump mapping
    45. 45. Mixing 3D Depth Buffering andPath Rendering PostScript tigers surrounding Utah teapot Plus overlaid TrueType font rendering No textures involved, no multi-pass
    46. 46. Live demo! Solid or wireframe teapotsVery fastTeapots + tigers in same 3D scene Zoom on tigers All the detail is there
    47. 47. Handling Uncommon Path RenderingFunctionality: Projection Projection “just works” Because GPU does everything with perspective-correct interpolation
    48. 48. Example of Bump Mapping onPath Rendered Text Phrase “Brick wall!” is path rendered and bump mapped with a Cg fragment shader light source position
    49. 49. Handling Common Path RenderingFunctionality: FilteringGPUs are highly efficient atimage filtering  Qt Fast texture mapping Mipmapping Anisotropic filtering Wrap modes MoiréCPUs arent artifactsreally  Cairo  GPU
    50. 50. Anti-aliasing Discussion Good anti-aliasing is a big deal for path rendering Particularly true for font rendering of small point sizes Features of glyphs are often on the scale of a pixel or less NV_path_rendering uses multiple stencil samples per pixel for reasonable antialiasing Otherwise, image quality is poor 4 samples/pixel bare minimum 8 or 16 samples/pixel is pretty sufficient But 16 requires expensive 2x2 supersampling of 4x multisampling 16x is quite memory intensive Alternative: quality vs. performance tradeoff Fast enough to render multiple passes to improve quality Approaches Accumulation buffer Alpha accumulation
    51. 51. Real Flash Scene same scene, GPU-rendered without conflation conflation artifacts abound, rendered by Skiaconflation is aliasing &edge coverage percentsare un-predicable in general;means conflated pixelsflicker when animated slowly
    52. 52. Improved Color Space:sRGB Path Rendering Modern GPUs have native Radial color gradient example support for perceptually-correct moving from saturated red to blue for sRGB framebuffer blending sRGB texture filtering No reason to tolerate uncorrected linear RGB color artifacts! More intuitive for artists to  linear RGB transition between saturated control red and saturated blue has Negligible expense for GPU to dark purple region perform sRGB-correct rendering However quite expensive for software path renderers to perform sRGB rendering Not done in practice  sRGB perceptually smooth transition from saturated red to saturated blue
    53. 53. Trying OutNV_path_rendering Operating system support 2000, XP, Vista, Windows 7, Linux, FreeBSD, and Solaris Unfortunately no Mac support GPU support GeForce 8 and up (easy rule: all CUDA-capable GPUs) Most efficient on Fermi and Kepler GPUs Current performance can be expected to improve Shipping since NVIDIA’s Release 275 drivers Available since summer 2011 New Release 300+ drivers have remarkable NV_path_rendering performance Try it, you’ll like it There’s an SDK freely available with example code!
    54. 54. Future Work Using NV_path_rendering in actual web and 2D applications Standardizing the programming interface Moving these algorithms to mobile devices Path rendering test bed on Nexus 7