Bringing GPU to the Web (HTML5 Dev Conference) Oct13
Upcoming SlideShare
Loading in...5
×
 

Bringing GPU to the Web (HTML5 Dev Conference) Oct13

on

  • 1,129 views

This presentation explores three open standards bringing the power of the GPU to the Web with cutting edge examples of each: ...

This presentation explores three open standards bringing the power of the GPU to the Web with cutting edge examples of each:
- WebGL is a significant advance in the evolution of 3D on the Web, enabling foundational, GPU-accelerated 3D to be delivered by the browser without the need for a plug-in;
- WebCL is a direct JavaScript binding to the OpenCL standard framework for heterogeneous parallel computation in web applications;
- NVIDIA has spearheaded research and development into innovative OpenGL functionality that enables full GPU acceleration of vector based APIs such as SVG.

Statistics

Views

Total Views
1,129
Views on SlideShare
1,129
Embed Views
0

Actions

Likes
2
Downloads
17
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Bringing GPU to the Web (HTML5 Dev Conference) Oct13 Bringing GPU to the Web (HTML5 Dev Conference) Oct13 Presentation Transcript

    • Harnessing the Power of the GPU in Web Applications Neil Trevett Khronos President NVIDIA Vice President Mobile Content © Copyright Khronos Group 2013 - Page 1
    • GPUs are everywhere the Web goes. Making full use of GPUs is essential for any modern computing platform. But.. Traditionally the Web has not made effective use of GPUs. That is changing… © Copyright Khronos Group 2013 - Page 2
    • Mobile is the New Epicenter of Innovation © 2013 NVIDIA - Page 3
    • Mobile Web is a Real Time Application + 2048x1536 3100K Pixels 326 DPI 1024x768 786K Pixels 132 DPI 320x480 153K Pixels 163 DPI Apple iPhone Buttery smooth touch interaction needs continuous 60Hz updates Apple iPad = Apple iPad Mini In 5 years the number of pixels to process on mobile screens has gone up by factor of TWENTY Need GPU Acceleration for everything Web! © 2013 NVIDIA - Page 4
    • CPU/GPU AGGREGATE PERFORMANCE Mobile SOC Performance Increases Full Kepler GPU CUDA 5.5 OpenGL 4.4 NVIDIA Shield 100 Denver 64-bit CPU Maxwell GPU Parker Google Nexus 7 Logan HTC One X+ Tegra 4 10 100x perf increase in four years Quad A15 Tegra 3 Quad A9 Power saver 5th core Tegra 2 Dual A9 1 2011 2012 2013 2014 2015 Device Shipping Dates © 2013 NVIDIA - Page 5
    • NVIDIA Logan Mobile SOC Kepler GPU Architecture now on PC and Mobile. Can run essentially the same code – scaled for different power constraints © 2013 NVIDIA - Page 6
    • How are GPUs Accessible to the Web? Hardware composition Within the browser stack – under the hood Vector Acceleration for SVG Using NVIDIA OpenGL extensions 3D Developer Functionality OpenGL ES functionality through JavaScript Compute Acceleration Offloading compute intensive code to GPU Compression and streaming of 3D assets For network transmission Camera, vision and sensor processing Future JavaScript bindings to native APIs? © 2013 NVIDIA - Page 7
    • Khronos Connects Software to Silicon ROYALTY-FREE, OPEN STANDARD APIs for advanced hardware acceleration Low level silicon to software interfaces needed on every platform Graphics, video, audio, compute, vision, sensor and camera processing Defines the forward looking roadmap for the silicon community Shipping on billions of devices across multiple operating systems Rigorous conformance tests for cross-vendor consistency Khronos is OPEN for any company to join and participate Acceleration APIs BY the Industry FOR the Industry © Copyright Khronos Group 2013 - Page 8
    • Khronos Standards and AR 3D Asset Authoring - Advanced Authoring pipelines - glTF 3D Asset Transmission Format with streaming and compression Native Visual Computing - Gaming and professional apps - Advanced scene construction Camera Control API Over 100 companies defining royalty-free APIs to connect software to silicon Acceleration in the Browser Sensor Processing - WebGL for 3D in browsers - WebCL – Heterogeneous Computing for the web - Mobile Vision Acceleration - On-device Sensor Fusion © Copyright Khronos Group 2013 - Page 9
    • Mobile OS Adoption of Khronos APIs OpenGL ES 2.0 Shipping - Android 2.2 OpenSL ES 1.0 (subset) Shipping – Android 2.3 OpenMAX AL 1.0 (subset) Shipping - Android 4.0 EGL 1.4 Shipping under SDK -> NDK Opera and Firefox WebGL now Chrome soon OpenGL 3.2 on MacOS OpenCL 1.2 on MacOS OpenGL ES 3.0 on iOS Can enable on MacOS Safari iOS5 enables WebGL for iAds © Copyright Khronos Group 2013 - Page 10
    • WebGL – 3D on the Web – No Plug-in! • Leveraging HTML 5 and <canvas> element - WebGL defines JavaScript binding to OpenGL ES 2.0 - Enables a 3D context for the canvas • Low-level foundational Web API for accessing the GPU - Flexibility and direct GPU access - Enables higher-level frameworks and middleware Availability of OpenGL and OpenGL ES on almost every web-capable device JavaScript binding to OpenGL ES 2.0 Increasing JavaScript performance. HTML 5 Canvas Tag © Copyright Khronos Group 2013 - Page 11
    • WebGL Implementation Anatomy Content downloaded from the Web. Middleware can make WebGL accessible to non-expert 3D programmers Browser provides WebGL functionality alongside other HTML5 technologies - no plug-in required OS Provided Drivers. WebGL on Windows can use Direct3D - for example Angle open source project creates OpenGL ES 2.0 over DX9 Content JavaScript, HTML, CSS, ... Much WebGL content uses three.js library: http://threejs.org/ JavaScript Middleware HTML5 JavaScript CSS OpenGL ES 2.0 OpenGL DX9/Angle © Copyright Khronos Group 2013 - Page 12
    • WebGL Availability in Browsers - Microsoft – “where you have IE11, you have WebGL – turned on by default and working all the time” - Microsoft - WebGL also enabled for Windows applications - web app framework and web view - Chrome on Android now shipping with WebGL - Chrome OS - WebGL is the only cross-platform API to program the GPU - Apple - WebGL is present – but must be explicitly turned on MAC Safari and only exposed on iOS for iAds © Copyright Khronos Group 2013 - Page 13
    • Microsoft PhotoSynth2 • Demonstrated at Build 2013 http://channel9.msdn.com/Events/Build/2013/4-072 1:50 © Copyright Khronos Group 2013 - Page 14
    • WebGL on Logan Android Tablet © Copyright Khronos Group 2013 - Page 15
    • WebGL on Logan Android Tablet © Copyright Khronos Group 2013 - Page 16
    • Cross-OS Portability HTML/CSS SDK C/C++ Dalvik (Java) HTML/CSS Objective C HTML/CSS HTML5 provides cross platform portability. GPU accessibility through WebGL available soon on ~90% mobile systems C# Preferred development environments not designed for portability DirectX Native code is portablebut apps must cope with different available APIs and libraries © Copyright Khronos Group 2013 - Page 17
    • OpenGL 3D API Family Tree WebGL 2.0 is in development now will bring OpenGL ES 3.0 functionality to the Web http://www.khronos.org/webgl/public-mailing-list/ http://www.khronos.org/registry/webgl/specs/latest/ http://www.khronos.org/webgl/wiki/Testing/Conformance ES3 is backward compatible so new features can be added incrementally Programmable vertex and fragment shaders Fixed function 3D Pipeline OpenGL ES 2.0 Content OpenGL ES 1.1 Content OpenGL ES 3.0 Content Mobile 3D WebGL 1.0 OpenGL ES 1.1 OpenGL ES 2.0 WebGL 2.0 OpenGL ES 3.0 ES-Next OpenGL ES 1.0 OpenGL 1.3 OpenGL 1.5 OpenGL 2.0 OpenGL 2.1 OpenGL 3.1 OpenGL 3.3 OpenGL 4.2 OpenGL 4.3 OpenGL 4.4 OpenGL 4.0 OpenGL 3.0 OpenGL 3.2 OpenGL 4.1 OpenGL 4.4 is a superset of DX11 Desktop 3D 2002 2003 2004 GL-Next 2005 2006 2007 2008 2009 2010 2011 2012 2013 © Copyright Khronos Group 2013 - Page 18
    • OpenGL ES 3.0 Highlights • Better looking, faster performing games and apps – at lower power - Incorporates proven features from OpenGL 3.3 / 4.x - 32-bit integers and floats in shader programs - NPOT, 3D textures, depth textures, texture arrays - Multiple Render Targets for deferred rendering, Occlusion Queries - Instanced Rendering, Transform Feedback … • Make life better for the programmer - Tighter requirements for supported features to reduce implementation variability • Backward compatible with OpenGL ES 2.0 - OpenGL ES 2.0 apps continue to run unmodified • Standardized Texture Compression - #1 developer request! © Copyright Khronos Group 2013 - Page 19
    • Texture Compression is Key • Texture compression saves precious resources - Network bandwidth, device memory space AND device memory bandwidth • Developers need the same texture compression EVERYWHERE - Otherwise portable apps – such as WebGL need multiple copies of same texture Quality ASTC Royalty-free BUT only optional in ES. Only 4bpp | 3 channel No alpha support NOT Royalty-free. Platform Fragmentation DXTC/S3TC Windows ETC1 Mandated in Android Froyo (400M devices) ETC2 / EAC MANDATED in OpenGL ES 3.0 OpenGL 4.3 Royalty-free Backward compatible with ETC1 ETC2: 4bpp | 3 channel EAC: 4 (8) bpp | 1(2) channel COMBINED: RGBA 8bpp | 4 channel Does not have 1-2 bit compression WITH ALPHA PVRTC OpenGL ES 3.0 and OpenGL 4.3 extensions -> Core once proven Royalty-free Best quality. Independent control of bit-rate and # channels 1 to 4 channel 1-8bpp in fine steps Pervasive Deployment iOS 2008-2010 2012-2013 2014-> © Copyright Khronos Group 2013 - Page 20
    • ASTC – Universal Texture Standard • Adaptive Scalable Texture Compression (ASTC) - Quality significantly exceeds S3TC or PVRTC at same bit rate • Industry-leading orthogonal compression rate and format flexibility - 1 to 4 color components: R / RG / RGB / RGBA - Choice of bit rate: from 8bpp to <1bpp in fine steps • ASTC is royalty-free and so is available to be universally adopted - Shipping as OpenGL/OpenGL ES extension today for industry feedback Original 24bpp 8bpp ASTC Compression 3.56bpp 2bpp © Copyright Khronos Group 2013 - Page 21
    • Why Khronos for WebGL? • Hardware API standards must take into account silicon design cycles - Multi-year pipeline of APIs that affect chips that take $100Ms to execute - Deep insights into silicon and driver architectures - Rigorous conformance tests and infrastructure • Khronos is committed to being a good citizen in the larger Web community - Opened Khronos WebGL processes to enable cooperation with web community • Khronos is the industry forum to drive hardware consensus and cooperation - Help create foundational support for higher-level Web standards that access hardware capabilities © Copyright Khronos Group 2013 - Page 22
    • OpenCL – Portable Heterogeneous Computing • Native framework for programming diverse parallel computing resources - CPU, GPU, DSP etc. • OpenCL C kernel language - Very close to C99 • APIs to discover compute resources and distribute kernels - Across all available compute resources OpenCL Kernel OpenCL Code OpenCL Kernel Code OpenCL Kernel Code Kernel Code GPU DSP HW CPU CPU One code tree can be executed on CPUs, GPUs, DSPs and hardware. Dynamically interrogate system load and load balance work across available processors © Copyright Khronos Group 2013 - Page 23
    • OpenCL as Parallel Compute Foundation C++ AMP OpenCL HLM WebCL Aparapi JavaScript binding to Java language C++ Shevlin Park Uses Clang syntax/compiler OpenCL for initiation extensions for parallelism of OpenCL C kernels extensions and LLVM River Trail PyOpenCL Harlan Compiler High level Python wrapper Language directives for language for GPU around extensions to Fortran C and C++ programming OpenCL JavaScript OpenCL provides vendor optimized, cross-platform, cross-vendor access to heterogeneous compute resources © Copyright Khronos Group 2013 - Page 24
    • WebCL – Parallel Computing for the Web • JavaScript bindings to OpenCL APIs - Enables initiation of Kernels written in OpenCL C within the browser http://www.youtube.com/user/SamsungSISA#p/a/u/1/9Ttux1A-Nuc © Copyright Khronos Group 2013 - Page 25
    • Leveraging Proven Native APIs into HTML5 • Khronos and W3C exploring liaison - Leverage proven native API investments into the Web - Fast API development and deployment - Designed by the hardware community - Familiar foundation reduces developer learning curve HTML Canvas WebVX? Vision Processing Path Rendering Native APIs shipping or Khronos working group JavaScript API shipping, acceleration being developed or work underway WebCAM(!) WebStream? Sensor Fusion Camera control and video processing Camera Control JavaScript Native Possible future JavaScript APIs or acceleration © Copyright Khronos Group 2013 - Page 26
    • StreamInput - Sensor Fusion • Defines access to high-quality fused sensor stream and context changes - Implementers can optimize and innovate generation of the sensor stream Applications Platforms can provide increased access to improved sensor data stream – driving faster, deeper sensor usage by applications StreamInput implementations compete on sensor stream quality, reduced power consumption, environment triggering and context detection – enabling sensor subsystem vendors to increased ADDED VALUE OS Sensor OS APIs (E.g. Android SensorManager or iOS CoreMotion) Middleware (E.g. Augmented Reality engines, gaming engines) Middleware engines need platformportable access to native, low-level sensor data stream Low-level native API defines access to fused sensor data stream and context-awareness Sensor Sensor … Mobile or embedded platforms without sensor fusion APIs can provide direct application access to StreamInput Sensor Sensor Hub Hub © Copyright Khronos Group 2013 - Page 27
    • OpenVX – Power Efficient Vision Acceleration • Complementary to OpenCV open source project - Which is great for prototyping • OpenVX is tightly specified API with conformance - Portable, production-grade vision functions • OpenVX enables graph of vision processing - Each Node in graph can be implemented in software or accelerated hardware • Nodes may be fused and optimized - e.g. implementation may stripe execution over an image sections in cache OpenVX Node OpenVX Node OpenVX Node OpenVX Node Application OpenCV open source library Open source sample implementation Other higher-level CV libraries Hardware vendor implementations © Copyright Khronos Group 2013 - Page 28
    • Typical Imaging Pipeline • Processing pre- and post-ISP can be done on CPU, GPU, DSP - E.g. using OpenCL or OpenVX • BUT.. Applications have often had limited control over the actual camera and ISP - ISP controls camera via 3A algorithms - Auto Exposure (AE), Auto White Balance (AWB), Auto Focus (AF) Lens, sensor, aperture control 3A Bayer RGB/YUV Pre-processing CMOS sensor Color Filter Array Lens Image Signal Processor (ISP) Postprocessing App Need for advanced camera control API: - to provide more flexible app camera control - over more types of camera sensors - with tighter integration with the rest of the system © Copyright Khronos Group 2013 - Page 29
    • Khronos APIs for Augmented Reality W3C Augmented Web Community Group discussing many of these issues for the Web: e.g. leveraging WebRTC in the short term http://w3.org/community/ar MEMS Sensors Sensor Fusion Application on CPUs, GPUs and DSPs Vision Processing Precision timestamps on all sensor samples Advanced Camera Control and stream generation Audio Rendering EGLStream stream data between APIs 3D Rendering and Video Composition On GPU Camera Control API AR needs not just advanced sensor processing, vision acceleration, computation and rendering - but also for all these subsystems to work efficiently together © Copyright Khronos Group 2013 - Page 30
    • 3D Needs a Transmission Format! • Compression and streaming of 3D assets becoming essential - Mobile and connected devices need access to increasingly large asset databases • 3D is the last media type to define a compressed format - 3D is more complex – diverse asset types and use cases • Needs to be royalty-free - Avoid an ‘internet video codec war’ scenario • Eventually enable hardware implementations of successful codecs - High-performance and low power – but pragmatic adoption strategy is key Audio Video Images MP3 H.264 JPEG 3D ? ! An effective and widely adopted codec ignites previously unimagined opportunities for a media type © Copyright Khronos Group 2013 - Page 31
    • glTF – OpenGL Transmission Format • Binary file format for efficient transmission for 3D assets - Reduce network bandwidth and minimize client processing overhead • Run-time neutral - DO NOT IMPLY OR MANDATE ANY RUN-TIME BEHAVIOR - Can be used by any app or run-time – usually WebGL accelerated • Scalable to handle compression and streaming - Though baseline format does not include compression • ‘Direct load efficiency’ for WebGL - Little or NO processing to drop glTF data into WebGL client • Carry conditioned data from any authoring format - Prototyping and optimizing efficient handling of COLLADA assets Authoring Playback A standards-based content pipeline for rich native and Web 3D applications © Copyright Khronos Group 2013 - Page 32
    • COLLADA and glTF Open Source Ecosystem OpenCOLLADA Importer/Exporter and COLLADA Conformance Tests On GitHUB Tool Interop COLLADA2GLTF Translator Other authoring formats https://github.com/KhronosGroup/glTF Web-based Tools Pervasive WebGL deployment https://github.com/KhronosGroup/OpenCOLLADA https://github.com/KhronosGroup/COLLADA-CTS Three.js glTF Importer. Rest3D initiative © Copyright Khronos Group 2013 - Page 33
    • WebGL as Test-bed for 3D Asset Compression • Integrating and benchmarking 3D geometry compression formats with glTF - Baseline is GZIP • Scalable Complexity 3D Mesh Compression codec MPEG-SC3DMC - Royalty-free graphics compression technology from MPEG (MIT License) - Open3DGC is efficient JavaScript and C/C++ implementation - Convertor using Open3DGC to compress 3D Meshes, Skinning, Animations - Available at https://github.com/fabrobinet/glTF-webgl-viewer • WebGL-loader is Google lightweight compression for WebGL content © Copyright Khronos Group 2013 - Page 34
    • Compression Efficiency – Early Results Format OBJ Gzip Webgl-loader Open3DGC Webgl-loader + Gzip CAD Models (Mbytes) 1310 (100%) 336 (26%) 219 (17%) 67 (5%) 80 (6%) 3D Scanned Models (Mbytes) 736 (100%) 204 (28%) 117 (16%) 22 (3%) 38 (5%) MPEG dataset (Mbytes) 600 (100%) 157 (26%) 103 (17%) 22 (4%) 26 (4%) Open3DGC is 5x-9x more efficient than Gzip and 1.2x-1.5x more efficient than webgl-loader © Copyright Khronos Group 2013 - Page 35
    • Decoding Speed • For mobile - need to balance file size AND decompression processing - Extensive processing can take more time/power than transmission • OpenCTM is also promising but LZMA is very processor intensive - Work may lead to LZMA in hardware? Hand (100K Tri.) Win7 64-bit, 10GB RAM i7-2600 CPU @ 3.4GHz Samsung Galaxy S4 Android 4.2.2 Dilo (54K Tri.) Octopus (34K Tri.) 130 ms 86 ms 65 ms 1045 ms 768 ms 457 ms © Copyright Khronos Group 2013 - Page 36
    • Path Rendering Acceleration Offload the CPU so the application can run as fast as possible Make maximum use of the GPU for best performance and power CPU creates paths CPU creates paths CPU renders paths CPU creates paths CPU tessellates paths into polygons CPU GPU Use standard 3D commands to process polygons - Software Scanline renderers can be high quality and portable - CPU has to process complete pipeline – stealing cycles from the application - Software rendering limits performance Define new OpenGL path commands to process paths directly - Tessellation loads the CPU – stealing cycles from the application so perf sometimes slower than software alone - Tessellation consumes a lot of data and memory bandwidth = power - Quality can be compromised due to tessellation accuracy - Maximum CPU offload - Compact data format sent to GPU renderer - GPU provides excellent performance and power - GPU can increase quality and functionality © 2013 NVIDIA - Page 37
    • NV_path_rendering OpenGL Extension Brings Path processing directly to OpenGL No tessellation necessary Goals Functionally complete for key standards: SVG, Canvas, PostScript etc. Much faster—often 4x to 100x faster than CPUs Enhanced quality – can avoid approximations needed by CPU renderers Lower power by leveraging dedicated hardware New functionality – e.g. mix 2D paths with 3D and programmable shading © 2013 NVIDIA - Page 38
    • Stencil then Cover Approach Create a path object and pass directly to the GPU Cubic & quadratic Bezier segments, line segments, partial elliptical arcs GPU “Stencils” the path object into the stencil buffer GPU provides massively parallel stenciling of filled or stroked paths Calculate winding rule or containment at every sub-pixel sample in parallel “Cover” the path object and stencil test against its coverage Test against path coverage determined in the 1st step and shade the path Uses GPU MSAA anti-aliasing 8 or 16 samples/pixel gives good quality Step 1 Stencil Step 2: Cover repeat © 2013 NVIDIA - Page 39
    • Excellent Geometric Fidelity for Stroking Correct stroking is hard Lots of CPU implementations approximate stroking GPU-accelerated   GPU-accelerated stroking avoids such short-cuts GPU has FLOPS to compute true stroke point containment OpenVG reference Cairo  Qt  Stroking with tight end-point curve © 2013 NVIDIA - Page 40
    • Micrography “Girl with Words in Her Hair” 591 paths 338,507 commands 1,244,474 coordinates Ron Maharik, Mikhail Bessmeltsev, Alla Sheffer, Ariel Shamir and Nathan Carr SIGGRAPH 2011 © 2013 NVIDIA - Page 41
    • More Details on nvpr Functionality union of all major path rendering standards Enables mixing traditional functionality with 3D and programmable shading Point sampling for path filling is exact No approximations due to tessellation or subdivision Path stroking is exact Line segments & quadratic Bezier segments stroking is exact All stroke cap + join styles supported Dashing fully supported Minimal pre-computation required NO tessellation involved, NO recursive subdivision Fast to animate, morph, or edit paths © 2013 NVIDIA - Page 42
    • Enhanced Quality on GPU   weird big holes feathers? Skia  regular grid on CPU - sub-optimal Antialiasing  jitter pattern  Cairo NV_path_rendering Stroking approximations avoided by GPU on GPU for better Antialiasing GPU Offers Jittered Sampling for Free GPUs great at texturing: Mip-mapping Anisotropic filtering Wrap modes  GPU  Qt Moiré artifacts Similar for Qt & Skia color bleeding   Cairo conflation artifacts on CPU conflation free on GPU Eliminate Conflation Artifacts Multiple color AND stencil samples per pixel Proper gradient filtering on GPU © 2013 NVIDIA - Page 43
    • Comparing Performance © 2013 NVIDIA - Page 44
    • 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 Comparative Performance (Logarithmic Scale) tiger 100.00 Wels h_dragon Celtic_round_dogs utterfly b GeForce GTX 480. Release drivers V.300. x16 MSAA s pikesAm erican_Sam oa cowboy Buonapartebrace_the_World Em Yokozawa Cougar tiger_clipped_by_he 1000.00 NVpr16/Cairo NVpr16/SkiaBitmap NVpr16/SkiaGanesh NVpr16/Direct2D GPU NVpr16/Direct2D W ARP 10.00 1.00 0.10 © 2013 NVIDIA - Page 45
    • New GPU Functionality Projective Transformation Fast Arbitrary Path Clipping  linear RGB transition between saturated red and saturated blue has dark purple region light source position for BUMP Mapping Programmable Shading Paint in GLSL – for filter and blending acceleration  sRGB perceptually smooth transition from saturated red to saturated blue Fully sRGB Correct Rendering Mixing depth tested Text, 3D, and Paths © 2013 NVIDIA - Page 46
    • Mixing 2D and 3D © 2013 NVIDIA - Page 47
    • Resolution-independent Font Support Fonts are a standard, first-class part of all path rendering systems Foreign to 3D graphics systems such as OpenGL and Direct3D NV_path_rendering has built-in font support Can specify a range of path objects with A specified font Sequence or range of Unicode character points No requirement for applications use font API to load glyphs You can also load glyphs “manually” from your own glyph outlines Functionality provides OS portability © 2013 NVIDIA - Page 48
    • Path Geometric Queries glIsPointInFillPathNV Determine if object-space (x,y) position is inside or outside path, given a winding number mask glIsPointInStrokePathNV Determine if object-space (x,y) position is inside the stroke of a path accounts for dash pattern, joins, and caps glGetPathLengthNV Returns approximation of geometric length of a given sub-range of path segments glPointAlongPathNV Returns the object-space (x,y) position and 2D tangent vector a given offset into a specified path object Useful for “text follows a path” Queries are modeled after OpenVG queries © 2013 NVIDIA - Page 49
    • Open Source Accelerated SVG Renderer Partial SVG Renderer - pr_svg Path filling, transformations and grouping Path stroking with all stroking embellishments Clipping – including clipping paths to other arbitrary paths Painting with linear/radial gradients and images Basic compositing Coming in next update: markers and text Stuff that’s missing from pr_svg Filters, Blending, Opacity groups, Animation, JavaScript integration Not hard, just best done in context of a browser NVIDIA welcomes any community involvement http://developer.nvidia.com/nv-path-rendering © 2013 NVIDIA - Page 50
    • More Information Best drivers: OpenGL 4.4 www.nvidia.com/drivers Grab the latest drivers for your OS & GPU Runs on any CUDA-capable GPU (GeForce 8 onwards) Developer resources http://developer.nvidia.com/nv-path-rendering Whitepapers, FAQ, specification NVprSDK—software development kit NVprDEMOs—pre-compiled Windows demos YouTube videos demonstrate various NVpr DEMOs Email: nvpr-support@nvidia.com © 2013 NVIDIA - Page 51
    • Standardization and Adoption Pipeline NVIDIA is proposing nvpr to OpenGL working group at Khronos to create open, royalty-free cross platform foundation for vector graphics acceleration Initial functionality proposal. Prove concepts. Solicit industry feedback Vendor Extension to OpenGL Desktop and mobile displays typically >300 DPI Pervasive multi-vendor availability. Widespread application usage inspires silicon optimizations OpenGL Extension or Core nvpr is here! OpenGL vector acceleration adopted into OpenGL and OpenGL ES Vector acceleration pervasive on desktop and mobile Mobile silicon is CUDA/OpenCL capable © 2013 NVIDIA - Page 52
    • Path Rendering Acceleration on Android Tablet © 2013 NVIDIA - Page 53
    • Summary • Open standards such as WebGL and WebCL are enabling web applications to reach the power of the GPU through JavaScript • GPU acceleration will soon become vital for Web applications wanting to leverage advanced use of camera and sensors • Direct acceleration of path primitives directly on GPUs will drive browser performance for new classes of applications and devices • Work starting on 3D asset streaming and compression standards – to enable 3D as a social media type on the web • The Web and hardware community have significant opportunity to leverage each others efforts for the benefit of the industry • Khronos is committed to enable the hardware community to be a good citizen in creating the next generation of accelerated web standards ntrevett@nvidia.com © Copyright Khronos Group 2013 - Page 54