SlideShare a Scribd company logo
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Marc Fauconneau
March 04, 2015
High-Quality, Fast DX11 Texture
Compression with ISPC
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Copyright © 2015 Intel Corporation. All rights reserved.
 *Other names and brands may be claimed as the property of others.
 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED,
BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH
PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS
OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR
INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
 A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S
PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS,
OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY
CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS
NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS.
 Intel may make changes to specifications and product descriptions at any time, without notice.
 All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.
 Intel processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized
errata are available on request.
 Any code names featured are used internally within Intel to identify products that are in development and not yet publicly announced for release. Customers, licensees and other third parties are not
authorized by Intel to use code names in advertising, promotion or marketing of any product or services and any such use of Intel's internal code names is at the sole risk of the user.
 Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps.
 Performance claims: Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark and
MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult
other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more
information go to
http://www.Intel.com/performance
 Iris™ graphics is available on select systems. Consult your system manufacturer.
 Intel, Intel Inside, the Intel logo, Intel Core and Iris are trademarks of Intel Corporation in the United States and other countries.
Legal
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Introduction to DX11 Texture Compression
 The Fast ISPC Texture Compressor
 Overview of the DX11 Formats
 Fast Texture compression: Algorithms
 Effective SIMD acceleration with ISPC
 Future Work
Outline
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Introduction to DX11 Texture Compression
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 4x4 Blocks, 128 bits per block
 BC7
– RGB or RGBA, LDR (8-bit)
– 4:1 compression for R8G8B8A8
– Very high quality RGB (far better than BC1/S3TC, due to more bits)
– Great replacement for BC3 (higher quality, same size)
 BC6H
– RGB, HDR (16-bit)
– 4:1 compression for R9G9B9E5_SHAREDEXP (8:1 for 4ch FP16)
DX11 Texture Formats
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
BC7 in action
BC3 BC7 Source
RGB
Alpha
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
BC6H in action
BC6H
Source
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Fast ISPC Texture Compressor
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Driven by ISV demand for a good, fast compressor
 State of the art compressor
– Efficient pruning scheme
– SIMD implementation using ISPC (Intel SPMD Compiler)
 Permissive license, easy to use
– Has been integrated into several content pipelines
ISPC Texture Compressor
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Performance of available BC7 compressors on Kodak dataset (RGB only)
 Independent testing confirms those results: http://gamma.cs.unc.edu/FasTC/
State of the art
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Performance of available BC7 compressors on Kodak dataset (RGB only)
 Independent testing confirms those results: http://gamma.cs.unc.edu/FasTC/
State of the art
300x faster
30x faster,
~same power
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Builds as a DLL with a C interface
– Encoder settings are highly configurable
– A set of profiles spanning a wide curve of
performance/quality trade-offs is included
 No external dependencies (other than ISPC)
 No hidden multithreading: Responsibility of the calling
application (everyone has their preferences)
Easy to integrate
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
D3D11_MAPPED_SUBRESOURCE uncompData, compData;
deviceContext->Map(uncompTex, 0, D3D11_MAP_READ_WRITE, 0, &uncompData);
deviceContext->Map(compStgTex, 0, D3D11_MAP_READ_WRITE, 0, &compData);
rgba_surface input;
input.ptr = (BYTE*)uncompData.pData;
input.stride = uncompData.RowPitch;
input.width = uncompTexDesc.Width;
input.height = uncompTexDesc.Height;
BYTE* output = (BYTE*)compData.pData;
bc6h_enc_settings settings;
GetProfile_bc6h_basic(&settings);
CompressBlocksBC6H(input, output, &settings);
Mandatory sample code
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
D3D11_MAPPED_SUBRESOURCE uncompData, compData;
deviceContext->Map(uncompTex, 0, D3D11_MAP_READ_WRITE, 0, &uncompData);
deviceContext->Map(compTex, 0, D3D11_MAP_READ_WRITE, 0, &compData);
rgba_surface input;
input.ptr = (BYTE*)uncompData.pData;
input.stride = uncompData.RowPitch;
input.width = uncompTexDesc.Width;
input.height = uncompTexDesc.Height;
BYTE* output = (BYTE*)compData.pData;
bc6h_enc_settings settings;
GetProfile_bc6h_basic(&settings);
CompressBlocksBC6H(input, output, &settings);
Mandatory sample code
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 https://software.intel.com/en-us/articles/fast-ispc-texture-compressor-update
Sample application
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Overview of the DX11 Formats
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 This is not a reference
– See MSDN for that:
https://msdn.microsoft.com/en-us/library/hh308955(v=vs.85).aspx
 I will not talk about specifics like bit packing…
 Those formats build upon BC1 aka DXT1 aka S3TC
– Per-pixel interpolation between two endpoints
Overview of the DX11 Formats
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Each block is set to one mode (14 in BC6H, 8 in BC7)
– It uniquely defines how all values in the block are packed
 Some modes enable specific features:
– Multiple partitions
– RGBA endpoints (BC7)
– Encoding two sets of weights (BC7)
– Differential endpoint encoding (BC6H)
Coding tools
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Like in BC1 aka DXT1 aka S3TC
– More variation in endpoint/weight sizes
– Avoids the unequal channel quantization problem (RGB565)
 This technique exploits local color correlation/contrast
– It fails if pixels are not clustered along a line in RGB(A) space
– It introduces more error in high-contrast blocks
Endpoint interpolation
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Endpoint interpolation
RGB Cube Block
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Endpoint interpolation
RGB Cube Block
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Some modes split the 4x4 block into 2 or 3 partitions
– Each partition has it’s own pair of endpoints
– Pre-defined table with 64 ways to split
 This drastically reduces error in blocks presenting
difficult combinations of colors or contrast
Multiple partitions
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
First 32 partitions
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 RGB or RGBA endpoints
– Many modes only encode RGB, assume Alpha=1.0
– Very effective for transparent textures with opaque regions
 Some modes can encode two sets of weights
– A “vector” channel and “scalar” channel (any of R,G,B,A)
– Bounds error for uncorrelated alpha (behaves like BC3)
– Also very effective at representing certain RGB blocks
BC7 specifics
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Multiple partitions (modes 0, 1, 2, 3, 7)
– 3 partitions, RGB, 3 or 2 bits per weight (mode 0, 2)
– 2 partitions, RGB, 3 or 2 bits per weight (mode 1, 3)
– 2 partition, RGBA, 2 bits per weight (mode 7)
 Two sets of weights (modes 4, 5)
– 1 partition, RGBA, 2+3 or 3+2 bits per weight (mode 4)
– 1 partition, RGBA, 2+2 bits per weight (mode 5)
 Vanilla (mode 6)
– 1 partition, RGBA, 4 bits per weight (mode 6)
BC7 modes
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Differential endpoint encoding
– The first endpoint is subtracted from all other endpoints
– Deltas can fit in less bits (amount defined by each mode)
– Some modes allow more bits in a given channel
 Effective on blocks with low dynamic range
– Absolute encoding almost never used in practice
BC6H specifics
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Differential endpoint encoding (mode 1-9, 12-14)
– 2 partition, 3 bits per weight (mode 1-9)
– 1 partition, 4 bits per weight (mode 12-14)
 Absolute endpoint encoding (mode 10, 11)
– 2 partition, 3 bits per weight (mode 10)
– 1 partition, 4 bits per weight (mode 11)
BC6H modes
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Fast Texture compression: Algorithms
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Local optimization
– The encoder processes each 4x4 block in isolation
 Parameter search
– Mode, Partition, Endpoints, Weights…
 Encoding
– Endpoint swaps, swizzles, bit packing…
Fast Texture compression: Algorithms
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Essentially a nested loop: Mode>Partition>Endpoints
 Mode
– BC7: Exhaustive search
– BC6H: First-fit (with tweaks)
 Partition
– Fast partition pruning
 Endpoints, Weights
– Iterative refinement with PCA initialization
Parameter search
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 For each partition, find the best endpoints and weights
– Integer bilinear programming (NP-hard)
 But we can find the “best” endpoints given weights
– Least squares (trivial)
 And the best weights given endpoints
– Scalar quantization (trivial)
 This is iterative refinement (this is what stb_dxt.h does)
Iterative refinement
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 We need something to start the iterations
– What if we didn’t have to quantize weights?
– This would reduce to projecting all points to a line
 Finding the “best line” is easy (first PCA component)
– We use power iteration
 Use the two farthest projections as initial endpoints
PCA initialization
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
2D Toy example
PCA
Initial Endpoints
Scalar Quantization
Least squares
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 PCA is a lower bound for endpoint/weight search
– Could use it to skip partitions, but the pruning is too shallow
– However, it’s a decent proxy for a partition’s optimum
 We rank partitions by bound, and only try the best
– Aggressive pruning has minimal impact on quality
 Small trick: The sum of covariance matrices is constant
Fast partition pruning
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Most of the modes vary endpoint bit allocation
– We can take a guess based on local dynamic range
– Use the narrowest mode that works (first-fit)
 Do this for both 1 partition and 2 partitions modes
 Only UF16 implemented so far
– Is there a big usage case for signed HDR values (SF16) ?
BC6H specifics
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 The encoder is fairly configurable
– Iterative refinement iterations
– Fast pruning thresholds
– Alpha channel can be ignored
– Many of the shortcuts can be turned off
– Some modes can be completely skipped
 A range of presets is available (profiles)
Encoder parameters
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Effective SIMD acceleration with ISPC
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Effective way to use vector ISAs of modern processors
– Scales from SSE2 to AVX2
– Not Intel-specific!
– Supports many platforms, including Jaguar
 Code is easy to maintain, quite portable
Intel SPMD Program Compiler
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Each program instance is mapped to a 4x4 block
– Typically 4 or 8 blocks processed in parallel
 It’s inefficient to run different code paths per block
– Gather/scatter is more efficient, but still avoid if possible
 The encoder was designed under those constraints
– There is no scalar code path
SIMD-friendly implementation
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Fast partition pruning requires partial sorting
– We use selection sort, 1 scatter per sorted element
 Blocks will typically have divergent partitions
– PCA initialization and endpoint optimization (LS)
 For each partition, process all pixels but mask out inactive ones
– Weight optimization (scalar quantization)
 For each pixel, use gather to get the right endpoints
SIMD-friendly search
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Each mode requires very different bit-packing
– By using exhaustive search we avoid divergent paths
– Encode every time, but only keep the best 128-bit result
 But implicit bits positions depend on the partition
– This makes the bit positions variable
– Instead we use a fixed encoding in 130 bits
– The difference is resolved using 160-bit shifts (one or two)
SIMD-friendly encoding
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Future Work
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 New supported formats in the next release:
– ETC1, ETC2
– ASTC LDR
 Don’t hesitate to ask if you‘re looking for a specific
feature: We’ll implement it if there’s enough demand.
 Try it out for yourself:
– Google: “fast ISPC texture compressor”
– https://software.intel.com/en-us/articles/fast-ispc-texture-compressor-update
What now?
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Questions
(and maybe answers)
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

More Related Content

What's hot

Partitioning tables and indexing them
Partitioning tables and indexing them Partitioning tables and indexing them
Partitioning tables and indexing them
Hemant K Chitale
 
Oracle SQL Developer Tips and Tricks: Data Edition
Oracle SQL Developer Tips and Tricks: Data EditionOracle SQL Developer Tips and Tricks: Data Edition
Oracle SQL Developer Tips and Tricks: Data Edition
Jeff Smith
 
01 intel processor architecture core
01 intel processor architecture core01 intel processor architecture core
01 intel processor architecture coresssuhas
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQL
Joel Brewer
 
08 Dynamic SQL and Metadata
08 Dynamic SQL and Metadata08 Dynamic SQL and Metadata
08 Dynamic SQL and Metadata
rehaniltifat
 
Azure IoT Edge: a breakthrough platform and service running cloud intelligenc...
Azure IoT Edge: a breakthrough platform and service running cloud intelligenc...Azure IoT Edge: a breakthrough platform and service running cloud intelligenc...
Azure IoT Edge: a breakthrough platform and service running cloud intelligenc...
Microsoft Tech Community
 
Intel core i7 processor
Intel core i7 processorIntel core i7 processor
Intel core i7 processor
Abhishek Ashish
 
Introduction to SQLAlchemy and Alembic Migrations
Introduction to SQLAlchemy and Alembic MigrationsIntroduction to SQLAlchemy and Alembic Migrations
Introduction to SQLAlchemy and Alembic Migrations
Jason Myers
 
How to use source control with apex?
How to use source control with apex?How to use source control with apex?
How to use source control with apex?
Oliver Lemm
 
Geneos Corporate Overview June 2010
Geneos Corporate Overview June 2010Geneos Corporate Overview June 2010
Geneos Corporate Overview June 2010ctom
 
FPGA Tabanlı Sinyal ve Görüntü İşleme
FPGA Tabanlı Sinyal ve Görüntü İşlemeFPGA Tabanlı Sinyal ve Görüntü İşleme
FPGA Tabanlı Sinyal ve Görüntü İşleme
Mehmet Ali Çavuşlu
 
Basics of digital verilog design(alok singh kanpur)
Basics of digital verilog design(alok singh kanpur)Basics of digital verilog design(alok singh kanpur)
Basics of digital verilog design(alok singh kanpur)
Alok Singh
 
Oracle material (1)
Oracle material (1)Oracle material (1)
Oracle material (1)
Ashok Kumar
 
Verilog HDL Verification
Verilog HDL VerificationVerilog HDL Verification
Verilog HDL Verification
dennis gookyi
 
WebGL 2.0 Reference Guide
WebGL 2.0 Reference GuideWebGL 2.0 Reference Guide
WebGL 2.0 Reference Guide
The Khronos Group Inc.
 
IP PCIe
IP PCIeIP PCIe
IP PCIe
SILKAN
 
PCIe DL_layer_3.0.1 (1)
PCIe DL_layer_3.0.1 (1)PCIe DL_layer_3.0.1 (1)
PCIe DL_layer_3.0.1 (1)
Rakeshkumar Sachdev
 
[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL Tuning[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL Tuning
PgDay.Seoul
 
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
Gavin Guo
 

What's hot (20)

Partitioning tables and indexing them
Partitioning tables and indexing them Partitioning tables and indexing them
Partitioning tables and indexing them
 
Intel Core i7
Intel Core i7Intel Core i7
Intel Core i7
 
Oracle SQL Developer Tips and Tricks: Data Edition
Oracle SQL Developer Tips and Tricks: Data EditionOracle SQL Developer Tips and Tricks: Data Edition
Oracle SQL Developer Tips and Tricks: Data Edition
 
01 intel processor architecture core
01 intel processor architecture core01 intel processor architecture core
01 intel processor architecture core
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQL
 
08 Dynamic SQL and Metadata
08 Dynamic SQL and Metadata08 Dynamic SQL and Metadata
08 Dynamic SQL and Metadata
 
Azure IoT Edge: a breakthrough platform and service running cloud intelligenc...
Azure IoT Edge: a breakthrough platform and service running cloud intelligenc...Azure IoT Edge: a breakthrough platform and service running cloud intelligenc...
Azure IoT Edge: a breakthrough platform and service running cloud intelligenc...
 
Intel core i7 processor
Intel core i7 processorIntel core i7 processor
Intel core i7 processor
 
Introduction to SQLAlchemy and Alembic Migrations
Introduction to SQLAlchemy and Alembic MigrationsIntroduction to SQLAlchemy and Alembic Migrations
Introduction to SQLAlchemy and Alembic Migrations
 
How to use source control with apex?
How to use source control with apex?How to use source control with apex?
How to use source control with apex?
 
Geneos Corporate Overview June 2010
Geneos Corporate Overview June 2010Geneos Corporate Overview June 2010
Geneos Corporate Overview June 2010
 
FPGA Tabanlı Sinyal ve Görüntü İşleme
FPGA Tabanlı Sinyal ve Görüntü İşlemeFPGA Tabanlı Sinyal ve Görüntü İşleme
FPGA Tabanlı Sinyal ve Görüntü İşleme
 
Basics of digital verilog design(alok singh kanpur)
Basics of digital verilog design(alok singh kanpur)Basics of digital verilog design(alok singh kanpur)
Basics of digital verilog design(alok singh kanpur)
 
Oracle material (1)
Oracle material (1)Oracle material (1)
Oracle material (1)
 
Verilog HDL Verification
Verilog HDL VerificationVerilog HDL Verification
Verilog HDL Verification
 
WebGL 2.0 Reference Guide
WebGL 2.0 Reference GuideWebGL 2.0 Reference Guide
WebGL 2.0 Reference Guide
 
IP PCIe
IP PCIeIP PCIe
IP PCIe
 
PCIe DL_layer_3.0.1 (1)
PCIe DL_layer_3.0.1 (1)PCIe DL_layer_3.0.1 (1)
PCIe DL_layer_3.0.1 (1)
 
[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL Tuning[Pgday.Seoul 2020] SQL Tuning
[Pgday.Seoul 2020] SQL Tuning
 
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
 

Similar to How to create a high quality, fast texture compressor using ISPC

Embree Ray Tracing Kernels
Embree Ray Tracing KernelsEmbree Ray Tracing Kernels
Embree Ray Tracing Kernels
Intel® Software
 
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
Gael Hofemeier
 
Efficient Rendering with DirectX* 12 on Intel® Graphics
Efficient Rendering with DirectX* 12 on Intel® GraphicsEfficient Rendering with DirectX* 12 on Intel® Graphics
Efficient Rendering with DirectX* 12 on Intel® Graphics
Gael Hofemeier
 
Software-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRay
Software-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRaySoftware-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRay
Software-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRay
Intel® Software
 
In The Trenches Optimizing UE4 for Intel
In The Trenches Optimizing UE4 for IntelIn The Trenches Optimizing UE4 for Intel
In The Trenches Optimizing UE4 for Intel
Intel® Software
 
More explosions, more chaos, and definitely more blowing stuff up
More explosions, more chaos, and definitely more blowing stuff upMore explosions, more chaos, and definitely more blowing stuff up
More explosions, more chaos, and definitely more blowing stuff up
Intel® Software
 
How Funcom Increased Play Time in Lego Minifigures by 40%
How Funcom Increased Play Time in Lego Minifigures by 40%How Funcom Increased Play Time in Lego Minifigures by 40%
How Funcom Increased Play Time in Lego Minifigures by 40%
Gael Hofemeier
 
Intel NFVi Enabling Kit Demo/Lab
Intel NFVi Enabling Kit Demo/LabIntel NFVi Enabling Kit Demo/Lab
Intel NFVi Enabling Kit Demo/Lab
Michelle Holley
 
What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?
Michelle Holley
 
QATCodec: past, present and future
QATCodec: past, present and futureQATCodec: past, present and future
QATCodec: past, present and future
boxu42
 
Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013
Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013
Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013
Intel Software Brasil
 
Make your unity game faster, faster
Make your unity game faster, fasterMake your unity game faster, faster
Make your unity game faster, faster
Intel® Software
 
Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...
Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...
Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...
Igor José F. Freitas
 
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...
tdc-globalcode
 
Play faster and longer: How Square Enix maximized Android* performance and ba...
Play faster and longer: How Square Enix maximized Android* performance and ba...Play faster and longer: How Square Enix maximized Android* performance and ba...
Play faster and longer: How Square Enix maximized Android* performance and ba...
Gael Hofemeier
 
Real-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPAReal-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPA
Intel® Software
 
Ready access to high performance Python with Intel Distribution for Python 2018
Ready access to high performance Python with Intel Distribution for Python 2018Ready access to high performance Python with Intel Distribution for Python 2018
Ready access to high performance Python with Intel Distribution for Python 2018
AWS User Group Bengaluru
 
Intel XDK - Philly JS
Intel XDK - Philly JSIntel XDK - Philly JS
Intel XDK - Philly JS
Ian Maffett
 
Gary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year Horizon
Gary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year HorizonGary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year Horizon
Gary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year Horizon
AugmentedWorldExpo
 
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura IntelTDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
tdc-globalcode
 

Similar to How to create a high quality, fast texture compressor using ISPC (20)

Embree Ray Tracing Kernels
Embree Ray Tracing KernelsEmbree Ray Tracing Kernels
Embree Ray Tracing Kernels
 
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
 
Efficient Rendering with DirectX* 12 on Intel® Graphics
Efficient Rendering with DirectX* 12 on Intel® GraphicsEfficient Rendering with DirectX* 12 on Intel® Graphics
Efficient Rendering with DirectX* 12 on Intel® Graphics
 
Software-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRay
Software-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRaySoftware-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRay
Software-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRay
 
In The Trenches Optimizing UE4 for Intel
In The Trenches Optimizing UE4 for IntelIn The Trenches Optimizing UE4 for Intel
In The Trenches Optimizing UE4 for Intel
 
More explosions, more chaos, and definitely more blowing stuff up
More explosions, more chaos, and definitely more blowing stuff upMore explosions, more chaos, and definitely more blowing stuff up
More explosions, more chaos, and definitely more blowing stuff up
 
How Funcom Increased Play Time in Lego Minifigures by 40%
How Funcom Increased Play Time in Lego Minifigures by 40%How Funcom Increased Play Time in Lego Minifigures by 40%
How Funcom Increased Play Time in Lego Minifigures by 40%
 
Intel NFVi Enabling Kit Demo/Lab
Intel NFVi Enabling Kit Demo/LabIntel NFVi Enabling Kit Demo/Lab
Intel NFVi Enabling Kit Demo/Lab
 
What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?
 
QATCodec: past, present and future
QATCodec: past, present and futureQATCodec: past, present and future
QATCodec: past, present and future
 
Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013
Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013
Intel® Trace Analyzer e Collector (ITAC) - Intel Software Conference 2013
 
Make your unity game faster, faster
Make your unity game faster, fasterMake your unity game faster, faster
Make your unity game faster, faster
 
Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...
Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...
Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...
 
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...
 
Play faster and longer: How Square Enix maximized Android* performance and ba...
Play faster and longer: How Square Enix maximized Android* performance and ba...Play faster and longer: How Square Enix maximized Android* performance and ba...
Play faster and longer: How Square Enix maximized Android* performance and ba...
 
Real-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPAReal-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPA
 
Ready access to high performance Python with Intel Distribution for Python 2018
Ready access to high performance Python with Intel Distribution for Python 2018Ready access to high performance Python with Intel Distribution for Python 2018
Ready access to high performance Python with Intel Distribution for Python 2018
 
Intel XDK - Philly JS
Intel XDK - Philly JSIntel XDK - Philly JS
Intel XDK - Philly JS
 
Gary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year Horizon
Gary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year HorizonGary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year Horizon
Gary Brown (Movidius, Intel): Deep Learning in AR: the 3 Year Horizon
 
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura IntelTDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
 

Recently uploaded

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 

Recently uploaded (20)

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 

How to create a high quality, fast texture compressor using ISPC

  • 1. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Marc Fauconneau March 04, 2015 High-Quality, Fast DX11 Texture Compression with ISPC
  • 2. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Copyright © 2015 Intel Corporation. All rights reserved.  *Other names and brands may be claimed as the property of others.  INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.  A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS.  Intel may make changes to specifications and product descriptions at any time, without notice.  All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.  Intel processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request.  Any code names featured are used internally within Intel to identify products that are in development and not yet publicly announced for release. Customers, licensees and other third parties are not authorized by Intel to use code names in advertising, promotion or marketing of any product or services and any such use of Intel's internal code names is at the sole risk of the user.  Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps.  Performance claims: Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.Intel.com/performance  Iris™ graphics is available on select systems. Consult your system manufacturer.  Intel, Intel Inside, the Intel logo, Intel Core and Iris are trademarks of Intel Corporation in the United States and other countries. Legal
  • 3. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Introduction to DX11 Texture Compression  The Fast ISPC Texture Compressor  Overview of the DX11 Formats  Fast Texture compression: Algorithms  Effective SIMD acceleration with ISPC  Future Work Outline
  • 4. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Introduction to DX11 Texture Compression
  • 5. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  4x4 Blocks, 128 bits per block  BC7 – RGB or RGBA, LDR (8-bit) – 4:1 compression for R8G8B8A8 – Very high quality RGB (far better than BC1/S3TC, due to more bits) – Great replacement for BC3 (higher quality, same size)  BC6H – RGB, HDR (16-bit) – 4:1 compression for R9G9B9E5_SHAREDEXP (8:1 for 4ch FP16) DX11 Texture Formats
  • 6. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. BC7 in action BC3 BC7 Source RGB Alpha
  • 7. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. BC6H in action BC6H Source
  • 8. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Fast ISPC Texture Compressor
  • 9. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Driven by ISV demand for a good, fast compressor  State of the art compressor – Efficient pruning scheme – SIMD implementation using ISPC (Intel SPMD Compiler)  Permissive license, easy to use – Has been integrated into several content pipelines ISPC Texture Compressor
  • 10. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Performance of available BC7 compressors on Kodak dataset (RGB only)  Independent testing confirms those results: http://gamma.cs.unc.edu/FasTC/ State of the art
  • 11. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Performance of available BC7 compressors on Kodak dataset (RGB only)  Independent testing confirms those results: http://gamma.cs.unc.edu/FasTC/ State of the art 300x faster 30x faster, ~same power
  • 12. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Builds as a DLL with a C interface – Encoder settings are highly configurable – A set of profiles spanning a wide curve of performance/quality trade-offs is included  No external dependencies (other than ISPC)  No hidden multithreading: Responsibility of the calling application (everyone has their preferences) Easy to integrate
  • 13. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. D3D11_MAPPED_SUBRESOURCE uncompData, compData; deviceContext->Map(uncompTex, 0, D3D11_MAP_READ_WRITE, 0, &uncompData); deviceContext->Map(compStgTex, 0, D3D11_MAP_READ_WRITE, 0, &compData); rgba_surface input; input.ptr = (BYTE*)uncompData.pData; input.stride = uncompData.RowPitch; input.width = uncompTexDesc.Width; input.height = uncompTexDesc.Height; BYTE* output = (BYTE*)compData.pData; bc6h_enc_settings settings; GetProfile_bc6h_basic(&settings); CompressBlocksBC6H(input, output, &settings); Mandatory sample code
  • 14. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. D3D11_MAPPED_SUBRESOURCE uncompData, compData; deviceContext->Map(uncompTex, 0, D3D11_MAP_READ_WRITE, 0, &uncompData); deviceContext->Map(compTex, 0, D3D11_MAP_READ_WRITE, 0, &compData); rgba_surface input; input.ptr = (BYTE*)uncompData.pData; input.stride = uncompData.RowPitch; input.width = uncompTexDesc.Width; input.height = uncompTexDesc.Height; BYTE* output = (BYTE*)compData.pData; bc6h_enc_settings settings; GetProfile_bc6h_basic(&settings); CompressBlocksBC6H(input, output, &settings); Mandatory sample code
  • 15. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  https://software.intel.com/en-us/articles/fast-ispc-texture-compressor-update Sample application
  • 16. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Overview of the DX11 Formats
  • 17. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  This is not a reference – See MSDN for that: https://msdn.microsoft.com/en-us/library/hh308955(v=vs.85).aspx  I will not talk about specifics like bit packing…  Those formats build upon BC1 aka DXT1 aka S3TC – Per-pixel interpolation between two endpoints Overview of the DX11 Formats
  • 18. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Each block is set to one mode (14 in BC6H, 8 in BC7) – It uniquely defines how all values in the block are packed  Some modes enable specific features: – Multiple partitions – RGBA endpoints (BC7) – Encoding two sets of weights (BC7) – Differential endpoint encoding (BC6H) Coding tools
  • 19. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Like in BC1 aka DXT1 aka S3TC – More variation in endpoint/weight sizes – Avoids the unequal channel quantization problem (RGB565)  This technique exploits local color correlation/contrast – It fails if pixels are not clustered along a line in RGB(A) space – It introduces more error in high-contrast blocks Endpoint interpolation
  • 20. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Endpoint interpolation RGB Cube Block
  • 21. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Endpoint interpolation RGB Cube Block
  • 22. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Some modes split the 4x4 block into 2 or 3 partitions – Each partition has it’s own pair of endpoints – Pre-defined table with 64 ways to split  This drastically reduces error in blocks presenting difficult combinations of colors or contrast Multiple partitions
  • 23. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. First 32 partitions
  • 24. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  RGB or RGBA endpoints – Many modes only encode RGB, assume Alpha=1.0 – Very effective for transparent textures with opaque regions  Some modes can encode two sets of weights – A “vector” channel and “scalar” channel (any of R,G,B,A) – Bounds error for uncorrelated alpha (behaves like BC3) – Also very effective at representing certain RGB blocks BC7 specifics
  • 25. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Multiple partitions (modes 0, 1, 2, 3, 7) – 3 partitions, RGB, 3 or 2 bits per weight (mode 0, 2) – 2 partitions, RGB, 3 or 2 bits per weight (mode 1, 3) – 2 partition, RGBA, 2 bits per weight (mode 7)  Two sets of weights (modes 4, 5) – 1 partition, RGBA, 2+3 or 3+2 bits per weight (mode 4) – 1 partition, RGBA, 2+2 bits per weight (mode 5)  Vanilla (mode 6) – 1 partition, RGBA, 4 bits per weight (mode 6) BC7 modes
  • 26. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Differential endpoint encoding – The first endpoint is subtracted from all other endpoints – Deltas can fit in less bits (amount defined by each mode) – Some modes allow more bits in a given channel  Effective on blocks with low dynamic range – Absolute encoding almost never used in practice BC6H specifics
  • 27. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Differential endpoint encoding (mode 1-9, 12-14) – 2 partition, 3 bits per weight (mode 1-9) – 1 partition, 4 bits per weight (mode 12-14)  Absolute endpoint encoding (mode 10, 11) – 2 partition, 3 bits per weight (mode 10) – 1 partition, 4 bits per weight (mode 11) BC6H modes
  • 28. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Fast Texture compression: Algorithms
  • 29. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Local optimization – The encoder processes each 4x4 block in isolation  Parameter search – Mode, Partition, Endpoints, Weights…  Encoding – Endpoint swaps, swizzles, bit packing… Fast Texture compression: Algorithms
  • 30. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Essentially a nested loop: Mode>Partition>Endpoints  Mode – BC7: Exhaustive search – BC6H: First-fit (with tweaks)  Partition – Fast partition pruning  Endpoints, Weights – Iterative refinement with PCA initialization Parameter search
  • 31. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  For each partition, find the best endpoints and weights – Integer bilinear programming (NP-hard)  But we can find the “best” endpoints given weights – Least squares (trivial)  And the best weights given endpoints – Scalar quantization (trivial)  This is iterative refinement (this is what stb_dxt.h does) Iterative refinement
  • 32. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  We need something to start the iterations – What if we didn’t have to quantize weights? – This would reduce to projecting all points to a line  Finding the “best line” is easy (first PCA component) – We use power iteration  Use the two farthest projections as initial endpoints PCA initialization
  • 33. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. 2D Toy example PCA Initial Endpoints Scalar Quantization Least squares
  • 34. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  PCA is a lower bound for endpoint/weight search – Could use it to skip partitions, but the pruning is too shallow – However, it’s a decent proxy for a partition’s optimum  We rank partitions by bound, and only try the best – Aggressive pruning has minimal impact on quality  Small trick: The sum of covariance matrices is constant Fast partition pruning
  • 35. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Most of the modes vary endpoint bit allocation – We can take a guess based on local dynamic range – Use the narrowest mode that works (first-fit)  Do this for both 1 partition and 2 partitions modes  Only UF16 implemented so far – Is there a big usage case for signed HDR values (SF16) ? BC6H specifics
  • 36. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  The encoder is fairly configurable – Iterative refinement iterations – Fast pruning thresholds – Alpha channel can be ignored – Many of the shortcuts can be turned off – Some modes can be completely skipped  A range of presets is available (profiles) Encoder parameters
  • 37. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Effective SIMD acceleration with ISPC
  • 38. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Effective way to use vector ISAs of modern processors – Scales from SSE2 to AVX2 – Not Intel-specific! – Supports many platforms, including Jaguar  Code is easy to maintain, quite portable Intel SPMD Program Compiler
  • 39. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Each program instance is mapped to a 4x4 block – Typically 4 or 8 blocks processed in parallel  It’s inefficient to run different code paths per block – Gather/scatter is more efficient, but still avoid if possible  The encoder was designed under those constraints – There is no scalar code path SIMD-friendly implementation
  • 40. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Fast partition pruning requires partial sorting – We use selection sort, 1 scatter per sorted element  Blocks will typically have divergent partitions – PCA initialization and endpoint optimization (LS)  For each partition, process all pixels but mask out inactive ones – Weight optimization (scalar quantization)  For each pixel, use gather to get the right endpoints SIMD-friendly search
  • 41. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Each mode requires very different bit-packing – By using exhaustive search we avoid divergent paths – Encode every time, but only keep the best 128-bit result  But implicit bits positions depend on the partition – This makes the bit positions variable – Instead we use a fixed encoding in 130 bits – The difference is resolved using 160-bit shifts (one or two) SIMD-friendly encoding
  • 42. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Future Work
  • 43. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  New supported formats in the next release: – ETC1, ETC2 – ASTC LDR  Don’t hesitate to ask if you‘re looking for a specific feature: We’ll implement it if there’s enough demand.  Try it out for yourself: – Google: “fast ISPC texture compressor” – https://software.intel.com/en-us/articles/fast-ispc-texture-compressor-update What now?
  • 44. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Questions (and maybe answers)
  • 45. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.