Submit Search
Upload
DreamWorks Animation
Report
Share
Intel® Software
Intel Software at Intel® Software
Follow
•
7 likes
•
5,075 views
1
of
46
DreamWorks Animation
•
7 likes
•
5,075 views
Report
Share
Download Now
Download to read offline
Software
DreamWorks Animation*: Slashing the cost of 3d Matrix Math using X-Form (Transform) Building Blocks
Read more
Intel® Software
Intel Software at Intel® Software
Follow
Recommended
Improving the performance of OpenSubdiv* on Intel Architecture by
Improving the performance of OpenSubdiv* on Intel Architecture
Intel® Software
735 views
•
36 slides
Embree Ray Tracing Kernels by
Embree Ray Tracing Kernels
Intel® Software
2.2K views
•
64 slides
Software-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRay by
Software-defined Visualization, High-Fidelity Visualization: OpenSWR and OSPRay
Intel® Software
1.4K views
•
83 slides
DreamWork Animation DWA by
DreamWork Animation DWA
Intel® Software
2.9K views
•
68 slides
Intel - Nurcan Coskun - Hadoop World 2010 by
Intel - Nurcan Coskun - Hadoop World 2010
Cloudera, Inc.
1.7K views
•
25 slides
Driving Industrial InnovationOn the Path to Exascale by
Driving Industrial InnovationOn the Path to Exascale
Intel IT Center
413 views
•
15 slides
More Related Content
What's hot
Transforming Products into Platforms by
Transforming Products into Platforms
Delyn Simons
1.3K views
•
19 slides
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ... by
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...
Ceph Community
130 views
•
19 slides
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, Mashery by
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, Mashery
Delyn Simons
1.4K views
•
8 slides
Intel: мобильность и трансформация рабочего места by
Intel: мобильность и трансформация рабочего места
Expolink
336 views
•
17 slides
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir... by
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
Gael Hofemeier
2.4K views
•
53 slides
LF_DPDK17_The Path to Data Plane Microservices by
LF_DPDK17_The Path to Data Plane Microservices
LF_DPDK
176 views
•
9 slides
What's hot
(19)
Transforming Products into Platforms by Delyn Simons
Transforming Products into Platforms
Delyn Simons
•
1.3K views
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ... by Ceph Community
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...
Ceph Community
•
130 views
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, Mashery by Delyn Simons
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, Mashery
Delyn Simons
•
1.4K views
Intel: мобильность и трансформация рабочего места by Expolink
Intel: мобильность и трансформация рабочего места
Expolink
•
336 views
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir... by Gael Hofemeier
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
Gael Hofemeier
•
2.4K views
LF_DPDK17_The Path to Data Plane Microservices by LF_DPDK
LF_DPDK17_The Path to Data Plane Microservices
LF_DPDK
•
176 views
LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int... by LF_DPDK
LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...
LF_DPDK
•
239 views
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh by MAKERPRO.cc
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
MAKERPRO.cc
•
734 views
TDC2019 Intel Software Day - Inferencia de IA em edge devices by tdc-globalcode
TDC2019 Intel Software Day - Inferencia de IA em edge devices
tdc-globalcode
•
560 views
Intel® desktop board by Eduardo Ernesto
Intel® desktop board
Eduardo Ernesto
•
2.6K views
D101 ggc techprodspec by IMI CALULU
D101 ggc techprodspec
IMI CALULU
•
1.8K views
E5 Intel Xeon Processor E5 Family Making the Business Case by Intel IT Center
E5 Intel Xeon Processor E5 Family Making the Business Case
Intel IT Center
•
772 views
Explore, design and implement threading parallelism with Intel® Advisor XE by Intel IT Center
Explore, design and implement threading parallelism with Intel® Advisor XE
Intel IT Center
•
583 views
EARS: The Easy Approach to Requirements Syntax by TechWell
EARS: The Easy Approach to Requirements Syntax
TechWell
•
2.6K views
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests by LF_DPDK
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK
•
409 views
Launch X-431 Diagun V product introduction by LeslieTsai2
Launch X-431 Diagun V product introduction
LeslieTsai2
•
115 views
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear... by tdc-globalcode
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
tdc-globalcode
•
1K views
Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at... by Intel IT Center
Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...
Intel IT Center
•
1.8K views
Embedded Platforms Launch Press Presentation by AMD
Embedded Platforms Launch Press Presentation
AMD
•
523 views
Viewers also liked
Real-Time Game Optimization with Intel® GPA by
Real-Time Game Optimization with Intel® GPA
Intel® Software
930 views
•
33 slides
Real-Time Game Optimization with Intel® GPA by
Real-Time Game Optimization with Intel® GPA
Intel® Software
2K views
•
20 slides
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution by
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution
Intel® Software
2.5K views
•
63 slides
Dreamworks Presentation by
Dreamworks Presentation
Dreamworksng
1.1K views
•
20 slides
DreamWorks Animation by
DreamWorks Animation
Ashley Coro
1.8K views
•
15 slides
VFX Operations by
VFX Operations
John Patrick
858 views
•
18 slides
Viewers also liked
(16)
Real-Time Game Optimization with Intel® GPA by Intel® Software
Real-Time Game Optimization with Intel® GPA
Intel® Software
•
930 views
Real-Time Game Optimization with Intel® GPA by Intel® Software
Real-Time Game Optimization with Intel® GPA
Intel® Software
•
2K views
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution by Intel® Software
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution
Intel® Software
•
2.5K views
Dreamworks Presentation by Dreamworksng
Dreamworks Presentation
Dreamworksng
•
1.1K views
DreamWorks Animation by Ashley Coro
DreamWorks Animation
Ashley Coro
•
1.8K views
VFX Operations by John Patrick
VFX Operations
John Patrick
•
858 views
Looking at Machine Learning in Games by Intel® Software
Looking at Machine Learning in Games
Intel® Software
•
7.8K views
Cigdc powerpoint by Dreamworks Marketing Consultancy, Inc.
Cigdc powerpoint
Dreamworks Marketing Consultancy, Inc.
•
3.1K views
Masked Software Occlusion Culling by Intel® Software
Masked Software Occlusion Culling
Intel® Software
•
4K views
DreamWorks Pictures by Sarah Byard
DreamWorks Pictures
Sarah Byard
•
1.8K views
Math by jeh20717
Math
jeh20717
•
243 views
D math graph by Budi Irmawati
D math graph
Budi Irmawati
•
314 views
Presentation Dreamworks by Mark Dudethatdidnotjusthappen
Presentation Dreamworks
Mark Dudethatdidnotjusthappen
•
3.2K views
Dreamworks Studios Skg by rishabhbhatia
Dreamworks Studios Skg
rishabhbhatia
•
1.1K views
Unity Optimization Tips, Tricks and Tools by Intel® Software
Unity Optimization Tips, Tricks and Tools
Intel® Software
•
2.6K views
Optimization Deep Dive: Unreal Engine 4 on Intel by Intel® Software
Optimization Deep Dive: Unreal Engine 4 on Intel
Intel® Software
•
2.9K views
Similar to DreamWorks Animation
Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro... by
Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...
inside-BigData.com
1.7K views
•
33 slides
Intel HPC Update by
Intel HPC Update
IBM Danmark
2.7K views
•
39 slides
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013 by
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013
Intel Software Brasil
2K views
•
16 slides
VIOPS08: マイクロサーバー アーキテクチャトレンド by
VIOPS08: マイクロサーバー アーキテクチャトレンド
VIOPS Virtualized Infrastructure Operators group ARCHIVES
730 views
•
10 slides
Using Xeon + FPGA for Accelerating HPC Workloads by
Using Xeon + FPGA for Accelerating HPC Workloads
inside-BigData.com
4.5K views
•
35 slides
Austin Cherian: Big data and HPC technologies - intel by
Austin Cherian: Big data and HPC technologies - intel
Vu Hung Nguyen
1.3K views
•
38 slides
Similar to DreamWorks Animation
(20)
Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro... by inside-BigData.com
Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...
inside-BigData.com
•
1.7K views
Intel HPC Update by IBM Danmark
Intel HPC Update
IBM Danmark
•
2.7K views
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013 by Intel Software Brasil
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013
Intel Software Brasil
•
2K views
VIOPS08: マイクロサーバー アーキテクチャトレンド by VIOPS Virtualized Infrastructure Operators group ARCHIVES
VIOPS08: マイクロサーバー アーキテクチャトレンド
VIOPS Virtualized Infrastructure Operators group ARCHIVES
•
730 views
Using Xeon + FPGA for Accelerating HPC Workloads by inside-BigData.com
Using Xeon + FPGA for Accelerating HPC Workloads
inside-BigData.com
•
4.5K views
Austin Cherian: Big data and HPC technologies - intel by Vu Hung Nguyen
Austin Cherian: Big data and HPC technologies - intel
Vu Hung Nguyen
•
1.3K views
Como criar um mundo autônomo e conectado - Jomar Silva by iMasters
Como criar um mundo autônomo e conectado - Jomar Silva
iMasters
•
450 views
8 intel network builders overview by videos
8 intel network builders overview
videos
•
2.4K views
AI & Computer Vision (OpenVINO) - CPBR12 by Jomar Silva
AI & Computer Vision (OpenVINO) - CPBR12
Jomar Silva
•
403 views
4 dpdk roadmap(1) by videos
4 dpdk roadmap(1)
videos
•
2.1K views
O uso de tecnologias Intel na implantação de sistemas de alto desempenho by Intel Software Brasil
O uso de tecnologias Intel na implantação de sistemas de alto desempenho
Intel Software Brasil
•
4.5K views
Yocto Project Open Source Build System and Collaboration Initiative by Marcelo Sanz
Yocto Project Open Source Build System and Collaboration Initiative
Marcelo Sanz
•
2.8K views
Internet of Things: Lightning Round, Sargent by GovLoop
Internet of Things: Lightning Round, Sargent
GovLoop
•
1.5K views
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2) by IntelAPAC
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)
IntelAPAC
•
812 views
Transforming Business with Advanced Analytics by Intel IT Center
Transforming Business with Advanced Analytics
Intel IT Center
•
751 views
Intel® AI: Reinforcement Learning Coach by Intel® Software
Intel® AI: Reinforcement Learning Coach
Intel® Software
•
724 views
Intel Mobile Launch Information by Anna Yovka
Intel Mobile Launch Information
Anna Yovka
•
370 views
50 Billion Connected Things are Coming by Intel® Software
50 Billion Connected Things are Coming
Intel® Software
•
778 views
E20190227[EDLS]インテル®︎FPGAによるエッジAI by LeapMind Inc
E20190227[EDLS]インテル®︎FPGAによるエッジAI
LeapMind Inc
•
298 views
Achieve Unconstrained Collaboration in a Digital World by Intel IT Center
Achieve Unconstrained Collaboration in a Digital World
Intel IT Center
•
500 views
More from Intel® Software
AI for All: Biology is eating the world & AI is eating Biology by
AI for All: Biology is eating the world & AI is eating Biology
Intel® Software
606 views
•
22 slides
Python Data Science and Machine Learning at Scale with Intel and Anaconda by
Python Data Science and Machine Learning at Scale with Intel and Anaconda
Intel® Software
1.4K views
•
21 slides
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci by
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Intel® Software
2.2K views
•
33 slides
AI for good: Scaling AI in science, healthcare, and more. by
AI for good: Scaling AI in science, healthcare, and more.
Intel® Software
4.4K views
•
12 slides
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su... by
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Intel® Software
6.1K views
•
21 slides
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization... by
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
Intel® Software
2.7K views
•
22 slides
More from Intel® Software
(20)
AI for All: Biology is eating the world & AI is eating Biology by Intel® Software
AI for All: Biology is eating the world & AI is eating Biology
Intel® Software
•
606 views
Python Data Science and Machine Learning at Scale with Intel and Anaconda by Intel® Software
Python Data Science and Machine Learning at Scale with Intel and Anaconda
Intel® Software
•
1.4K views
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci by Intel® Software
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Intel® Software
•
2.2K views
AI for good: Scaling AI in science, healthcare, and more. by Intel® Software
AI for good: Scaling AI in science, healthcare, and more.
Intel® Software
•
4.4K views
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su... by Intel® Software
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Intel® Software
•
6.1K views
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization... by Intel® Software
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
Intel® Software
•
2.7K views
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S... by Intel® Software
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Intel® Software
•
1.7K views
AWS & Intel Webinar Series - Accelerating AI Research by Intel® Software
AWS & Intel Webinar Series - Accelerating AI Research
Intel® Software
•
7.3K views
Intel Developer Program by Intel® Software
Intel Developer Program
Intel® Software
•
32.8K views
Intel AIDC Houston Summit - Overview Slides by Intel® Software
Intel AIDC Houston Summit - Overview Slides
Intel® Software
•
1.4K views
AIDC NY: BODO AI Presentation - 09.19.2019 by Intel® Software
AIDC NY: BODO AI Presentation - 09.19.2019
Intel® Software
•
1.4K views
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019 by Intel® Software
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019
Intel® Software
•
408 views
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl... by Intel® Software
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Intel® Software
•
3.1K views
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses... by Intel® Software
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Intel® Software
•
3.4K views
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019... by Intel® Software
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Intel® Software
•
974 views
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect... by Intel® Software
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
Intel® Software
•
1.9K views
AIDC India - AI on IA by Intel® Software
AIDC India - AI on IA
Intel® Software
•
433 views
AIDC India - Intel Movidius / Open Vino Slides by Intel® Software
AIDC India - Intel Movidius / Open Vino Slides
Intel® Software
•
155 views
AIDC India - AI Vision Slides by Intel® Software
AIDC India - AI Vision Slides
Intel® Software
•
167 views
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ... by Intel® Software
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
Intel® Software
•
1K views
Recently uploaded
Agile 101 by
Agile 101
John Valentino
13 views
•
20 slides
aATP - New Correlation Confirmation Feature.pptx by
aATP - New Correlation Confirmation Feature.pptx
EsatEsenek1
222 views
•
6 slides
Winter Projects GDSC IITK by
Winter Projects GDSC IITK
SahilSingh368445
416 views
•
60 slides
EV Charging App Case by
EV Charging App Case
iCoderz Solutions
10 views
•
1 slide
Introduction to Maven by
Introduction to Maven
John Valentino
7 views
•
10 slides
.NET Deserialization Attacks by
.NET Deserialization Attacks
Dharmalingam Ganesan
7 views
•
50 slides
Recently uploaded
(20)
Agile 101 by John Valentino
Agile 101
John Valentino
•
13 views
aATP - New Correlation Confirmation Feature.pptx by EsatEsenek1
aATP - New Correlation Confirmation Feature.pptx
EsatEsenek1
•
222 views
Winter Projects GDSC IITK by SahilSingh368445
Winter Projects GDSC IITK
SahilSingh368445
•
416 views
EV Charging App Case by iCoderz Solutions
EV Charging App Case
iCoderz Solutions
•
10 views
Introduction to Maven by John Valentino
Introduction to Maven
John Valentino
•
7 views
.NET Deserialization Attacks by Dharmalingam Ganesan
.NET Deserialization Attacks
Dharmalingam Ganesan
•
7 views
tecnologia18.docx by nosi6702
tecnologia18.docx
nosi6702
•
6 views
predicting-m3-devopsconMunich-2023-v2.pptx by Tier1 app
predicting-m3-devopsconMunich-2023-v2.pptx
Tier1 app
•
14 views
How To Make Your Plans Suck Less — Maarten Dalmijn at the 57th Hands-on Agile... by Stefan Wolpers
How To Make Your Plans Suck Less — Maarten Dalmijn at the 57th Hands-on Agile...
Stefan Wolpers
•
44 views
Using Qt under LGPL-3.0 by Burkhard Stubert
Using Qt under LGPL-3.0
Burkhard Stubert
•
14 views
ADDO_2022_CICID_Tom_Halpin.pdf by TomHalpin9
ADDO_2022_CICID_Tom_Halpin.pdf
TomHalpin9
•
6 views
Ports-and-Adapters Architecture for Embedded HMI by Burkhard Stubert
Ports-and-Adapters Architecture for Embedded HMI
Burkhard Stubert
•
35 views
Electronic AWB - Electronic Air Waybill by Freightoscope
Electronic AWB - Electronic Air Waybill
Freightoscope
•
6 views
FOSSLight Community Day 2023-11-30 by Shane Coughlan
FOSSLight Community Day 2023-11-30
Shane Coughlan
•
8 views
Quality Assurance by interworksoftware2
Quality Assurance
interworksoftware2
•
8 views
Flask-Python by Triloki Gupta
Flask-Python
Triloki Gupta
•
10 views
JioEngage_Presentation.pptx by admin125455
JioEngage_Presentation.pptx
admin125455
•
9 views
Quality Engineer: A Day in the Life by John Valentino
Quality Engineer: A Day in the Life
John Valentino
•
10 views
nintendo_64.pptx by paiga02016
nintendo_64.pptx
paiga02016
•
7 views
Streamlining Your Business Operations with Enterprise Application Integration... by Flexsin
Streamlining Your Business Operations with Enterprise Application Integration...
Flexsin
•
5 views
DreamWorks Animation
1.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. DreamWorks Animation*: Slashing the cost of 3d Matrix Math using X-Form (Transform) Building Blocks
2.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. DreamWorks Animation*: Slashing the cost of 3d Matrix Math using X-Form (Transform) Building Blocks
3.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. DreamWorks Animation: Slashing the cost of 3d Matrix Math using X-Form (Transform) Building Blocks
4.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Alex Wells (presenter) & Martin Watt (DWA) August 12 & 13, 2015 DreamWorks Animation: Slashing the cost of 3d Matrix Math using X-Form (Transform) Building Blocks
5.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Intel does not control or audit the design or implementation of third party benchmarks or Web sites referenced in this document. Intel encourages all of its customers to visit the referenced Web sites or others where similar performance benchmarks are reported and confirm whether the referenced benchmarks are accurate and reflect performance of systems available for purchase. Relative performance is calculated by assigning a baseline value of 1.0 to one benchmark result, and then dividing the actual benchmark result for the baseline platform into each of the specific benchmark results of each of the other platforms, and assigning them a relative performance number that correlates with the performance improvements reported. SPEC, SPECint, SPECfp, SPECrate. SPECpower, SPECjAppServer, SPECjbb, SPECjvm, SPECWeb, SPECompM, SPECompL, SPEC MPI, SPECjEnterprise* are trademarks of the Standard Performance Evaluation Corporation. See http://www.spec.org for more information. TPC-C, TPC-H, TPC-E are trademarks of the Transaction Processing Council. See http://www.tpc.org for more information. Hyper-Threading Technology requires a computer system with a processor supporting HT Technology and an HT Technology-enabled chipset, BIOS and operating system. Performance will vary depending on the specific hardware and software you use. For more information including details on which processors support HT Technology, see here Intel® Turbo Boost Technology requires a Platform with a processor with Intel Turbo Boost Technology capability. Intel Turbo Boost Technology performance varies depending on hardware, software and overall system configuration. Check with your platform manufacturer on whether your system delivers Intel Turbo Boost Technology. For more information, see http://www.intel.com/technology/turboboost No computer system can provide absolute security. Requires an enabled Intel® processor and software optimized for use of the technology. Consult your system manufacturer and/or software vendor for more information. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families: Go to: Learn About Intel® Processor Numbers Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps. Copyright © 2014 Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon and Intel Core are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. All dates and products specified are for planning purposes only and are subject to change without notice *Other names and brands may be claimed as the property of others. Legal Disclaimers 5
6.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. The above statements and any others in this document that refer to plans and expectations for the third quarter, the year and the future are forward-looking statements that involve a number of risks and uncertainties. Words such as “anticipates,” “expects,” “intends,” “plans,” “believes,” “seeks,” “estimates,” “may,” “will,” “should” and their variations identify forward-looking statements. Statements that refer to or are based on projections, uncertain events or assumptions also identify forward-looking statements. Many factors could affect Intel’s actual results, and variances from Intel’s current expectations regarding such factors could cause actual results to differ materially from those expressed in these forward-looking statements. Intel presently considers the following to be the important factors that could cause actual results to differ materially from the company’s expectations. Demand could be different from Intel's expectations due to factors including changes in business and economic conditions; customer acceptance of Intel’s and competitors’ products; supply constraints and other disruptions affecting customers; changes in customer order patterns including order cancellations; and changes in the level of inventory at customers. Uncertainty in global economic and financial conditions poses a risk that consumers and businesses may defer purchases in response to negative financial events, which could negatively affect product demand and other related matters. Intel operates in intensely competitive industries that are characterized by a high percentage of costs that are fixed or difficult to reduce in the short term and product demand that is highly variable and difficult to forecast. Revenue and the gross margin percentage are affected by the timing of Intel product introductions and the demand for and market acceptance of Intel's products; actions taken by Intel's competitors, including product offerings and introductions, marketing programs and pricing pressures and Intel’s response to such actions; and Intel’s ability to respond quickly to technological developments and to incorporate new features into its products. The gross margin percentage could vary significantly from expectations based on capacity utilization; variations in inventory valuation, including variations related to the timing of qualifying products for sale; changes in revenue levels; segment product mix; the timing and execution of the manufacturing ramp and associated costs; start-up costs; excess or obsolete inventory; changes in unit costs; defects or disruptions in the supply of materials or resources; product manufacturing quality/yields; and impairments of long-lived assets, including manufacturing, assembly/test and intangible assets. Intel's results could be affected by adverse economic, social, political and physical/infrastructure conditions in countries where Intel, its customers or its suppliers operate, including military conflict and other security risks, natural disasters, infrastructure disruptions, health concerns and fluctuations in currency exchange rates. Expenses, particularly certain marketing and compensation expenses, as well as restructuring and asset impairment charges, vary depending on the level of demand for Intel's products and the level of revenue and profits. Intel’s results could be affected by the timing of closing of acquisitions and divestitures. Intel's results could be affected by adverse effects associated with product defects and errata (deviations from published specifications), and by litigation or regulatory matters involving intellectual property, stockholder, consumer, antitrust, disclosure and other issues, such as the litigation and regulatory matters described in Intel's SEC reports. An unfavorable ruling could include monetary damages or an injunction prohibiting Intel from manufacturing or selling one or more products, precluding particular business practices, impacting Intel’s ability to design its products, or requiring other remedies such as compulsory licensing of intellectual property. A detailed discussion of these and other factors that could affect Intel’s results is included in Intel’s SEC filings, including the company’s most recent reports on Form 10-Q, Form 10-K and earnings release. Risk Factors 6
7.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. 7
8.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Before After Overall Speedup 1.2x 8 DWA* Character Animation Speedup After XBB Motion System Speedup 1.6x
9.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Motion System in DWA Character Animation Observed performance bottlenecks in Motion System 3d Matrix transforms How would an ideal transform behave XBB representation XBB deferred evaluation Results Agenda 9
10.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. To represent bones of a skeleton in 3d space an animation tool builds a Hierarchy of Joints and how they are connected. – Typically a Directed Acyclic Graph of Joints How is a skeleton represented for animation? 10
11.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Relative to a parent Joint (in Local Space), each Joint needs to model: – Rotational Euler Angles(around X, Y, and Z axis) & Order – Scale (of X, Y, and Z axis) – Shear (along X, Y, and Z axis) – Translation (X, Y, and Z components) Animation curves change values over time – drive the Joint’s attributes (rotation, translation, etc.) How is a each Joint represented? 11
12.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Deformers which compute the final 3d vertices of a character’s skin need an “Frame” of reference to apply offsets from. The “World Space” Position and Orientation of the Joints from the Hierarchy (skeleton) provide that “Frame” of reference. How does the skeleton influence the skin? 12
13.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Representing a “Frame” of reference struct Matrix4x4 { double m[4][4]; }; A 4x4 Matrix can represent the Position and Orientation of a Joint in World Space. When used in this manner, the 4x4 Matrix is commonly referred to as a 3d transform (x-form). 4x4 Matrix is typically implemented literally as a 4x4 array of floating point values. 13
14.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Rotation, Scale, Shear, and Translation can all be represented as 4x4 Matrices. Multiple 4x4 Matrices can be concatenated (multiplied) together to a single 4x4 matrix. 3d points and 3d vectors (offsets) can be multiplied through a 4x4 Matrix to be transformed to the position and orientation in “World Space” it represents. For each Joint – matrices representing Scale, Shear, Rotation, and Translation are combined together into a single “Local Space” 4x4 matrix. Why a 4x4 Matrix? 14
15.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. By recursively combining the “Local Space” transforms of a Joint with its parent Joint’s “Local Space” until the root of the hierarchy is reached, a 4x4 matrix can be accumulated that represents the World Space of that Joint. As there are many joints, its pays off to cache a “World Space” 4x4 Matrix at each joint, so that a recursive walk up the hierarchy can stop early if a clean “World Space” has been cached. How To Calculate The World Space Transform Of A Joint? 15
16.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Each time step, 1000’s of Joint attributes change, invalidating a Hierarchy’s cached World Space and Local Space transforms. 1000’s of operations on Hierarchy objects build up a complex skeleton. Hierarchy is the core of DWA’s Motion System Imagine how many bones are used to represent a 4 legged creature with a tail & wings. Due to the recursion, there is little opportunity for data vectorization or threading. 16
17.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Despite heavy parallelization of the Deformation System (green & yellow), it can’t start until the Motion System (red) finishes assembling a Hierarchy. Motion System Is On The Critical Path 17
18.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Motion System dwarfs the other systems. Amdahl’s law limits our threading & vectorization improvements in the deformation system from having a larger overall impact. Wall Time Spent in Each Category 18
19.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. “hier_apply_fk_around_pivot” as the hottest operator – Operates on a Hierarchy – Verified in Intel® VTune™ Amplifier XE Several other “hier” related operations taking up other top hot spots. Time Spent inside each type of Operator 19
20.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Typical implementation – Loop over rows – Loop over colums – Compute result element by multiplying one row of first matrix across one column of the other Simple enough, but how much work did we really just do? struct Matrix4x4 { double m[4][4]; }; 20 Matrix4x4 operator * (const Matrix4x4 &iOther) { Matrix4x4 result; for (int r=0;r < 4; ++r) { for (int c=0;c < 4; ++c) { double sum = 0.0; for(int k=0; k < 4; ++k) { sum += m[r][k]*iOther.m[k][c]; } result.m[r][c] = sum; } } return result; } Matrix Concatenation (Multiplication)
21.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. 64 Multiplies (double precision) 48 Additions (double precision) Expensive Matrix Concatenation Matrix4x4 operator * (const Matrix4x4 &iOther) { Matrix4x4 result; result.m[0][0] = m[0][0]*iOther.m[0][0] + m[0][1]*iOther.m[1][0] + m[0][2]*iOther.m[2][0] + m[0][3]*iOther.m[3][0]; result.m[0][1] = m[0][0]*iOther.m[0][1] + m[0][1]*iOther.m[1][1] + m[0][2]*iOther.m[2][1] + m[0][3]*iOther.m[3][1]; result.m[0][2] = m[0][0]*iOther.m[0][2] + m[0][1]*iOther.m[1][2] + m[0][2]*iOther.m[2][2] + m[0][3]*iOther.m[3][2]; result.m[0][3] = m[0][0]*iOther.m[0][3] + m[0][1]*iOther.m[1][3] + m[0][2]*iOther.m[2][3] + m[0][3]*iOther.m[3][3]; result.m[1][0] = m[1][0]*iOther.m[0][0] + m[1][1]*iOther.m[1][0] + m[1][2]*iOther.m[2][0] + m[1][3]*iOther.m[3][0]; result.m[1][1] = m[1][0]*iOther.m[0][1] + m[1][1]*iOther.m[1][1] + m[1][2]*iOther.m[2][1] + m[1][3]*iOther.m[3][1]; result.m[1][2] = m[1][0]*iOther.m[0][2] + m[1][1]*iOther.m[1][2] + m[1][2]*iOther.m[2][2] + m[1][3]*iOther.m[3][2]; result.m[1][3] = m[1][0]*iOther.m[0][3] + m[1][1]*iOther.m[1][3] + m[1][2]*iOther.m[2][3] + m[1][3]*iOther.m[3][3]; result.m[2][0] = m[2][0]*iOther.m[0][0] + m[2][1]*iOther.m[1][0] + m[2][2]*iOther.m[2][0] + m[2][3]*iOther.m[3][0]; result.m[2][1] = m[2][0]*iOther.m[0][1] + m[2][1]*iOther.m[1][1] + m[2][2]*iOther.m[2][1] + m[2][3]*iOther.m[3][1]; result.m[2][2] = m[2][0]*iOther.m[0][2] + m[2][1]*iOther.m[1][2] + m[2][2]*iOther.m[2][2] + m[2][3]*iOther.m[3][2]; result.m[2][3] = m[2][0]*iOther.m[0][3] + m[2][1]*iOther.m[1][3] + m[2][2]*iOther.m[2][3] + m[2][3]*iOther.m[3][3]; result.m[3][0] = m[3][0]*iOther.m[0][0] + m[3][1]*iOther.m[1][0] + m[3][2]*iOther.m[2][0] + m[3][3]*iOther.m[3][0]; result.m[3][1] = m[3][0]*iOther.m[0][1] + m[3][1]*iOther.m[1][1] + m[3][2]*iOther.m[2][1] + m[3][3]*iOther.m[3][1]; result.m[3][2] = m[3][0]*iOther.m[0][2] + m[3][1]*iOther.m[1][2] + m[3][2]*iOther.m[2][2] + m[3][3]*iOther.m[3][2]; result.m[3][3] = m[3][0]*iOther.m[0][3] + m[3][1]*iOther.m[1][3] + m[3][2]*iOther.m[2][3] + m[3][3]*iOther.m[3][3]; return result; } 21
22.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Good news! YES! If you knew the exact transform a 4x4 matrix was representing, you would know quite a few 0 and 1 values at compile time. Are Any of Those 16 Matrix Values Known At Compile Time? Identity [1][0][0][0] [0][1][0][0] [0][0][1][0] [0][0][0][1] Translation(x,y,z) [1][0][0][0] [0][1][0][0] [0][0][1][0] [x][y][z][1] Shear(x,y,z) [1][0][0][0] [x][1][0][0] [y][z][1][0] [0][0][0][1] Scale(x,y,z) [x][0][0][0] [0][y][0][0] [0][0][z][0] [0][0][0][1] 22
23.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Building rotation matrices is more expensive because of the need to call sine and cosine on the angle Rotations also have 0 and 1 values What About Rotations? Rotate X axis(angle) [1][0][0][0] [0][c][s][0] [0][-s][c][0] [0][0][0][1] Rotate Y axis(angle) [c][0][-s][0] [0][1][0][0] [s][0][c][0] [0][0][0][1] Rotate Z axis(angle) [c][s][0][0] [-s][c][0][0] [0][0][1][0] [0][0][0][1] 23 let s = sine(angle) let c = cosine(angle)
24.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Unfortunately, the matrix multiply method doesn’t know that the 4x4 Matrix it was passed has any 0 or 1 values – So it can not avoid performing math operations. Even if we had separate classes to represent the different transformations and multiple versions of the matrix multiply method for each – The result becomes a general 4x4 matrix. – Chains of multiplication would only benefit on the 1st multiply operation Huge Optimization Potential! 24
25.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Pseudo algorithm to compute a Joint’s World Space – 10 4x4 matrix multiplications – 1 matrix inversion (very expensive) in the middle YES… But you won’t even want to try Good luck getting the expanded math right Can we expand the math by hand? JointWorldSpace = Scale*Shear* ParentScale*ParentShear* RotZ*RotY*RotX* ((ParentScale*ParentShear).inverse())* Translate* ParentWorldSpace; 25
26.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Must keep high level representation of algorithm Perform the absolute minimum required number of math operations – It must track known values – Continue tracking values through matrix multiplications Utilize known information to provide a cheaper alternative to full matrix inversions Interface/Adapt to existing 4x4 Matrix data types Ideal Transform Behavior 26
27.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. C++ library to enable composition of 3d transforms Instead of a general purpose 4x4 matrix, it provides specific types for different transforms. Track known values through multiplication chains Deferred Evaluation Localized source code changes required to take advantage of Introducing Xform Building Blocks (XBB) 27
28.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. XBB Scale, Shear3, & Translation ref::Matrix4x4 S; S.makeScale(scaleX, scaleY, scaleZ); ref::Matrix4x4 SH; SH.makeShear3(shearX, shearY, shearZ); ref::Matrix4x4 T; T.makeTranslation(transX, transY, transZ); 128 Bytes of Stack Used Per 4x4 Matrix Overhead to initialize to Identity(), then overwrite elements 28 xbb::Scale S(scaleX, scaleY, scaleZ); xbb::Shear3 SH(shearX, shearY, shearZ); xbb::Translation T(transX, transY, transZ); Before After XBB 24 Bytes of Stack No overhead to initialize 4x4 elements that are known to be 0 or 1 for each type of transform
29.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. XBB Transform Representation struct Translation { double x; double y; double z; … }; 29 Stores only non-constant data needed to represent a 4x4 matrix of the transform type Provides methods for element level access to a 4x4 matrix – Return known constant values double e10() const { return 0.0; } double e11() const { return 1.0; } double e12() const { return 0.0; } double e13() const { return 0.0; } double e20() const { return 0.0; } double e21() const { return 0.0; } double e22() const { return 1.0; } double e23() const { return 0.0; } double e30() const { return x; } double e31() const { return y; } double e32() const { return z; } double e33() const { return 1.0; } double e00() const { return 1.0; } double e01() const { return 0.0; } double e02() const { return 0.0; } double e03() const { return 0.0; } Translation(x,y,z) [1][0][0][0] [0][1][0][0] [0][0][1][0] [x][y][z][1]
30.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. XBB Transform Constancy enum Constancy { ConstantZero, ConstantOne, NotConstant }; 30 Each transform identifies if each 4x4 matrix element is a constant 0, 1, or Not Constant Constancy is suitable as template parameter – Matrix Multiply will make use of static const Constancy c10 = ConstantZero; static const Constancy c11 = ConstantOne; static const Constancy c12 = ConstantZero; static const Constancy c13 = ConstantZero; static const Constancy c20 = ConstantZero; static const Constancy c21 = ConstantZero; static const Constancy c22 = ConstantOne; static const Constancy c23 = ConstantZero; static const Constancy c30 = NotConstant; static const Constancy c31 = NotConstant; static const Constancy c32 = NotConstant; static const Constancy c33 = ConstantOne; static const Constancy c00 = ConstantOne; static const Constancy c01 = ConstantZero; static const Constancy c02 = ConstantZero; static const Constancy c03 = ConstantZero; Translation(x,y,z) [1][0][0][0] [0][1][0][0] [0][0][1][0] [x][y][z][1]
31.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. XBB Rotations ref::Matrix4x4 Rx; Rx.makeRotationX(rotX); ref::Matrix4x4 Ry; Ry.makeRotationY(rotY); ref::Matrix4x4 Rz; Rz.makeRotationZ(rotZ); 128 Bytes of Stack Used Per 4x4 Matrix Overhead to initialize to Identity(), then overwrite elements 31 xbb::RotationX Rx(rotX); xbb::RotationY Ry(rotY); xbb::RotationZ Rz(rotZ); Before After XBB 16 Bytes of Stack No overhead to initialize 4x4 elements that are known to be 0 or 1 for each type of transform sin(angle) cosine(angle) sine(angle) cosine(angle)
32.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. XBB Rotation Representation struct RotationX { double cosineOfAngle; double sineOfAngle; … }; 32 Stores the sine and cosine of the angle, not the angle itself. Provides methods for element level access to a 4x4 matrix – Return known constant values double e10() const { return 0.0; } double e11() const { return cosineOfAngle; } double e12() const { return sineOfAngle; } double e13() const { return 0.0; } double e20() const { return 0.0; } double e21() const { return -sineOfAngle; } double e22() const { return cosineOfAngle; } double e23() const { return 0.0; } double e30() const { return 0.0; } double e31() const { return 0.0; } double e32() const { return 0.0; } double e33() const { return 1.0; } double e00() const { return 1.0; } double e01() const { return 0.0; } double e02() const { return 0.0; } double e03() const { return 0.0; } Rotate X axis(angle) [1][0][0][0] [0][c][s][0] [0][-s][c][0] [0][0][0][1]
33.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. XBB Multiply ref::Matrix4x4 SxSH; SxSH = S*SH; 33 auto SxSH = S*SH; xbb::Matrix4x3 SxSH_Matrix; SxSH.to(SxSH_Matrix); Before After XBB No Math is performed. Instead, a new type Multiply<Scale, Shear3> is returned Math is deferred until you explicitly export to a general purpose matrix. XBB’s Multiply uses the Constancy of its template parameters to define its own Constancy values
34.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Multiplication Chains ref::Matrix4x4 jointLocalSpace; jointLocalSpace = S*SH*Rz*Ry*Rx*T; 34 xbb::Matrix4x3 jointLocalSpace; (S*SH*Rz*Ry*Rx*T).to(jointLocalSpace); Before After XBB Confirmed assembly has minimum math operations 5 matrix multiplications: 320 multiplications 240 adds Speedup 2.45x Multiply<Multiply<Multiply<Multiply<Multiply<Scale, Shear3>, RotationZ>, RotationY>, RotationX>, Translation>
35.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Deferred Evaluation (reduce) 35 typedef ReducedMatrix < c00, c01, c02, c03, c10, c11, c12, c13, c20, c21, c22, c23, c30, c31, c32, c33 > ReducedType; ReducedMatrix based on a transform’s Constancy. – Only has data members for NotConstant matrix elements Multiply’s reduce recursively expands its left and right operands – Expands out entire multiplication chain 4x4 elements setByMatrixMultiply – Actually multiplies a column by row – Knows Constancy of the elements from reduced left and right transforms Using template specialization based on the Constancy – Only exact terms necessary are accessed – Emits only necessary multiplications & additions ReducedType Multiply::reduce() const { const auto tl = left.reduce(); const auto tr = right.reduce(); ReducedType r; r.setByMatrixMultiply<0,0>(tl,tr); r.setByMatrixMultiply<0,1>(tl,tr); ... r.setByMatrixMultiply<3,3>(tl,tr); return r; }
36.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Many Hierarchy operations change only Translation of a Joint. – If we could cache the Rotation transforms, then many expensive sin/cos calls could be avoided. – Matrix4x4 is too big (128 bytes) to cache one for each Rotation X, Y, and Z. XBB rotations are only 16 bytes each – Small enough to cache inside the Joint object XBB: Cached Rotations (S*SH*cached.Rz*cached.Ry*cached.Rx*T).to(jointLocalSpace); Use Cached Sin/Cos of Angles Speedup 12.71x 36
37.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Identity is free in any multiplication chain – Optimized out entirely – Only 1 byte of stack space (empty struct) Transpose is free in any multiplication chain – Deferred evaluation pulls results out in different order – No additional math or data movement XBB Identity & Transpose Identity id; (S*SH*id*R*T).to(result); 37 (S*SH*R*T).transpose().(result);
38.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Inverse is very expensive – Determinant – Cofactor – Transpose – Division – scalar matrix multiply Before: Inverse of (Scale*Shear) inverseOfSxSH = (S*SH).inverse(); 38
39.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. (S*SH).inverse().to(inverseOfSxSH); MAGIC happens – Inverse becomes part of deferred evaluation! Because we have a representation of the multiplication chain – we can move the inverse inside the multiplication chain and reverse its order Inverse of most transform primitives is free – except Scale which costs 3 divisions During deferred evaluation – the logical 4x4 matrix values are reordered and flip signs where needed to represent its inverse (SH.inverse()*S.inverse()).to(inverseOfSxSH); Speedup 6.43x 39 After XBB: Inverse of (Scale*Shear)
40.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Provide template specializations for adapters to map between DWA math classes and XBB’s. – Allows XBB deferred evaluation directly into DWA matrix types In many scenarios, the transforms could have been Identity based on logic inside the Joint. – To take full advantage of XBB, we needed to know the exact type of transforms of involved. Templatized Hierarchy algorithm making conditional logic controlled by template parameters. e.g. – Order of Rotations – Scale Propagation Mode Specialized templates based on parameters to – Use the correct type of XBB transform Identity whenever possible – Multiply the Rotations in the correct order XBB Integration to DWA Motion System 40
41.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Built a jump table with instances of the algorithm for all the different combinations of options and rotation orders. – Used enums as indexes into multi-dimensional array of function pointers to the corresponding algorithm instance to execute. Used XBB for decomposing World Space Matrix4x4 into individual Joint attributes. Rewrote expensive “hier_apply_fk_around_pivot” with XBB directly vs. going through Hierarchy object – Avoid high overhead of building Hierarchy on on the fly Performed non XBB related optimizations – Reduced dynamic memory allocation by replacing local std::vector<T> with stack based array when possible XBB Integration to DWA Motion System (continued…) 41
42.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Before After XBB DWA Motion System Results Overall Speedup 1.2x 42 hier_apply_fk_around_pivot Speedup 2.8x Motion System Speedup 1.6x
43.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Reducing the Critical Path helped Thread Scaling. 43 XBB DWA Motion System Scaling Reached goal of 30 fps on single Avoton cartridge
44.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Good way to improve the impact of vectorization or threading is to reduce the amount of work being done outside those data parallel regions. – Ideally do less work in the first place. Complex optimization problems can be represented in C++ and presented back to the compiler in a form it can excel at optimizing. – Expanding math by hand is untenable. You can do much more with C++11/14 to encapsulate problems while retaining the original high level algorithm – Look for optimization problems that might be representable at a higher level. Call to Action 44
45.
Copyright © 2015,
Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. XBB has exactly the features required to support the DWA Motion System. For general purpose use – more transformations and math operations might be required. e.g. Inverse of general 4x4 matrix Single precision version or template based data type XBB can be licensed or potentially open sourced upon request. – Could be of use to CAD, Animation Tools, and Gaming. Contact Alex Wells (alex.m.wells@intel.com) Future Work 45
46.
C o p
y r i g h t © 2 0 1 5 , I n t e l C o r p o r a t i o n . A l l r i g h t s r e s e r v e d . *O t h e r n a me s a n d b r a n d s ma y b e c l a i me d a s t h e p r o p e r t y o f o t h e r s .