SlideShare a Scribd company logo
etnaviv: status update
Christian Gmeiner
2023-10-19
1
Who Am I?
Long term etnaviv developer
Doing consulting work for Igalia since June 2023
2
git log --since="1 year ago" --until="now"
3
Kernel
NPU support (Tomeu)
new HWDB entries
GC520 r5341 c204 (2D GPU found on the i.MX8MP
SoC)
GC7000 r6203 (NXP i.MX8MN SoC)
VIP8000 r7120 (Amlogic A311D)
VIP8000 Nano r8002 (NPU found on the NXP
i.MX8MP SoC)
Small fixes and improvments
4
Userspace
MSAA work by Lucas and me
(ETNA_MESA_DEBUG=msaa)
TS buffer sharing by Lucas
(ETNA_MESA_DEBUG=shared_ts)
Performance warnings (ETNA_MESA_DEBUG=perf)
Compiler work
Support negative float inline immediates
Improved lowerings
Improved linking (glsl-routing)
Some new extensions
5
CI
Has proven its value
6
CI
baremetal based
zero maintenance needed
7
CI - current devices
6x GC2000 based devices (imx6q)
2x GC3000 based devices (imx6qp)
4x GC7000L based device (imx8mq)
Only a handful are available as public runners.
8
CI - more boardfarms
David Heidelberg: librem5 devboards
Sam Ravnborg: i.MX6 Solo with GC880
Igalia HQ
With ci-tron everybody can jump in and provide some
HW.
9
The road torwards GLES3
With some roadblocks.
10
The road torwards GLES3
dEQP-GLES3.functional.shaders.texture_functions.*
Multiple Render Targets
11
The road towards GLES3
But then ...
nir: Transition away from nir_register and abs/neg/sat mo
happened and the problems started and I shifted my
focus.
12
The road towards GLES3
The master plan to use nir_legacy failed
Took me quite some time to get it working but
shader-db is sad
total instructions in shared programs: 228098 -> 249817 (
instructions in affected programs: 97892 -> 119611 (22.19
total temps in shared programs: 84665 -> 86158 (1.76%)
temps in affected programs: 6257 -> 7750 (23.86%)
total immediates in shared programs: 152272 -> 152148 (-0
immediates in affected programs: 2304 -> 2180 (-5.38%)
13
The road towards GLES3
Cloned nir_lower_to_source_mods with the following
differences:
we store the source mods in pass_flags
we do not try to saturate the destination
total instructions in shared programs: 234974 -> 235376 (
instructions in affected programs: 11481 -> 11883 (3.50%)
total temps in shared programs: 84891 -> 84891 (0.00%)
temps in affected programs: 0 -> 0
total immediates in shared programs: 154776 -> 154776 (0.
total immediates in shared programs: 154776 -> 154776 (0.
14
The road torwards GLES3
15
Digression: current compiler
uses NIR as backend IR
does some 64bit nir_const_value pollution
#define CONST_VAL(a, b) (nir_const_value) 
{.u64 = (uint64_t)(a) << 32 | (uint64_t)(b)}
#define CONST(x) CONST_VAL(ETNA_UNIFORM_CONSTANT, x
#define UNIFORM(x) CONST_VAL(ETNA_UNIFORM_UNIFORM, x
16
Digression: current compiler
blind flight late in the process
/* some custom NIR transformations */
...
...
/* call directly to avoid validation */
nir_convert_from_ssa(shader, true);
nir_trivialize_registers(shader);
17
Digression: current compiler
assembler is not aware of different instruction
encodings
case nir_intrinsic_load_global_etna: {
struct etna_inst inst = {
.opcode = INST_OPCODE_LOAD,
.type = inst_type_from_bitsize(c, ...),
.tex = {
.amode = INST_AMODE_ADD_A_W,
},
};
if (nir_src_is_const(intr->src[1])) {
inst.src[1].amode = INST_AMODE_ADD_A_Y;
inst.tex.swiz = 128;
}
18
Digression: current compiler
miss compilations
quite fragile
19
etnaviv backend compiler
20
etnaviv backend compiler
not a single line of rust
no SSA based register allocator (vec4 ISA)
uses isaspec
uses etnaviv's register allocator
has a backend IR and an optimizer
supports control flow
21
etnaviv backend compiler
cp src/asahi/compiler src/etnaviv/compiler -r
with some vec4 sugar on top
22
etnaviv backend compiler
kernel void rotate(global int* out, global int* in0, global int*
out[get_global_id(0)] =
rotate(in0[get_global_id(0)], in1[get_global_id(0)]);
}
23
etnaviv backend compiler
block0 {
3.x = add.s32 *12.x, u0.y
5.x = imullo0.s32 *3.x, u0.z
7.x = load.u32 u0.w [base address], 5.x [offset]
9.x = load.u32 u1.x [base address], 5.x [offset]
10.x = rotate.u32 *7.x, *9.x
store.u32 u1.y [base address], *5.x [offset], *10.x [v
}
24
etnaviv backend compiler
000 add.s32.pack t0.x___, t0.xyzw, void, u0.yyy
001 imullo0.s32.pack t0.x___, t0.xyzw, u0.zzzz, voi
002 load.denorm.u32 t0._y__, u0.wwww, t0.xyzw, voi
003 load.denorm.u32 t0.__z_, u1.xxxx, t0.xyzw, voi
004 rotate.u32.pack t0._y__, t0.yyyy, void, t0.zzz
005 store.denorm.u32 mem.x___, u1.yyyy, t0.xyzw, t0
25
Legacy love
Emulation with shaders
26
Legacy model love
etnaviv does support a wide range of GPU modules
found in different SoCs
availability of HW features fluctuates a lot
binary blob uses some emulation tricks to get the job
done
27
Legacy model love
Make use of more lowerings NIR provides (logicops, ..)
Provide own lowerings in NIR (bordercolor)
28
etnaviv is more alive then
ever
29
Q&A
30
31
Etnaviv: Status update     –    XDC 2023

More Related Content

Similar to Etnaviv: Status update –  XDC 2023

Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
Gavin Guo
 
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPCRISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
Ganesan Narayanasamy
 
Certification
CertificationCertification
Certification
subhransu mishra
 
Sprint 140
Sprint 140Sprint 140
Sprint 140
ManageIQ
 
node-rpi-ws281x
node-rpi-ws281xnode-rpi-ws281x
node-rpi-ws281x
Martin Schuhfuß
 
Multicloud connectivity using OpenNHRP
Multicloud connectivity using OpenNHRPMulticloud connectivity using OpenNHRP
Multicloud connectivity using OpenNHRP
Bob Melander
 
Application Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systemsApplication Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systems
Ganesan Narayanasamy
 
Sprint 142
Sprint 142Sprint 142
Sprint 142
ManageIQ
 
vAccel: Interoperable Application Hardware Acceleration
vAccel: Interoperable Application Hardware AccelerationvAccel: Interoperable Application Hardware Acceleration
vAccel: Interoperable Application Hardware Acceleration
Carlos Reaño González
 
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Akihiro Hayashi
 
Ipv6 test plan for opnfv poc v2.2 spirent-vctlab
Ipv6 test plan for opnfv poc v2.2 spirent-vctlabIpv6 test plan for opnfv poc v2.2 spirent-vctlab
Ipv6 test plan for opnfv poc v2.2 spirent-vctlab
Iben Rodriguez
 
Debugging linux issues with eBPF
Debugging linux issues with eBPFDebugging linux issues with eBPF
Debugging linux issues with eBPF
Ivan Babrou
 
Linux kernel debugging(PDF format)
Linux kernel debugging(PDF format)Linux kernel debugging(PDF format)
Linux kernel debugging(PDF format)
yang firo
 
Linux kernel debugging(ODP format)
Linux kernel debugging(ODP format)Linux kernel debugging(ODP format)
Linux kernel debugging(ODP format)
yang firo
 
Sprint 145
Sprint 145Sprint 145
Sprint 145
ManageIQ
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVM
John Lee
 
Cryptography and secure systems
Cryptography and secure systemsCryptography and secure systems
Cryptography and secure systems
Vsevolod Stakhov
 
Introducing Scylla Manager: Cluster Management and Task Automation
Introducing Scylla Manager: Cluster Management and Task AutomationIntroducing Scylla Manager: Cluster Management and Task Automation
Introducing Scylla Manager: Cluster Management and Task Automation
ScyllaDB
 
SR-IOV ixgbe Driver Limitations and Improvement
SR-IOV ixgbe Driver Limitations and ImprovementSR-IOV ixgbe Driver Limitations and Improvement
SR-IOV ixgbe Driver Limitations and Improvement
LF Events
 
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Takahiro Harada
 

Similar to Etnaviv: Status update –  XDC 2023 (20)

Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
Spectre(v1%2 fv2%2fv4) v.s. meltdown(v3)
 
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPCRISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
 
Certification
CertificationCertification
Certification
 
Sprint 140
Sprint 140Sprint 140
Sprint 140
 
node-rpi-ws281x
node-rpi-ws281xnode-rpi-ws281x
node-rpi-ws281x
 
Multicloud connectivity using OpenNHRP
Multicloud connectivity using OpenNHRPMulticloud connectivity using OpenNHRP
Multicloud connectivity using OpenNHRP
 
Application Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systemsApplication Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systems
 
Sprint 142
Sprint 142Sprint 142
Sprint 142
 
vAccel: Interoperable Application Hardware Acceleration
vAccel: Interoperable Application Hardware AccelerationvAccel: Interoperable Application Hardware Acceleration
vAccel: Interoperable Application Hardware Acceleration
 
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
 
Ipv6 test plan for opnfv poc v2.2 spirent-vctlab
Ipv6 test plan for opnfv poc v2.2 spirent-vctlabIpv6 test plan for opnfv poc v2.2 spirent-vctlab
Ipv6 test plan for opnfv poc v2.2 spirent-vctlab
 
Debugging linux issues with eBPF
Debugging linux issues with eBPFDebugging linux issues with eBPF
Debugging linux issues with eBPF
 
Linux kernel debugging(PDF format)
Linux kernel debugging(PDF format)Linux kernel debugging(PDF format)
Linux kernel debugging(PDF format)
 
Linux kernel debugging(ODP format)
Linux kernel debugging(ODP format)Linux kernel debugging(ODP format)
Linux kernel debugging(ODP format)
 
Sprint 145
Sprint 145Sprint 145
Sprint 145
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVM
 
Cryptography and secure systems
Cryptography and secure systemsCryptography and secure systems
Cryptography and secure systems
 
Introducing Scylla Manager: Cluster Management and Task Automation
Introducing Scylla Manager: Cluster Management and Task AutomationIntroducing Scylla Manager: Cluster Management and Task Automation
Introducing Scylla Manager: Cluster Management and Task Automation
 
SR-IOV ixgbe Driver Limitations and Improvement
SR-IOV ixgbe Driver Limitations and ImprovementSR-IOV ixgbe Driver Limitations and Improvement
SR-IOV ixgbe Driver Limitations and Improvement
 
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
Introduction to Monte Carlo Ray Tracing, OpenCL Implementation (CEDEC 2014)
 

More from Igalia

The journey towards stabilizing Chromium’s Wayland support
The journey towards stabilizing Chromium’s Wayland supportThe journey towards stabilizing Chromium’s Wayland support
The journey towards stabilizing Chromium’s Wayland support
Igalia
 
Sustainable Futures: Funding the Web Ecosystem
Sustainable Futures: Funding the Web EcosystemSustainable Futures: Funding the Web Ecosystem
Sustainable Futures: Funding the Web Ecosystem
Igalia
 
Status of the Layer-Based SVG Engine in WebKit
Status of the Layer-Based SVG Engine in WebKitStatus of the Layer-Based SVG Engine in WebKit
Status of the Layer-Based SVG Engine in WebKit
Igalia
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
Igalia
 
Building End-user Applications on Embedded Devices with WPE
Building End-user Applications on Embedded Devices with WPEBuilding End-user Applications on Embedded Devices with WPE
Building End-user Applications on Embedded Devices with WPE
Igalia
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Igalia
 
Automated Testing for Web-based Systems on Embedded Devices
Automated Testing for Web-based Systems on Embedded DevicesAutomated Testing for Web-based Systems on Embedded Devices
Automated Testing for Web-based Systems on Embedded Devices
Igalia
 
Embedding WPE WebKit - from Bring-up to Maintenance
Embedding WPE WebKit - from Bring-up to MaintenanceEmbedding WPE WebKit - from Bring-up to Maintenance
Embedding WPE WebKit - from Bring-up to Maintenance
Igalia
 
Optimizing Scheduler for Linux Gaming.pdf
Optimizing Scheduler for Linux Gaming.pdfOptimizing Scheduler for Linux Gaming.pdf
Optimizing Scheduler for Linux Gaming.pdf
Igalia
 
Running JS via WASM faster with JIT
Running JS via WASM      faster with JITRunning JS via WASM      faster with JIT
Running JS via WASM faster with JIT
Igalia
 
To crash or not to crash: if you do, at least recover fast!
To crash or not to crash: if you do, at least recover fast!To crash or not to crash: if you do, at least recover fast!
To crash or not to crash: if you do, at least recover fast!
Igalia
 
Implementing a Vulkan Video Encoder From Mesa to GStreamer
Implementing a Vulkan Video Encoder From Mesa to GStreamerImplementing a Vulkan Video Encoder From Mesa to GStreamer
Implementing a Vulkan Video Encoder From Mesa to GStreamer
Igalia
 
8 Years of Open Drivers, including the State of Vulkan in Mesa
8 Years of Open Drivers, including the State of Vulkan in Mesa8 Years of Open Drivers, including the State of Vulkan in Mesa
8 Years of Open Drivers, including the State of Vulkan in Mesa
Igalia
 
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por IgaliaIntroducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
Igalia
 
2023 in Chimera Linux
2023 in Chimera                    Linux2023 in Chimera                    Linux
2023 in Chimera Linux
Igalia
 
Building a Linux distro with LLVM
Building a Linux distro        with LLVMBuilding a Linux distro        with LLVM
Building a Linux distro with LLVM
Igalia
 
turnip: Update on Open Source Vulkan Driver for Adreno GPUs
turnip: Update on Open Source Vulkan Driver for Adreno GPUsturnip: Update on Open Source Vulkan Driver for Adreno GPUs
turnip: Update on Open Source Vulkan Driver for Adreno GPUs
Igalia
 
Graphics stack updates for Raspberry Pi devices
Graphics stack updates for Raspberry Pi devicesGraphics stack updates for Raspberry Pi devices
Graphics stack updates for Raspberry Pi devices
Igalia
 
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOSDelegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
Igalia
 
MessageFormat: The future of i18n on the web
MessageFormat: The future of i18n on the webMessageFormat: The future of i18n on the web
MessageFormat: The future of i18n on the web
Igalia
 

More from Igalia (20)

The journey towards stabilizing Chromium’s Wayland support
The journey towards stabilizing Chromium’s Wayland supportThe journey towards stabilizing Chromium’s Wayland support
The journey towards stabilizing Chromium’s Wayland support
 
Sustainable Futures: Funding the Web Ecosystem
Sustainable Futures: Funding the Web EcosystemSustainable Futures: Funding the Web Ecosystem
Sustainable Futures: Funding the Web Ecosystem
 
Status of the Layer-Based SVG Engine in WebKit
Status of the Layer-Based SVG Engine in WebKitStatus of the Layer-Based SVG Engine in WebKit
Status of the Layer-Based SVG Engine in WebKit
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Building End-user Applications on Embedded Devices with WPE
Building End-user Applications on Embedded Devices with WPEBuilding End-user Applications on Embedded Devices with WPE
Building End-user Applications on Embedded Devices with WPE
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Automated Testing for Web-based Systems on Embedded Devices
Automated Testing for Web-based Systems on Embedded DevicesAutomated Testing for Web-based Systems on Embedded Devices
Automated Testing for Web-based Systems on Embedded Devices
 
Embedding WPE WebKit - from Bring-up to Maintenance
Embedding WPE WebKit - from Bring-up to MaintenanceEmbedding WPE WebKit - from Bring-up to Maintenance
Embedding WPE WebKit - from Bring-up to Maintenance
 
Optimizing Scheduler for Linux Gaming.pdf
Optimizing Scheduler for Linux Gaming.pdfOptimizing Scheduler for Linux Gaming.pdf
Optimizing Scheduler for Linux Gaming.pdf
 
Running JS via WASM faster with JIT
Running JS via WASM      faster with JITRunning JS via WASM      faster with JIT
Running JS via WASM faster with JIT
 
To crash or not to crash: if you do, at least recover fast!
To crash or not to crash: if you do, at least recover fast!To crash or not to crash: if you do, at least recover fast!
To crash or not to crash: if you do, at least recover fast!
 
Implementing a Vulkan Video Encoder From Mesa to GStreamer
Implementing a Vulkan Video Encoder From Mesa to GStreamerImplementing a Vulkan Video Encoder From Mesa to GStreamer
Implementing a Vulkan Video Encoder From Mesa to GStreamer
 
8 Years of Open Drivers, including the State of Vulkan in Mesa
8 Years of Open Drivers, including the State of Vulkan in Mesa8 Years of Open Drivers, including the State of Vulkan in Mesa
8 Years of Open Drivers, including the State of Vulkan in Mesa
 
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por IgaliaIntroducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
 
2023 in Chimera Linux
2023 in Chimera                    Linux2023 in Chimera                    Linux
2023 in Chimera Linux
 
Building a Linux distro with LLVM
Building a Linux distro        with LLVMBuilding a Linux distro        with LLVM
Building a Linux distro with LLVM
 
turnip: Update on Open Source Vulkan Driver for Adreno GPUs
turnip: Update on Open Source Vulkan Driver for Adreno GPUsturnip: Update on Open Source Vulkan Driver for Adreno GPUs
turnip: Update on Open Source Vulkan Driver for Adreno GPUs
 
Graphics stack updates for Raspberry Pi devices
Graphics stack updates for Raspberry Pi devicesGraphics stack updates for Raspberry Pi devices
Graphics stack updates for Raspberry Pi devices
 
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOSDelegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
 
MessageFormat: The future of i18n on the web
MessageFormat: The future of i18n on the webMessageFormat: The future of i18n on the web
MessageFormat: The future of i18n on the web
 

Recently uploaded

"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
From Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMsFrom Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMs
Sease
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
zjhamm304
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
ScyllaDB
 
What is an RPA CoE? Session 2 – CoE Roles
What is an RPA CoE?  Session 2 – CoE RolesWhat is an RPA CoE?  Session 2 – CoE Roles
What is an RPA CoE? Session 2 – CoE Roles
DianaGray10
 
High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024
Vadym Kazulkin
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
christinelarrosa
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
LizaNolte
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
Fwdays
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
Mydbops
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
Enterprise Knowledge
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
christinelarrosa
 

Recently uploaded (20)

"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
From Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMsFrom Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMs
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...QA or the Highway - Component Testing: Bridging the gap between frontend appl...
QA or the Highway - Component Testing: Bridging the gap between frontend appl...
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
 
What is an RPA CoE? Session 2 – CoE Roles
What is an RPA CoE?  Session 2 – CoE RolesWhat is an RPA CoE?  Session 2 – CoE Roles
What is an RPA CoE? Session 2 – CoE Roles
 
High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
 

Etnaviv: Status update –  XDC 2023

  • 1. etnaviv: status update Christian Gmeiner 2023-10-19 1
  • 2. Who Am I? Long term etnaviv developer Doing consulting work for Igalia since June 2023 2
  • 3. git log --since="1 year ago" --until="now" 3
  • 4. Kernel NPU support (Tomeu) new HWDB entries GC520 r5341 c204 (2D GPU found on the i.MX8MP SoC) GC7000 r6203 (NXP i.MX8MN SoC) VIP8000 r7120 (Amlogic A311D) VIP8000 Nano r8002 (NPU found on the NXP i.MX8MP SoC) Small fixes and improvments 4
  • 5. Userspace MSAA work by Lucas and me (ETNA_MESA_DEBUG=msaa) TS buffer sharing by Lucas (ETNA_MESA_DEBUG=shared_ts) Performance warnings (ETNA_MESA_DEBUG=perf) Compiler work Support negative float inline immediates Improved lowerings Improved linking (glsl-routing) Some new extensions 5
  • 8. CI - current devices 6x GC2000 based devices (imx6q) 2x GC3000 based devices (imx6qp) 4x GC7000L based device (imx8mq) Only a handful are available as public runners. 8
  • 9. CI - more boardfarms David Heidelberg: librem5 devboards Sam Ravnborg: i.MX6 Solo with GC880 Igalia HQ With ci-tron everybody can jump in and provide some HW. 9
  • 10. The road torwards GLES3 With some roadblocks. 10
  • 11. The road torwards GLES3 dEQP-GLES3.functional.shaders.texture_functions.* Multiple Render Targets 11
  • 12. The road towards GLES3 But then ... nir: Transition away from nir_register and abs/neg/sat mo happened and the problems started and I shifted my focus. 12
  • 13. The road towards GLES3 The master plan to use nir_legacy failed Took me quite some time to get it working but shader-db is sad total instructions in shared programs: 228098 -> 249817 ( instructions in affected programs: 97892 -> 119611 (22.19 total temps in shared programs: 84665 -> 86158 (1.76%) temps in affected programs: 6257 -> 7750 (23.86%) total immediates in shared programs: 152272 -> 152148 (-0 immediates in affected programs: 2304 -> 2180 (-5.38%) 13
  • 14. The road towards GLES3 Cloned nir_lower_to_source_mods with the following differences: we store the source mods in pass_flags we do not try to saturate the destination total instructions in shared programs: 234974 -> 235376 ( instructions in affected programs: 11481 -> 11883 (3.50%) total temps in shared programs: 84891 -> 84891 (0.00%) temps in affected programs: 0 -> 0 total immediates in shared programs: 154776 -> 154776 (0. total immediates in shared programs: 154776 -> 154776 (0. 14
  • 15. The road torwards GLES3 15
  • 16. Digression: current compiler uses NIR as backend IR does some 64bit nir_const_value pollution #define CONST_VAL(a, b) (nir_const_value) {.u64 = (uint64_t)(a) << 32 | (uint64_t)(b)} #define CONST(x) CONST_VAL(ETNA_UNIFORM_CONSTANT, x #define UNIFORM(x) CONST_VAL(ETNA_UNIFORM_UNIFORM, x 16
  • 17. Digression: current compiler blind flight late in the process /* some custom NIR transformations */ ... ... /* call directly to avoid validation */ nir_convert_from_ssa(shader, true); nir_trivialize_registers(shader); 17
  • 18. Digression: current compiler assembler is not aware of different instruction encodings case nir_intrinsic_load_global_etna: { struct etna_inst inst = { .opcode = INST_OPCODE_LOAD, .type = inst_type_from_bitsize(c, ...), .tex = { .amode = INST_AMODE_ADD_A_W, }, }; if (nir_src_is_const(intr->src[1])) { inst.src[1].amode = INST_AMODE_ADD_A_Y; inst.tex.swiz = 128; } 18
  • 19. Digression: current compiler miss compilations quite fragile 19
  • 21. etnaviv backend compiler not a single line of rust no SSA based register allocator (vec4 ISA) uses isaspec uses etnaviv's register allocator has a backend IR and an optimizer supports control flow 21
  • 22. etnaviv backend compiler cp src/asahi/compiler src/etnaviv/compiler -r with some vec4 sugar on top 22
  • 23. etnaviv backend compiler kernel void rotate(global int* out, global int* in0, global int* out[get_global_id(0)] = rotate(in0[get_global_id(0)], in1[get_global_id(0)]); } 23
  • 24. etnaviv backend compiler block0 { 3.x = add.s32 *12.x, u0.y 5.x = imullo0.s32 *3.x, u0.z 7.x = load.u32 u0.w [base address], 5.x [offset] 9.x = load.u32 u1.x [base address], 5.x [offset] 10.x = rotate.u32 *7.x, *9.x store.u32 u1.y [base address], *5.x [offset], *10.x [v } 24
  • 25. etnaviv backend compiler 000 add.s32.pack t0.x___, t0.xyzw, void, u0.yyy 001 imullo0.s32.pack t0.x___, t0.xyzw, u0.zzzz, voi 002 load.denorm.u32 t0._y__, u0.wwww, t0.xyzw, voi 003 load.denorm.u32 t0.__z_, u1.xxxx, t0.xyzw, voi 004 rotate.u32.pack t0._y__, t0.yyyy, void, t0.zzz 005 store.denorm.u32 mem.x___, u1.yyyy, t0.xyzw, t0 25
  • 27. Legacy model love etnaviv does support a wide range of GPU modules found in different SoCs availability of HW features fluctuates a lot binary blob uses some emulation tricks to get the job done 27
  • 28. Legacy model love Make use of more lowerings NIR provides (logicops, ..) Provide own lowerings in NIR (bordercolor) 28
  • 29. etnaviv is more alive then ever 29
  • 31. 31