SlideShare a Scribd company logo
1 of 44
Download to read offline
 Kalray SA. Confidential - All Rights Reserved. 1
www.kalrayinc.com
IF-CONVERSION FOR A PARTIALLY
PREDICATED VLIW ARCHITECTURE
Benoît Dupont de Dinechin, CTO
GNU Cauldron 2022
 Kalray SA. Confidential - All Rights Reserved. 3
 Kalray SA. Confidential - All Rights Reserved. 3
1.Kalray MPPA Processor and KVX Core
2.KVX Code Generation Features
3.GCC IF-Conversion Framework
4.Extending GCC IF-Conversion
5.First Results and Outlook
AGENDA
 Kalray SA. Confidential - All Rights Reserved. 4
 Kalray SA. Confidential - All Rights Reserved. 4
KALRAY DPU-BASED ACCELERATION CARD
 Kalray SA. Confidential - All Rights Reserved. 4
 Kalray SA. Confidential - All Rights Reserved. 5
 Kalray SA. Confidential - All Rights Reserved. 5
MPPA® COOLIDGE V1
Block Diagram & Feature List 80 VLIW Application Cores
• 64-bit/32-bit 6-issue VLIW core
• From 600MHz to 1.2 GHz
• 16KB I/D cache with MMU
• IEEE 754 FP16, FP32, FP64 FPU
• Up to 256-bits per cycle Load/Store
80 Tensor Co-processors (per core)
• INT8.32, INT16.64, FP16.32
• Up to 128 MAC equivalent per cycle
Compute Clusters (5)
• +1 Management/Security Core
• 4 MB of Memory / L2 Cache
• 600GB/s bandwidth
2x100GbE Ethernet Interface & Mger
• 8x1/8x10/8x25/2x40/4x50/2x100 GbE
• Jumbo Frame Support (9.6KB)
• Support for PTP/IEEE 1588v2
• Priority Flow Control (PFC), IEEE 802.1Qbb
• Checksum offload Header & Payload
• Hash & Round-robin dispatcher
Security
• Secure boot with authentication & encryption
• TRNG, RSA, Diffie-Hellman, DSA, ECC, EC-DSA
and EC-DH acceleration
PCIe Gen4 Interface
• 16-lane PCIe GEN4 Endpoint (EP) or
Root Complex (RC)
• Bifurcation up to 8 downstream ports in
RC mode
• SR-IOV up to 8 PF / 248 VF
• Address translation and protection
• Up to 2048 MSI-X & 64 MSI
• Support for Hot Plug
• Up to 512 DMAs for multi queues /
kernel bypass drivers
• Direct PCIe-to-clusters and PCIe-to-
DDR transfers
LPDDR4/DDR4 Interface
• 64-bit DDR4/LPDDR4-3200 channels
with sideband/inline ECC
• Up to two ranks per DDR4 Channel
• 2 DDR channels (up to 32GB) with
channel interleaving
Cryptography Accelerators
(optional)
• AES-128/192/256
(ECB/CBC/ICM/CTR/GCM/GMAC/CCM)
, AES-XTS, MD5/SHA-1, SHA-2, SHA-3,
Kazumi/Snow 3G, ZUC
 Kalray SA. Confidential - All Rights Reserved. 6
ANDEY BOSTAN COOLIDGE v1 COOLIDGE v2 DOLOMITES(3)
PROCESS 28 nm 28 nm 16 nm 16 nm 6/5 nm
PERFORMANCE 1 TOPS 1.3 TOPS
20 TOPS (1)
4 TFLOPS (2)
190 KDMIPS
50 TOPS (1)
25 TFLOPS (2)
190 KDMIPS
200 TOPS (1)
100 TFLOPS (2)
380 KDMIPS
USE CASES /
MARKET
Prototyping
40G Data Center
Auto Prototypes
Data Center / Edge
Automotive (proto)
Data Center / Edge
Automotive (proto)
Data Center
Edge Computing
5G
CONSUMPTION
(WATTS)
25W 25W 20W(4) 20W(4) 20W(4)
PROTOTYPING PRODUCTION AVAILABLE
UNDER
DEVELOPMENT
UNDER SPECIFICATION
2015 2020 1H 2023 2025
(1) INT8.32
(2) FP16.32
(3) Initial target – may changes
(4) 50W maximum compute workload
MPPA® PROCESSORS
Product Family
2012
 Kalray SA. Confidential - All Rights Reserved. 7
CLASSIC VLIW ARCHITECTURE (J. A. FISHER) EPIC VLIW ARCHITECTURE (B. R. RAU)
Key architecture features
• SELECT operation on Boolean value
• Conditional load/store/FPU operations
• Dismissible loads (non-trapping)
• [Multi-way conditional branches]
Key compiler techniques
• Trace scheduling (global instruction scheduling)
• Partial predication (S. Freudenberger if-conversion)
Main examples
• Multiflow TRACE processors
• HP Labs Lx « Embedded Computing: a VLIW Approach »
• STMicroelectronics ST200 (media processor based on Lx)
Key architecture features
• Fully predicated ISA
• Speculative loads (control speculation)
• Advanced loads (data speculation)
• Rotating registers
Key compiler techniques
• Modulo scheduling (software pipelining)
• Full predication (R-K algorithm, J. Fang algorithm)
Main examples
• Cydrome Cydra-5
• HP-intel IA64
• TI C6x DSPs
VERY LONG INSTRUCTION WORD (VLIW) ARCHITECTURES
Compiler-driven instruction-level parallel execution
Simple, energy-efficient, time-predictable implementations
 Kalray SA. Confidential - All Rights Reserved. 8
MPPA COOLIDGE 64-BIT VLIW CORE
VLIW CORE PIPELINE
Kalray VLIW (KVX) architecture is co-designed to
appear as an in-order superscalar to compilers
• Every scheduler parallel instruction group is a valid bundle
• No need for vertical or horizontal no-op padding
Vector-scalar ISA
• 64x 64-bit general-purpose registers
• Operands can be single registers, register pairs (128-bit) or
register quadruples (256-bit)
• 128-bit/256-bit SIMD instructions by dual-issuing/quad-issuing 64-
bit instructions on the ALUS or by using the FPU data-path
DSP capabilities
• Counted or while hardware loops with early exits
• Non-temporal loads (L1 cache bypass / preload)
• Non-trapping memory loads (faulting bytes return 0)
CPU capabilities
• 4 privilege levels (rings), MMU (runs Linux kernel)
• Recursive ISA virtualization (Popek & Goldberg)
 Kalray SA. Confidential - All Rights Reserved. 9
 Kalray SA. Confidential - All Rights Reserved. 9
1.Kalray MPPA Processor and KVX Core
2.KVX Code Generation Features
3.GCC IF-Conversion Framework
4.Extending GCC IF-Conversion
5.First Results and Outlook
AGENDA
 Kalray SA. Confidential - All Rights Reserved. 10
ACCESSCORE® SOFTWARE DEVELOPMENT KIT
A Complete Toolchain & Standard Libraries
Standard
Programming
Environment
(C/C++/OpenCL)
Operating
Systems
& Libraries
(Linux / ClusterOS)
Deep Learning
Mathematics
Computer Vision
Software framework for offloading numerical, signal and image processing
KAF™
CNN inference code generator compatible with standard CNN frameworks (Caffe, …)
KaNN™
3rd Party OS RTOS
3rd Party Tools Model-Based Development
POSIX THREAD, OpenMP
OpenCL, Eclipse
GCC, GDB, LLVM, QEMU
SLEEF, SIMDe
OPEN CV, BLAS, LAPACK
CNN Inference Code Gen.
Exokernel, Open Source
POSIX RTOS, Linux,
Communication Libs
AccessCore®
for a seamless
integration
AccessCore®
SDK
AccessCore®
Runtime
Optimized
Librairies
Compiler,
Simulator,
Debugger &
System Trace
</>
 Kalray SA. Confidential - All Rights Reserved. 11
C/C++ COMPILER SUPPORT OF KVX CORE
GCC 10 for lightweight POSIX OS and for Linux (all TLS models)
• Mapping of hardware loops using GCC doloop patterns
• Sched2 does instruction scheduling and instruction bundling
• Most high-gain optimizations apply (such as auto-vectorization)
• On-going developments in software pipelining (derived from C6x)
 Kalray SA. Confidential - All Rights Reserved. 12
CACHE BYPASS LOADS AND NON-TRAPPING LOADS
Reuse of the GCC named address spaces (not available in C++)
 Kalray SA. Confidential - All Rights Reserved. 13
EXPLOITATION OF THE VECTOR-SCALAR ARCHITECTURE
128-bit and 256-bit vectors are operated and passed as 64-bit register pairs and quadruples
• Full support of the GCC vector syntax extensions
• Align vectors on register pair/quad on ABI boundaries
• SIMD lane splatting and shuffling rely on the BMM8
(8x8 Bit-Matrix Multiply) operations, exposed as
SBMM8 instructions (swapped operands)
• To improve register allocation, vector instructions are
kept as machine instruction pairs or quadruples until
after register allocation
• Use of partial instruction bundles in output templates,
with suitable scheduling type
 Kalray SA. Confidential - All Rights Reserved. 14
SIMDE EMULATION OF X86
BUILTINS
SIMDe translates the x86 builtin
functions into native call on x86
(SIMDE_X86_SSSE3_NATIVE) and
plain C code on other architectures
(SIMDE_VECTORIZE)
Kalray port of SIMDe provides an
optimized translation on KVX using
the GCC/LLVM KVX builtin functions
(SIMDE_KVX_NATIVE)
 Kalray SA. Confidential - All Rights Reserved. 15
KVX CONDITIONAL
BRANCH TEMPLATES
KVX condition codes live in the
general-purpose registers
The "cstore<m>" standard pattern
produces 0 or 1
(STORE_FLAG_VALUE)
The "*cbsi" pattern matches a
conditional branch depending on the
comparison (EQ, NE, LE, LT, GE,
GT) of a source register to zero
The KVX can compare two integer or
floating-point values and instruction
variants negate the result (0 or -1)
 Kalray SA. Confidential - All Rights Reserved. 16
KVX CONDITIONAL MOVE
TEMPLATES
Conditional moves can be produced
in two ways
The "*cmovsi.df" SET source is an
IF_THEN_ELSE that relies on a
zero_comparison_operator
Genconfig outputs in insn-config.h
#define HAVE_conditional_move
The "*cond_exec_movedf" pattern is
a COND_EXEC wrapping of a simple
SET expression
Genconfig outputs in insn-config.h
#define HAVE_conditional_execution
 Kalray SA. Confidential - All Rights Reserved. 17
KVX CONDITIONAL LOAD
AND STORE TEMPLATES
Load instructions format sub-64-bit
values with zero or sign extension
Plain load and store addressing
modes include [reg], offset[reg],
reg[reg], reg*size[reg]:
Conditional load and store
addressing modes are restricted to
[reg], offset[reg]:
 Kalray SA. Confidential - All Rights Reserved. 18
 Kalray SA. Confidential - All Rights Reserved. 18
1.Kalray MPPA Processor and KVX Core
2.KVX Code Generation Features
3.GCC IF-Conversion Framework
4.Extending GCC IF-Conversion
5.First Results and Outlook
AGENDA
 Kalray SA. Confidential - All Rights Reserved. 19
GCC AUTOMATED
COND_EXEC TEMPLATES
GCC can automate the writing of
COND_EXEC instruction templates
with the (define_cond_exec) template
and the "predicable" attribute
The (define_insn) templates with
(eq_attr "predicable" "yes") have their
RTL template wrapped into a
COND_EXEC with the condition
supplied by the (define_cond_exec)
The output template of the resulting
instructions is prefixed by the output
template of the (define_cond_exec)
Custom output may use the
current_insn_predicate RTX
 Kalray SA. Confidential - All Rights Reserved. 20
GCC IF-CONVERSION
OVERVIEW (1)
Enabled with –fif-conversion and
–fif-conversion2
Three passes:
• CE1 before combine
• CE2 after combine
• CE3 after reload
Information about the if-conversion
region is passed with a ce_if_block
structure
Top level (if_convert) iterates over
if-conversion region header blocks
by calling (find_if_header)
 Kalray SA. Confidential - All Rights Reserved. 21
GCC IF-CONVERSION
OVERVIEW (2)
(find_if_header)
• Fill the ce_if_block structure
• Call IFCVT_MACHDEP_INIT
• Before reload (CE1 and CE2),
call (noce_find_if_block)
• After reload (CE3) and if target
has conditional execution, call
(cond_exec_find_if_block)
Default target hook for TARGET_
HAVE_CONDITIONAL_EXECUTION
returns HAVE_conditional_execution
 Kalray SA. Confidential - All Rights Reserved. 22
GCC IF-CONVERSION
OVERVIEW (3)
(noce_find_if_block)
• Determine the if-conversion
region: IF-THEN-ELSE-JOIN or
IF-THEN-JOIN or IF-ELSE-JOIN
• First try without, then with, using
conditional moves
 Kalray SA. Confidential - All Rights Reserved. 23
GCC IF-CONVERSION
OVERVIEW (4)
(cond_exec_find_if_block)
• Identify cases of && tests (jump
to ELSE block) or || tests (jump to
THEN block)
• In case of && or || tests, try to
combine then into the conditional
expression
• If no or failed on multiple test
region, process IF-THEN-ELSE-
JOIN etc.
(cond_exec_process_if_block)
• Find common head or tail
sequences in IF-THEN-ELSE-
JOIN
• Dispatch to
(cond_exec_process_insns)
 Kalray SA. Confidential - All Rights Reserved. 24
GCC IF-CONVERSION
OVERVIEW (5)
(cond_exec_process_insns)
• Process instructions from START
to END, as there can be matching
head and tail sequences in the
THEN and ELSE blocks
• If instruction pattern code is
already COND_EXEC, build a
new condition by ANDing with the
block condition
• Generate COND_EXEC pattern
• Call IFCVT_MODIFY_INSN
which can modify the pattern or
abort if-conversion
 Kalray SA. Confidential - All Rights Reserved. 25
 Kalray SA. Confidential - All Rights Reserved. 25
1.Kalray MPPA Processor and KVX Core
2.KVX Code Generation Features
3.GCC IF-Conversion Framework
4.Extending GCC IF-Conversion
5.First Results and Outlook
AGENDA
 Kalray SA. Confidential - All Rights Reserved. 26
Extend GCC conditional
execution to the KVX
predicated load and stores
instructions that have
addressing mode
restrictions
Unconditionally compute
the original result into a
scratch register then
conditionally move the
result to the original
destination register
PREDICATION OF
LOADS AND STORES
PSEUDO-PREDICATION
OF INSTRUCTIONS
Eliminate the need for
computing into a scratch
register and conditional
move if the destination is
only locally used in the
THEN or ELSE block
SPECULATIVE
EXECUTION OF
INSTRUCTIONS
KVX IF-CONVERSION OBJECTIVES AND CONSTRAINTS
Focus on scalar instructions, as the GIMPLE auto-vectorization takes care of generating
masked vector operations
Complement the if-conversion provided the standard patterns for conditional operations:
move<m>cc, add<m>cc, neg<m>cc, not<m>cc
No changes to the target-independent GCC code
Can only expose the predicated instructions after register allocation
Unconditional assigments to scratch registers must not clobber registers in use
 Kalray SA. Confidential - All Rights Reserved. 27
KVX IF-CONVERSION
OVERVIEW (1)
Implemented with four target hooks
• MAX_CONDITIONAL_EXECUTE
• IFCVT_MACHDEP_INIT (kvx.h)
called in CE1, CE2, CE3 from
(find_if_header)
• IFCVT_MODIFY_INSN (kvx.h)
called in CE3 from
(cond_exec_process_insns)
• TARGET_HAVE_CONDITIONAL
_EXECUTION (kvx.c) called in
CE1, CE2, CE3
In combination with COND_EXEC
patterns and helper patterns in the
kvx .md files
 Kalray SA. Confidential - All Rights Reserved. 28
KVX IF-CONVERSION
OVERVIEW (2)
(kvx_ifcvt_machdep_init)
• Let CE1 and CE2 do if-conversion
without conditional execution
• In CE2, prepare for CE3, focusing
on IF-THEN-ELSE-JOIN, IF-
THEN-JOIN, IF-ELSE-JOIN
regions identified with same logic
as in (noce_find_if_block)
The idea is to insert USEs and
pseudo-DEFs in CE2 so that the
CE3 if-conversion will have the spare
hard registers it needs for pseudo-
predication and speculation
 Kalray SA. Confidential - All Rights Reserved. 29
KVX IF-CONVERSION
FIND CANDIDATES (1)
(kvx_ifcvt_ce2_candidate_ce3)
• Scan the non-jump instructions
• Bail-out if complex instruction or
instructions with side-effects
• Try conditional moves (with
COND_EXEC), if fail will have a
second chance as arithmetic
• Try conditional memory accesses
(irrespective of addressing mode)
• Try to speculate the non-trapping
arithmetic instructions
• Try to pseudo-predicate the non-
trapping arithmetic instructions
 Kalray SA. Confidential - All Rights Reserved. 30
KVX IF-CONVERSION
FIND CANDIDATES (2)
(kvx_ifcvt_ce2_cond_mem_ce3)
• If need a scratch register to
compute address, reserve it by
wrapping the original pattern and
a USE inside a PARALLEL
• Success if recognize the original
pattern or the wrapped pattern
inside a COND_EXEC
(kvx_ifcvt_ce2_cond_arith_ce3)
• Similar, except that always wrap
with USE of a scratch register that
has the mode of destination
(kvx_ifcvt_ce2_spec_arith_ce3)
• If the destination register is only
locally used (not live-out), may
speculatively execute unchanged
 Kalray SA. Confidential - All Rights Reserved. 31
KVX PREPARE FOR CE3
IF-CONVERSION (1)
<prepare for CE3 if-conversion> in
(kvx_ifcvt_machdep_init)
• Extend the live-range of tested
register by inserting its USE at
end of THEN and ELSE blocks
• Update pattern of the pseudo-
predicated memory and arithmetic
insns to the one recognized in
(kvx_ifcvt_ce2_cond_mem_ce3)
or (kvx_ifcvt_ce2_cond_arith_ce3
 Kalray SA. Confidential - All Rights Reserved. 32
KVX PREPARE FOR CE3
IF-CONVERSION (2)
<prepare for CE3 if-conversion> in
(kvx_ifcvt_machdep_init)
• Flag the speculated instructions
with REG_NONNEG note (hack,
unused otherwise in this port)
• Insert USE of speculated
destination register in JOIN block
• Insert DEFs of scratch registers in
TEST block and USEs of scratch
registers in JOIN block
These DEFs and USEs prevent the
allocation of the same hard registers
to the scratch registers in one path
and live variables on the other path
 Kalray SA. Confidential - All Rights Reserved. 33
KVX FINALIZE IF-
CONVERSION
(kvx_ifcvt_modify_insn)
• Implements target hook
IFCVT_MODIFY_INSN (CE3)
• Undo the COND_EXEC of pattern
by (cond_exec_process_insns) in
case of the inserted pseudo-DEFs
• Undo the COND_EXEC of pattern
by (cond_exec_process_insns) in
case of speculated instructions
 Kalray SA. Confidential - All Rights Reserved. 34
COND_EXEC OF MEMORY
LOADS
(cond_exec_process_insns) tries to
CON_EXEC instruction patterns
• COND_EXEC of loads with the
"memsimple" operand predicate
must appear first (shown earlier)
• COND_EXEC of loads with a
"memory" operand predicate not
"memsimple" is not valid, so use
a (define_insn_and_split) to
simplify the addressing mode.
• In case CE3 fails, provide another
(define_insn_and_split) to undo
the PARALLEL wrapping done by
(kvx_ifcvt_ce2_cond_mem_ce3)
Similar patterns for loading with
zero/sign extension
 Kalray SA. Confidential - All Rights Reserved. 35
COND_EXEC OF MEMORY
STORES
Similar to the COND_EXEC of
memory loads
• COND_EXEC of stores with the
"memsimple" operand predicate
must appear first (shown earlier)
• Use a (define_insn_and_split) to
simplify the addressing mode in
case of a "memory" operand
predicate which is not
"memsimple"
• In case CE3 fails, provide another
(define_insn_and_split) to undo
the wrapping with PARALLEL
done in CE2 by
(kvx_ifcvt_ce2_cond_mem_ce3)
 Kalray SA. Confidential - All Rights Reserved. 36
COND_EXEC OF NON-
PREDICABLE ARITHMETIC
(cond_exec_process_insns) tries to
CON_EXEC the instruction patterns
• As there are no predicated
arithmetic instructions in the KVX
ISA, pseudo-predicate them
• Use a (define_insn_and_split) to
compute into scratch register,
then conditionally move it to the
original destination
• In case CE3 fails, provide
another (define_insn_and_split)
to undo the wrapping with
PARALLEL done in CE2 by
(kvx_ifcvt_ce2_cond_arith_ce3)
Similar patterns are needed for
most of the scalar ISA subset
 Kalray SA. Confidential - All Rights Reserved. 37
CLEANUPS OF NON IF-
CONVERTED REGIONS
Undo the PARALLEL wrapping done
by (kvx_ifcvt_ce2_cond_arith_ce3)
with the (define_insn_and_split)
patterns previously shown
Also deactivate the previously
inserted UNSPEC_DEFs by splitting
them into USE
As CE3 is not always run, set the
kvx_ifcvt_ce_level to enable splitting
of the UNSPEC_DEFs and the
unwrapping the pseudo-predicated
instructions
This is done in the machine reorg
pass, which requires that all splitting
be done before doloop finalization
and sched2
 Kalray SA. Confidential - All Rights Reserved. 38
 Kalray SA. Confidential - All Rights Reserved. 38
1.Kalray MPPA Processor and KVX Core
2.KVX Code Generation Features
3.GCC IF-Conversion Framework
4.Extending GCC IF-Conversion
5.First Results and Outlook
AGENDA
 Kalray SA. Confidential - All Rights Reserved. 39
EXAMPLE OF KVX IF-CONVERSION (BEFORE)
COND_MEM (# 14)
COND_MOVE (# 19)
COND_MEM (# 20)
SPEC_ARTITH (# 21)
COND_MEM (# 22)
 Kalray SA. Confidential - All Rights Reserved. 40
EXAMPLE OF KVX IF-CONVERSION (AFTER)
COND_MEM (# 14)
COND_MOVE (# 19)
COND_MEM (# 20)
SPEC_ARTITH (# 21)
COND_MEM (# 22)
 Kalray SA. Confidential - All Rights Reserved. 41
MORE EXAMPLES OF KVX IF-CONVERSION (1)
COND_ARTITH (# 26)
 Kalray SA. Confidential - All Rights Reserved. 42
MORE EXAMPLES OF KVX IF-CONVERSION (2)
COND_MEM (# 26)
 Kalray SA. Confidential - All Rights Reserved. 43
MORE EXAMPLES OF KVX IF-CONVERSION (3)
COND_MEM (# 26)
COND_ARTITH (# 27)
 Kalray SA. Confidential - All Rights Reserved. 44
• On the SSA form: requires
extensions such as Psi-SSA
• Before register allocation
o CMOVE only: GEM compiler
o Fully predicated: R-K and J.
Fang algorithms (IA64)
o Partially predicated: S.
Freudenberger (TRACE)
• After register allocation:
GCC ports IA64, C6x, FRV
• Apply pseudo-predication
and local speculation after
register allocation
• The scratch registers that
will be unconditionally
defined are reserved before
register allocation with
UNSPEC_DEFs and USEs
• The GCC FRV port looks for
unused hard registers after
reload instead
EXISTING SCALAR IF-
CONVERSION
APPROACHES
KEY FEATURES OF
THE KVX SCALAR IF-
CONVERSION IN GCC
• Cannot reuse the existing
GCC (define_cond_exec)
machinery, as it may only
generate (define_insn)
patterns
• Automate generation of the
(define_insn_and_split)
patterns that enable
pseudo-predication
• Performance tuning for
scalar (while) loop pipelining
NEXT STEPS
SUMMARY AND OUTLOOK
Implemented scalar if-conversion in GCC for the partially predicated KVX architecture
Relies on the IFCVT framework, but activate it before (CE1, CE2) and after (CE3) reload
 Kalray SA. Confidential - All Rights Reserved. 45
www.kalrayinc.com
THANK YOU

More Related Content

Similar to 2022-Cauldron-If-Conversion-for-a-Partially-Predicated-VLIW-Architecture.pdf

Technical sales education enterprise- svc and ibm flash best practices update
Technical sales education   enterprise- svc and ibm flash best practices updateTechnical sales education   enterprise- svc and ibm flash best practices update
Technical sales education enterprise- svc and ibm flash best practices updatesolarisyougood
 
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)byteLAKE
 
Tech Days 2015: Embedded Product Update
Tech Days 2015: Embedded Product UpdateTech Days 2015: Embedded Product Update
Tech Days 2015: Embedded Product UpdateAdaCore
 
Codasip application class RISC-V processor solutions
Codasip application class RISC-V processor solutionsCodasip application class RISC-V processor solutions
Codasip application class RISC-V processor solutionsRISC-V International
 
The Open Power ISA: A Summary of Architecture Compliancy Options and the Late...
The Open Power ISA: A Summary of Architecture Compliancy Options and the Late...The Open Power ISA: A Summary of Architecture Compliancy Options and the Late...
The Open Power ISA: A Summary of Architecture Compliancy Options and the Late...Ganesan Narayanasamy
 
cisco-n9k-c92160yc-x-datasheet.pdf
cisco-n9k-c92160yc-x-datasheet.pdfcisco-n9k-c92160yc-x-datasheet.pdf
cisco-n9k-c92160yc-x-datasheet.pdfHi-Network.com
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD
 
Cisco at v mworld 2015 vmworld - cisco mds and emc xtrem_io-v2
Cisco at v mworld 2015 vmworld - cisco mds and emc xtrem_io-v2Cisco at v mworld 2015 vmworld - cisco mds and emc xtrem_io-v2
Cisco at v mworld 2015 vmworld - cisco mds and emc xtrem_io-v2ldangelo0772
 
Cisco Catalyst 6500 Technical Deep Dive.pdf
Cisco Catalyst 6500 Technical Deep Dive.pdfCisco Catalyst 6500 Technical Deep Dive.pdf
Cisco Catalyst 6500 Technical Deep Dive.pdfjuergenJaeckel
 
Evaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI SupercomputerEvaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI SupercomputerGeorge Markomanolis
 
IBM Power Systems E850C and S824
IBM Power Systems E850C and S824IBM Power Systems E850C and S824
IBM Power Systems E850C and S824David Spurway
 
Инновации Cisco для операторов связи
Инновации Cisco для операторов связиИнновации Cisco для операторов связи
Инновации Cisco для операторов связиCisco Russia
 
Flexis QE 32-bit ColdFire® V1 Microcontrollers
Flexis QE 32-bit  ColdFire® V1 Microcontrollers Flexis QE 32-bit  ColdFire® V1 Microcontrollers
Flexis QE 32-bit ColdFire® V1 Microcontrollers Premier Farnell
 
The sunsparc architecture
The sunsparc architectureThe sunsparc architecture
The sunsparc architectureTaha Malampatti
 
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPCRISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPCGanesan Narayanasamy
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...xKinAnx
 

Similar to 2022-Cauldron-If-Conversion-for-a-Partially-Predicated-VLIW-Architecture.pdf (20)

Technical sales education enterprise- svc and ibm flash best practices update
Technical sales education   enterprise- svc and ibm flash best practices updateTechnical sales education   enterprise- svc and ibm flash best practices update
Technical sales education enterprise- svc and ibm flash best practices update
 
HiPEAC-Keynote.pptx
HiPEAC-Keynote.pptxHiPEAC-Keynote.pptx
HiPEAC-Keynote.pptx
 
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
 
Tech Days 2015: Embedded Product Update
Tech Days 2015: Embedded Product UpdateTech Days 2015: Embedded Product Update
Tech Days 2015: Embedded Product Update
 
Codasip application class RISC-V processor solutions
Codasip application class RISC-V processor solutionsCodasip application class RISC-V processor solutions
Codasip application class RISC-V processor solutions
 
The Open Power ISA: A Summary of Architecture Compliancy Options and the Late...
The Open Power ISA: A Summary of Architecture Compliancy Options and the Late...The Open Power ISA: A Summary of Architecture Compliancy Options and the Late...
The Open Power ISA: A Summary of Architecture Compliancy Options and the Late...
 
cisco-n9k-c92160yc-x-datasheet.pdf
cisco-n9k-c92160yc-x-datasheet.pdfcisco-n9k-c92160yc-x-datasheet.pdf
cisco-n9k-c92160yc-x-datasheet.pdf
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor Architecture
 
Cisco at v mworld 2015 vmworld - cisco mds and emc xtrem_io-v2
Cisco at v mworld 2015 vmworld - cisco mds and emc xtrem_io-v2Cisco at v mworld 2015 vmworld - cisco mds and emc xtrem_io-v2
Cisco at v mworld 2015 vmworld - cisco mds and emc xtrem_io-v2
 
Cisco Catalyst 6500 Technical Deep Dive.pdf
Cisco Catalyst 6500 Technical Deep Dive.pdfCisco Catalyst 6500 Technical Deep Dive.pdf
Cisco Catalyst 6500 Technical Deep Dive.pdf
 
Evaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI SupercomputerEvaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI Supercomputer
 
OpenPOWER Webinar
OpenPOWER Webinar OpenPOWER Webinar
OpenPOWER Webinar
 
IBM Power Systems E850C and S824
IBM Power Systems E850C and S824IBM Power Systems E850C and S824
IBM Power Systems E850C and S824
 
Инновации Cisco для операторов связи
Инновации Cisco для операторов связиИнновации Cisco для операторов связи
Инновации Cisco для операторов связи
 
POWER9 for AI & HPC
POWER9 for AI & HPCPOWER9 for AI & HPC
POWER9 for AI & HPC
 
Flexis QE 32-bit ColdFire® V1 Microcontrollers
Flexis QE 32-bit  ColdFire® V1 Microcontrollers Flexis QE 32-bit  ColdFire® V1 Microcontrollers
Flexis QE 32-bit ColdFire® V1 Microcontrollers
 
The sunsparc architecture
The sunsparc architectureThe sunsparc architecture
The sunsparc architecture
 
Kgdb kdb modesetting
Kgdb kdb modesettingKgdb kdb modesetting
Kgdb kdb modesetting
 
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPCRISC-V  and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
RISC-V and OpenPOWER open-ISA and open-HW - a swiss army knife for HPC
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
 

More from ssuser866937

GNU Toolchain Infrastructure at gcc cauldron
GNU Toolchain Infrastructure at gcc cauldronGNU Toolchain Infrastructure at gcc cauldron
GNU Toolchain Infrastructure at gcc cauldronssuser866937
 
Ctrl-C redesign for gcc cauldron in 2022 in prague
Ctrl-C redesign for gcc cauldron in 2022 in pragueCtrl-C redesign for gcc cauldron in 2022 in prague
Ctrl-C redesign for gcc cauldron in 2022 in praguessuser866937
 
cauldron-2022-docs-bof at gcc cauldron in 2022
cauldron-2022-docs-bof at gcc cauldron in 2022cauldron-2022-docs-bof at gcc cauldron in 2022
cauldron-2022-docs-bof at gcc cauldron in 2022ssuser866937
 
Cauldron_2022_ctf_frame at gcc cauldron 2022 in prague
Cauldron_2022_ctf_frame at gcc cauldron 2022 in pragueCauldron_2022_ctf_frame at gcc cauldron 2022 in prague
Cauldron_2022_ctf_frame at gcc cauldron 2022 in praguessuser866937
 
BoF-OpenMP-OpenACC-Offloading-Cauldron2022.pdf
BoF-OpenMP-OpenACC-Offloading-Cauldron2022.pdfBoF-OpenMP-OpenACC-Offloading-Cauldron2022.pdf
BoF-OpenMP-OpenACC-Offloading-Cauldron2022.pdfssuser866937
 
Anatomy of ROCgdb presentation at gcc cauldron 2022
Anatomy of ROCgdb presentation at gcc cauldron 2022Anatomy of ROCgdb presentation at gcc cauldron 2022
Anatomy of ROCgdb presentation at gcc cauldron 2022ssuser866937
 
2022-ranger-update-Cauldron for gcc versions
2022-ranger-update-Cauldron for gcc versions2022-ranger-update-Cauldron for gcc versions
2022-ranger-update-Cauldron for gcc versionsssuser866937
 
2022 Cauldron Value Numbering for gcc versions
2022 Cauldron Value Numbering for gcc versions2022 Cauldron Value Numbering for gcc versions
2022 Cauldron Value Numbering for gcc versionsssuser866937
 
2022 Cauldron analyzer talk from david malcolm
2022 Cauldron analyzer talk from david malcolm2022 Cauldron analyzer talk from david malcolm
2022 Cauldron analyzer talk from david malcolmssuser866937
 
OpenMP-OpenACC-Offload-Cauldron2022-1.pdf
OpenMP-OpenACC-Offload-Cauldron2022-1.pdfOpenMP-OpenACC-Offload-Cauldron2022-1.pdf
OpenMP-OpenACC-Offload-Cauldron2022-1.pdfssuser866937
 
cs.ds-2211.13454.pdf
cs.ds-2211.13454.pdfcs.ds-2211.13454.pdf
cs.ds-2211.13454.pdfssuser866937
 

More from ssuser866937 (11)

GNU Toolchain Infrastructure at gcc cauldron
GNU Toolchain Infrastructure at gcc cauldronGNU Toolchain Infrastructure at gcc cauldron
GNU Toolchain Infrastructure at gcc cauldron
 
Ctrl-C redesign for gcc cauldron in 2022 in prague
Ctrl-C redesign for gcc cauldron in 2022 in pragueCtrl-C redesign for gcc cauldron in 2022 in prague
Ctrl-C redesign for gcc cauldron in 2022 in prague
 
cauldron-2022-docs-bof at gcc cauldron in 2022
cauldron-2022-docs-bof at gcc cauldron in 2022cauldron-2022-docs-bof at gcc cauldron in 2022
cauldron-2022-docs-bof at gcc cauldron in 2022
 
Cauldron_2022_ctf_frame at gcc cauldron 2022 in prague
Cauldron_2022_ctf_frame at gcc cauldron 2022 in pragueCauldron_2022_ctf_frame at gcc cauldron 2022 in prague
Cauldron_2022_ctf_frame at gcc cauldron 2022 in prague
 
BoF-OpenMP-OpenACC-Offloading-Cauldron2022.pdf
BoF-OpenMP-OpenACC-Offloading-Cauldron2022.pdfBoF-OpenMP-OpenACC-Offloading-Cauldron2022.pdf
BoF-OpenMP-OpenACC-Offloading-Cauldron2022.pdf
 
Anatomy of ROCgdb presentation at gcc cauldron 2022
Anatomy of ROCgdb presentation at gcc cauldron 2022Anatomy of ROCgdb presentation at gcc cauldron 2022
Anatomy of ROCgdb presentation at gcc cauldron 2022
 
2022-ranger-update-Cauldron for gcc versions
2022-ranger-update-Cauldron for gcc versions2022-ranger-update-Cauldron for gcc versions
2022-ranger-update-Cauldron for gcc versions
 
2022 Cauldron Value Numbering for gcc versions
2022 Cauldron Value Numbering for gcc versions2022 Cauldron Value Numbering for gcc versions
2022 Cauldron Value Numbering for gcc versions
 
2022 Cauldron analyzer talk from david malcolm
2022 Cauldron analyzer talk from david malcolm2022 Cauldron analyzer talk from david malcolm
2022 Cauldron analyzer talk from david malcolm
 
OpenMP-OpenACC-Offload-Cauldron2022-1.pdf
OpenMP-OpenACC-Offload-Cauldron2022-1.pdfOpenMP-OpenACC-Offload-Cauldron2022-1.pdf
OpenMP-OpenACC-Offload-Cauldron2022-1.pdf
 
cs.ds-2211.13454.pdf
cs.ds-2211.13454.pdfcs.ds-2211.13454.pdf
cs.ds-2211.13454.pdf
 

Recently uploaded

Russian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service ThaneRussian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service ThaneCall girls in Ahmedabad High profile
 
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on DeliveryCall Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Deliverybabeytanya
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Roomgirls4nights
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts servicevipmodelshub1
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)Damian Radcliffe
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGAPNIC
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Servicesexy call girls service in goa
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersDamian Radcliffe
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Delhi Call girls
 
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service PuneVIP Call Girls Pune Madhuri 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service PuneCall girls in Ahmedabad High profile
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...Diya Sharma
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...SofiyaSharma5
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝soniya singh
 

Recently uploaded (20)

Russian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service ThaneRussian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
 
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on DeliveryCall Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
 
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOG
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
 
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICECall Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
 
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service PuneVIP Call Girls Pune Madhuri 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service Pune
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
 

2022-Cauldron-If-Conversion-for-a-Partially-Predicated-VLIW-Architecture.pdf

  • 1.  Kalray SA. Confidential - All Rights Reserved. 1 www.kalrayinc.com IF-CONVERSION FOR A PARTIALLY PREDICATED VLIW ARCHITECTURE Benoît Dupont de Dinechin, CTO GNU Cauldron 2022
  • 2.  Kalray SA. Confidential - All Rights Reserved. 3  Kalray SA. Confidential - All Rights Reserved. 3 1.Kalray MPPA Processor and KVX Core 2.KVX Code Generation Features 3.GCC IF-Conversion Framework 4.Extending GCC IF-Conversion 5.First Results and Outlook AGENDA
  • 3.  Kalray SA. Confidential - All Rights Reserved. 4  Kalray SA. Confidential - All Rights Reserved. 4 KALRAY DPU-BASED ACCELERATION CARD  Kalray SA. Confidential - All Rights Reserved. 4
  • 4.  Kalray SA. Confidential - All Rights Reserved. 5  Kalray SA. Confidential - All Rights Reserved. 5 MPPA® COOLIDGE V1 Block Diagram & Feature List 80 VLIW Application Cores • 64-bit/32-bit 6-issue VLIW core • From 600MHz to 1.2 GHz • 16KB I/D cache with MMU • IEEE 754 FP16, FP32, FP64 FPU • Up to 256-bits per cycle Load/Store 80 Tensor Co-processors (per core) • INT8.32, INT16.64, FP16.32 • Up to 128 MAC equivalent per cycle Compute Clusters (5) • +1 Management/Security Core • 4 MB of Memory / L2 Cache • 600GB/s bandwidth 2x100GbE Ethernet Interface & Mger • 8x1/8x10/8x25/2x40/4x50/2x100 GbE • Jumbo Frame Support (9.6KB) • Support for PTP/IEEE 1588v2 • Priority Flow Control (PFC), IEEE 802.1Qbb • Checksum offload Header & Payload • Hash & Round-robin dispatcher Security • Secure boot with authentication & encryption • TRNG, RSA, Diffie-Hellman, DSA, ECC, EC-DSA and EC-DH acceleration PCIe Gen4 Interface • 16-lane PCIe GEN4 Endpoint (EP) or Root Complex (RC) • Bifurcation up to 8 downstream ports in RC mode • SR-IOV up to 8 PF / 248 VF • Address translation and protection • Up to 2048 MSI-X & 64 MSI • Support for Hot Plug • Up to 512 DMAs for multi queues / kernel bypass drivers • Direct PCIe-to-clusters and PCIe-to- DDR transfers LPDDR4/DDR4 Interface • 64-bit DDR4/LPDDR4-3200 channels with sideband/inline ECC • Up to two ranks per DDR4 Channel • 2 DDR channels (up to 32GB) with channel interleaving Cryptography Accelerators (optional) • AES-128/192/256 (ECB/CBC/ICM/CTR/GCM/GMAC/CCM) , AES-XTS, MD5/SHA-1, SHA-2, SHA-3, Kazumi/Snow 3G, ZUC
  • 5.  Kalray SA. Confidential - All Rights Reserved. 6 ANDEY BOSTAN COOLIDGE v1 COOLIDGE v2 DOLOMITES(3) PROCESS 28 nm 28 nm 16 nm 16 nm 6/5 nm PERFORMANCE 1 TOPS 1.3 TOPS 20 TOPS (1) 4 TFLOPS (2) 190 KDMIPS 50 TOPS (1) 25 TFLOPS (2) 190 KDMIPS 200 TOPS (1) 100 TFLOPS (2) 380 KDMIPS USE CASES / MARKET Prototyping 40G Data Center Auto Prototypes Data Center / Edge Automotive (proto) Data Center / Edge Automotive (proto) Data Center Edge Computing 5G CONSUMPTION (WATTS) 25W 25W 20W(4) 20W(4) 20W(4) PROTOTYPING PRODUCTION AVAILABLE UNDER DEVELOPMENT UNDER SPECIFICATION 2015 2020 1H 2023 2025 (1) INT8.32 (2) FP16.32 (3) Initial target – may changes (4) 50W maximum compute workload MPPA® PROCESSORS Product Family 2012
  • 6.  Kalray SA. Confidential - All Rights Reserved. 7 CLASSIC VLIW ARCHITECTURE (J. A. FISHER) EPIC VLIW ARCHITECTURE (B. R. RAU) Key architecture features • SELECT operation on Boolean value • Conditional load/store/FPU operations • Dismissible loads (non-trapping) • [Multi-way conditional branches] Key compiler techniques • Trace scheduling (global instruction scheduling) • Partial predication (S. Freudenberger if-conversion) Main examples • Multiflow TRACE processors • HP Labs Lx « Embedded Computing: a VLIW Approach » • STMicroelectronics ST200 (media processor based on Lx) Key architecture features • Fully predicated ISA • Speculative loads (control speculation) • Advanced loads (data speculation) • Rotating registers Key compiler techniques • Modulo scheduling (software pipelining) • Full predication (R-K algorithm, J. Fang algorithm) Main examples • Cydrome Cydra-5 • HP-intel IA64 • TI C6x DSPs VERY LONG INSTRUCTION WORD (VLIW) ARCHITECTURES Compiler-driven instruction-level parallel execution Simple, energy-efficient, time-predictable implementations
  • 7.  Kalray SA. Confidential - All Rights Reserved. 8 MPPA COOLIDGE 64-BIT VLIW CORE VLIW CORE PIPELINE Kalray VLIW (KVX) architecture is co-designed to appear as an in-order superscalar to compilers • Every scheduler parallel instruction group is a valid bundle • No need for vertical or horizontal no-op padding Vector-scalar ISA • 64x 64-bit general-purpose registers • Operands can be single registers, register pairs (128-bit) or register quadruples (256-bit) • 128-bit/256-bit SIMD instructions by dual-issuing/quad-issuing 64- bit instructions on the ALUS or by using the FPU data-path DSP capabilities • Counted or while hardware loops with early exits • Non-temporal loads (L1 cache bypass / preload) • Non-trapping memory loads (faulting bytes return 0) CPU capabilities • 4 privilege levels (rings), MMU (runs Linux kernel) • Recursive ISA virtualization (Popek & Goldberg)
  • 8.  Kalray SA. Confidential - All Rights Reserved. 9  Kalray SA. Confidential - All Rights Reserved. 9 1.Kalray MPPA Processor and KVX Core 2.KVX Code Generation Features 3.GCC IF-Conversion Framework 4.Extending GCC IF-Conversion 5.First Results and Outlook AGENDA
  • 9.  Kalray SA. Confidential - All Rights Reserved. 10 ACCESSCORE® SOFTWARE DEVELOPMENT KIT A Complete Toolchain & Standard Libraries Standard Programming Environment (C/C++/OpenCL) Operating Systems & Libraries (Linux / ClusterOS) Deep Learning Mathematics Computer Vision Software framework for offloading numerical, signal and image processing KAF™ CNN inference code generator compatible with standard CNN frameworks (Caffe, …) KaNN™ 3rd Party OS RTOS 3rd Party Tools Model-Based Development POSIX THREAD, OpenMP OpenCL, Eclipse GCC, GDB, LLVM, QEMU SLEEF, SIMDe OPEN CV, BLAS, LAPACK CNN Inference Code Gen. Exokernel, Open Source POSIX RTOS, Linux, Communication Libs AccessCore® for a seamless integration AccessCore® SDK AccessCore® Runtime Optimized Librairies Compiler, Simulator, Debugger & System Trace </>
  • 10.  Kalray SA. Confidential - All Rights Reserved. 11 C/C++ COMPILER SUPPORT OF KVX CORE GCC 10 for lightweight POSIX OS and for Linux (all TLS models) • Mapping of hardware loops using GCC doloop patterns • Sched2 does instruction scheduling and instruction bundling • Most high-gain optimizations apply (such as auto-vectorization) • On-going developments in software pipelining (derived from C6x)
  • 11.  Kalray SA. Confidential - All Rights Reserved. 12 CACHE BYPASS LOADS AND NON-TRAPPING LOADS Reuse of the GCC named address spaces (not available in C++)
  • 12.  Kalray SA. Confidential - All Rights Reserved. 13 EXPLOITATION OF THE VECTOR-SCALAR ARCHITECTURE 128-bit and 256-bit vectors are operated and passed as 64-bit register pairs and quadruples • Full support of the GCC vector syntax extensions • Align vectors on register pair/quad on ABI boundaries • SIMD lane splatting and shuffling rely on the BMM8 (8x8 Bit-Matrix Multiply) operations, exposed as SBMM8 instructions (swapped operands) • To improve register allocation, vector instructions are kept as machine instruction pairs or quadruples until after register allocation • Use of partial instruction bundles in output templates, with suitable scheduling type
  • 13.  Kalray SA. Confidential - All Rights Reserved. 14 SIMDE EMULATION OF X86 BUILTINS SIMDe translates the x86 builtin functions into native call on x86 (SIMDE_X86_SSSE3_NATIVE) and plain C code on other architectures (SIMDE_VECTORIZE) Kalray port of SIMDe provides an optimized translation on KVX using the GCC/LLVM KVX builtin functions (SIMDE_KVX_NATIVE)
  • 14.  Kalray SA. Confidential - All Rights Reserved. 15 KVX CONDITIONAL BRANCH TEMPLATES KVX condition codes live in the general-purpose registers The "cstore<m>" standard pattern produces 0 or 1 (STORE_FLAG_VALUE) The "*cbsi" pattern matches a conditional branch depending on the comparison (EQ, NE, LE, LT, GE, GT) of a source register to zero The KVX can compare two integer or floating-point values and instruction variants negate the result (0 or -1)
  • 15.  Kalray SA. Confidential - All Rights Reserved. 16 KVX CONDITIONAL MOVE TEMPLATES Conditional moves can be produced in two ways The "*cmovsi.df" SET source is an IF_THEN_ELSE that relies on a zero_comparison_operator Genconfig outputs in insn-config.h #define HAVE_conditional_move The "*cond_exec_movedf" pattern is a COND_EXEC wrapping of a simple SET expression Genconfig outputs in insn-config.h #define HAVE_conditional_execution
  • 16.  Kalray SA. Confidential - All Rights Reserved. 17 KVX CONDITIONAL LOAD AND STORE TEMPLATES Load instructions format sub-64-bit values with zero or sign extension Plain load and store addressing modes include [reg], offset[reg], reg[reg], reg*size[reg]: Conditional load and store addressing modes are restricted to [reg], offset[reg]:
  • 17.  Kalray SA. Confidential - All Rights Reserved. 18  Kalray SA. Confidential - All Rights Reserved. 18 1.Kalray MPPA Processor and KVX Core 2.KVX Code Generation Features 3.GCC IF-Conversion Framework 4.Extending GCC IF-Conversion 5.First Results and Outlook AGENDA
  • 18.  Kalray SA. Confidential - All Rights Reserved. 19 GCC AUTOMATED COND_EXEC TEMPLATES GCC can automate the writing of COND_EXEC instruction templates with the (define_cond_exec) template and the "predicable" attribute The (define_insn) templates with (eq_attr "predicable" "yes") have their RTL template wrapped into a COND_EXEC with the condition supplied by the (define_cond_exec) The output template of the resulting instructions is prefixed by the output template of the (define_cond_exec) Custom output may use the current_insn_predicate RTX
  • 19.  Kalray SA. Confidential - All Rights Reserved. 20 GCC IF-CONVERSION OVERVIEW (1) Enabled with –fif-conversion and –fif-conversion2 Three passes: • CE1 before combine • CE2 after combine • CE3 after reload Information about the if-conversion region is passed with a ce_if_block structure Top level (if_convert) iterates over if-conversion region header blocks by calling (find_if_header)
  • 20.  Kalray SA. Confidential - All Rights Reserved. 21 GCC IF-CONVERSION OVERVIEW (2) (find_if_header) • Fill the ce_if_block structure • Call IFCVT_MACHDEP_INIT • Before reload (CE1 and CE2), call (noce_find_if_block) • After reload (CE3) and if target has conditional execution, call (cond_exec_find_if_block) Default target hook for TARGET_ HAVE_CONDITIONAL_EXECUTION returns HAVE_conditional_execution
  • 21.  Kalray SA. Confidential - All Rights Reserved. 22 GCC IF-CONVERSION OVERVIEW (3) (noce_find_if_block) • Determine the if-conversion region: IF-THEN-ELSE-JOIN or IF-THEN-JOIN or IF-ELSE-JOIN • First try without, then with, using conditional moves
  • 22.  Kalray SA. Confidential - All Rights Reserved. 23 GCC IF-CONVERSION OVERVIEW (4) (cond_exec_find_if_block) • Identify cases of && tests (jump to ELSE block) or || tests (jump to THEN block) • In case of && or || tests, try to combine then into the conditional expression • If no or failed on multiple test region, process IF-THEN-ELSE- JOIN etc. (cond_exec_process_if_block) • Find common head or tail sequences in IF-THEN-ELSE- JOIN • Dispatch to (cond_exec_process_insns)
  • 23.  Kalray SA. Confidential - All Rights Reserved. 24 GCC IF-CONVERSION OVERVIEW (5) (cond_exec_process_insns) • Process instructions from START to END, as there can be matching head and tail sequences in the THEN and ELSE blocks • If instruction pattern code is already COND_EXEC, build a new condition by ANDing with the block condition • Generate COND_EXEC pattern • Call IFCVT_MODIFY_INSN which can modify the pattern or abort if-conversion
  • 24.  Kalray SA. Confidential - All Rights Reserved. 25  Kalray SA. Confidential - All Rights Reserved. 25 1.Kalray MPPA Processor and KVX Core 2.KVX Code Generation Features 3.GCC IF-Conversion Framework 4.Extending GCC IF-Conversion 5.First Results and Outlook AGENDA
  • 25.  Kalray SA. Confidential - All Rights Reserved. 26 Extend GCC conditional execution to the KVX predicated load and stores instructions that have addressing mode restrictions Unconditionally compute the original result into a scratch register then conditionally move the result to the original destination register PREDICATION OF LOADS AND STORES PSEUDO-PREDICATION OF INSTRUCTIONS Eliminate the need for computing into a scratch register and conditional move if the destination is only locally used in the THEN or ELSE block SPECULATIVE EXECUTION OF INSTRUCTIONS KVX IF-CONVERSION OBJECTIVES AND CONSTRAINTS Focus on scalar instructions, as the GIMPLE auto-vectorization takes care of generating masked vector operations Complement the if-conversion provided the standard patterns for conditional operations: move<m>cc, add<m>cc, neg<m>cc, not<m>cc No changes to the target-independent GCC code Can only expose the predicated instructions after register allocation Unconditional assigments to scratch registers must not clobber registers in use
  • 26.  Kalray SA. Confidential - All Rights Reserved. 27 KVX IF-CONVERSION OVERVIEW (1) Implemented with four target hooks • MAX_CONDITIONAL_EXECUTE • IFCVT_MACHDEP_INIT (kvx.h) called in CE1, CE2, CE3 from (find_if_header) • IFCVT_MODIFY_INSN (kvx.h) called in CE3 from (cond_exec_process_insns) • TARGET_HAVE_CONDITIONAL _EXECUTION (kvx.c) called in CE1, CE2, CE3 In combination with COND_EXEC patterns and helper patterns in the kvx .md files
  • 27.  Kalray SA. Confidential - All Rights Reserved. 28 KVX IF-CONVERSION OVERVIEW (2) (kvx_ifcvt_machdep_init) • Let CE1 and CE2 do if-conversion without conditional execution • In CE2, prepare for CE3, focusing on IF-THEN-ELSE-JOIN, IF- THEN-JOIN, IF-ELSE-JOIN regions identified with same logic as in (noce_find_if_block) The idea is to insert USEs and pseudo-DEFs in CE2 so that the CE3 if-conversion will have the spare hard registers it needs for pseudo- predication and speculation
  • 28.  Kalray SA. Confidential - All Rights Reserved. 29 KVX IF-CONVERSION FIND CANDIDATES (1) (kvx_ifcvt_ce2_candidate_ce3) • Scan the non-jump instructions • Bail-out if complex instruction or instructions with side-effects • Try conditional moves (with COND_EXEC), if fail will have a second chance as arithmetic • Try conditional memory accesses (irrespective of addressing mode) • Try to speculate the non-trapping arithmetic instructions • Try to pseudo-predicate the non- trapping arithmetic instructions
  • 29.  Kalray SA. Confidential - All Rights Reserved. 30 KVX IF-CONVERSION FIND CANDIDATES (2) (kvx_ifcvt_ce2_cond_mem_ce3) • If need a scratch register to compute address, reserve it by wrapping the original pattern and a USE inside a PARALLEL • Success if recognize the original pattern or the wrapped pattern inside a COND_EXEC (kvx_ifcvt_ce2_cond_arith_ce3) • Similar, except that always wrap with USE of a scratch register that has the mode of destination (kvx_ifcvt_ce2_spec_arith_ce3) • If the destination register is only locally used (not live-out), may speculatively execute unchanged
  • 30.  Kalray SA. Confidential - All Rights Reserved. 31 KVX PREPARE FOR CE3 IF-CONVERSION (1) <prepare for CE3 if-conversion> in (kvx_ifcvt_machdep_init) • Extend the live-range of tested register by inserting its USE at end of THEN and ELSE blocks • Update pattern of the pseudo- predicated memory and arithmetic insns to the one recognized in (kvx_ifcvt_ce2_cond_mem_ce3) or (kvx_ifcvt_ce2_cond_arith_ce3
  • 31.  Kalray SA. Confidential - All Rights Reserved. 32 KVX PREPARE FOR CE3 IF-CONVERSION (2) <prepare for CE3 if-conversion> in (kvx_ifcvt_machdep_init) • Flag the speculated instructions with REG_NONNEG note (hack, unused otherwise in this port) • Insert USE of speculated destination register in JOIN block • Insert DEFs of scratch registers in TEST block and USEs of scratch registers in JOIN block These DEFs and USEs prevent the allocation of the same hard registers to the scratch registers in one path and live variables on the other path
  • 32.  Kalray SA. Confidential - All Rights Reserved. 33 KVX FINALIZE IF- CONVERSION (kvx_ifcvt_modify_insn) • Implements target hook IFCVT_MODIFY_INSN (CE3) • Undo the COND_EXEC of pattern by (cond_exec_process_insns) in case of the inserted pseudo-DEFs • Undo the COND_EXEC of pattern by (cond_exec_process_insns) in case of speculated instructions
  • 33.  Kalray SA. Confidential - All Rights Reserved. 34 COND_EXEC OF MEMORY LOADS (cond_exec_process_insns) tries to CON_EXEC instruction patterns • COND_EXEC of loads with the "memsimple" operand predicate must appear first (shown earlier) • COND_EXEC of loads with a "memory" operand predicate not "memsimple" is not valid, so use a (define_insn_and_split) to simplify the addressing mode. • In case CE3 fails, provide another (define_insn_and_split) to undo the PARALLEL wrapping done by (kvx_ifcvt_ce2_cond_mem_ce3) Similar patterns for loading with zero/sign extension
  • 34.  Kalray SA. Confidential - All Rights Reserved. 35 COND_EXEC OF MEMORY STORES Similar to the COND_EXEC of memory loads • COND_EXEC of stores with the "memsimple" operand predicate must appear first (shown earlier) • Use a (define_insn_and_split) to simplify the addressing mode in case of a "memory" operand predicate which is not "memsimple" • In case CE3 fails, provide another (define_insn_and_split) to undo the wrapping with PARALLEL done in CE2 by (kvx_ifcvt_ce2_cond_mem_ce3)
  • 35.  Kalray SA. Confidential - All Rights Reserved. 36 COND_EXEC OF NON- PREDICABLE ARITHMETIC (cond_exec_process_insns) tries to CON_EXEC the instruction patterns • As there are no predicated arithmetic instructions in the KVX ISA, pseudo-predicate them • Use a (define_insn_and_split) to compute into scratch register, then conditionally move it to the original destination • In case CE3 fails, provide another (define_insn_and_split) to undo the wrapping with PARALLEL done in CE2 by (kvx_ifcvt_ce2_cond_arith_ce3) Similar patterns are needed for most of the scalar ISA subset
  • 36.  Kalray SA. Confidential - All Rights Reserved. 37 CLEANUPS OF NON IF- CONVERTED REGIONS Undo the PARALLEL wrapping done by (kvx_ifcvt_ce2_cond_arith_ce3) with the (define_insn_and_split) patterns previously shown Also deactivate the previously inserted UNSPEC_DEFs by splitting them into USE As CE3 is not always run, set the kvx_ifcvt_ce_level to enable splitting of the UNSPEC_DEFs and the unwrapping the pseudo-predicated instructions This is done in the machine reorg pass, which requires that all splitting be done before doloop finalization and sched2
  • 37.  Kalray SA. Confidential - All Rights Reserved. 38  Kalray SA. Confidential - All Rights Reserved. 38 1.Kalray MPPA Processor and KVX Core 2.KVX Code Generation Features 3.GCC IF-Conversion Framework 4.Extending GCC IF-Conversion 5.First Results and Outlook AGENDA
  • 38.  Kalray SA. Confidential - All Rights Reserved. 39 EXAMPLE OF KVX IF-CONVERSION (BEFORE) COND_MEM (# 14) COND_MOVE (# 19) COND_MEM (# 20) SPEC_ARTITH (# 21) COND_MEM (# 22)
  • 39.  Kalray SA. Confidential - All Rights Reserved. 40 EXAMPLE OF KVX IF-CONVERSION (AFTER) COND_MEM (# 14) COND_MOVE (# 19) COND_MEM (# 20) SPEC_ARTITH (# 21) COND_MEM (# 22)
  • 40.  Kalray SA. Confidential - All Rights Reserved. 41 MORE EXAMPLES OF KVX IF-CONVERSION (1) COND_ARTITH (# 26)
  • 41.  Kalray SA. Confidential - All Rights Reserved. 42 MORE EXAMPLES OF KVX IF-CONVERSION (2) COND_MEM (# 26)
  • 42.  Kalray SA. Confidential - All Rights Reserved. 43 MORE EXAMPLES OF KVX IF-CONVERSION (3) COND_MEM (# 26) COND_ARTITH (# 27)
  • 43.  Kalray SA. Confidential - All Rights Reserved. 44 • On the SSA form: requires extensions such as Psi-SSA • Before register allocation o CMOVE only: GEM compiler o Fully predicated: R-K and J. Fang algorithms (IA64) o Partially predicated: S. Freudenberger (TRACE) • After register allocation: GCC ports IA64, C6x, FRV • Apply pseudo-predication and local speculation after register allocation • The scratch registers that will be unconditionally defined are reserved before register allocation with UNSPEC_DEFs and USEs • The GCC FRV port looks for unused hard registers after reload instead EXISTING SCALAR IF- CONVERSION APPROACHES KEY FEATURES OF THE KVX SCALAR IF- CONVERSION IN GCC • Cannot reuse the existing GCC (define_cond_exec) machinery, as it may only generate (define_insn) patterns • Automate generation of the (define_insn_and_split) patterns that enable pseudo-predication • Performance tuning for scalar (while) loop pipelining NEXT STEPS SUMMARY AND OUTLOOK Implemented scalar if-conversion in GCC for the partially predicated KVX architecture Relies on the IFCVT framework, but activate it before (CE1, CE2) and after (CE3) reload
  • 44.  Kalray SA. Confidential - All Rights Reserved. 45 www.kalrayinc.com THANK YOU