Testing Framework to port and
optimize SIMD library to
OpenPOWER Systems
Daisuke Oka
daisukeokahassou@gmail.com
10/29/2021
2021年10月29日
Current Problem
• There are many SIMD Intel x86 Intrinsic function
• Intel Intrinsic function runs only on Intel, not on OpenPOWER Systems
• So we need to port Intel x86 Intrinsic code to OpenPOWER equivalent
Intrinsic code to run application by good performance on
OpenPOWER Systems
• But porting Intel x86 Intrinsics to OpenPOWER Intrinsics is technically
challenging
Obstacle of this project
• Intel x86 Intrinsics and OpenPOWER Intrinsics are not one to one
currespondance
• PowerISA vetor facility(VMX and VSV) are extensive but do not always
provide a direct or obvious equivalent to the Intel Intrinsics.
• Porting must be correct. Without error and error must lead to fatal
bug of application.
• If we can , we want to measure letency and throughput
Porting wrap structure in case
of _mm256_add_pd
Why we need testing framework
• Result of intrinsics of Intel x86 and OpenPOWER ISA must be equal
• If Error or Exception occurres , these must occurre in Intel x86 and
OpenPOWER ISA as same
• If result is not same, unexpected and unpredictable bug may be
occurre
• Not repuducable bug may be lead to fatal result
• So we must test Intel Intrinsic and correspondance of OpenPOWER
ISA automatically
• So we need testing framework
Testing Framework for _mm256_add_pd
• Throw same random __m256d value to Intrinsics in Intel x86 and
OpenPOWER Systems
• Ramdom value must be generated by CommonRandom__m256d()
function
• Compare return __m256d value of Intel x86 and OpenPOWER
Systems
• Return value must be compared by CommonAssert(__m256d value)
• Intel and OpenPOWER Systems must be connected by network
module written by Python
• Testing module will be written by C language extension of Python
Name of testing framework may be "Akari"
• "Akari" is Japanese orginary girl's name "あかり"
• Meaning "Light"
To port other architecture by using Akari
• Akari can be used to port Intel Intrinsics to other archtecture like ARM
• ARM has different Intrinsics too
• Need a few modification(I think)
• Need to run Python
Reference
• Linux on Power Porting Guide: Vector Intrinsics
• https://openpowerfoundation.org/?resource_lib=linux-power-
porting-guide-vector-intrinsics
• SLEEF: A Portable Vectorized Library of C Standard Mathematical
Functions(Naoki Shibata , Member, IEEE and Francesco Petrogalli
2020)
• https://sites.uclouvain.be/SystInfo/usr/include/avxintrin.h.html
We want to build study group and OSS
Community!
• Study group for understanding OpenPower ISA and Intel Intrinsics
• Community for Testing Framework
Thank you for watching!!

Testing framework to port and optimize simd library to open power systems

  • 1.
    Testing Framework toport and optimize SIMD library to OpenPOWER Systems Daisuke Oka daisukeokahassou@gmail.com 10/29/2021 2021年10月29日
  • 2.
    Current Problem • Thereare many SIMD Intel x86 Intrinsic function • Intel Intrinsic function runs only on Intel, not on OpenPOWER Systems • So we need to port Intel x86 Intrinsic code to OpenPOWER equivalent Intrinsic code to run application by good performance on OpenPOWER Systems • But porting Intel x86 Intrinsics to OpenPOWER Intrinsics is technically challenging
  • 3.
    Obstacle of thisproject • Intel x86 Intrinsics and OpenPOWER Intrinsics are not one to one currespondance • PowerISA vetor facility(VMX and VSV) are extensive but do not always provide a direct or obvious equivalent to the Intel Intrinsics. • Porting must be correct. Without error and error must lead to fatal bug of application. • If we can , we want to measure letency and throughput
  • 4.
    Porting wrap structurein case of _mm256_add_pd
  • 5.
    Why we needtesting framework • Result of intrinsics of Intel x86 and OpenPOWER ISA must be equal • If Error or Exception occurres , these must occurre in Intel x86 and OpenPOWER ISA as same • If result is not same, unexpected and unpredictable bug may be occurre • Not repuducable bug may be lead to fatal result • So we must test Intel Intrinsic and correspondance of OpenPOWER ISA automatically • So we need testing framework
  • 6.
    Testing Framework for_mm256_add_pd • Throw same random __m256d value to Intrinsics in Intel x86 and OpenPOWER Systems • Ramdom value must be generated by CommonRandom__m256d() function • Compare return __m256d value of Intel x86 and OpenPOWER Systems • Return value must be compared by CommonAssert(__m256d value) • Intel and OpenPOWER Systems must be connected by network module written by Python • Testing module will be written by C language extension of Python
  • 7.
    Name of testingframework may be "Akari" • "Akari" is Japanese orginary girl's name "あかり" • Meaning "Light"
  • 8.
    To port otherarchitecture by using Akari • Akari can be used to port Intel Intrinsics to other archtecture like ARM • ARM has different Intrinsics too • Need a few modification(I think) • Need to run Python
  • 9.
    Reference • Linux onPower Porting Guide: Vector Intrinsics • https://openpowerfoundation.org/?resource_lib=linux-power- porting-guide-vector-intrinsics • SLEEF: A Portable Vectorized Library of C Standard Mathematical Functions(Naoki Shibata , Member, IEEE and Francesco Petrogalli 2020) • https://sites.uclouvain.be/SystInfo/usr/include/avxintrin.h.html
  • 10.
    We want tobuild study group and OSS Community! • Study group for understanding OpenPower ISA and Intel Intrinsics • Community for Testing Framework
  • 11.
    Thank you forwatching!!