SlideShare a Scribd company logo
1 of 49
Download to read offline
SIMD Instructions
outside and inside
Oracle 12c
Laurent Léturgez – 2015
ABOUT ME
´ Oracle Consultant since 2001
´ Former developer (C, Java, perl, PL/SQL)
´ Blogger since 2004
´ http://laurent.leturgez.free.fr (In french and discontinued)
´ http://laurent-leturgez.com
´ Twitter : @lleturgez
´ Paris Oracle Meetup Organizer: @ParisOracle
´ OCM 11g
Agenda
´ SIMD Instructions, outside Oracle 12c
´ What is a SIMD instruction ?
´ Will my application use SIMD ?
´ Raw Performance
´ SIMD Instructions, inside Oracle 12c
´ How SIMD instructions are used inside Oracle 12c
´ Tracing SIMD in Oracle 12c
Caveats
´ Most of the topics are from
´ My own researches
´ My past life as a developer
´ Some of the topics are about internals, so:
´ Analysis and conclusion may be incomplete
´ Future versions of Oracle may change the features
´ Tests have been done with Oracle 12.1.0.2, Oracle
Enterprise Linux 7.1, VMWare Fusion 7 (And
VirtualBox)
Before we start …
´ Some fundamentals (from Dennis Yurichev’s book)
´ CPU register : […]The easiest way to understand a register is
to think of it as an untyped temporary variable. Imagine if
you were working with high-level PL1 and could only use
eight 32-bit (or 64-bit) variables. Yet a lot can be done using
just these!
´ Instruction : A primitive CPU command. The simplest
examples include: moving data between registers, working
with memory and arithmetic primitives. As a rule, each CPU
has its own instruction set architecture (ISA).
´ Assembly language : Mnemonic code and some extensions
like macros which are intended to make a programmer’s life
easier.
http://beginners.re/Reverse_Engineering_for_Beginners-en.pdf
Agenda
´ SIMD Instructions, outside Oracle 12c
´ What is a SIMD instruction ?
´ Will my application use SIMD ?
´ Raw Performance
´ SIMD Instructions, inside Oracle 12c
´ How SIMD instructions are used inside Oracle 12c
´ Tracing SIMD in Oracle 12c
SIMD instructions … outside
Oracle 12c
´ SIMD stands for Single Instruction Multiple Data
´ Process multiple data
´ In one CPU instruction
´ Based on
´ Specific registers
´ Specific CPU instructions and sets of instructions
´ Not Oracle specific
´ CPU Architecture specific
´ Intel
´ IBM
´ Sparc
´ This presentation is mainly about Intel architecture
SIMD instructions … outside
Oracle 12c
´ What is a SIMD register ?
´ It’s a CPU register
´ Wider than traditional registers (RDI, RSI, R8, R9 etc.)
´ 128 up to 512 bits wide
´ Contains many data
SIMD instructions … outside
Oracle 12c
´ Scalar operation
´ an array of 4 integers {1,2,3,4}
´ add 1 to each value
Reg1
Reg2
Reg3
CPU
RAM
In
Out
2 3 41
1
Reg1
Reg2
Reg3
CPU
RAM
In
Out
2 3 41
1
1
Reg1
Reg2
Reg3
CPU
RAM
In
Out
2 3 41
1
1
2
Reg1
Reg2
Reg3
CPU
RAM
In
Out
2 3 41
1
1
2
2
Reg1
Reg2
Reg3
CPU
RAM
In
Out
2 3 41
4
1
5
3 4 52
…/
…
LOAD ADD SAVE
4 LOAD
4 ADD
4 SAVE
SIMD instructions … outside
Oracle 12c
´ SIMD operation
´ an array of 4 integers {1,2,3,4}
´ add 1 to each value
SIMD Reg1
CPU
RAM
In
Out
2 3 41
1 1 11SIMD Reg2
SIMD Reg3
SIMD Reg1
CPU
RAM
In
Out
2 3 41
2 3 41
1 1 11SIMD Reg2
SIMD Reg3
SIMD Reg1
CPU
RAM
In
Out
2 3 41
2 3 41
1 1 11
3 4 52
SIMD Reg2
SIMD Reg3
SIMD Reg1
CPU
RAM
In
Out
2 3 41
3 4 52
2 3 41
1 1 11
3 4 52
SIMD Reg2
SIMD Reg3
LOAD ADD SAVE
SIMD instructions … outside
Oracle 12c
´ MMX: MultiMedia eXtensions (Pentium II)
´ 64 bits registers
´ 8 registers (MM0 to MM7)
´ SSE: Streaming SIMD Extensions: (Pentium III)
´ 128 bits registers
´ 8 registers (XMM0 to XMM7)
´ Only four 32 bits single precision floating point numbers
´ SSE2 (Pentium IV), SSE3 (Pentium IV Prescott, Xeon Nocona), SSSE3
(Xeon 5100, Core 2), SSE4.1 (Penryn), SSE4.2 (Nehalem)
´ 128 bits registers
´ 16 registers (XMM0 to XMM15)
´ Usage expansion (two 64 bits double precision, four 32 bits
integers until sixteen 8 bits bytes)
´ New instructions
SIMD instructions … outside
Oracle 12c
´ AVX: Advanced Vector eXtension (Sandy Bridge processors)
´ XMM registers are extended to 256 bits
´ 16 AVX registers named YMM0 to YMM15
´ Three operand instructions (non destructive) : A+B=C rather than
A=A+B
´ Some alignment requirements are relaxed
´ AVX2 (Introduced with Haswell processors)
´ 256 bits registers
´ New instructions (shifting, value broadcasting etc…)
´ AVX-512 or AVX3 (Skylake processors)
´ 512 bits registers
´ 32 registers named ZMM0 to ZMM31
´ AVX-1024 … the future
´ 1024 bits registers
SIMD instructions … outside
Oracle 12c
´ SIMD instructions
´ Reduce number of CPU cycles and memory pressure
´ Process data in parallel without any contention
´ Need a programming method (vector programming) with some
constraints (data alignments etc.)
´ Size matters
´ Wider registers, more data loaded (but wider register files
increase CPU power consumption à Challenge)
´ Processing is always done as a single CPU Cycle
´ More registers
´ Use cases
´ Data Filtering
´ Graphics
´ Bioinformatics …
SIMD instructions … outside
Oracle 12c
´ Intel API (C/C++) : Intel Intrinsics Guide
https://software.intel.com/sites/landingpage/IntrinsicsGuide/
´ Sample codes:
https://app.box.com/simdSampleC-2015
SIMD instructions … outside
Oracle 12c
Agenda
´ SIMD Instructions, outside Oracle 12c
´ What is a SIMD instruction ?
´ Will my application use SIMD ?
´ Raw Performance
´ SIMD Instructions, inside Oracle 12c
´ How SIMD instructions are used inside Oracle 12c
´ Tracing SIMD in Oracle 12c
Will my application use SIMD registers
and instructions ?
´ It depends on :
´ Hardware
´ Consult processors datasheets to see which instruction set
extensions are used (if many)
´ http://ark.intel.com/#@Processors
´ Hypervisor
´ Some (old) hypervisors do not support modern extensions
´ VirtualBox versions <5.0 don’t support SSE4, AVX and AVX2
´ Hyper-V on W2008R2-SP1 needs patch for specific processors
to support AVX
´ It depends on the Operating System
´AVX (256 bits) is supported from
´ Linux Kernel >= 2.6.30
´ Redhat EL5 : 2.6.18
´ Oracle EL5 w/UEK : 2.6.32
AVX needs xsave kernel parameter
´ Solaris 10 upd 10 and Solaris 11
´ Windows 2008 R2 SP1
Will my application use SIMD registers
and instructions ?
´ It depends on the compiler
´ GCC
´ > 4.6 for AVX support
´ Use of specific switches (-msse2, -msse4.1, msse4.2, -
mavx, -mavx2 …)
´ Intel C/C++ Compiler (ICC)
´ > 11.1 for AVX Support and > 13.0 for AVX2 support
´ Use of specific switches (-xsse4.2, -xavx, -xcore-avx2
…)
´ Beware of optimization switches (-O1,-O2, -O3)
´ More … disassemble (if you are allowed to J )
´ Registers
´ Assembler instructions
Will my application use SIMD registers
and instructions ?
Agenda
´ SIMD Instructions, outside Oracle 12c
´ What is a SIMD instruction ?
´ Will my application use SIMD ?
´ Raw Performance
´ SIMD Instructions, inside Oracle 12c
´ How SIMD instructions are used inside Oracle 12c
´ Tracing SIMD in Oracle 12c
´ Based on a C program
´ Used CPU: Haswell microarchitecture (Core
i7-4960HQ). AVX/AVX2 enabled
´ 3 tests : No SIMD, SSE4, AVX
´ Input: one array containing 1Million values.
´ Goal: Add 1 to each value, each million
values repeated 4k, 8k, 16k and 32k times
´ CPU Time(s) = f(#rows)
“Quick and Dirty” Sample code available here:
https://app.box.com/s/ibmnbblpho4xtbeq2x8ir60nrk37208v
Raw performance
Raw performance
10,35
20,46
42,35
85,64
3,3 6,81
13,73
25,58
1,96 3,51 7,23
15,15
0
10
20
30
40
50
60
70
80
90
4096 M. ROWS 8192 M. ROWS 16384 M. ROWS 32768 M. ROWS
CPUTime(Sec)
RAW Performance (CPU) for SIMD Instructions
NO SIMD SSE4 (XMM Registers) AVX (YMM Registers)
Agenda
´ SIMD Instructions, outside Oracle 12c
´ What is a SIMD instruction ?
´ Will my application use SIMD ?
´ Raw Performance
´ SIMD Instructions, inside Oracle 12c
´ How SIMD instructions are used inside Oracle 12c
´ Tracing SIMD in Oracle 12c
SIMD instructions … inside
Oracle 12c
´ In Memory Data Structure
´ In Memory Compression Unit :
IMCU
´ IMCU is the unit of column store
allocation
´ Target size is 1M rows
(controlled by _inmemory_imcu_target_rows)
´ One IMCU can contain more
than one column
´ Each column in one IMCU is a
column unit (CU)
SIMD instructions … inside
Oracle 12c
´ In memory column store storage indexes
´ For each column unit, min and max values are
maintained in a storage index
´ Storage Indexes provide CU pruning
´ Information about CU available in GV$IM_COL_CU
(Undocumented. See BugID 19361690)
IMCU
Pruning
SIMD instructions … inside
Oracle 12c
´ The way your data is sorted matters for best IMCU pruning
SIMD instructions … inside
Oracle 12c
´ SIMD extensions are used with In Memory storage
indexes for efficient filtering
1. IM Storage Indexes do IMCU pruning
2. SIMD instructions apply efficiently filter predicates
IMCU
Pruning
Prod-id
10
10
14
14
10
Filtering
with SIMD
SIMD instructions … inside
Oracle 12c
´ Oracle 12c uses specific libraries for SIMD (and
compression)
´ Located in $ORACLE_HOME/lib
´ libshpksse4212.so for SSE4.2 extensions
Compiled with ICC v12 with specific xsse4.2 switch
´ libshpkavx12.so for AVX extensions
Compiled with ICC v12 with specific xavx switch
´ libshpkavx212.so for AVX2 extensions
Not yet implemented (8 functions implemented)
No ICC avx2 switch used because ICC v12 doesn’t support AVX2
´ Thanks Tanel Pöder
SIMD instructions … inside
Oracle 12c
´ Oracle SIMD related functions
´ Located in kdzk kernel module (HPK)
´ Part of Advanced Compression library (ADVCMP)
´ Easily tracked with systemtap
SIMD instructions … inside
Oracle 12c
´ How Oracle uses SIMD extensions ?
It depends on many parameters
´ OS Level : /proc/cpuinfo
´ AVX and AVX2 support
´ SSE4 Support only
SIMD instructions … inside
Oracle 12c
´ Which library am I using ?
´ pmap
´ AVX support
´ SSE4 support
SIMD instructions … inside
Oracle 12c
´ Which compiler options have been used ?
´ Read “comment” section in ELF
´ Read the corresponding compiler documentation
[oracle@oel7 conf]$ readelf -p .comment $ORACLE_HOME/lib/libshpkavx12.so |
> | egrep -i 'intel|gcc' | egrep 'xavx|mavx’
[ 2c] -?comment:Intel(R) C Intel(R) 64 Compiler XE for applications running on
Intel(R) 64, Version 12.0 Build 20120731
…/…
-DNTEV_USE_EPOLL -DNET_USE_LDAP -xavx
SIMD instructions … inside
Oracle 12c
´ How are SIMD registers used by Oracle ?
´ GDB
´ To get the call stack (backtrace)
´ To set breakpoints on interesting functions
´ To view register contents (traditional and SIMD)
´ “Info registers” for traditional registers
´ “Info all-registers” for all registers (SIMD reg included)
´ (gdb) print $ymmX.<format>
Format can be v8_float, v4_double, v32_int8, v16_int16, v8_int32,
v4_int64, or v2_int128
SIMD instructions … inside
Oracle 12c
In red, register content
has been modified
In blue, the second
part of the SIMD
registers (128 bits) is
empty
SIMD instructions … inside
Oracle 12c
´ Oracle IM can use AVX or SSE4 extensions for SIMD
operations
´ When AVX is used
It uses only 128 bits out of 256 bits wide registers
• AVX adds new register-state through the 256-bit wide
YMM register file
• Explicit operating system support is required to properly
save and restore AVX's expanded registers
between context switches
• Without this, only AVX 128-bit is supported
SIMD instructions … inside
Oracle 12c
´The culprit
´ Oracle 12.1.0.2 is supported from EL5 onwards
´ EL5 Redhat Kernel is 2.6.18 and this flag
(xsave) is supported from 2.6.30 kernels
´ For compatibility reasons, Oracle has to
compile its code on 2.6.18 kernels
SIMD instructions … inside
Oracle 12c
´Or maybe …
´ Oracle needs to use values packed below
32bits wide
Agenda
´ SIMD Instructions, outside Oracle 12c
´ What is a SIMD instruction ?
´ Will my application use SIMD ?
´ Raw Performance
´ SIMD Instructions, inside Oracle 12c
´ How SIMD instructions are used inside Oracle 12c
´ Tracing SIMD in Oracle 12c
Tracing SIMD in Oracle 12c
´ Oradebug has 2 components related to IM
Tracing SIMD in Oracle 12c
´ Interesting components to trace for SIMD
and/or IMCU Pruning are :
´ IM_optimizer
´Gives information about CBO calculation
related to IM
´ ADVCMP_DECOMP.*
´ADVCMP_DECOMP_HPK : SIMD functions
´ADVCMP_DECOMP_PCODE : Portable Code
Machine (usually comparison functions and
results)
Tracing SIMD in Oracle 12c
´ IM_optimizer
´ Information available in trace file
´ IMCU Pruning ratio
´ CU decompression costing (per IMCU)
´ Predicate evaluation costing (per row)
´ Statement has to be parsed to get results
Tracing SIMD in Oracle 12c
select prod_id,cust_id,time_id from laurent.s_capa_high where amount_sold=20;
Tracing SIMD in Oracle 12c
´ This information is available in CBO trace file (10053 or
SQL_costing event)
Tracing SIMD in Oracle 12c
´ ADVCMP_DECOMP
´ ADVCMP_DECOMP_HPK
´ Information is available in the trace file (for each IMCU
processed)
´ Used library and function
´ Number of rows and counting algorithm
´ Processing rate (comparison and decompression if relevant)
´ But nothing on the results of the processing L
Tracing SIMD in Oracle 12c
´ ADVCMP_DECOMP
´ ADVCMP_DECOMP_HPK
´ Gives information about SIMD function usage and filtering
(after IMCU pruning)
´ Example: inmemory table with NO MEMCOMPRESS or DML
compression
Tracing SIMD in Oracle 12c
´ ADVCMP_DECOMP
´ ADVCMP_DECOMP_HPK
´ Example: inmemory compressed table
´ SIMD are used only in the kdzk_eq_dict functions
Tracing SIMD in Oracle 12c
´ My thoughts about compression/decompression
´ NO MEMCOMPRESS / COMPRESS FOR DML
´ kdzk*dynp* functions (ex: kdzk_eq_dynp_16bit,
kdzk_le_dynp_32bit etc.)
´ FOR QUERY LOW / QUERY HIGH
´ Dictionary Encoding (LZW ?) : kdzk_*dict* functions (ex:
kdzk_eq_dict_7bit, kdzk_le_dict_4bit etc.)
´ Run Length Encoding: kdzk_burst_rle* functions (ex:
kdzk_burst_rle_8bit, kdzk_burst_rle_16bit …)
´ Bit packing compression: kdzk*fixed* functions (ex:
kdzk_ge_lt_fixed_32bit, kdzk_lt_fixed_8bit …)
Tracing SIMD in Oracle 12c
´ My thoughts about compression/decompression
´ FOR CAPACITY LOW
´ FOR QUERY LOW + additional proprietary compression (OZIP)
´ Functions: ozip_decode_dict*, kdzk_ozip_decode* (Ex:
kdzk_ozip_decode_dydi, ozip_decode_dict_9_bit etc.)
´ FOR CAPACITY HIGH
´ FOR QUERY HIGH + heavy weigth compression algorithm
´ Compression/decompression method depends on:
´ Datatype
´ Column Compression Unit size
´ Column contents
leturgezl@gmail.com
http://laurent-leturgez.com
@lleturgez

More Related Content

What's hot

Embedded systems design @ defcon 2015
Embedded systems design @ defcon 2015Embedded systems design @ defcon 2015
Embedded systems design @ defcon 2015Rodrigo Almeida
 
Never Trust Your Inputs or how to fool an ADC
Never Trust Your Inputs or how to fool an ADCNever Trust Your Inputs or how to fool an ADC
Never Trust Your Inputs or how to fool an ADCAlexander Bolshev
 
Keysight Mini-ICT - Testing Days México
Keysight Mini-ICT - Testing Days MéxicoKeysight Mini-ICT - Testing Days México
Keysight Mini-ICT - Testing Days MéxicoInterlatin
 
Introduction to FPGA, VHDL
Introduction to FPGA, VHDL  Introduction to FPGA, VHDL
Introduction to FPGA, VHDL Amr Rashed
 
Embedded systems development Defcon 19
Embedded systems development Defcon 19Embedded systems development Defcon 19
Embedded systems development Defcon 19Rodrigo Almeida
 
Intel galileo gen 2
Intel galileo gen 2Intel galileo gen 2
Intel galileo gen 2srknec
 
14157565 embedded-programming
14157565 embedded-programming14157565 embedded-programming
14157565 embedded-programmingPRADEEP
 
Recon: Hopeless relay protection for substation automation
Recon: Hopeless relay protection for substation automation  Recon: Hopeless relay protection for substation automation
Recon: Hopeless relay protection for substation automation Sergey Gordeychik
 
Jtag presentation
Jtag presentationJtag presentation
Jtag presentationklinetik
 

What's hot (10)

Embedded systems design @ defcon 2015
Embedded systems design @ defcon 2015Embedded systems design @ defcon 2015
Embedded systems design @ defcon 2015
 
Never Trust Your Inputs or how to fool an ADC
Never Trust Your Inputs or how to fool an ADCNever Trust Your Inputs or how to fool an ADC
Never Trust Your Inputs or how to fool an ADC
 
Keysight Mini-ICT - Testing Days México
Keysight Mini-ICT - Testing Days MéxicoKeysight Mini-ICT - Testing Days México
Keysight Mini-ICT - Testing Days México
 
Introduction to FPGA, VHDL
Introduction to FPGA, VHDL  Introduction to FPGA, VHDL
Introduction to FPGA, VHDL
 
Embedded systems development Defcon 19
Embedded systems development Defcon 19Embedded systems development Defcon 19
Embedded systems development Defcon 19
 
Session two
Session twoSession two
Session two
 
Intel galileo gen 2
Intel galileo gen 2Intel galileo gen 2
Intel galileo gen 2
 
14157565 embedded-programming
14157565 embedded-programming14157565 embedded-programming
14157565 embedded-programming
 
Recon: Hopeless relay protection for substation automation
Recon: Hopeless relay protection for substation automation  Recon: Hopeless relay protection for substation automation
Recon: Hopeless relay protection for substation automation
 
Jtag presentation
Jtag presentationJtag presentation
Jtag presentation
 

Similar to SIMD inside and outside Oracle 12c In Memory

Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)
Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)
Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)Laurent Leturgez
 
SIMD inside and outside oracle 12c
SIMD inside and outside oracle 12cSIMD inside and outside oracle 12c
SIMD inside and outside oracle 12cLaurent Leturgez
 
Crypto Performance on ARM Cortex-M Processors
Crypto Performance on ARM Cortex-M ProcessorsCrypto Performance on ARM Cortex-M Processors
Crypto Performance on ARM Cortex-M ProcessorsHannes Tschofenig
 
Something about SSE and beyond
Something about SSE and beyondSomething about SSE and beyond
Something about SSE and beyondLihang Li
 
Introduction2_PIC.ppt
Introduction2_PIC.pptIntroduction2_PIC.ppt
Introduction2_PIC.pptAakashRawat35
 
Arduino by yogesh t s'
Arduino by yogesh t s'Arduino by yogesh t s'
Arduino by yogesh t s'tsyogesh46
 
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMDEdge AI and Vision Alliance
 
Cryptography and secure systems
Cryptography and secure systemsCryptography and secure systems
Cryptography and secure systemsVsevolod Stakhov
 
DPDK layer for porting IPS-IDS
DPDK layer for porting IPS-IDSDPDK layer for porting IPS-IDS
DPDK layer for porting IPS-IDSVipin Varghese
 
My seminar new 28
My seminar new 28My seminar new 28
My seminar new 28rajeshkvdn
 
20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Jorisimec.archive
 
Introduction to FreeRTOS
Introduction to FreeRTOSIntroduction to FreeRTOS
Introduction to FreeRTOSICS
 
GOD MODE UNLOCKED - Hardware Backdoors in x86 CPUs
GOD MODE UNLOCKED - Hardware Backdoors in x86 CPUsGOD MODE UNLOCKED - Hardware Backdoors in x86 CPUs
GOD MODE UNLOCKED - Hardware Backdoors in x86 CPUsPriyanka Aash
 
Summer training embedded system and its scope
Summer training  embedded system and its scopeSummer training  embedded system and its scope
Summer training embedded system and its scopeArshit Rai
 
The n00bs guide to ovs dpdk
The n00bs guide to ovs dpdkThe n00bs guide to ovs dpdk
The n00bs guide to ovs dpdkmarkdgray
 

Similar to SIMD inside and outside Oracle 12c In Memory (20)

Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)
Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)
Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)
 
SIMD inside and outside oracle 12c
SIMD inside and outside oracle 12cSIMD inside and outside oracle 12c
SIMD inside and outside oracle 12c
 
Crypto Performance on ARM Cortex-M Processors
Crypto Performance on ARM Cortex-M ProcessorsCrypto Performance on ARM Cortex-M Processors
Crypto Performance on ARM Cortex-M Processors
 
Joel Falcou, Boost.SIMD
Joel Falcou, Boost.SIMDJoel Falcou, Boost.SIMD
Joel Falcou, Boost.SIMD
 
Something about SSE and beyond
Something about SSE and beyondSomething about SSE and beyond
Something about SSE and beyond
 
Introduction2_PIC.ppt
Introduction2_PIC.pptIntroduction2_PIC.ppt
Introduction2_PIC.ppt
 
Introduction to Blackfin BF532 DSP
Introduction to Blackfin BF532 DSPIntroduction to Blackfin BF532 DSP
Introduction to Blackfin BF532 DSP
 
Presentation
PresentationPresentation
Presentation
 
Oracle SPARC T7 a M7 servery
Oracle SPARC T7 a M7 serveryOracle SPARC T7 a M7 servery
Oracle SPARC T7 a M7 servery
 
Arduino by yogesh t s'
Arduino by yogesh t s'Arduino by yogesh t s'
Arduino by yogesh t s'
 
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
 
Cryptography and secure systems
Cryptography and secure systemsCryptography and secure systems
Cryptography and secure systems
 
Embedded system
Embedded systemEmbedded system
Embedded system
 
DPDK layer for porting IPS-IDS
DPDK layer for porting IPS-IDSDPDK layer for porting IPS-IDS
DPDK layer for porting IPS-IDS
 
My seminar new 28
My seminar new 28My seminar new 28
My seminar new 28
 
20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris
 
Introduction to FreeRTOS
Introduction to FreeRTOSIntroduction to FreeRTOS
Introduction to FreeRTOS
 
GOD MODE UNLOCKED - Hardware Backdoors in x86 CPUs
GOD MODE UNLOCKED - Hardware Backdoors in x86 CPUsGOD MODE UNLOCKED - Hardware Backdoors in x86 CPUs
GOD MODE UNLOCKED - Hardware Backdoors in x86 CPUs
 
Summer training embedded system and its scope
Summer training  embedded system and its scopeSummer training  embedded system and its scope
Summer training embedded system and its scope
 
The n00bs guide to ovs dpdk
The n00bs guide to ovs dpdkThe n00bs guide to ovs dpdk
The n00bs guide to ovs dpdk
 

More from Laurent Leturgez

Python and Oracle : allies for best of data management
Python and Oracle : allies for best of data managementPython and Oracle : allies for best of data management
Python and Oracle : allies for best of data managementLaurent Leturgez
 
Oracle hadoop let them talk together !
Oracle hadoop let them talk together !Oracle hadoop let them talk together !
Oracle hadoop let them talk together !Laurent Leturgez
 
Oracle Database : Addressing a performance issue the drilldown approach
Oracle Database : Addressing a performance issue the drilldown approachOracle Database : Addressing a performance issue the drilldown approach
Oracle Database : Addressing a performance issue the drilldown approachLaurent Leturgez
 
Improve oracle 12c security
Improve oracle 12c securityImprove oracle 12c security
Improve oracle 12c securityLaurent Leturgez
 
Which cloud provider for your oracle database
Which cloud provider for your oracle databaseWhich cloud provider for your oracle database
Which cloud provider for your oracle databaseLaurent Leturgez
 

More from Laurent Leturgez (6)

Python and Oracle : allies for best of data management
Python and Oracle : allies for best of data managementPython and Oracle : allies for best of data management
Python and Oracle : allies for best of data management
 
Architecting a datalake
Architecting a datalakeArchitecting a datalake
Architecting a datalake
 
Oracle hadoop let them talk together !
Oracle hadoop let them talk together !Oracle hadoop let them talk together !
Oracle hadoop let them talk together !
 
Oracle Database : Addressing a performance issue the drilldown approach
Oracle Database : Addressing a performance issue the drilldown approachOracle Database : Addressing a performance issue the drilldown approach
Oracle Database : Addressing a performance issue the drilldown approach
 
Improve oracle 12c security
Improve oracle 12c securityImprove oracle 12c security
Improve oracle 12c security
 
Which cloud provider for your oracle database
Which cloud provider for your oracle databaseWhich cloud provider for your oracle database
Which cloud provider for your oracle database
 

Recently uploaded

Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?Watsoo Telematics
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
buds n tech IT solutions
buds n  tech IT                solutionsbuds n  tech IT                solutions
buds n tech IT solutionsmonugehlot87
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 

Recently uploaded (20)

Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
buds n tech IT solutions
buds n  tech IT                solutionsbuds n  tech IT                solutions
buds n tech IT solutions
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 

SIMD inside and outside Oracle 12c In Memory

  • 1. SIMD Instructions outside and inside Oracle 12c Laurent Léturgez – 2015
  • 2. ABOUT ME ´ Oracle Consultant since 2001 ´ Former developer (C, Java, perl, PL/SQL) ´ Blogger since 2004 ´ http://laurent.leturgez.free.fr (In french and discontinued) ´ http://laurent-leturgez.com ´ Twitter : @lleturgez ´ Paris Oracle Meetup Organizer: @ParisOracle ´ OCM 11g
  • 3. Agenda ´ SIMD Instructions, outside Oracle 12c ´ What is a SIMD instruction ? ´ Will my application use SIMD ? ´ Raw Performance ´ SIMD Instructions, inside Oracle 12c ´ How SIMD instructions are used inside Oracle 12c ´ Tracing SIMD in Oracle 12c
  • 4. Caveats ´ Most of the topics are from ´ My own researches ´ My past life as a developer ´ Some of the topics are about internals, so: ´ Analysis and conclusion may be incomplete ´ Future versions of Oracle may change the features ´ Tests have been done with Oracle 12.1.0.2, Oracle Enterprise Linux 7.1, VMWare Fusion 7 (And VirtualBox)
  • 5. Before we start … ´ Some fundamentals (from Dennis Yurichev’s book) ´ CPU register : […]The easiest way to understand a register is to think of it as an untyped temporary variable. Imagine if you were working with high-level PL1 and could only use eight 32-bit (or 64-bit) variables. Yet a lot can be done using just these! ´ Instruction : A primitive CPU command. The simplest examples include: moving data between registers, working with memory and arithmetic primitives. As a rule, each CPU has its own instruction set architecture (ISA). ´ Assembly language : Mnemonic code and some extensions like macros which are intended to make a programmer’s life easier. http://beginners.re/Reverse_Engineering_for_Beginners-en.pdf
  • 6. Agenda ´ SIMD Instructions, outside Oracle 12c ´ What is a SIMD instruction ? ´ Will my application use SIMD ? ´ Raw Performance ´ SIMD Instructions, inside Oracle 12c ´ How SIMD instructions are used inside Oracle 12c ´ Tracing SIMD in Oracle 12c
  • 7. SIMD instructions … outside Oracle 12c ´ SIMD stands for Single Instruction Multiple Data ´ Process multiple data ´ In one CPU instruction ´ Based on ´ Specific registers ´ Specific CPU instructions and sets of instructions ´ Not Oracle specific ´ CPU Architecture specific ´ Intel ´ IBM ´ Sparc ´ This presentation is mainly about Intel architecture
  • 8. SIMD instructions … outside Oracle 12c ´ What is a SIMD register ? ´ It’s a CPU register ´ Wider than traditional registers (RDI, RSI, R8, R9 etc.) ´ 128 up to 512 bits wide ´ Contains many data
  • 9. SIMD instructions … outside Oracle 12c ´ Scalar operation ´ an array of 4 integers {1,2,3,4} ´ add 1 to each value Reg1 Reg2 Reg3 CPU RAM In Out 2 3 41 1 Reg1 Reg2 Reg3 CPU RAM In Out 2 3 41 1 1 Reg1 Reg2 Reg3 CPU RAM In Out 2 3 41 1 1 2 Reg1 Reg2 Reg3 CPU RAM In Out 2 3 41 1 1 2 2 Reg1 Reg2 Reg3 CPU RAM In Out 2 3 41 4 1 5 3 4 52 …/ … LOAD ADD SAVE 4 LOAD 4 ADD 4 SAVE
  • 10. SIMD instructions … outside Oracle 12c ´ SIMD operation ´ an array of 4 integers {1,2,3,4} ´ add 1 to each value SIMD Reg1 CPU RAM In Out 2 3 41 1 1 11SIMD Reg2 SIMD Reg3 SIMD Reg1 CPU RAM In Out 2 3 41 2 3 41 1 1 11SIMD Reg2 SIMD Reg3 SIMD Reg1 CPU RAM In Out 2 3 41 2 3 41 1 1 11 3 4 52 SIMD Reg2 SIMD Reg3 SIMD Reg1 CPU RAM In Out 2 3 41 3 4 52 2 3 41 1 1 11 3 4 52 SIMD Reg2 SIMD Reg3 LOAD ADD SAVE
  • 11. SIMD instructions … outside Oracle 12c ´ MMX: MultiMedia eXtensions (Pentium II) ´ 64 bits registers ´ 8 registers (MM0 to MM7) ´ SSE: Streaming SIMD Extensions: (Pentium III) ´ 128 bits registers ´ 8 registers (XMM0 to XMM7) ´ Only four 32 bits single precision floating point numbers ´ SSE2 (Pentium IV), SSE3 (Pentium IV Prescott, Xeon Nocona), SSSE3 (Xeon 5100, Core 2), SSE4.1 (Penryn), SSE4.2 (Nehalem) ´ 128 bits registers ´ 16 registers (XMM0 to XMM15) ´ Usage expansion (two 64 bits double precision, four 32 bits integers until sixteen 8 bits bytes) ´ New instructions
  • 12. SIMD instructions … outside Oracle 12c ´ AVX: Advanced Vector eXtension (Sandy Bridge processors) ´ XMM registers are extended to 256 bits ´ 16 AVX registers named YMM0 to YMM15 ´ Three operand instructions (non destructive) : A+B=C rather than A=A+B ´ Some alignment requirements are relaxed ´ AVX2 (Introduced with Haswell processors) ´ 256 bits registers ´ New instructions (shifting, value broadcasting etc…) ´ AVX-512 or AVX3 (Skylake processors) ´ 512 bits registers ´ 32 registers named ZMM0 to ZMM31 ´ AVX-1024 … the future ´ 1024 bits registers
  • 13. SIMD instructions … outside Oracle 12c ´ SIMD instructions ´ Reduce number of CPU cycles and memory pressure ´ Process data in parallel without any contention ´ Need a programming method (vector programming) with some constraints (data alignments etc.) ´ Size matters ´ Wider registers, more data loaded (but wider register files increase CPU power consumption à Challenge) ´ Processing is always done as a single CPU Cycle ´ More registers ´ Use cases ´ Data Filtering ´ Graphics ´ Bioinformatics …
  • 14. SIMD instructions … outside Oracle 12c ´ Intel API (C/C++) : Intel Intrinsics Guide https://software.intel.com/sites/landingpage/IntrinsicsGuide/ ´ Sample codes: https://app.box.com/simdSampleC-2015
  • 15. SIMD instructions … outside Oracle 12c
  • 16. Agenda ´ SIMD Instructions, outside Oracle 12c ´ What is a SIMD instruction ? ´ Will my application use SIMD ? ´ Raw Performance ´ SIMD Instructions, inside Oracle 12c ´ How SIMD instructions are used inside Oracle 12c ´ Tracing SIMD in Oracle 12c
  • 17. Will my application use SIMD registers and instructions ? ´ It depends on : ´ Hardware ´ Consult processors datasheets to see which instruction set extensions are used (if many) ´ http://ark.intel.com/#@Processors ´ Hypervisor ´ Some (old) hypervisors do not support modern extensions ´ VirtualBox versions <5.0 don’t support SSE4, AVX and AVX2 ´ Hyper-V on W2008R2-SP1 needs patch for specific processors to support AVX
  • 18. ´ It depends on the Operating System ´AVX (256 bits) is supported from ´ Linux Kernel >= 2.6.30 ´ Redhat EL5 : 2.6.18 ´ Oracle EL5 w/UEK : 2.6.32 AVX needs xsave kernel parameter ´ Solaris 10 upd 10 and Solaris 11 ´ Windows 2008 R2 SP1 Will my application use SIMD registers and instructions ?
  • 19. ´ It depends on the compiler ´ GCC ´ > 4.6 for AVX support ´ Use of specific switches (-msse2, -msse4.1, msse4.2, - mavx, -mavx2 …) ´ Intel C/C++ Compiler (ICC) ´ > 11.1 for AVX Support and > 13.0 for AVX2 support ´ Use of specific switches (-xsse4.2, -xavx, -xcore-avx2 …) ´ Beware of optimization switches (-O1,-O2, -O3) ´ More … disassemble (if you are allowed to J ) ´ Registers ´ Assembler instructions Will my application use SIMD registers and instructions ?
  • 20. Agenda ´ SIMD Instructions, outside Oracle 12c ´ What is a SIMD instruction ? ´ Will my application use SIMD ? ´ Raw Performance ´ SIMD Instructions, inside Oracle 12c ´ How SIMD instructions are used inside Oracle 12c ´ Tracing SIMD in Oracle 12c
  • 21. ´ Based on a C program ´ Used CPU: Haswell microarchitecture (Core i7-4960HQ). AVX/AVX2 enabled ´ 3 tests : No SIMD, SSE4, AVX ´ Input: one array containing 1Million values. ´ Goal: Add 1 to each value, each million values repeated 4k, 8k, 16k and 32k times ´ CPU Time(s) = f(#rows) “Quick and Dirty” Sample code available here: https://app.box.com/s/ibmnbblpho4xtbeq2x8ir60nrk37208v Raw performance
  • 22. Raw performance 10,35 20,46 42,35 85,64 3,3 6,81 13,73 25,58 1,96 3,51 7,23 15,15 0 10 20 30 40 50 60 70 80 90 4096 M. ROWS 8192 M. ROWS 16384 M. ROWS 32768 M. ROWS CPUTime(Sec) RAW Performance (CPU) for SIMD Instructions NO SIMD SSE4 (XMM Registers) AVX (YMM Registers)
  • 23. Agenda ´ SIMD Instructions, outside Oracle 12c ´ What is a SIMD instruction ? ´ Will my application use SIMD ? ´ Raw Performance ´ SIMD Instructions, inside Oracle 12c ´ How SIMD instructions are used inside Oracle 12c ´ Tracing SIMD in Oracle 12c
  • 24. SIMD instructions … inside Oracle 12c ´ In Memory Data Structure ´ In Memory Compression Unit : IMCU ´ IMCU is the unit of column store allocation ´ Target size is 1M rows (controlled by _inmemory_imcu_target_rows) ´ One IMCU can contain more than one column ´ Each column in one IMCU is a column unit (CU)
  • 25. SIMD instructions … inside Oracle 12c ´ In memory column store storage indexes ´ For each column unit, min and max values are maintained in a storage index ´ Storage Indexes provide CU pruning ´ Information about CU available in GV$IM_COL_CU (Undocumented. See BugID 19361690) IMCU Pruning
  • 26. SIMD instructions … inside Oracle 12c ´ The way your data is sorted matters for best IMCU pruning
  • 27. SIMD instructions … inside Oracle 12c ´ SIMD extensions are used with In Memory storage indexes for efficient filtering 1. IM Storage Indexes do IMCU pruning 2. SIMD instructions apply efficiently filter predicates IMCU Pruning Prod-id 10 10 14 14 10 Filtering with SIMD
  • 28. SIMD instructions … inside Oracle 12c ´ Oracle 12c uses specific libraries for SIMD (and compression) ´ Located in $ORACLE_HOME/lib ´ libshpksse4212.so for SSE4.2 extensions Compiled with ICC v12 with specific xsse4.2 switch ´ libshpkavx12.so for AVX extensions Compiled with ICC v12 with specific xavx switch ´ libshpkavx212.so for AVX2 extensions Not yet implemented (8 functions implemented) No ICC avx2 switch used because ICC v12 doesn’t support AVX2 ´ Thanks Tanel Pöder
  • 29. SIMD instructions … inside Oracle 12c ´ Oracle SIMD related functions ´ Located in kdzk kernel module (HPK) ´ Part of Advanced Compression library (ADVCMP) ´ Easily tracked with systemtap
  • 30. SIMD instructions … inside Oracle 12c ´ How Oracle uses SIMD extensions ? It depends on many parameters ´ OS Level : /proc/cpuinfo ´ AVX and AVX2 support ´ SSE4 Support only
  • 31. SIMD instructions … inside Oracle 12c ´ Which library am I using ? ´ pmap ´ AVX support ´ SSE4 support
  • 32. SIMD instructions … inside Oracle 12c ´ Which compiler options have been used ? ´ Read “comment” section in ELF ´ Read the corresponding compiler documentation [oracle@oel7 conf]$ readelf -p .comment $ORACLE_HOME/lib/libshpkavx12.so | > | egrep -i 'intel|gcc' | egrep 'xavx|mavx’ [ 2c] -?comment:Intel(R) C Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 12.0 Build 20120731 …/… -DNTEV_USE_EPOLL -DNET_USE_LDAP -xavx
  • 33. SIMD instructions … inside Oracle 12c ´ How are SIMD registers used by Oracle ? ´ GDB ´ To get the call stack (backtrace) ´ To set breakpoints on interesting functions ´ To view register contents (traditional and SIMD) ´ “Info registers” for traditional registers ´ “Info all-registers” for all registers (SIMD reg included) ´ (gdb) print $ymmX.<format> Format can be v8_float, v4_double, v32_int8, v16_int16, v8_int32, v4_int64, or v2_int128
  • 34. SIMD instructions … inside Oracle 12c In red, register content has been modified In blue, the second part of the SIMD registers (128 bits) is empty
  • 35. SIMD instructions … inside Oracle 12c ´ Oracle IM can use AVX or SSE4 extensions for SIMD operations ´ When AVX is used It uses only 128 bits out of 256 bits wide registers • AVX adds new register-state through the 256-bit wide YMM register file • Explicit operating system support is required to properly save and restore AVX's expanded registers between context switches • Without this, only AVX 128-bit is supported
  • 36. SIMD instructions … inside Oracle 12c ´The culprit ´ Oracle 12.1.0.2 is supported from EL5 onwards ´ EL5 Redhat Kernel is 2.6.18 and this flag (xsave) is supported from 2.6.30 kernels ´ For compatibility reasons, Oracle has to compile its code on 2.6.18 kernels
  • 37. SIMD instructions … inside Oracle 12c ´Or maybe … ´ Oracle needs to use values packed below 32bits wide
  • 38. Agenda ´ SIMD Instructions, outside Oracle 12c ´ What is a SIMD instruction ? ´ Will my application use SIMD ? ´ Raw Performance ´ SIMD Instructions, inside Oracle 12c ´ How SIMD instructions are used inside Oracle 12c ´ Tracing SIMD in Oracle 12c
  • 39. Tracing SIMD in Oracle 12c ´ Oradebug has 2 components related to IM
  • 40. Tracing SIMD in Oracle 12c ´ Interesting components to trace for SIMD and/or IMCU Pruning are : ´ IM_optimizer ´Gives information about CBO calculation related to IM ´ ADVCMP_DECOMP.* ´ADVCMP_DECOMP_HPK : SIMD functions ´ADVCMP_DECOMP_PCODE : Portable Code Machine (usually comparison functions and results)
  • 41. Tracing SIMD in Oracle 12c ´ IM_optimizer ´ Information available in trace file ´ IMCU Pruning ratio ´ CU decompression costing (per IMCU) ´ Predicate evaluation costing (per row) ´ Statement has to be parsed to get results
  • 42. Tracing SIMD in Oracle 12c select prod_id,cust_id,time_id from laurent.s_capa_high where amount_sold=20;
  • 43. Tracing SIMD in Oracle 12c ´ This information is available in CBO trace file (10053 or SQL_costing event)
  • 44. Tracing SIMD in Oracle 12c ´ ADVCMP_DECOMP ´ ADVCMP_DECOMP_HPK ´ Information is available in the trace file (for each IMCU processed) ´ Used library and function ´ Number of rows and counting algorithm ´ Processing rate (comparison and decompression if relevant) ´ But nothing on the results of the processing L
  • 45. Tracing SIMD in Oracle 12c ´ ADVCMP_DECOMP ´ ADVCMP_DECOMP_HPK ´ Gives information about SIMD function usage and filtering (after IMCU pruning) ´ Example: inmemory table with NO MEMCOMPRESS or DML compression
  • 46. Tracing SIMD in Oracle 12c ´ ADVCMP_DECOMP ´ ADVCMP_DECOMP_HPK ´ Example: inmemory compressed table ´ SIMD are used only in the kdzk_eq_dict functions
  • 47. Tracing SIMD in Oracle 12c ´ My thoughts about compression/decompression ´ NO MEMCOMPRESS / COMPRESS FOR DML ´ kdzk*dynp* functions (ex: kdzk_eq_dynp_16bit, kdzk_le_dynp_32bit etc.) ´ FOR QUERY LOW / QUERY HIGH ´ Dictionary Encoding (LZW ?) : kdzk_*dict* functions (ex: kdzk_eq_dict_7bit, kdzk_le_dict_4bit etc.) ´ Run Length Encoding: kdzk_burst_rle* functions (ex: kdzk_burst_rle_8bit, kdzk_burst_rle_16bit …) ´ Bit packing compression: kdzk*fixed* functions (ex: kdzk_ge_lt_fixed_32bit, kdzk_lt_fixed_8bit …)
  • 48. Tracing SIMD in Oracle 12c ´ My thoughts about compression/decompression ´ FOR CAPACITY LOW ´ FOR QUERY LOW + additional proprietary compression (OZIP) ´ Functions: ozip_decode_dict*, kdzk_ozip_decode* (Ex: kdzk_ozip_decode_dydi, ozip_decode_dict_9_bit etc.) ´ FOR CAPACITY HIGH ´ FOR QUERY HIGH + heavy weigth compression algorithm ´ Compression/decompression method depends on: ´ Datatype ´ Column Compression Unit size ´ Column contents