SlideShare a Scribd company logo
1 of 60
Download to read offline
Hudson Mendes

Lead Java Software Engineer @ AIQUDO

twitter.com/hudsonmendes

linkedin.com/in/hudsonmendes

medium.com/@hudsonmendes
SIMD (VECTORISATION)
SINGLE INSTRUCTION
MULTIPLE DATA
SOUJAVA & BELFASTJUG
WHAT IS AN IMAGE FOR

NEURAL NETS / AI?
WHAT ARE IMAGES FOR

NEURAL NETS / AI?
WHY DOES IT MATTER

FOR US / FOR JAVA?
CERN’S COLT

JNI AND PANAMA
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
WHAT IS AN IMAGE FOR NEURAL NETS / AI?
Hudson MendesSouJava & Belfast JUG
WHAT IS AN IMAGE FOR NEURAL NETS / AI?
Hudson Mendes
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SouJava & Belfast JUG
WHAT IS AN IMAGE FOR NEURAL NETS / AI?
Hudson Mendes
255 255 255 244 230 …
255 255 255 230 230 …
255 255 230 210 200 …
230 210 200 130 130 …
210
200
200 200 130 130 …
210 200 200 130 130 …
… … … … … …
210 200 130 130 130 …
RED
GREEN
BLUE
255 255 255 244 230 …
255 255 255 230 230 …
255 255 230 210 200 …
230 210 200 130 130 …
210 200 200 130 130 …
210 200 200 130 130 …
… … … … … …
89 60 65 50 20 20
255 255 255 244 230 …
255 255 255 230 230 …
255 255 230 210 200 …
230 210 200 130 130 …
210 200 200 130 130 …
210 200 200 130 130 …
… … … … … …
… 20 0 12 12 0
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SouJava & Belfast JUG
WHAT IS AN IMAGE FOR NEURAL NETS / AI?
Hudson Mendes
▸ Humans: colours

& other "features"
▸ AI: numbers
▸ Which numbers?

RGB matrices
255 255 255 244 230 …
255 255 255 230 230 …
255 255 230 210 200 …
230 210 200 130 130 …
210
200
200 200 130 130 …
210 200 200 130 130 …
… … … … … …
210 200 130 130 130 …
RED
GREEN
BLUE
255 255 255 244 230 …
255 255 255 230 230 …
255 255 230 210 200 …
230 210 200 130 130 …
210 200 200 130 130 …
210 200 200 130 130 …
… … … … … …
89 60 65 50 20 20
255 255 255 244 230 …
255 255 255 230 230 …
255 255 230 210 200 …
230 210 200 130 130 …
210 200 200 130 130 …
210 200 200 130 130 …
… … … … … …
… 20 0 12 12 0
255
255
230
230
210
200
130
130
130
120
…
89
60
65
50
20
20
20
…
20
0
12
12
0
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SouJava & Belfast JUG
WHAT IS AN IMAGE FOR NEURAL NETS / AI?
Hudson Mendes
▸ Humans: colours

& other "features"
▸ AI: numbers
▸ Which numbers?

RGB matrices
▸ 64 x 64 x 3 = 12,288
▸ 1 Image =

1 Feature Vector
255
255
230
230
210
200
130
130
130
120
…
89
60
65
50
20
20
20
…
20
0
12
12
0
64 pixels
64pixels
64x64x3=12,288
255 255 255 244 230 …
255 255 255 230 230 …
255 255 230 210 200 …
230 210 200 130 130 …
210
200
200 200 130 130 …
210 200 200 130 130 …
… … … … … …
210 200 130 130 130 …
RED
GREEN
BLUE
255 255 255 244 230 …
255 255 255 230 230 …
255 255 230 210 200 …
230 210 200 130 130 …
210 200 200 130 130 …
210 200 200 130 130 …
… … … … … …
89 60 65 50 20 20
255 255 255 244 230 …
255 255 255 230 230 …
255 255 230 210 200 …
230 210 200 130 130 …
210 200 200 130 130 …
210 200 200 130 130 …
… … … … … …
… 20 0 12 12 0
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SouJava & Belfast JUG
WHAT IS AN IMAGE FOR NEURAL NETS / AI?
Hudson Mendes
▸ Humans: colours

& other "features"
▸ AI: numbers
▸ Which numbers?

RGB matrices
▸ 64 x 64 x 3 = 12,288
▸ 1000 Images =

1000 Feature Vectors
255
255
230
230
210
200
130
130
130
120
…
89
60
65
50
20
20
20
…
20
0
12
12
0
64 pixels
64pixels
64x64x3=12,288
255 255 255 244 230 …
255 255 255 230 230 …
255 255 230 210 200 …
230 210 200 130 130 …
210
200
200 200 130 130 …
210 200 200 130 130 …
… … … … … …
210 200 130 130 130 …
RED
GREEN
BLUE
255 255 255 244 230 …
255 255 255 230 230 …
255 255 230 210 200 …
230 210 200 130 130 …
210 200 200 130 130 …
210 200 200 130 130 …
… … … … … …
89 60 65 50 20 20
255 255 255 244 230 …
255 255 255 230 230 …
255 255 230 210 200 …
230 210 200 130 130 …
210 200 200 130 130 …
210 200 200 130 130 …
… … … … … …
… 20 0 12 12 0
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SouJava & Belfast JUG
64X64 IMAGE =
ARRAY OF 12,288



1024X1024 =

ARRAY OF 3,145,728
CNN, or Convolutional Neural Nets

ς(X * W.T + b)

ς(x) => 1 / (1 + Math.pow(Math.E, x)
Hudson MendesSouJava & Belfast JUG
CNN, or Convolutional Neural Nets

ς(X * W.T + b)

X => feature vector of the image
Hudson MendesSouJava & Belfast JUG
CNN, or Convolutional Neural Nets

ς(X * W.T + b)

X, W, b: not numbers, but vectors
Hudson MendesSouJava & Belfast JUG
CNN, or Convolutional Neural Nets

ς(X * W.T + b)

X, W, b: LARGE VECTORS
Hudson MendesSouJava & Belfast JUG
CNN, or Convolutional Neural Nets

=> Vectorisation <=
Hudson MendesSouJava & Belfast JUG
SIMD

Single Instruction, Multiple Data

Hudson Mendes
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SouJava & Belfast JUG
SIMD

How much faster?

Hudson Mendes
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SouJava & Belfast JUG
Hudson Mendes
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
EXAMPLE WITH NUMPY
elapsedinms
0
125
250
375
500
500 X 5 5000 X 5 50000 X 5
SiSD SiMD
SouJava & Belfast JUG
WELL, I DON’T DO AI OR NNS.

DOES IT MATTER TO ME?
WHAT ARE IMAGES FOR

NEURAL NETS / AI?
WHY DOES IT MATTER

FOR US / FOR JAVA?
CERN’S COLT

JNI AND PANAMA
Hudson Mendes
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SIMD

Single “Instruction"?
SouJava & Belfast JUG
Hudson Mendes
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SIMD

Single “Instruction"?

Add, Subtract, Multiply, Divide

but also Log, Exp, Sqrt, etc
SouJava & Belfast JUG
Hudson Mendes
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SIMD

Multiple “Data"?
SouJava & Belfast JUG
Hudson Mendes
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SIMD

Multiple “Data”?

Numerical data:

Int, Long, Float, Double, etc
SouJava & Belfast JUG
Hudson Mendes
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SISD IMPLEMENTATIONS: SUM STREAM OF DOUBLES
SouJava & Belfast JUG
Hudson Mendes
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SISD IMPLEMENTATIONS: ADDING ELEMENTS OF 2 ARRAYS
SouJava & Belfast JUG
Hudson Mendes
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SISD IMPLEMENTATIONS: EUCLIDIAN DISTANCE BTW DOUBLE IN 2 ARRAYS
SouJava & Belfast JUG
Hudson Mendes
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SISD IMPLEMENTATIONS: LOGS OF MATRICES
SouJava & Belfast JUG
Hudson Mendes
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SISD (OR SINGLE INSTRUCTION SINGLE DATA) IN BYTECODE
SouJava & Belfast JUG
Hudson Mendes
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
SISD (OR SINGLE INSTRUCTION SINGLE DATA) IN BYTECODE
▸ Given Vector of size N
▸ Each 1 N (N[i]) receives 1 instruction (Single Instruction, Single Data)
▸ SiSd O(n) = 2n x > SiMd O(n) = n
elapsedinms
0
250
500
500 X 5 1000 X 5 5000 X 5 10000 X 5 50000 X 5 100000 X 5
SiSD SiMD
SouJava & Belfast JUG
HOW TO DO SIMD

IN JAVA THEN?
WHAT ARE IMAGES FOR

NEURAL NETS / AI?
WHY DOES IT MATTER

FOR US / FOR JAVA?
CERN’S COLT

JNI AND PANAMA
LET’S ASK THE DATA SCIENTISTS?
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson MendesBelfast JUG
CLASSIC SCIENTIFIC COMPUTING

REFERENCES POINT TO…
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson MendesBelfast JUG
CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
1 GFLOP = 1.000.000.000 float point operations per second
SouJava & Belfast JUG
CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
1 GFLOP = 1.000.000.000 float point operations per second
That looks pretty fast!
SouJava & Belfast JUG
CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
1 GFLOP = 1.000.000.000 float point operations per second
That looks pretty fast!
CERN’s Colt, Java SIMD Library?
SouJava & Belfast JUG
CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson MendesSouJava & Belfast JUG
CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson MendesSouJava & Belfast JUG
CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
Not Faster than Serial, but WHY?
SouJava & Belfast JUG
CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson MendesSouJava & Belfast JUG
CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
Not SIMD!
Digging Source Code:
SouJava & Belfast JUG
CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
1 GFLOP = 1.000.000.000 float point operations per second
Fast, but not SIMD
CERN’s Colt, Java SIMD Library?
SouJava & Belfast JUG
DEEP LEARNING DATA

SCIENTISTS SAY…
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson MendesBelfast JUG
DEEP LEARNING DATA SCIENTIST: DEEPLEARNING4J
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson MendesSouJava & Belfast JUG
DEEP LEARNING DATA SCIENTIST: DEEPLEARNING4J
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson MendesSouJava & Belfast JUG
DEEP LEARNING DATA SCIENTIST: DEEPLEARNING4J
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
Faster than all of THEM!
SouJava & Belfast JUG
DEEP LEARNING DATA SCIENTIST: DEEPLEARNING4J
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
Digging Source Code:
SouJava & Belfast JUG
DEEP LEARNING DATA SCIENTIST: DEEPLEARNING4J
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
Digging Source Code:
SouJava & Belfast JUG
DEEP LEARNING DATA SCIENTIST: DEEPLEARNING4J
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
Digging Source Code:
Yes, SIMD! done with JNI
SouJava & Belfast JUG
SO, IS JNI (JAVA NATIVE INTERFACE)

THE ONLY WAY TO SIMDS?
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson MendesSouJava & Belfast JUG
WAYS TO DO SIMD?
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
1 + 2 = 3
iconst_1
iconst_2
iadd
MOV AX,@DATA
MOV DS,AX
MOV AX,OPR1
MOV BX,OPR2
CLC
ADD AX,BX
MOV DI,OFFSET RESULT
MOV [DI], AX
MOV AH,09H
MOV DX,OFFSET RESULT
INT 21H
MOV AH,4CH
INT 21H
END
JAVA
BYTECODE
ASSEMBLER
Hudson Mendes
WAYS TO DO SIMD?
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
1 + 2 = 3
iconst_1
iconst_2
iadd
MOV AX,@DATA
MOV DS,AX
MOV AX,OPR1
MOV BX,OPR2
CLC
ADD AX,BX
MOV DI,OFFSET RESULT
MOV [DI], AX
MOV AH,09H
MOV DX,OFFSET RESULT
INT 21H
MOV AH,4CH
INT 21H
END
VECTOR.SUM() ivector_1
vector_add
EXPORT XCORR_KERNEL
xcorr_kernel PROC
VMOV.I32 q0, #0
CMP r3, #0
BLE xcorr_kernel_done
VLD1.16 {d3}, [r2]!
SUBS r3, r3, #4
BLE xcorr_kernel_process4_done
(…)
JAVA
BYTECODE
ASSEMBLER
JAVA
BYTECODE
ASSEMBLER
SouJava & Belfast JUG Hudson Mendes
WAYS TO DO SIMD?
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
VECTOR.SUM() ivector_1
vector_add
EXPORT XCORR_KERNEL
xcorr_kernel PROC
VMOV.I32 q0, #0
CMP r3, #0
BLE xcorr_kernel_done
VLD1.16 {d3}, [r2]!
SUBS r3, r3, #4
BLE xcorr_kernel_process4_done
(…)
JAVA
BYTECODE
ASSEMBLER
SO, YES - AT THE MINUTE

JNI IS PRETTY MUCH THE ONLY WAY…
SouJava & Belfast JUG Hudson Mendes
IS THERE ANYTHING BETTER

THAN JNI IN THE NEAR FUTURE?
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson MendesBelfast JUG
IS THERE ANYTHING BETTER

THAN JNI IN THE NEAR FUTURE?
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson MendesBelfast JUG
YES!
JAVA9+ SUPERWORD
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
Source http://prestodb.rocks/code/simd/
SouJava & Belfast JUG
JAVA9+ SUPERWORD
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
Source http://prestodb.rocks/code/simd/
"Exploiting Superword Level Parallelism with Multimedia Instruction
Sets", LARSEN Samuel and AMARASINGHE Saman, from MIT

HTTP://GROUPS.CSAIL.MIT.EDU/CAG/SLP/SLP-PLDI-2000.PDF
SouJava & Belfast JUG
JAVA9+ SUPERWORD
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
Source http://prestodb.rocks/code/simd/
FOR SUPERWORD, MUST NOT HAVE:
•AN OR CONDITION AS THE LOOP CONDITION
•A NON-INLINED METHOD INSIDE THE LOOP
•AN ARBITRARY METHOD AS THE LOOP CONDITION
•MANUALLY UNROLLING OF THE LOOP
•A LONG AS THE LOOP VARIABLE
•MULTIPLE EXIT POINTS
SouJava & Belfast JUG
JAVA9+ SUPERWORD
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
Source http://prestodb.rocks/code/simd/
ON BY DEFAULT ON J9+
FOR SUPERWORD, MUST NOT HAVE:
•AN OR CONDITION AS THE LOOP CONDITION
•A NON-INLINED METHOD INSIDE THE LOOP
•AN ARBITRARY METHOD AS THE LOOP CONDITION
•MANUALLY UNROLLING OF THE LOOP
•A LONG AS THE LOOP VARIABLE
•MULTIPLE EXIT POINTS
SouJava & Belfast JUG
PROJECT PANAMA
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
Source http://openjdk.java.net/projects/panama/
SouJava & Belfast JUG
PROJECT PANAMA
SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION)
Hudson Mendes
Source http://openjdk.java.net/projects/panama/
BETTER NATIVE API THAN JNI
SouJava & Belfast JUG
Hudson Mendes

Lead Java Software Engineer @ AIQUDO

twitter.com/hudsonmendes

linkedin.com/in/hudsonmendes

medium.com/@hudsonmendes
THANK YOU!
JMH CODE AVAILABLE AT

HTTPS://GITHUB.COM/HUDSONMENDES/BELFASTJUG-SAMPLE-3
SOUJAVA & BELFASTJUG

More Related Content

What's hot

Sistemas de múltiples grados de libertad
Sistemas de múltiples grados de libertadSistemas de múltiples grados de libertad
Sistemas de múltiples grados de libertadEnrique Santana
 
POTENCIAS Y RADICALES
POTENCIAS Y RADICALESPOTENCIAS Y RADICALES
POTENCIAS Y RADICALESEducación
 
Expectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocationExpectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocationTomonari Masada
 
第7回 大規模データを用いたデータフレーム操作実習(1)
第7回 大規模データを用いたデータフレーム操作実習(1)第7回 大規模データを用いたデータフレーム操作実習(1)
第7回 大規模データを用いたデータフレーム操作実習(1)Wataru Shito
 
Solutions manual for calculus an applied approach brief international metric ...
Solutions manual for calculus an applied approach brief international metric ...Solutions manual for calculus an applied approach brief international metric ...
Solutions manual for calculus an applied approach brief international metric ...Larson612
 
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML CampIntroduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML CampIlya Kuzovkin
 
Data visualization with Python and SVG
Data visualization with Python and SVGData visualization with Python and SVG
Data visualization with Python and SVGSukjun Kim
 
Student manual
Student manualStudent manual
Student manualec931657
 
Modeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationModeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationMark Chang
 
Hideitsu Hino
Hideitsu HinoHideitsu Hino
Hideitsu HinoSuurist
 
Diffusion kernels on SNP data embedded in a non-Euclidean metric
Diffusion kernels on SNP data embedded in a non-Euclidean metricDiffusion kernels on SNP data embedded in a non-Euclidean metric
Diffusion kernels on SNP data embedded in a non-Euclidean metricGota Morota
 

What's hot (17)

Capitulo 5 Soluciones Purcell 9na Edicion
Capitulo 5 Soluciones Purcell 9na EdicionCapitulo 5 Soluciones Purcell 9na Edicion
Capitulo 5 Soluciones Purcell 9na Edicion
 
14 mecv14 dvd
14 mecv14 dvd14 mecv14 dvd
14 mecv14 dvd
 
Sistemas de múltiples grados de libertad
Sistemas de múltiples grados de libertadSistemas de múltiples grados de libertad
Sistemas de múltiples grados de libertad
 
POTENCIAS Y RADICALES
POTENCIAS Y RADICALESPOTENCIAS Y RADICALES
POTENCIAS Y RADICALES
 
Expectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocationExpectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocation
 
第7回 大規模データを用いたデータフレーム操作実習(1)
第7回 大規模データを用いたデータフレーム操作実習(1)第7回 大規模データを用いたデータフレーム操作実習(1)
第7回 大規模データを用いたデータフレーム操作実習(1)
 
Final_Presentation
Final_PresentationFinal_Presentation
Final_Presentation
 
Solutions manual for calculus an applied approach brief international metric ...
Solutions manual for calculus an applied approach brief international metric ...Solutions manual for calculus an applied approach brief international metric ...
Solutions manual for calculus an applied approach brief international metric ...
 
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML CampIntroduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
 
Data visualization with Python and SVG
Data visualization with Python and SVGData visualization with Python and SVG
Data visualization with Python and SVG
 
Integral table
Integral tableIntegral table
Integral table
 
Realibity design
Realibity designRealibity design
Realibity design
 
Student manual
Student manualStudent manual
Student manual
 
Modeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationModeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential Equation
 
Hideitsu Hino
Hideitsu HinoHideitsu Hino
Hideitsu Hino
 
Diffusion kernels on SNP data embedded in a non-Euclidean metric
Diffusion kernels on SNP data embedded in a non-Euclidean metricDiffusion kernels on SNP data embedded in a non-Euclidean metric
Diffusion kernels on SNP data embedded in a non-Euclidean metric
 
Ch16s
Ch16sCh16s
Ch16s
 

Similar to Belfast JUG, SIMD (Vectorial) Operations

Multichannel IoT CAUSAL digital twin
Multichannel IoT CAUSAL digital twinMultichannel IoT CAUSAL digital twin
Multichannel IoT CAUSAL digital twinPG Madhavan
 
Black Hat Europe 2015 - Time and Position Spoofing with Open Source Projects
Black Hat Europe 2015 - Time and Position Spoofing with Open Source ProjectsBlack Hat Europe 2015 - Time and Position Spoofing with Open Source Projects
Black Hat Europe 2015 - Time and Position Spoofing with Open Source ProjectsWang Kang
 
Wim Remes SOURCE Boston 2011
Wim Remes SOURCE Boston 2011 Wim Remes SOURCE Boston 2011
Wim Remes SOURCE Boston 2011 Source Conference
 
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBMSolr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBMLucidworks
 
Next Top Data Model by Ian Plosker
Next Top Data Model by Ian PloskerNext Top Data Model by Ian Plosker
Next Top Data Model by Ian PloskerSyncConf
 
Extending Structured Streaming Made Easy with Algebra with Erik Erlandson
Extending Structured Streaming Made Easy with Algebra with Erik ErlandsonExtending Structured Streaming Made Easy with Algebra with Erik Erlandson
Extending Structured Streaming Made Easy with Algebra with Erik ErlandsonDatabricks
 
Analysis of an OSS supply chain attack - How did 8 millions developers downlo...
Analysis of an OSS supply chain attack - How did 8 millions developers downlo...Analysis of an OSS supply chain attack - How did 8 millions developers downlo...
Analysis of an OSS supply chain attack - How did 8 millions developers downlo...Jarrod Overson
 
Blur Filter - Hanpo
Blur Filter - HanpoBlur Filter - Hanpo
Blur Filter - HanpoHanpo Cheng
 
How I learned to stop worrying and love the dark silicon apocalypse.pdf
How I learned to stop worrying and love the dark silicon apocalypse.pdfHow I learned to stop worrying and love the dark silicon apocalypse.pdf
How I learned to stop worrying and love the dark silicon apocalypse.pdfTomasz Kowalczewski
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming DataSingleStore
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeWim Godden
 
Is writing performant code too expensive?
Is writing performant code too expensive? Is writing performant code too expensive?
Is writing performant code too expensive? Tomasz Kowalczewski
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeWim Godden
 
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtionNÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtionKai Katsumata
 
Cyberspectrum Sydney 0x01 Introduction to SDR
Cyberspectrum Sydney   0x01 Introduction to SDRCyberspectrum Sydney   0x01 Introduction to SDR
Cyberspectrum Sydney 0x01 Introduction to SDRsdrsydney
 
Next Generation Indexes For Big Data Engineering (ODSC East 2018)
Next Generation Indexes For Big Data Engineering (ODSC East 2018)Next Generation Indexes For Big Data Engineering (ODSC East 2018)
Next Generation Indexes For Big Data Engineering (ODSC East 2018)Daniel Lemire
 
[第34回 WBA若手の会勉強会] Microsoft AI platform
[第34回 WBA若手の会勉強会] Microsoft AI platform[第34回 WBA若手の会勉強会] Microsoft AI platform
[第34回 WBA若手の会勉強会] Microsoft AI platformNaoki (Neo) SATO
 
232 md5-considered-harmful-slides
232 md5-considered-harmful-slides232 md5-considered-harmful-slides
232 md5-considered-harmful-slidesDan Kaminsky
 
DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann
DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumannDSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann
DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumannDeltares
 
Lect2 up090 (100324)
Lect2 up090 (100324)Lect2 up090 (100324)
Lect2 up090 (100324)aicdesign
 

Similar to Belfast JUG, SIMD (Vectorial) Operations (20)

Multichannel IoT CAUSAL digital twin
Multichannel IoT CAUSAL digital twinMultichannel IoT CAUSAL digital twin
Multichannel IoT CAUSAL digital twin
 
Black Hat Europe 2015 - Time and Position Spoofing with Open Source Projects
Black Hat Europe 2015 - Time and Position Spoofing with Open Source ProjectsBlack Hat Europe 2015 - Time and Position Spoofing with Open Source Projects
Black Hat Europe 2015 - Time and Position Spoofing with Open Source Projects
 
Wim Remes SOURCE Boston 2011
Wim Remes SOURCE Boston 2011 Wim Remes SOURCE Boston 2011
Wim Remes SOURCE Boston 2011
 
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBMSolr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
 
Next Top Data Model by Ian Plosker
Next Top Data Model by Ian PloskerNext Top Data Model by Ian Plosker
Next Top Data Model by Ian Plosker
 
Extending Structured Streaming Made Easy with Algebra with Erik Erlandson
Extending Structured Streaming Made Easy with Algebra with Erik ErlandsonExtending Structured Streaming Made Easy with Algebra with Erik Erlandson
Extending Structured Streaming Made Easy with Algebra with Erik Erlandson
 
Analysis of an OSS supply chain attack - How did 8 millions developers downlo...
Analysis of an OSS supply chain attack - How did 8 millions developers downlo...Analysis of an OSS supply chain attack - How did 8 millions developers downlo...
Analysis of an OSS supply chain attack - How did 8 millions developers downlo...
 
Blur Filter - Hanpo
Blur Filter - HanpoBlur Filter - Hanpo
Blur Filter - Hanpo
 
How I learned to stop worrying and love the dark silicon apocalypse.pdf
How I learned to stop worrying and love the dark silicon apocalypse.pdfHow I learned to stop worrying and love the dark silicon apocalypse.pdf
How I learned to stop worrying and love the dark silicon apocalypse.pdf
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming Data
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
Is writing performant code too expensive?
Is writing performant code too expensive? Is writing performant code too expensive?
Is writing performant code too expensive?
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtionNÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
 
Cyberspectrum Sydney 0x01 Introduction to SDR
Cyberspectrum Sydney   0x01 Introduction to SDRCyberspectrum Sydney   0x01 Introduction to SDR
Cyberspectrum Sydney 0x01 Introduction to SDR
 
Next Generation Indexes For Big Data Engineering (ODSC East 2018)
Next Generation Indexes For Big Data Engineering (ODSC East 2018)Next Generation Indexes For Big Data Engineering (ODSC East 2018)
Next Generation Indexes For Big Data Engineering (ODSC East 2018)
 
[第34回 WBA若手の会勉強会] Microsoft AI platform
[第34回 WBA若手の会勉強会] Microsoft AI platform[第34回 WBA若手の会勉強会] Microsoft AI platform
[第34回 WBA若手の会勉強会] Microsoft AI platform
 
232 md5-considered-harmful-slides
232 md5-considered-harmful-slides232 md5-considered-harmful-slides
232 md5-considered-harmful-slides
 
DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann
DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumannDSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann
DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann
 
Lect2 up090 (100324)
Lect2 up090 (100324)Lect2 up090 (100324)
Lect2 up090 (100324)
 

Recently uploaded

CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 

Recently uploaded (20)

CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 

Belfast JUG, SIMD (Vectorial) Operations

  • 1. Hudson Mendes
 Lead Java Software Engineer @ AIQUDO
 twitter.com/hudsonmendes
 linkedin.com/in/hudsonmendes
 medium.com/@hudsonmendes SIMD (VECTORISATION) SINGLE INSTRUCTION MULTIPLE DATA SOUJAVA & BELFASTJUG
  • 2. WHAT IS AN IMAGE FOR
 NEURAL NETS / AI? WHAT ARE IMAGES FOR
 NEURAL NETS / AI? WHY DOES IT MATTER
 FOR US / FOR JAVA? CERN’S COLT
 JNI AND PANAMA
  • 3. SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) WHAT IS AN IMAGE FOR NEURAL NETS / AI? Hudson MendesSouJava & Belfast JUG
  • 4. WHAT IS AN IMAGE FOR NEURAL NETS / AI? Hudson Mendes SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SouJava & Belfast JUG
  • 5. WHAT IS AN IMAGE FOR NEURAL NETS / AI? Hudson Mendes 255 255 255 244 230 … 255 255 255 230 230 … 255 255 230 210 200 … 230 210 200 130 130 … 210 200 200 200 130 130 … 210 200 200 130 130 … … … … … … … 210 200 130 130 130 … RED GREEN BLUE 255 255 255 244 230 … 255 255 255 230 230 … 255 255 230 210 200 … 230 210 200 130 130 … 210 200 200 130 130 … 210 200 200 130 130 … … … … … … … 89 60 65 50 20 20 255 255 255 244 230 … 255 255 255 230 230 … 255 255 230 210 200 … 230 210 200 130 130 … 210 200 200 130 130 … 210 200 200 130 130 … … … … … … … … 20 0 12 12 0 SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SouJava & Belfast JUG
  • 6. WHAT IS AN IMAGE FOR NEURAL NETS / AI? Hudson Mendes ▸ Humans: colours
 & other "features" ▸ AI: numbers ▸ Which numbers?
 RGB matrices 255 255 255 244 230 … 255 255 255 230 230 … 255 255 230 210 200 … 230 210 200 130 130 … 210 200 200 200 130 130 … 210 200 200 130 130 … … … … … … … 210 200 130 130 130 … RED GREEN BLUE 255 255 255 244 230 … 255 255 255 230 230 … 255 255 230 210 200 … 230 210 200 130 130 … 210 200 200 130 130 … 210 200 200 130 130 … … … … … … … 89 60 65 50 20 20 255 255 255 244 230 … 255 255 255 230 230 … 255 255 230 210 200 … 230 210 200 130 130 … 210 200 200 130 130 … 210 200 200 130 130 … … … … … … … … 20 0 12 12 0 255 255 230 230 210 200 130 130 130 120 … 89 60 65 50 20 20 20 … 20 0 12 12 0 SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SouJava & Belfast JUG
  • 7. WHAT IS AN IMAGE FOR NEURAL NETS / AI? Hudson Mendes ▸ Humans: colours
 & other "features" ▸ AI: numbers ▸ Which numbers?
 RGB matrices ▸ 64 x 64 x 3 = 12,288 ▸ 1 Image =
 1 Feature Vector 255 255 230 230 210 200 130 130 130 120 … 89 60 65 50 20 20 20 … 20 0 12 12 0 64 pixels 64pixels 64x64x3=12,288 255 255 255 244 230 … 255 255 255 230 230 … 255 255 230 210 200 … 230 210 200 130 130 … 210 200 200 200 130 130 … 210 200 200 130 130 … … … … … … … 210 200 130 130 130 … RED GREEN BLUE 255 255 255 244 230 … 255 255 255 230 230 … 255 255 230 210 200 … 230 210 200 130 130 … 210 200 200 130 130 … 210 200 200 130 130 … … … … … … … 89 60 65 50 20 20 255 255 255 244 230 … 255 255 255 230 230 … 255 255 230 210 200 … 230 210 200 130 130 … 210 200 200 130 130 … 210 200 200 130 130 … … … … … … … … 20 0 12 12 0 SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SouJava & Belfast JUG
  • 8. WHAT IS AN IMAGE FOR NEURAL NETS / AI? Hudson Mendes ▸ Humans: colours
 & other "features" ▸ AI: numbers ▸ Which numbers?
 RGB matrices ▸ 64 x 64 x 3 = 12,288 ▸ 1000 Images =
 1000 Feature Vectors 255 255 230 230 210 200 130 130 130 120 … 89 60 65 50 20 20 20 … 20 0 12 12 0 64 pixels 64pixels 64x64x3=12,288 255 255 255 244 230 … 255 255 255 230 230 … 255 255 230 210 200 … 230 210 200 130 130 … 210 200 200 200 130 130 … 210 200 200 130 130 … … … … … … … 210 200 130 130 130 … RED GREEN BLUE 255 255 255 244 230 … 255 255 255 230 230 … 255 255 230 210 200 … 230 210 200 130 130 … 210 200 200 130 130 … 210 200 200 130 130 … … … … … … … 89 60 65 50 20 20 255 255 255 244 230 … 255 255 255 230 230 … 255 255 230 210 200 … 230 210 200 130 130 … 210 200 200 130 130 … 210 200 200 130 130 … … … … … … … … 20 0 12 12 0 SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SouJava & Belfast JUG
  • 9. 64X64 IMAGE = ARRAY OF 12,288
 
 1024X1024 =
 ARRAY OF 3,145,728
  • 10. CNN, or Convolutional Neural Nets
 ς(X * W.T + b)
 ς(x) => 1 / (1 + Math.pow(Math.E, x) Hudson MendesSouJava & Belfast JUG
  • 11. CNN, or Convolutional Neural Nets
 ς(X * W.T + b)
 X => feature vector of the image Hudson MendesSouJava & Belfast JUG
  • 12. CNN, or Convolutional Neural Nets
 ς(X * W.T + b)
 X, W, b: not numbers, but vectors Hudson MendesSouJava & Belfast JUG
  • 13. CNN, or Convolutional Neural Nets
 ς(X * W.T + b)
 X, W, b: LARGE VECTORS Hudson MendesSouJava & Belfast JUG
  • 14. CNN, or Convolutional Neural Nets
 => Vectorisation <= Hudson MendesSouJava & Belfast JUG
  • 15. SIMD
 Single Instruction, Multiple Data
 Hudson Mendes SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SouJava & Belfast JUG
  • 16. SIMD
 How much faster?
 Hudson Mendes SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SouJava & Belfast JUG
  • 17. Hudson Mendes SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) EXAMPLE WITH NUMPY elapsedinms 0 125 250 375 500 500 X 5 5000 X 5 50000 X 5 SiSD SiMD SouJava & Belfast JUG
  • 18. WELL, I DON’T DO AI OR NNS.
 DOES IT MATTER TO ME? WHAT ARE IMAGES FOR
 NEURAL NETS / AI? WHY DOES IT MATTER
 FOR US / FOR JAVA? CERN’S COLT
 JNI AND PANAMA
  • 19. Hudson Mendes SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SIMD
 Single “Instruction"? SouJava & Belfast JUG
  • 20. Hudson Mendes SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SIMD
 Single “Instruction"?
 Add, Subtract, Multiply, Divide
 but also Log, Exp, Sqrt, etc SouJava & Belfast JUG
  • 21. Hudson Mendes SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SIMD
 Multiple “Data"? SouJava & Belfast JUG
  • 22. Hudson Mendes SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SIMD
 Multiple “Data”?
 Numerical data:
 Int, Long, Float, Double, etc SouJava & Belfast JUG
  • 23. Hudson Mendes SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SISD IMPLEMENTATIONS: SUM STREAM OF DOUBLES SouJava & Belfast JUG
  • 24. Hudson Mendes SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SISD IMPLEMENTATIONS: ADDING ELEMENTS OF 2 ARRAYS SouJava & Belfast JUG
  • 25. Hudson Mendes SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SISD IMPLEMENTATIONS: EUCLIDIAN DISTANCE BTW DOUBLE IN 2 ARRAYS SouJava & Belfast JUG
  • 26. Hudson Mendes SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SISD IMPLEMENTATIONS: LOGS OF MATRICES SouJava & Belfast JUG
  • 27. Hudson Mendes SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SISD (OR SINGLE INSTRUCTION SINGLE DATA) IN BYTECODE SouJava & Belfast JUG
  • 28. Hudson Mendes SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) SISD (OR SINGLE INSTRUCTION SINGLE DATA) IN BYTECODE ▸ Given Vector of size N ▸ Each 1 N (N[i]) receives 1 instruction (Single Instruction, Single Data) ▸ SiSd O(n) = 2n x > SiMd O(n) = n elapsedinms 0 250 500 500 X 5 1000 X 5 5000 X 5 10000 X 5 50000 X 5 100000 X 5 SiSD SiMD SouJava & Belfast JUG
  • 29. HOW TO DO SIMD
 IN JAVA THEN? WHAT ARE IMAGES FOR
 NEURAL NETS / AI? WHY DOES IT MATTER
 FOR US / FOR JAVA? CERN’S COLT
 JNI AND PANAMA
  • 30. LET’S ASK THE DATA SCIENTISTS? SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson MendesBelfast JUG
  • 31. CLASSIC SCIENTIFIC COMPUTING
 REFERENCES POINT TO… SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson MendesBelfast JUG
  • 32. CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes 1 GFLOP = 1.000.000.000 float point operations per second SouJava & Belfast JUG
  • 33. CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes 1 GFLOP = 1.000.000.000 float point operations per second That looks pretty fast! SouJava & Belfast JUG
  • 34. CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes 1 GFLOP = 1.000.000.000 float point operations per second That looks pretty fast! CERN’s Colt, Java SIMD Library? SouJava & Belfast JUG
  • 35. CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson MendesSouJava & Belfast JUG
  • 36. CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson MendesSouJava & Belfast JUG
  • 37. CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes Not Faster than Serial, but WHY? SouJava & Belfast JUG
  • 38. CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson MendesSouJava & Belfast JUG
  • 39. CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes Not SIMD! Digging Source Code: SouJava & Belfast JUG
  • 40. CLASSIC SCIENTIFIC COMPUTING: CERN’S COLT SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes 1 GFLOP = 1.000.000.000 float point operations per second Fast, but not SIMD CERN’s Colt, Java SIMD Library? SouJava & Belfast JUG
  • 41. DEEP LEARNING DATA
 SCIENTISTS SAY… SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson MendesBelfast JUG
  • 42. DEEP LEARNING DATA SCIENTIST: DEEPLEARNING4J SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson MendesSouJava & Belfast JUG
  • 43. DEEP LEARNING DATA SCIENTIST: DEEPLEARNING4J SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson MendesSouJava & Belfast JUG
  • 44. DEEP LEARNING DATA SCIENTIST: DEEPLEARNING4J SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes Faster than all of THEM! SouJava & Belfast JUG
  • 45. DEEP LEARNING DATA SCIENTIST: DEEPLEARNING4J SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes Digging Source Code: SouJava & Belfast JUG
  • 46. DEEP LEARNING DATA SCIENTIST: DEEPLEARNING4J SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes Digging Source Code: SouJava & Belfast JUG
  • 47. DEEP LEARNING DATA SCIENTIST: DEEPLEARNING4J SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes Digging Source Code: Yes, SIMD! done with JNI SouJava & Belfast JUG
  • 48. SO, IS JNI (JAVA NATIVE INTERFACE)
 THE ONLY WAY TO SIMDS? SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson MendesSouJava & Belfast JUG
  • 49. WAYS TO DO SIMD? SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) 1 + 2 = 3 iconst_1 iconst_2 iadd MOV AX,@DATA MOV DS,AX MOV AX,OPR1 MOV BX,OPR2 CLC ADD AX,BX MOV DI,OFFSET RESULT MOV [DI], AX MOV AH,09H MOV DX,OFFSET RESULT INT 21H MOV AH,4CH INT 21H END JAVA BYTECODE ASSEMBLER Hudson Mendes
  • 50. WAYS TO DO SIMD? SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) 1 + 2 = 3 iconst_1 iconst_2 iadd MOV AX,@DATA MOV DS,AX MOV AX,OPR1 MOV BX,OPR2 CLC ADD AX,BX MOV DI,OFFSET RESULT MOV [DI], AX MOV AH,09H MOV DX,OFFSET RESULT INT 21H MOV AH,4CH INT 21H END VECTOR.SUM() ivector_1 vector_add EXPORT XCORR_KERNEL xcorr_kernel PROC VMOV.I32 q0, #0 CMP r3, #0 BLE xcorr_kernel_done VLD1.16 {d3}, [r2]! SUBS r3, r3, #4 BLE xcorr_kernel_process4_done (…) JAVA BYTECODE ASSEMBLER JAVA BYTECODE ASSEMBLER SouJava & Belfast JUG Hudson Mendes
  • 51. WAYS TO DO SIMD? SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) VECTOR.SUM() ivector_1 vector_add EXPORT XCORR_KERNEL xcorr_kernel PROC VMOV.I32 q0, #0 CMP r3, #0 BLE xcorr_kernel_done VLD1.16 {d3}, [r2]! SUBS r3, r3, #4 BLE xcorr_kernel_process4_done (…) JAVA BYTECODE ASSEMBLER SO, YES - AT THE MINUTE
 JNI IS PRETTY MUCH THE ONLY WAY… SouJava & Belfast JUG Hudson Mendes
  • 52. IS THERE ANYTHING BETTER
 THAN JNI IN THE NEAR FUTURE? SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson MendesBelfast JUG
  • 53. IS THERE ANYTHING BETTER
 THAN JNI IN THE NEAR FUTURE? SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson MendesBelfast JUG YES!
  • 54. JAVA9+ SUPERWORD SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes Source http://prestodb.rocks/code/simd/ SouJava & Belfast JUG
  • 55. JAVA9+ SUPERWORD SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes Source http://prestodb.rocks/code/simd/ "Exploiting Superword Level Parallelism with Multimedia Instruction Sets", LARSEN Samuel and AMARASINGHE Saman, from MIT
 HTTP://GROUPS.CSAIL.MIT.EDU/CAG/SLP/SLP-PLDI-2000.PDF SouJava & Belfast JUG
  • 56. JAVA9+ SUPERWORD SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes Source http://prestodb.rocks/code/simd/ FOR SUPERWORD, MUST NOT HAVE: •AN OR CONDITION AS THE LOOP CONDITION •A NON-INLINED METHOD INSIDE THE LOOP •AN ARBITRARY METHOD AS THE LOOP CONDITION •MANUALLY UNROLLING OF THE LOOP •A LONG AS THE LOOP VARIABLE •MULTIPLE EXIT POINTS SouJava & Belfast JUG
  • 57. JAVA9+ SUPERWORD SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes Source http://prestodb.rocks/code/simd/ ON BY DEFAULT ON J9+ FOR SUPERWORD, MUST NOT HAVE: •AN OR CONDITION AS THE LOOP CONDITION •A NON-INLINED METHOD INSIDE THE LOOP •AN ARBITRARY METHOD AS THE LOOP CONDITION •MANUALLY UNROLLING OF THE LOOP •A LONG AS THE LOOP VARIABLE •MULTIPLE EXIT POINTS SouJava & Belfast JUG
  • 58. PROJECT PANAMA SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes Source http://openjdk.java.net/projects/panama/ SouJava & Belfast JUG
  • 59. PROJECT PANAMA SIMD, SINGLE INSTRUCTION MULTIPLE DATA (VECTORISATION) Hudson Mendes Source http://openjdk.java.net/projects/panama/ BETTER NATIVE API THAN JNI SouJava & Belfast JUG
  • 60. Hudson Mendes
 Lead Java Software Engineer @ AIQUDO
 twitter.com/hudsonmendes
 linkedin.com/in/hudsonmendes
 medium.com/@hudsonmendes THANK YOU! JMH CODE AVAILABLE AT
 HTTPS://GITHUB.COM/HUDSONMENDES/BELFASTJUG-SAMPLE-3 SOUJAVA & BELFASTJUG