SlideShare a Scribd company logo
OverOps JVM Performance
Magic Tricks
Adam Feldscher
CS I699 Performance Design Patterns II
Reordering Optimization
StringBuilder sb = new StringBuilder("Ingredients: ");
for (int i = 0; i < ingredients.length; i++) {
if (i > 0) {
sb.append(", ");
}
sb.append(ingredients[i]);
}
return sb.toString();
StringBuilder sb = new StringBuilder("Ingredients: ");
if (ingredients.length > 0) {
sb.append(ingredients[0]);
for (int i = 1; i < ingredients.length; i++) {
sb.append(", ");
sb.append(ingredients[i]);
}
}
return sb.toString();
Original Optimized
• Removed branch within loop
Null Check
public static String l33tify(String phrase) {
if (phrase == null) {
throw new IllegalArgumentException("phrase mus
t not be null");
}
return phrase.replace('e', '3');
}
public static String l33tify(String phrase) {
return phrase.replace('e', '3');
}
Original Optimized
• Removed null check, after many iterations of no nulls
• Removes a branch
• If a null shows up, a segfault will catch it. JVM will decompile the code and throw the
exception
Polymorphic Method Dispatch
public interface Song {
void sing();
}
public class GangnamStyle implements Song {
@Override
public void sing() {
System.out.println("Oppan gangnam style!");
}
}
public class Baby implements Song {
@Override
public void sing() {
System.out.println("And I was like baby, baby, baby, oh
");
}
}
public class Main {
public static void perform(Song s) {
s.sing();
}
}
public static void perform(Song s) {
if (s fastnativeinstanceof GangnamStyle) {
System.out.println("Oppan gangnam style!");
} else {
s.sing();
}
}
Original Optimized
• If GangnamStyle is shown to be
passed in 95% of the time
• Inline virtual call
Making the obvious code fast
By Jack Mott
The Test
 32 million 64bit floating point values
 Calculate the sum of their squares
C – 17 milliseconds
double sum = 0.0;
for (int i = 0; i < COUNT; i++)
{
double v = values[i] * values[i];
sum += v;
}
C – SIMD – 17 milliseconds
__m256d vsum = _mm256_setzero_pd();
for(int i = 0; i < COUNT/4; i=i+1) {
__m256d v = values[i];
vsum = _mm256_add_pd(vsum,_mm256_mul_pd(v,v));
}
double *tsum = &vsum;
double sum = tsum[0]+tsum[1]+tsum[2]+tsum[3];
C Compiler converted base version to
SIMD
double sum = 0.0;
for (int i = 0; i < COUNT; i++) {
00007FF7085C1120 vmovupd ymm0,ymmword ptr [rcx]
00007FF7085C1124 lea rcx,[rcx+40h]
double v = values[i] * values[i]; //square em
00007FF7085C1128 vmulpd ymm2,ymm0,ymm0
00007FF7085C112C vmovupd ymm0,ymmword ptr [rcx-20h]
00007FF7085C1131 vaddpd ymm4,ymm2,ymm4
00007FF7085C1135 vmulpd ymm2,ymm0,ymm0
00007FF7085C1139 vaddpd ymm3,ymm2,ymm5
00007FF7085C113D vmovupd ymm5,ymm3
00007FF7085C1141 sub rdx,1
00007FF7085C1145 jne imperative+80h (07FF7085C1120h)
sum += v;
}
C#
var sum = values.Sum(x => x * x);
Linq Select Sum - 260 milliseconds
Linq Aggregate - 280 milliseconds
var sum = values.Aggregate(0.0,(acc, x) => acc + x * x);
for loop - 34 milliseconds
double sum = 0.0;
foreach (var v in values)
{
double square = v * v;
sum += square;
}
JIT decides its not worth it to do SIMD
C# SIMD Explicit - 17 milliseconds
Vector<double> vsum = new Vector<double>(0.0);
for (int i = 0; i < COUNT; i += Vector<double>.Count)
{
var value = new Vector<double>(values, i);
vsum = vsum + (value * value);
}
double sum = 0;
for(int i = 0; i < Vector<double>.Count;i++)
{
sum += vsum[i];
}
Java
double sum = Arrays.stream(values)
.map(x -> x*x)
.sum();
Streams Map Sum 138 milliseconds
Streams Reduce 34 milliseconds
double sum = Arrays.stream(values)
.reduce(0,(acc,x) -> acc+x*x);
Java Streams Map Reduce 34 milliseconds
double sum = Arrays.stream(values)
.map(x -> x*x)
.reduce(0,(acc,x) -> acc+x);
.sum is doing something fancy,
use reduce instead
Java won’t do SIMD
JavaScript
var sum = values.map(x => x*x)
.reduce( (total,num,index,array)
=> total+num,0.0);
map reduce (node.js) 10,000ms
reduce (node.js) 800 and then 300 milliseconds
var sum = values.reduce( (total,num,index,array)
=> total+num*num,0.0)
foreach (node.js) 800 and then 300 milliseconds
var sum = 0.0;
array.forEach( (element,index,array)
=> sum += element*element )
JIT kicks in after 3 or 4 iterations, drops to 300
imperative (node.js) 37 milliseconds
var sum = 0.0;
for (var i = 0; i < values.length;i++){
var x = values[i];
sum += x*x;
}

More Related Content

What's hot

Extend GraphQL with directives
Extend GraphQL with directivesExtend GraphQL with directives
Extend GraphQL with directives
Greg Bergé
 
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
OdessaJS Conf
 
Microsoft Word Hw#1
Microsoft Word   Hw#1Microsoft Word   Hw#1
Microsoft Word Hw#1
kkkseld
 
Formatted Console I/O Operations in C++
Formatted Console I/O Operations in C++Formatted Console I/O Operations in C++
Formatted Console I/O Operations in C++
sujathavvv
 
PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...
Andrey Karpov
 
How to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITHow to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJIT
Egor Bogatov
 
Whats new in_csharp4
Whats new in_csharp4Whats new in_csharp4
Whats new in_csharp4Abed Bukhari
 
Oops lab manual
Oops lab manualOops lab manual
Oops lab manual
Vivek Kumar Sinha
 
C++ TUTORIAL 5
C++ TUTORIAL 5C++ TUTORIAL 5
C++ TUTORIAL 5
Farhan Ab Rahman
 
Multi qubit entanglement
Multi qubit entanglementMulti qubit entanglement
Multi qubit entanglement
Vijayananda Mohire
 
C++ Programs
C++ ProgramsC++ Programs
C++ Programs
NarayanlalMenariya
 
OPL best practices - Doing more with less easier
OPL best practices - Doing more with less easierOPL best practices - Doing more with less easier
OPL best practices - Doing more with less easier
Alex Fleischer
 
A gremlin in my graph confoo 2014
A gremlin in my graph confoo 2014A gremlin in my graph confoo 2014
A gremlin in my graph confoo 2014
Damien Seguy
 
LLVM Backend の紹介
LLVM Backend の紹介LLVM Backend の紹介
LLVM Backend の紹介
Akira Maruoka
 
Macro and Preprocessor in c programming
 Macro and  Preprocessor in c programming Macro and  Preprocessor in c programming
Macro and Preprocessor in c programming
ProfSonaliGholveDoif
 
C++ TUTORIAL 1
C++ TUTORIAL 1C++ TUTORIAL 1
C++ TUTORIAL 1
Farhan Ab Rahman
 
Lyntale: MS Code Contracts
Lyntale: MS Code ContractsLyntale: MS Code Contracts
Lyntale: MS Code Contracts
Einar Høst
 
Bank management system project in c++ with graphics
Bank management system project in c++ with graphicsBank management system project in c++ with graphics
Bank management system project in c++ with graphics
Vtech Academy of Computers
 

What's hot (20)

Extend GraphQL with directives
Extend GraphQL with directivesExtend GraphQL with directives
Extend GraphQL with directives
 
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
 
Microsoft Word Hw#1
Microsoft Word   Hw#1Microsoft Word   Hw#1
Microsoft Word Hw#1
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 
Formatted Console I/O Operations in C++
Formatted Console I/O Operations in C++Formatted Console I/O Operations in C++
Formatted Console I/O Operations in C++
 
Oop1
Oop1Oop1
Oop1
 
PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...
 
How to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITHow to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJIT
 
Whats new in_csharp4
Whats new in_csharp4Whats new in_csharp4
Whats new in_csharp4
 
Oops lab manual
Oops lab manualOops lab manual
Oops lab manual
 
C++ TUTORIAL 5
C++ TUTORIAL 5C++ TUTORIAL 5
C++ TUTORIAL 5
 
Multi qubit entanglement
Multi qubit entanglementMulti qubit entanglement
Multi qubit entanglement
 
C++ Programs
C++ ProgramsC++ Programs
C++ Programs
 
OPL best practices - Doing more with less easier
OPL best practices - Doing more with less easierOPL best practices - Doing more with less easier
OPL best practices - Doing more with less easier
 
A gremlin in my graph confoo 2014
A gremlin in my graph confoo 2014A gremlin in my graph confoo 2014
A gremlin in my graph confoo 2014
 
LLVM Backend の紹介
LLVM Backend の紹介LLVM Backend の紹介
LLVM Backend の紹介
 
Macro and Preprocessor in c programming
 Macro and  Preprocessor in c programming Macro and  Preprocessor in c programming
Macro and Preprocessor in c programming
 
C++ TUTORIAL 1
C++ TUTORIAL 1C++ TUTORIAL 1
C++ TUTORIAL 1
 
Lyntale: MS Code Contracts
Lyntale: MS Code ContractsLyntale: MS Code Contracts
Lyntale: MS Code Contracts
 
Bank management system project in c++ with graphics
Bank management system project in c++ with graphicsBank management system project in c++ with graphics
Bank management system project in c++ with graphics
 

Similar to Java JIT Optimization Research

Operator overloading2
Operator overloading2Operator overloading2
Operator overloading2
zindadili
 
Pads lab manual final
Pads lab manual finalPads lab manual final
Pads lab manual final
AhalyaR
 
Федор Поляков (Looksery) “Face Tracking на мобильных устройствах в режиме реа...
Федор Поляков (Looksery) “Face Tracking на мобильных устройствах в режиме реа...Федор Поляков (Looksery) “Face Tracking на мобильных устройствах в режиме реа...
Федор Поляков (Looksery) “Face Tracking на мобильных устройствах в режиме реа...
Provectus
 
C++ lectures all chapters in one slide.pptx
C++ lectures all chapters in one slide.pptxC++ lectures all chapters in one slide.pptx
C++ lectures all chapters in one slide.pptx
ssuser3cbb4c
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical Reviewer
Andrey Karpov
 
OOP 2012 - Hint: Dynamic allocation in c++
OOP 2012 - Hint: Dynamic allocation in c++OOP 2012 - Hint: Dynamic allocation in c++
OOP 2012 - Hint: Dynamic allocation in c++Allan Sun
 
Lecture2.ppt
Lecture2.pptLecture2.ppt
Lecture2.ppt
TarekHemdan3
 
C-Sharp Arithmatic Expression Calculator
C-Sharp Arithmatic Expression CalculatorC-Sharp Arithmatic Expression Calculator
C-Sharp Arithmatic Expression CalculatorNeeraj Kaushik
 
Rich and Snappy Apps (No Scaling Required)
Rich and Snappy Apps (No Scaling Required)Rich and Snappy Apps (No Scaling Required)
Rich and Snappy Apps (No Scaling Required)
Thomas Fuchs
 
Imugi: Compiler made with Python
Imugi: Compiler made with PythonImugi: Compiler made with Python
Imugi: Compiler made with Python
Han Lee
 
CBSE Class XI Programming in C++
CBSE Class XI Programming in C++CBSE Class XI Programming in C++
CBSE Class XI Programming in C++
Pranav Ghildiyal
 
Translate the following CC++ code into MIPS Assembly Codevoid ch.pdf
Translate the following CC++ code into MIPS Assembly Codevoid ch.pdfTranslate the following CC++ code into MIPS Assembly Codevoid ch.pdf
Translate the following CC++ code into MIPS Assembly Codevoid ch.pdf
fcsondhiindia
 
oodp elab.pdf
oodp elab.pdfoodp elab.pdf
oodp elab.pdf
SWATIKUMARIRA2111030
 
Final DAA_prints.pdf
Final DAA_prints.pdfFinal DAA_prints.pdf
Final DAA_prints.pdf
Yashpatel821746
 
C aptitude scribd
C aptitude scribdC aptitude scribd
C aptitude scribdAmit Kapoor
 
Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...
Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...
Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...
DevGAMM Conference
 
Oops lab manual2
Oops lab manual2Oops lab manual2
Oops lab manual2
Mouna Guru
 

Similar to Java JIT Optimization Research (20)

Operator overloading2
Operator overloading2Operator overloading2
Operator overloading2
 
Pads lab manual final
Pads lab manual finalPads lab manual final
Pads lab manual final
 
C++ file
C++ fileC++ file
C++ file
 
C++ file
C++ fileC++ file
C++ file
 
Федор Поляков (Looksery) “Face Tracking на мобильных устройствах в режиме реа...
Федор Поляков (Looksery) “Face Tracking на мобильных устройствах в режиме реа...Федор Поляков (Looksery) “Face Tracking на мобильных устройствах в режиме реа...
Федор Поляков (Looksery) “Face Tracking на мобильных устройствах в режиме реа...
 
C++ lectures all chapters in one slide.pptx
C++ lectures all chapters in one slide.pptxC++ lectures all chapters in one slide.pptx
C++ lectures all chapters in one slide.pptx
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical Reviewer
 
OOP 2012 - Hint: Dynamic allocation in c++
OOP 2012 - Hint: Dynamic allocation in c++OOP 2012 - Hint: Dynamic allocation in c++
OOP 2012 - Hint: Dynamic allocation in c++
 
Lecture2.ppt
Lecture2.pptLecture2.ppt
Lecture2.ppt
 
C-Sharp Arithmatic Expression Calculator
C-Sharp Arithmatic Expression CalculatorC-Sharp Arithmatic Expression Calculator
C-Sharp Arithmatic Expression Calculator
 
Rich and Snappy Apps (No Scaling Required)
Rich and Snappy Apps (No Scaling Required)Rich and Snappy Apps (No Scaling Required)
Rich and Snappy Apps (No Scaling Required)
 
Imugi: Compiler made with Python
Imugi: Compiler made with PythonImugi: Compiler made with Python
Imugi: Compiler made with Python
 
CBSE Class XI Programming in C++
CBSE Class XI Programming in C++CBSE Class XI Programming in C++
CBSE Class XI Programming in C++
 
Translate the following CC++ code into MIPS Assembly Codevoid ch.pdf
Translate the following CC++ code into MIPS Assembly Codevoid ch.pdfTranslate the following CC++ code into MIPS Assembly Codevoid ch.pdf
Translate the following CC++ code into MIPS Assembly Codevoid ch.pdf
 
oodp elab.pdf
oodp elab.pdfoodp elab.pdf
oodp elab.pdf
 
Final DAA_prints.pdf
Final DAA_prints.pdfFinal DAA_prints.pdf
Final DAA_prints.pdf
 
7720
77207720
7720
 
C aptitude scribd
C aptitude scribdC aptitude scribd
C aptitude scribd
 
Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...
Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...
Самые вкусные баги из игрового кода: как ошибаются наши коллеги-программисты ...
 
Oops lab manual2
Oops lab manual2Oops lab manual2
Oops lab manual2
 

More from Adam Feldscher

Java JIT Performance Testing and Results
Java JIT Performance Testing and ResultsJava JIT Performance Testing and Results
Java JIT Performance Testing and Results
Adam Feldscher
 
Optimizing Java Notes
Optimizing Java NotesOptimizing Java Notes
Optimizing Java Notes
Adam Feldscher
 
Java JIT Improvements Research
Java JIT Improvements ResearchJava JIT Improvements Research
Java JIT Improvements Research
Adam Feldscher
 
C++ & Java JIT Optimizations: Finding Prime Numbers
C++ & Java JIT Optimizations: Finding Prime NumbersC++ & Java JIT Optimizations: Finding Prime Numbers
C++ & Java JIT Optimizations: Finding Prime Numbers
Adam Feldscher
 
C vs Java: Finding Prime Numbers
C vs Java: Finding Prime NumbersC vs Java: Finding Prime Numbers
C vs Java: Finding Prime Numbers
Adam Feldscher
 
Paper summary
Paper summaryPaper summary
Paper summary
Adam Feldscher
 
Optimizing Java
Optimizing JavaOptimizing Java
Optimizing Java
Adam Feldscher
 
Performance Design Patterns 3
Performance Design Patterns 3Performance Design Patterns 3
Performance Design Patterns 3
Adam Feldscher
 

More from Adam Feldscher (8)

Java JIT Performance Testing and Results
Java JIT Performance Testing and ResultsJava JIT Performance Testing and Results
Java JIT Performance Testing and Results
 
Optimizing Java Notes
Optimizing Java NotesOptimizing Java Notes
Optimizing Java Notes
 
Java JIT Improvements Research
Java JIT Improvements ResearchJava JIT Improvements Research
Java JIT Improvements Research
 
C++ & Java JIT Optimizations: Finding Prime Numbers
C++ & Java JIT Optimizations: Finding Prime NumbersC++ & Java JIT Optimizations: Finding Prime Numbers
C++ & Java JIT Optimizations: Finding Prime Numbers
 
C vs Java: Finding Prime Numbers
C vs Java: Finding Prime NumbersC vs Java: Finding Prime Numbers
C vs Java: Finding Prime Numbers
 
Paper summary
Paper summaryPaper summary
Paper summary
 
Optimizing Java
Optimizing JavaOptimizing Java
Optimizing Java
 
Performance Design Patterns 3
Performance Design Patterns 3Performance Design Patterns 3
Performance Design Patterns 3
 

Recently uploaded

CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
kimdan468
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
deeptiverma2406
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 

Recently uploaded (20)

CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 

Java JIT Optimization Research

  • 1. OverOps JVM Performance Magic Tricks Adam Feldscher CS I699 Performance Design Patterns II
  • 2. Reordering Optimization StringBuilder sb = new StringBuilder("Ingredients: "); for (int i = 0; i < ingredients.length; i++) { if (i > 0) { sb.append(", "); } sb.append(ingredients[i]); } return sb.toString(); StringBuilder sb = new StringBuilder("Ingredients: "); if (ingredients.length > 0) { sb.append(ingredients[0]); for (int i = 1; i < ingredients.length; i++) { sb.append(", "); sb.append(ingredients[i]); } } return sb.toString(); Original Optimized • Removed branch within loop
  • 3. Null Check public static String l33tify(String phrase) { if (phrase == null) { throw new IllegalArgumentException("phrase mus t not be null"); } return phrase.replace('e', '3'); } public static String l33tify(String phrase) { return phrase.replace('e', '3'); } Original Optimized • Removed null check, after many iterations of no nulls • Removes a branch • If a null shows up, a segfault will catch it. JVM will decompile the code and throw the exception
  • 4. Polymorphic Method Dispatch public interface Song { void sing(); } public class GangnamStyle implements Song { @Override public void sing() { System.out.println("Oppan gangnam style!"); } } public class Baby implements Song { @Override public void sing() { System.out.println("And I was like baby, baby, baby, oh "); } } public class Main { public static void perform(Song s) { s.sing(); } } public static void perform(Song s) { if (s fastnativeinstanceof GangnamStyle) { System.out.println("Oppan gangnam style!"); } else { s.sing(); } } Original Optimized • If GangnamStyle is shown to be passed in 95% of the time • Inline virtual call
  • 5. Making the obvious code fast By Jack Mott
  • 6. The Test  32 million 64bit floating point values  Calculate the sum of their squares
  • 7. C – 17 milliseconds double sum = 0.0; for (int i = 0; i < COUNT; i++) { double v = values[i] * values[i]; sum += v; }
  • 8. C – SIMD – 17 milliseconds __m256d vsum = _mm256_setzero_pd(); for(int i = 0; i < COUNT/4; i=i+1) { __m256d v = values[i]; vsum = _mm256_add_pd(vsum,_mm256_mul_pd(v,v)); } double *tsum = &vsum; double sum = tsum[0]+tsum[1]+tsum[2]+tsum[3];
  • 9. C Compiler converted base version to SIMD double sum = 0.0; for (int i = 0; i < COUNT; i++) { 00007FF7085C1120 vmovupd ymm0,ymmword ptr [rcx] 00007FF7085C1124 lea rcx,[rcx+40h] double v = values[i] * values[i]; //square em 00007FF7085C1128 vmulpd ymm2,ymm0,ymm0 00007FF7085C112C vmovupd ymm0,ymmword ptr [rcx-20h] 00007FF7085C1131 vaddpd ymm4,ymm2,ymm4 00007FF7085C1135 vmulpd ymm2,ymm0,ymm0 00007FF7085C1139 vaddpd ymm3,ymm2,ymm5 00007FF7085C113D vmovupd ymm5,ymm3 00007FF7085C1141 sub rdx,1 00007FF7085C1145 jne imperative+80h (07FF7085C1120h) sum += v; }
  • 10. C# var sum = values.Sum(x => x * x); Linq Select Sum - 260 milliseconds Linq Aggregate - 280 milliseconds var sum = values.Aggregate(0.0,(acc, x) => acc + x * x); for loop - 34 milliseconds double sum = 0.0; foreach (var v in values) { double square = v * v; sum += square; } JIT decides its not worth it to do SIMD C# SIMD Explicit - 17 milliseconds Vector<double> vsum = new Vector<double>(0.0); for (int i = 0; i < COUNT; i += Vector<double>.Count) { var value = new Vector<double>(values, i); vsum = vsum + (value * value); } double sum = 0; for(int i = 0; i < Vector<double>.Count;i++) { sum += vsum[i]; }
  • 11. Java double sum = Arrays.stream(values) .map(x -> x*x) .sum(); Streams Map Sum 138 milliseconds Streams Reduce 34 milliseconds double sum = Arrays.stream(values) .reduce(0,(acc,x) -> acc+x*x); Java Streams Map Reduce 34 milliseconds double sum = Arrays.stream(values) .map(x -> x*x) .reduce(0,(acc,x) -> acc+x); .sum is doing something fancy, use reduce instead Java won’t do SIMD
  • 12. JavaScript var sum = values.map(x => x*x) .reduce( (total,num,index,array) => total+num,0.0); map reduce (node.js) 10,000ms reduce (node.js) 800 and then 300 milliseconds var sum = values.reduce( (total,num,index,array) => total+num*num,0.0) foreach (node.js) 800 and then 300 milliseconds var sum = 0.0; array.forEach( (element,index,array) => sum += element*element ) JIT kicks in after 3 or 4 iterations, drops to 300 imperative (node.js) 37 milliseconds var sum = 0.0; for (var i = 0; i < values.length;i++){ var x = values[i]; sum += x*x; }