SlideShare a Scribd company logo
1 of 22
Presentation Outline
• Project Goals
• Tools for Benchmarking:
• Performance counters, PAPI,
• HPC Toolkit, Phoronix Test Suits,
• Power Measurement
• How testing was accomplished
• List of additional data points for application to processor affinity
• A simple continuation of Ryan’s test work

• Results from Memory/Cache Interference Testing for multiple applications
run simultaneously pinned to specific cores
Project Goals
• Benchmarking Processors
– Monitor both performance counters and the system's power
usage
– Gathering more data for looking at application affinity for
performance on a particular processor architecture
• Memory Intensive Applications
• CPU Intensive Applications
– Analyze the Interaction/Interference of multiple applications run
simultaneously on different cores of the same processor
• This data collection is intermediate work for future unspecified
projects
Performance Counters and PAPI
• Performance counters
– Counters built into processor hardware that record the number
of occurrences of user specified events in hardware
• PAPI – Performance Application Programming Interface
– PAPI was developed in the hope of identifying bottlenecks in
current architectural development of high performance
computing
– A standardized list of performance counters available for most
processors
– PAPI makes it easier to have consistent tests across multiple
processor architectures
What do the Performance Counter
Measurements mean?
• Can mean different things based on which counters are being
monitored Ex:
– PAPI_L1_DCA - Level 1 data cache accesses
– PAPI_FAD_INS - Floating point add instructions

– PAPI_L2_DCM - Level 2 data cache misses
• The raw count data provided by the Performance Counter will need
to be meaningfully interpreted by the user
Matching Performance counters to Processor
Architectures
• Performance Counters used for these tests :
– PAPI_TOT_INS – Total Instructions Executed
– PAPI_L2_TCM – Data and Instruction Level 2 Cache Misses
• These should be pretty universally available across different
processor architectures
• Future inclusion of other tests may require other Performance
Counters, but available Performance Counters vary greatly between
processor architectures…
HPC Toolkit
• “An Integrated suite of tools for measurement and analysis of
program performance”
• Essentially
– HPC Toolkit makes it easier to interface with the local machine's
performance counters
– Makes collecting program performance data easier
Phoronix Test Suite
• Phoronix Provides lots of test applications capable of testing many
aspects of processor performance

– Phoronix tests are responsible for all of the benchmarking data
gathered for this presentation
• However many other groups write application suites useful for
benchmarking
– SPEC CPU2000 / 2006
– PARSEC
• Several resources such as “OpenBenchmarking.org” provide a
substantial amount of results from tests run from these suites on
many processor architectures
– This could prove to be a useful resource, however they do not
include information about power usage
Applications used for testing CrossCore cache interference
• C-Ray
– A Ray Tracing Program
– CPU Intensive
– Many Floating Point Calculation Operations
– Relatively Little Memory Access

• Ramspeed
– Integer and Floating Point Writes and Reads to memory
– Memory Intensive
– More interaction with the caches
Monitoring Power Usage
• “Watts Up? PRO” power meter
– Measures power consumption from a single standard power
outlet
– Has a USB port to interface with a computer and dump recorded
power measurements
How tests were run
• Minimalist Ubuntu Operating System allows the processor's
attention to be dedicated to the test applications
– Terminal Based User Interface
– Unnecessary background processes not included in the
operating system
• Power usage and selected program counters are recorded and
saved while the various test applications are run.
• For Testing Interference between programs:
– “taskset” was used to pin the applications to specific processor
cores
– The applications were run concurrently, while performance
counter results were measured
Measuring Memory Interference
between Applications
•

How this is tested:
•
•

•

Simultaneously pin different types of applications to run only on specific cores in the
processor,
Then use performance counters and the power meter to measure the interference

Interference could be defined as:
•
•

Increase in application execution time

•
•

An increase in the number of cache misses

Possibly defined by an increase in power consumption

Test plan:
•

Tests were run:
• First on an AMD Turion II Dual-Core M520 Processor (2 cores, 5 P-states)
• Later also on an Intel Pentium Dual Core CPU (2 cores, 4 P-states)

•

Run control tests for running each processor alone (pinned to a single core )

•

Run the tests together and analyze the differences
Control Results:
Intel Pentium dual CPU T2330
Intel Pentium Dual
Core: C-Ray L2 Cache
Miss Control Results

Intel Pentium Dual
Core : C-Ray Execution
Time Control Results
400

2000

50000

300

1500
1000

CPU Control
Test

500

40000

200

CPU Control
Test

100

1

2

0

3

1

2

3

0

Intel Pentium Dual Core:
Ramspeed L2 Cache
Miss Control Results
54100

220
200
180
160
140
0

1

2

3

1

2

3

Intel Pentium Dual
Core Ramspeed Power
Usage Control Results
10000
9500

54050
Memory
Control Test

CPU Control
Energy

20000
0

0

Intel Pentium Dual
Core: Ramspeed
Execution Time…

30000
10000

0
0

Intel Pentium Dual
Core C-Ray Power
Usage Control Results

Memory
Control
Energy

9000
Memory
Control Test 8500

54000
53950
0

1

2

3

8000
0

1

2

3
Control Results:
AMD Turion II Dual Core Mobile M520
AMD Turion II DualCore C-ray Execution
Time control Results
582
581
580
579
578
577
576

AMD Turion II DualCore C-ray L2 Cache
Miss control Results
800
600

CPU Control
Test

400
200
0

0

1

2

3

4

0

1

2

3

4

AMD Turion II DualCore C-ray Power Usage
control Results

50000
40000
CPU Control 30000
Test
20000
10000
0

CPU Control
Energy

0

AMD Turion II DualCore Ramspeed
Execution Time
control Results
82
80
78
76
74

AMD Turion II DualCore Ramspeed L2
Cache Miss control
Results
4600

Memory
Control Test
0

1

2

3

4

1

2

3

4

AMD Turion II DualCore Ramspeed Power
Usage control Results
15000
10000

4400
Memory
Control Test

4200
4000
0

1

2

3

4

Memory
Control Energy

5000
0
0

1

2

3

4
Taking a Closer Look at the AMD
Control Results from the previous slide:
•

It seems suspect that the results from the control test should produce the same execution time across all
p-states, even though this result for the C-Ray execution control test was consistent over multiple runs on
the AMD Turion II processor, a test execution on a secondary Intel Pentium Dual Core processor produced
results that were closer to what seems realistic:

C-Ray Execution
Time
(AMD First Run)
2500

C-ray Execution
Time
(AMD Second Run)
1800

1400

1400

1200

1200
1500

Control Test

1000

Interference
Test

CPU Control
Test

1000

CPU
Interference
Test

600

200
0
1

2

3

4

CPU
Interference
Test

600
400
200

0
0

CPU Control
Test

1000
800

800

400

500

1800
1600

1600
2000

C-ray Execution
Time (Intel Run)

0

1

2

3

4

0
0

1

2

3
Interference Results
(Joint Pinning Results on C-Ray):
Intel Pentium dual CPU T2330
•

The third column of data represents Adjusted interference results

C-ray Execution Time
Interference (Ramspeed
test on second core)

C-ray L2 Cache Misses
Interference (Ramspeed
test on second core)

1800

5000

1600

4500

1400

4000
CPU Control
Test

1200

45000
40000
35000

3500

CPU Control
Test

3000

1000
Original CPU
Interference
Test

800
600

Adjusted CPU
Interference
Test

400
200

Original CPU
Interference
Test

2500
2000

Adjusted CPU
Interference
Test

1500
1000

0

1

2

3

30000

CPU Control
Energy

25000
20000

1 CPU and 1
Memory
Interference
Test Energy

15000
10000
5000

500

0

Power usage for C-ray
and Ramspeed tests
run together

0

0
0

1

2

3

0

1

2

3
Interference Results
(Joint Pinning Results on Ramspeed):
Intel Pentium dual CPU T2330

Ramspeed Execution Time
Interference
(C-ray test on second core)
220

Ramspeed L2 Cache Misses
Interference
(C-ray test on second core)
54100

210
54050
200
190

Memory Control Test

180

Memory Interference
Test

54000
Memory Control Test
Memory Interference
Test

53950

170
53900

160
150
0

1

2

3

53850
0

1

2

3
Interference Results
(2 CPU Intensive Application Pinning Results):
Intel Pentium dual CPU T2330

C-ray Execution Time
Interference
(C-ray test on second
core)
1600

C-ray L2 Cache
Misses Interference
(C-ray test on second
core)
700

Power usage for 2 Cray tests running on
separate cores
45000
40000

1400

600

35000

1200

CPU Control
Test

1000

CPU
Interference
Test

800
600

CPU
Interference
Test

400
200

500

CPU Control
Test

400
CPU
Interference
Test

300

CPU
Interference
Test

200
100

0
0

1

2

3

30000
CPU Control
Energy

25000
20000

2 CPU
Interference
Test Energy

15000
10000
5000

0

0
0

1

2

3

0

1

2

3
Interference Results
(2 Memory Intensive Application
Pinning Results):
Intel Pentium dual CPU T2330
Ramspeed Execution
Time Interference
(Ramspeed test on
second core)

Ramspeed L2 Cache
Misses Interference
(Ramspeed test on
second core)

400

54250

350

54200

Power usage for 2
Ramspeed tests
running on separate
cores
45000
40000
35000

300

Memory
Control Test

250

54150

Memory
Control Test

54100
Memory
Interference
Test

200
150

Memory
Interference
Test

100

Memory
Interference
Test
Memory
Interference
Test

54000
53950

0

53900
0

1

2

3

Memory
Control
Energy

25000

54050

50

30000

20000

2 Memory
Interference
Test Energy

15000
10000
5000
0

0

1

2

3

0

1

2

3
Interference between
simultaneous applications:
Future Tests

The foundation scripts have been written so in the future it will
be very easy to add support for testing:
– Interference of 1 type of application pinned to N cores for a processor
with a substantial number of cores (ie >2)
– Interference from 2 CPU intensive or 2 Memory intensive test
applications
– Measure memory interference with M applications mapped to N cores
(Obviously N > 2)
– Testing a larger sample size might produce more interesting results
– Find which application to core mappings can provide the best
performance for specific architectures/cache sizes
Presentation Outline
• Project Goals
• Tools for Benchmarking:
• Performance counters, PAPI,
• HPC Toolkit, Phoronix Test Suits,
• Power Measurement
• How testing was accomplished
• List of additional data points for application to processor affinity
• A simple continuation of Ryan’s test work

• Results from Interference Testing for applications pinned to specific cores
Thank You For Your Attention

More Related Content

What's hot

Introduction to JMeter
Introduction to JMeterIntroduction to JMeter
Introduction to JMeterRahul Sudame
 
Performancetestingjmeter 121109061704-phpapp02
Performancetestingjmeter 121109061704-phpapp02Performancetestingjmeter 121109061704-phpapp02
Performancetestingjmeter 121109061704-phpapp02Shivakumara .
 
WALT vs PELT : Redux - SFO17-307
WALT vs PELT : Redux  - SFO17-307WALT vs PELT : Redux  - SFO17-307
WALT vs PELT : Redux - SFO17-307Linaro
 
How we can measure server performance using jmeter?
How we can measure server performance using jmeter?How we can measure server performance using jmeter?
How we can measure server performance using jmeter?BugRaptors
 
System design techniques and networks
System design techniques and networksSystem design techniques and networks
System design techniques and networksRAMPRAKASHT1
 
Day1 JMeter_training_overview
Day1 JMeter_training_overviewDay1 JMeter_training_overview
Day1 JMeter_training_overviewSravanthiN
 
SFO15-302: Energy Aware Scheduling: Progress Update
SFO15-302: Energy Aware Scheduling: Progress UpdateSFO15-302: Energy Aware Scheduling: Progress Update
SFO15-302: Energy Aware Scheduling: Progress UpdateLinaro
 
Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...
Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...
Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...Core Security
 
Performance testing with Apache JMeter
Performance testing with Apache JMeterPerformance testing with Apache JMeter
Performance testing with Apache JMeterRedBlackTree
 
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...Chromatography & Mass Spectrometry Solutions
 
2016-jenkins-world-jenkins_and_load_sharing_facility_lsf_enables_rapid_delive...
2016-jenkins-world-jenkins_and_load_sharing_facility_lsf_enables_rapid_delive...2016-jenkins-world-jenkins_and_load_sharing_facility_lsf_enables_rapid_delive...
2016-jenkins-world-jenkins_and_load_sharing_facility_lsf_enables_rapid_delive...Brian Vandegriend
 
System performance monitoring pcp + vector
System performance monitoring   pcp + vectorSystem performance monitoring   pcp + vector
System performance monitoring pcp + vectorSandeep Kunkunuru
 
Common Pitfalls of Functional Programming and How to Avoid Them: A Mobile Gam...
Common Pitfalls of Functional Programming and How to Avoid Them: A Mobile Gam...Common Pitfalls of Functional Programming and How to Avoid Them: A Mobile Gam...
Common Pitfalls of Functional Programming and How to Avoid Them: A Mobile Gam...gree_tech
 

What's hot (18)

Introduction to JMeter
Introduction to JMeterIntroduction to JMeter
Introduction to JMeter
 
Performancetestingjmeter 121109061704-phpapp02
Performancetestingjmeter 121109061704-phpapp02Performancetestingjmeter 121109061704-phpapp02
Performancetestingjmeter 121109061704-phpapp02
 
WALT vs PELT : Redux - SFO17-307
WALT vs PELT : Redux  - SFO17-307WALT vs PELT : Redux  - SFO17-307
WALT vs PELT : Redux - SFO17-307
 
SOFTWARE TESTING W4_watermark.pdf
SOFTWARE TESTING W4_watermark.pdfSOFTWARE TESTING W4_watermark.pdf
SOFTWARE TESTING W4_watermark.pdf
 
How we can measure server performance using jmeter?
How we can measure server performance using jmeter?How we can measure server performance using jmeter?
How we can measure server performance using jmeter?
 
Load testing with J meter
Load testing with J meterLoad testing with J meter
Load testing with J meter
 
System design techniques and networks
System design techniques and networksSystem design techniques and networks
System design techniques and networks
 
Day1 JMeter_training_overview
Day1 JMeter_training_overviewDay1 JMeter_training_overview
Day1 JMeter_training_overview
 
Chromatography Data System: Expand to the Enterprise
Chromatography Data System: Expand to the Enterprise Chromatography Data System: Expand to the Enterprise
Chromatography Data System: Expand to the Enterprise
 
SFO15-302: Energy Aware Scheduling: Progress Update
SFO15-302: Energy Aware Scheduling: Progress UpdateSFO15-302: Energy Aware Scheduling: Progress Update
SFO15-302: Energy Aware Scheduling: Progress Update
 
Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...
Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...
Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...
 
Performance testing with Apache JMeter
Performance testing with Apache JMeterPerformance testing with Apache JMeter
Performance testing with Apache JMeter
 
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
 
Unit 3 part2
Unit 3 part2Unit 3 part2
Unit 3 part2
 
Chromatography Data System: Chromeleon Goes Mass Spectrometry
Chromatography Data System: Chromeleon Goes Mass SpectrometryChromatography Data System: Chromeleon Goes Mass Spectrometry
Chromatography Data System: Chromeleon Goes Mass Spectrometry
 
2016-jenkins-world-jenkins_and_load_sharing_facility_lsf_enables_rapid_delive...
2016-jenkins-world-jenkins_and_load_sharing_facility_lsf_enables_rapid_delive...2016-jenkins-world-jenkins_and_load_sharing_facility_lsf_enables_rapid_delive...
2016-jenkins-world-jenkins_and_load_sharing_facility_lsf_enables_rapid_delive...
 
System performance monitoring pcp + vector
System performance monitoring   pcp + vectorSystem performance monitoring   pcp + vector
System performance monitoring pcp + vector
 
Common Pitfalls of Functional Programming and How to Avoid Them: A Mobile Gam...
Common Pitfalls of Functional Programming and How to Avoid Them: A Mobile Gam...Common Pitfalls of Functional Programming and How to Avoid Them: A Mobile Gam...
Common Pitfalls of Functional Programming and How to Avoid Them: A Mobile Gam...
 

Viewers also liked

Report painter in SAP
Report painter in SAPReport painter in SAP
Report painter in SAPRajeev Kumar
 
ciri -ciri universal tamadun kini dan perbandingan tamadun dahulu
ciri -ciri universal tamadun kini dan perbandingan tamadun dahuluciri -ciri universal tamadun kini dan perbandingan tamadun dahulu
ciri -ciri universal tamadun kini dan perbandingan tamadun dahuluNur Hidayah
 
Techniques for Effective Retrospectives
Techniques for Effective RetrospectivesTechniques for Effective Retrospectives
Techniques for Effective RetrospectivesProwareness
 
ビッグデータ分析基盤を支えるOSSたち
ビッグデータ分析基盤を支えるOSSたちビッグデータ分析基盤を支えるOSSたち
ビッグデータ分析基盤を支えるOSSたちToru Takahashi
 
How to Effectively Audit your IT Infrastructure
How to Effectively Audit your IT InfrastructureHow to Effectively Audit your IT Infrastructure
How to Effectively Audit your IT InfrastructureNetwrix Corporation
 
12. npa & recovery management
12. npa & recovery management12. npa & recovery management
12. npa & recovery managementRatnesh Ratn
 
企業システムにアジャイルは必要か
企業システムにアジャイルは必要か企業システムにアジャイルは必要か
企業システムにアジャイルは必要かHiromasa Oka
 
A project report on chat application
A project report on chat applicationA project report on chat application
A project report on chat applicationKumar Gaurav
 
Monitoring Spark Applications
Monitoring Spark ApplicationsMonitoring Spark Applications
Monitoring Spark ApplicationsTzach Zohar
 

Viewers also liked (11)

Report painter in SAP
Report painter in SAPReport painter in SAP
Report painter in SAP
 
ciri -ciri universal tamadun kini dan perbandingan tamadun dahulu
ciri -ciri universal tamadun kini dan perbandingan tamadun dahuluciri -ciri universal tamadun kini dan perbandingan tamadun dahulu
ciri -ciri universal tamadun kini dan perbandingan tamadun dahulu
 
Change Management Framework
Change Management FrameworkChange Management Framework
Change Management Framework
 
Techniques for Effective Retrospectives
Techniques for Effective RetrospectivesTechniques for Effective Retrospectives
Techniques for Effective Retrospectives
 
ビッグデータ分析基盤を支えるOSSたち
ビッグデータ分析基盤を支えるOSSたちビッグデータ分析基盤を支えるOSSたち
ビッグデータ分析基盤を支えるOSSたち
 
How to Effectively Audit your IT Infrastructure
How to Effectively Audit your IT InfrastructureHow to Effectively Audit your IT Infrastructure
How to Effectively Audit your IT Infrastructure
 
12. npa & recovery management
12. npa & recovery management12. npa & recovery management
12. npa & recovery management
 
企業システムにアジャイルは必要か
企業システムにアジャイルは必要か企業システムにアジャイルは必要か
企業システムにアジャイルは必要か
 
A project report on chat application
A project report on chat applicationA project report on chat application
A project report on chat application
 
Monitoring Spark Applications
Monitoring Spark ApplicationsMonitoring Spark Applications
Monitoring Spark Applications
 
The digital marketing ppt
The digital marketing pptThe digital marketing ppt
The digital marketing ppt
 

Similar to Daniel dauwe ece 561 Benchmarking Results

L-2 (Computer Performance).ppt
L-2 (Computer Performance).pptL-2 (Computer Performance).ppt
L-2 (Computer Performance).pptImranKhan997082
 
Performance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12cPerformance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12cAjith Narayanan
 
Parallel Computing - Lec 6
Parallel Computing - Lec 6Parallel Computing - Lec 6
Parallel Computing - Lec 6Shah Zaib
 
Know More About Rational Performance - Snehamoy K
Know More About Rational Performance - Snehamoy KKnow More About Rational Performance - Snehamoy K
Know More About Rational Performance - Snehamoy KRoopa Nadkarni
 
3 know more_about_rational_performance_tester_8-1-snehamoy_k
3 know more_about_rational_performance_tester_8-1-snehamoy_k3 know more_about_rational_performance_tester_8-1-snehamoy_k
3 know more_about_rational_performance_tester_8-1-snehamoy_kIBM
 
Performance testing jmeter
Performance testing jmeterPerformance testing jmeter
Performance testing jmeterBhojan Rajan
 
Performance Testing
Performance TestingPerformance Testing
Performance TestingAnu Shaji
 
05. performance-concepts
05. performance-concepts05. performance-concepts
05. performance-conceptsMuhammad Ahad
 
Continuous Performance Testing
Continuous Performance TestingContinuous Performance Testing
Continuous Performance TestingMark Price
 
Performance eng prakash.sahu
Performance eng prakash.sahuPerformance eng prakash.sahu
Performance eng prakash.sahuDr. Prakash Sahu
 
Factors influencing the success of computer architecture
Factors influencing the success of computer architectureFactors influencing the success of computer architecture
Factors influencing the success of computer architectureMajane Padua
 
05. performance-concepts-26-slides
05. performance-concepts-26-slides05. performance-concepts-26-slides
05. performance-concepts-26-slidesMuhammad Ahad
 
Oracle Analytics Server Infrastructure Tuning guide v2.pdf
Oracle Analytics Server Infrastructure Tuning guide v2.pdfOracle Analytics Server Infrastructure Tuning guide v2.pdf
Oracle Analytics Server Infrastructure Tuning guide v2.pdfsivakodali7
 
Lecture for the day three in jj3 ppt.pdf
Lecture for the day three in jj3 ppt.pdfLecture for the day three in jj3 ppt.pdf
Lecture for the day three in jj3 ppt.pdfAhmedWasiu
 
Automated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsAutomated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsSAIL_QU
 

Similar to Daniel dauwe ece 561 Benchmarking Results (20)

L-2 (Computer Performance).ppt
L-2 (Computer Performance).pptL-2 (Computer Performance).ppt
L-2 (Computer Performance).ppt
 
Performance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12cPerformance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12c
 
Parallel Computing - Lec 6
Parallel Computing - Lec 6Parallel Computing - Lec 6
Parallel Computing - Lec 6
 
Know More About Rational Performance - Snehamoy K
Know More About Rational Performance - Snehamoy KKnow More About Rational Performance - Snehamoy K
Know More About Rational Performance - Snehamoy K
 
3 know more_about_rational_performance_tester_8-1-snehamoy_k
3 know more_about_rational_performance_tester_8-1-snehamoy_k3 know more_about_rational_performance_tester_8-1-snehamoy_k
3 know more_about_rational_performance_tester_8-1-snehamoy_k
 
Performance testing jmeter
Performance testing jmeterPerformance testing jmeter
Performance testing jmeter
 
Performance Testing
Performance TestingPerformance Testing
Performance Testing
 
05. performance-concepts
05. performance-concepts05. performance-concepts
05. performance-concepts
 
CPU Verification
CPU VerificationCPU Verification
CPU Verification
 
13009690.ppt
13009690.ppt13009690.ppt
13009690.ppt
 
Real-World Load Testing of ADF Fusion Applications Demonstrated - Oracle Ope...
Real-World Load Testing of ADF Fusion Applications Demonstrated  - Oracle Ope...Real-World Load Testing of ADF Fusion Applications Demonstrated  - Oracle Ope...
Real-World Load Testing of ADF Fusion Applications Demonstrated - Oracle Ope...
 
Continuous Performance Testing
Continuous Performance TestingContinuous Performance Testing
Continuous Performance Testing
 
Performance Testing Overview
Performance Testing OverviewPerformance Testing Overview
Performance Testing Overview
 
JMeter
JMeterJMeter
JMeter
 
Performance eng prakash.sahu
Performance eng prakash.sahuPerformance eng prakash.sahu
Performance eng prakash.sahu
 
Factors influencing the success of computer architecture
Factors influencing the success of computer architectureFactors influencing the success of computer architecture
Factors influencing the success of computer architecture
 
05. performance-concepts-26-slides
05. performance-concepts-26-slides05. performance-concepts-26-slides
05. performance-concepts-26-slides
 
Oracle Analytics Server Infrastructure Tuning guide v2.pdf
Oracle Analytics Server Infrastructure Tuning guide v2.pdfOracle Analytics Server Infrastructure Tuning guide v2.pdf
Oracle Analytics Server Infrastructure Tuning guide v2.pdf
 
Lecture for the day three in jj3 ppt.pdf
Lecture for the day three in jj3 ppt.pdfLecture for the day three in jj3 ppt.pdf
Lecture for the day three in jj3 ppt.pdf
 
Automated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsAutomated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise Applications
 

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 

Recently uploaded (20)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 

Daniel dauwe ece 561 Benchmarking Results

  • 1.
  • 2. Presentation Outline • Project Goals • Tools for Benchmarking: • Performance counters, PAPI, • HPC Toolkit, Phoronix Test Suits, • Power Measurement • How testing was accomplished • List of additional data points for application to processor affinity • A simple continuation of Ryan’s test work • Results from Memory/Cache Interference Testing for multiple applications run simultaneously pinned to specific cores
  • 3. Project Goals • Benchmarking Processors – Monitor both performance counters and the system's power usage – Gathering more data for looking at application affinity for performance on a particular processor architecture • Memory Intensive Applications • CPU Intensive Applications – Analyze the Interaction/Interference of multiple applications run simultaneously on different cores of the same processor • This data collection is intermediate work for future unspecified projects
  • 4. Performance Counters and PAPI • Performance counters – Counters built into processor hardware that record the number of occurrences of user specified events in hardware • PAPI – Performance Application Programming Interface – PAPI was developed in the hope of identifying bottlenecks in current architectural development of high performance computing – A standardized list of performance counters available for most processors – PAPI makes it easier to have consistent tests across multiple processor architectures
  • 5. What do the Performance Counter Measurements mean? • Can mean different things based on which counters are being monitored Ex: – PAPI_L1_DCA - Level 1 data cache accesses – PAPI_FAD_INS - Floating point add instructions – PAPI_L2_DCM - Level 2 data cache misses • The raw count data provided by the Performance Counter will need to be meaningfully interpreted by the user
  • 6. Matching Performance counters to Processor Architectures • Performance Counters used for these tests : – PAPI_TOT_INS – Total Instructions Executed – PAPI_L2_TCM – Data and Instruction Level 2 Cache Misses • These should be pretty universally available across different processor architectures • Future inclusion of other tests may require other Performance Counters, but available Performance Counters vary greatly between processor architectures…
  • 7. HPC Toolkit • “An Integrated suite of tools for measurement and analysis of program performance” • Essentially – HPC Toolkit makes it easier to interface with the local machine's performance counters – Makes collecting program performance data easier
  • 8. Phoronix Test Suite • Phoronix Provides lots of test applications capable of testing many aspects of processor performance – Phoronix tests are responsible for all of the benchmarking data gathered for this presentation • However many other groups write application suites useful for benchmarking – SPEC CPU2000 / 2006 – PARSEC • Several resources such as “OpenBenchmarking.org” provide a substantial amount of results from tests run from these suites on many processor architectures – This could prove to be a useful resource, however they do not include information about power usage
  • 9. Applications used for testing CrossCore cache interference • C-Ray – A Ray Tracing Program – CPU Intensive – Many Floating Point Calculation Operations – Relatively Little Memory Access • Ramspeed – Integer and Floating Point Writes and Reads to memory – Memory Intensive – More interaction with the caches
  • 10. Monitoring Power Usage • “Watts Up? PRO” power meter – Measures power consumption from a single standard power outlet – Has a USB port to interface with a computer and dump recorded power measurements
  • 11. How tests were run • Minimalist Ubuntu Operating System allows the processor's attention to be dedicated to the test applications – Terminal Based User Interface – Unnecessary background processes not included in the operating system • Power usage and selected program counters are recorded and saved while the various test applications are run. • For Testing Interference between programs: – “taskset” was used to pin the applications to specific processor cores – The applications were run concurrently, while performance counter results were measured
  • 12. Measuring Memory Interference between Applications • How this is tested: • • • Simultaneously pin different types of applications to run only on specific cores in the processor, Then use performance counters and the power meter to measure the interference Interference could be defined as: • • Increase in application execution time • • An increase in the number of cache misses Possibly defined by an increase in power consumption Test plan: • Tests were run: • First on an AMD Turion II Dual-Core M520 Processor (2 cores, 5 P-states) • Later also on an Intel Pentium Dual Core CPU (2 cores, 4 P-states) • Run control tests for running each processor alone (pinned to a single core ) • Run the tests together and analyze the differences
  • 13. Control Results: Intel Pentium dual CPU T2330 Intel Pentium Dual Core: C-Ray L2 Cache Miss Control Results Intel Pentium Dual Core : C-Ray Execution Time Control Results 400 2000 50000 300 1500 1000 CPU Control Test 500 40000 200 CPU Control Test 100 1 2 0 3 1 2 3 0 Intel Pentium Dual Core: Ramspeed L2 Cache Miss Control Results 54100 220 200 180 160 140 0 1 2 3 1 2 3 Intel Pentium Dual Core Ramspeed Power Usage Control Results 10000 9500 54050 Memory Control Test CPU Control Energy 20000 0 0 Intel Pentium Dual Core: Ramspeed Execution Time… 30000 10000 0 0 Intel Pentium Dual Core C-Ray Power Usage Control Results Memory Control Energy 9000 Memory Control Test 8500 54000 53950 0 1 2 3 8000 0 1 2 3
  • 14. Control Results: AMD Turion II Dual Core Mobile M520 AMD Turion II DualCore C-ray Execution Time control Results 582 581 580 579 578 577 576 AMD Turion II DualCore C-ray L2 Cache Miss control Results 800 600 CPU Control Test 400 200 0 0 1 2 3 4 0 1 2 3 4 AMD Turion II DualCore C-ray Power Usage control Results 50000 40000 CPU Control 30000 Test 20000 10000 0 CPU Control Energy 0 AMD Turion II DualCore Ramspeed Execution Time control Results 82 80 78 76 74 AMD Turion II DualCore Ramspeed L2 Cache Miss control Results 4600 Memory Control Test 0 1 2 3 4 1 2 3 4 AMD Turion II DualCore Ramspeed Power Usage control Results 15000 10000 4400 Memory Control Test 4200 4000 0 1 2 3 4 Memory Control Energy 5000 0 0 1 2 3 4
  • 15. Taking a Closer Look at the AMD Control Results from the previous slide: • It seems suspect that the results from the control test should produce the same execution time across all p-states, even though this result for the C-Ray execution control test was consistent over multiple runs on the AMD Turion II processor, a test execution on a secondary Intel Pentium Dual Core processor produced results that were closer to what seems realistic: C-Ray Execution Time (AMD First Run) 2500 C-ray Execution Time (AMD Second Run) 1800 1400 1400 1200 1200 1500 Control Test 1000 Interference Test CPU Control Test 1000 CPU Interference Test 600 200 0 1 2 3 4 CPU Interference Test 600 400 200 0 0 CPU Control Test 1000 800 800 400 500 1800 1600 1600 2000 C-ray Execution Time (Intel Run) 0 1 2 3 4 0 0 1 2 3
  • 16. Interference Results (Joint Pinning Results on C-Ray): Intel Pentium dual CPU T2330 • The third column of data represents Adjusted interference results C-ray Execution Time Interference (Ramspeed test on second core) C-ray L2 Cache Misses Interference (Ramspeed test on second core) 1800 5000 1600 4500 1400 4000 CPU Control Test 1200 45000 40000 35000 3500 CPU Control Test 3000 1000 Original CPU Interference Test 800 600 Adjusted CPU Interference Test 400 200 Original CPU Interference Test 2500 2000 Adjusted CPU Interference Test 1500 1000 0 1 2 3 30000 CPU Control Energy 25000 20000 1 CPU and 1 Memory Interference Test Energy 15000 10000 5000 500 0 Power usage for C-ray and Ramspeed tests run together 0 0 0 1 2 3 0 1 2 3
  • 17. Interference Results (Joint Pinning Results on Ramspeed): Intel Pentium dual CPU T2330 Ramspeed Execution Time Interference (C-ray test on second core) 220 Ramspeed L2 Cache Misses Interference (C-ray test on second core) 54100 210 54050 200 190 Memory Control Test 180 Memory Interference Test 54000 Memory Control Test Memory Interference Test 53950 170 53900 160 150 0 1 2 3 53850 0 1 2 3
  • 18. Interference Results (2 CPU Intensive Application Pinning Results): Intel Pentium dual CPU T2330 C-ray Execution Time Interference (C-ray test on second core) 1600 C-ray L2 Cache Misses Interference (C-ray test on second core) 700 Power usage for 2 Cray tests running on separate cores 45000 40000 1400 600 35000 1200 CPU Control Test 1000 CPU Interference Test 800 600 CPU Interference Test 400 200 500 CPU Control Test 400 CPU Interference Test 300 CPU Interference Test 200 100 0 0 1 2 3 30000 CPU Control Energy 25000 20000 2 CPU Interference Test Energy 15000 10000 5000 0 0 0 1 2 3 0 1 2 3
  • 19. Interference Results (2 Memory Intensive Application Pinning Results): Intel Pentium dual CPU T2330 Ramspeed Execution Time Interference (Ramspeed test on second core) Ramspeed L2 Cache Misses Interference (Ramspeed test on second core) 400 54250 350 54200 Power usage for 2 Ramspeed tests running on separate cores 45000 40000 35000 300 Memory Control Test 250 54150 Memory Control Test 54100 Memory Interference Test 200 150 Memory Interference Test 100 Memory Interference Test Memory Interference Test 54000 53950 0 53900 0 1 2 3 Memory Control Energy 25000 54050 50 30000 20000 2 Memory Interference Test Energy 15000 10000 5000 0 0 1 2 3 0 1 2 3
  • 20. Interference between simultaneous applications: Future Tests The foundation scripts have been written so in the future it will be very easy to add support for testing: – Interference of 1 type of application pinned to N cores for a processor with a substantial number of cores (ie >2) – Interference from 2 CPU intensive or 2 Memory intensive test applications – Measure memory interference with M applications mapped to N cores (Obviously N > 2) – Testing a larger sample size might produce more interesting results – Find which application to core mappings can provide the best performance for specific architectures/cache sizes
  • 21. Presentation Outline • Project Goals • Tools for Benchmarking: • Performance counters, PAPI, • HPC Toolkit, Phoronix Test Suits, • Power Measurement • How testing was accomplished • List of additional data points for application to processor affinity • A simple continuation of Ryan’s test work • Results from Interference Testing for applications pinned to specific cores
  • 22. Thank You For Your Attention