Benchmarking Techniques
Michael Hope <michael.hope@linaro.org>
bzr branch lp:~michaelh1/+junk/benchmarking-techniques
r6
2
Why benchmark?
3
Issues are
Relevance
Accuracy
Repeatability
4
Picking relevant benchmarks:
Profile
Workload
Features
We use SPEC 2000 and EEMBC
We'd like shareable benchmarks
5
Test Platform
Build / test / benchmark via Linux / web / commodity hardwar
6
Measuring
7
Realtime timers
See 'man clock_gettime'
8
<comparison of different timers>
See https://wiki.linaro.org/WorkingGroups/ToolChain/Benchmarks/TimerAc
9
10
11
External influences
12
Other features to be wary of
scheduler
governor
cpuidle, power management, thermal
limiting
SMP
bugs, like core lockdow...
13
Putting things together and running
We use:
timer built into the app
run five times
collect everything
post process
14
Statistics
15
16
17
Standard deviation
Dispersion / coefficient of variance
t-scores
t-test
18
t=
̄X 1 − ̄X 2
√ s1
2
N 1
+
s2
2
N 2
See http://en.wikipedia.org/wiki/Welch%27s_t_
19
Compiler Mean Std CV
gcc-4.6.2 1.00 134u 134u
gcc-linaro-4.6-2011.11 4.25 5035u 1178u
t = 30,300
21
Variant Mean Std CV
Plain 1.00 320u 320u
With SMS 1.01 430u 426u
t = 118
22
Variant Mean Std CV
Plain 1.00 320u 320u
With vectoriser 1.001 404u 404u
t = 8.27 - significant
23
Our tools
perf
difftest
betters
tabulate
Python
LibreOffice!
Other statistical tools like scipy.stat, R,
Judge, and min...
24
http://help.libreoffice.org/Calc/Applying_AutoFilter
25
www.linaro.org / wiki.linaro.org
people.linaro.org/~michaelh/presentations
Upcoming SlideShare
Loading in...5
×

Q2.12: Benchmarking Techniques

194
-1

Published on

Resource: Q2.12
Name: Benchmarking Techniques
Date: 30-05-2012
Speaker: Michael Hope

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
194
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Q2.12: Benchmarking Techniques

  1. 1. Benchmarking Techniques Michael Hope <michael.hope@linaro.org> bzr branch lp:~michaelh1/+junk/benchmarking-techniques r6
  2. 2. 2 Why benchmark?
  3. 3. 3 Issues are Relevance Accuracy Repeatability
  4. 4. 4 Picking relevant benchmarks: Profile Workload Features We use SPEC 2000 and EEMBC We'd like shareable benchmarks
  5. 5. 5 Test Platform Build / test / benchmark via Linux / web / commodity hardwar
  6. 6. 6 Measuring
  7. 7. 7 Realtime timers See 'man clock_gettime'
  8. 8. 8 <comparison of different timers> See https://wiki.linaro.org/WorkingGroups/ToolChain/Benchmarks/TimerAc
  9. 9. 9
  10. 10. 10
  11. 11. 11 External influences
  12. 12. 12 Other features to be wary of scheduler governor cpuidle, power management, thermal limiting SMP bugs, like core lockdown NEON startup See “Understanding the Linux Kernel” ch10: http://oreilly.com/catalog/linuxkernel/chapter/ch10.html
  13. 13. 13 Putting things together and running We use: timer built into the app run five times collect everything post process
  14. 14. 14 Statistics
  15. 15. 15
  16. 16. 16
  17. 17. 17 Standard deviation Dispersion / coefficient of variance t-scores t-test
  18. 18. 18 t= ̄X 1 − ̄X 2 √ s1 2 N 1 + s2 2 N 2 See http://en.wikipedia.org/wiki/Welch%27s_t_
  19. 19. 19 Compiler Mean Std CV gcc-4.6.2 1.00 134u 134u gcc-linaro-4.6-2011.11 4.25 5035u 1178u t = 30,300
  20. 20. 21 Variant Mean Std CV Plain 1.00 320u 320u With SMS 1.01 430u 426u t = 118
  21. 21. 22 Variant Mean Std CV Plain 1.00 320u 320u With vectoriser 1.001 404u 404u t = 8.27 - significant
  22. 22. 23 Our tools perf difftest betters tabulate Python LibreOffice! Other statistical tools like scipy.stat, R, Judge, and ministat
  23. 23. 24 http://help.libreoffice.org/Calc/Applying_AutoFilter
  24. 24. 25 www.linaro.org / wiki.linaro.org people.linaro.org/~michaelh/presentations
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×