Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

of

IntelON 2021 Processor Benchmarking Slide 1 IntelON 2021 Processor Benchmarking Slide 2 IntelON 2021 Processor Benchmarking Slide 3 IntelON 2021 Processor Benchmarking Slide 4 IntelON 2021 Processor Benchmarking Slide 5 IntelON 2021 Processor Benchmarking Slide 6 IntelON 2021 Processor Benchmarking Slide 7 IntelON 2021 Processor Benchmarking Slide 8 IntelON 2021 Processor Benchmarking Slide 9 IntelON 2021 Processor Benchmarking Slide 10 IntelON 2021 Processor Benchmarking Slide 11 IntelON 2021 Processor Benchmarking Slide 12 IntelON 2021 Processor Benchmarking Slide 13 IntelON 2021 Processor Benchmarking Slide 14 IntelON 2021 Processor Benchmarking Slide 15 IntelON 2021 Processor Benchmarking Slide 16 IntelON 2021 Processor Benchmarking Slide 17
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

0 Likes

Share

Download to read offline

IntelON 2021 Processor Benchmarking

Download to read offline

A short summary of processor benchmarking by Brendan Gregg: a case study of misleading results, and methodologies to do accurate benchmarking.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

IntelON 2021 Processor Benchmarking

  1. 1. Processor Benchmarking Brendan Gregg Senior Performance Engineer IntelON, Oct 2021
  2. 2. Case Study (2021) New processor Popular CPU benchmark: 2.6x faster than Intel What would you do?
  3. 3. ~100% of benchmarks are wrong
  4. 4. Active Benchmarking Low-level analysis while it is still running Not just statistical analysis of the results
  5. 5. Flame Graphs Showed CPU time was in a single function Flame Graphs are now in Intel vTune!
  6. 6. Instruction-Level Profiling...
  7. 7. linux$ perf top -e cycles:ppp -p 18641 Samples: 274K of event 'cycles:ppp', 4000 Hz, Event count (approx.): 61489970617 │ for(l = 2; l <= t; l++) 0.02 │20290: comisd %xmm2,%xmm1 0.05 │20294: ↑ jb 20270 <cpu_execute_event+0x30> │ if (c % l == 0) 0.15 │20296: test $0x1,%bl 0.15 │20299: ↑ je 20270 <cpu_execute_event+0x30> │ for(l = 2; l <= t; l++) │2029b: mov $0x2,%ecx │202a0: ↓ jmp 202c4 <cpu_execute_event+0x84> │202a2: nopw 0x0(%rax,%rax,1) 3.57 │202a8: pxor %xmm0,%xmm0 0.21 │202ac: cvtsi2sd %rcx,%xmm0 0.26 │202b1: comisd %xmm0,%xmm1 3.51 │202b5: ↑ jb 20270 <cpu_execute_event+0x30> │ if (c % l == 0) 0.09 │202b7: mov %rbx,%rax 0.02 │202ba: xor %edx,%edx 85.00 │202bc: div %rcx 0.12 │202bf: test %rdx,%rdx
  8. 8. linux$ perf top -e cycles:ppp -p 18641 Samples: 274K of event 'cycles:ppp', 4000 Hz, Event count (approx.): 61489970617 │ for(l = 2; l <= t; l++) 0.02 │20290: comisd %xmm2,%xmm1 0.05 │20294: ↑ jb 20270 <cpu_execute_event+0x30> │ if (c % l == 0) 0.15 │20296: test $0x1,%bl 0.15 │20299: ↑ je 20270 <cpu_execute_event+0x30> │ for(l = 2; l <= t; l++) │2029b: mov $0x2,%ecx │202a0: ↓ jmp 202c4 <cpu_execute_event+0x84> │202a2: nopw 0x0(%rax,%rax,1) 3.57 │202a8: pxor %xmm0,%xmm0 0.21 │202ac: cvtsi2sd %rcx,%xmm0 0.26 │202b1: comisd %xmm0,%xmm1 3.51 │202b5: ↑ jb 20270 <cpu_execute_event+0x30> │ if (c % l == 0) 0.09 │202b7: mov %rbx,%rax 0.02 │202ba: xor %edx,%edx 85.00 │202bc: div %rcx 0.12 │202bf: test %rdx,%rdx 85% of cycles in the div instruction
  9. 9. Instruction-level Analysis ● Determined it’s really a div benchmark ● Other processor has a faster div
  10. 10. Netflix Cloud ● <1% div cycles ● Therefore, perf win should be <1% (not 2.6x!)
  11. 11. Challenges ● This benchmark is widely used ● Cycle analysis is nearly impossible in the cloud ○ Under hypervisors: Limited PMCs; no PEBS ● Accurate benchmarking needs senior engineers
  12. 12. ~100% of benchmarks are wrong
  13. 13. My Benchmarking Checklist 1. Why not double? 2. Was it tuned? 3. Did it break limits? 4. Did it error? 5. Does it reproduce? 6. Does it matter? 7. Did it even happen? https://www.brendangregg.com/blog/2018-06-30/benchmarking-checklist.html
  14. 14. An Exciting New Era of Processor Innovation Vertical stacking, new capabilities More processors & competition
  15. 15. But also a Challenging New Era of Processor Benchmarking Increased demand Hard to do debug in the cloud Popular benchmarks can be wrong
  16. 16. Good benchmarking drives innovation
  17. 17. Thank you. Brendan Gregg @brendangregg

A short summary of processor benchmarking by Brendan Gregg: a case study of misleading results, and methodologies to do accurate benchmarking.

Views

Total views

105

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

2

Shares

0

Comments

0

Likes

0

×