SlideShare a Scribd company logo
1 of 12
Download to read offline
Evaluation of ARM CPUs
for IceCube
available through
Google Kubernetes Engine
Igor Sfiligoi – UCSD
in collaboration with David Schultz & Benedikt Riedel - University of Wisconsin Madison
CMS O&C Meeting, 9/28/22
Benchmark setup
• Google offered some credits to IceCube to be used on the
Google Kubernetes Engine (GKE)
• A managed Kubernetes setup on the Google Compute Cloud (GCP)
• Google offers several generations of CPUs
• x86 from both Intel and AMD, in both Hyper-threaded and not mode
• ARM CPUs from Ampere Computing (no hyper-threading)
• The main CPU-heavy application is CORSIKA
• Cosmic air shower simulator
• x86 binaries available on CVMFS
• ARM binaries compiled in GKE
CMS O&C Meeting, 9/28/22
CPUs used
• Being a Cloud platform, a lot of details are hidden
• But this is what I can see:
• t2a - one vCPU per core
• Ampere Altra Arm Q64-30 processor with all-core frequency of 3.0 GHz
• t2d - one vCPU per core
• AMD EPYC 7B13 - 2.45 base 2.8 effective 3.5 max boost
• c2d - two vCPU per core
• AMD EPYC 7B13 - 2.45 base 2.8 effective 3.5 max boost
• c2 - two vCPUs per core
• Intel® Xeon® Gold 6253CL - 3.1 base 3.8 all core turbo 3.9 single core
Same HW, different setup
CMS O&C Meeting, 9/28/22
Benchmark results
ARM CPUs faster than all tested x86 CPUs
Hyper-threading actually
quite effective
ARM
CMS O&C Meeting, 9/28/22
Cost effectiveness
But AMD pricing definitely
not in line with performance
ARM CPUs more cost-effective than x86 CPUs
ARM
CMS O&C Meeting, 9/28/22
Porting experience
• x86 binaries already available, so a no brainer
• We did not have ARM binaries, so we had to compile from source
• Likely the first time it was done for most of the IceCube SW stack
• The process was mostly painless
• Some hardcoded x86-specific compiler parameters
• Easy to replace with ARM-equivalents (but took a few passes to get them all)
• CORSIKA itself built without a hitch
• But we did need a bunch of helper binaries for the jobs to run
• OS and the like came out of the box for both platforms
• As standard docker images (we used k8s, remember?)
CMS O&C Meeting, 9/28/22
This is it for concrete data
• The previous slides contain pretty much all that we have done
• In the next few slides I will show a few reference data
about power consumption I pulled from the Web
• Take with a grain of salt!
CMS O&C Meeting, 9/28/22
CPU comparison
• Ampere Altra Q64-30
• 64 cores at 3.0 GHz
• 180W TDP
• 8x 3200 DDR4 RAM
• AMD EPYC 7B13
• CPU family 25, Stepping 1
• 60 cores, base freq 2.45 Ghz
• 8x 3200 DDR4 RAM
• Closest public TDP data is for
EPYC 7713 at 225W TDP
https://www.anandtech.com/show/16979/the-ampere-
altra-max-review-pushing-it-to-128-cores-per-socket
https://www.amd.com/en/products/cpu/amd-epyc-7713
https://cloud.google.com/compute/docs/general-purpose-
machines#t2d_machines
Intel Xeon 6253CL supposedly has 18 cores at 200W TDP
CMS O&C Meeting, 9/28/22
CPU comparison
• Ampere Altra Q64-30
• 64 cores at 3.0 GHz
• 180W TDP
• 8x 3200 DDR4 RAM
• AMD EPYC 7B13
• CPU family 25, Stepping 1
• 60 cores, base freq 2.45 Ghz
• 8x 3200 DDR4 RAM
• Closest public TDP data is for
EPYC 7713 at 225W TDP
https://www.anandtech.com/show/16979/the-ampere-
altra-max-review-pushing-it-to-128-cores-per-socket
https://www.amd.com/en/products/cpu/amd-epyc-7713
https://cloud.google.com/compute/docs/general-purpose-
machines#t2d_machines
Intel Xeon 6253CL supposedly has 18 cores at 200W TDP
CPU power
consumption is just a
fraction of the total
node power usage, of
course!
CMS O&C Meeting, 9/28/22
CPU power effectiveness
AMD EPYC in HT mode likely beats ARM CPU in power effectiveness.
But ARM CPU more power efficient when non-HT mode is desired.
Hyper-threading great
for power effectiveness
Disclaimer:
Power usage
based on
estimated TDP
CMS O&C Meeting, 9/28/22
In summary
• ARM CPUs now easily available in the commercial Clouds
• Porting code to ARM not hard if you have the source code
• In GKE, using IceCube CORSIKA simulation
• ARM cores about as fast as the x86 cores (even a bit faster)
• Priced cheaper, too
• No hard data on power usage, but extrapolated data imply
• ARM CPUs only modestly more power efficient that x86 competition
CMS O&C Meeting, 9/28/22
Acknowledgement
• This work was partially funded by the U.S. National Science
Foundation (NSF) under grants OAC-1826967, OAC-1541349,
OPP- 2042807, CNS-1925001 and OAC- 2103963.
• All Google Kubernetes Engine costs have been covered by Google-
issued credits.
CMS O&C Meeting, 9/28/22

More Related Content

Similar to O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kubernetes Engine

RMOUG 2013 - Where did my CPU go?
RMOUG 2013 - Where did my CPU go?RMOUG 2013 - Where did my CPU go?
RMOUG 2013 - Where did my CPU go?Kristofferson A
 
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...Cesar Maciel
 
FPGAs in the cloud? (October 2017)
FPGAs in the cloud? (October 2017)FPGAs in the cloud? (October 2017)
FPGAs in the cloud? (October 2017)Julien SIMON
 
Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVMAchieving the ultimate performance with KVM
Achieving the ultimate performance with KVMStorPool Storage
 
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsChoose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsDatabricks
 
AWS Summit Bogotá Track Avanzado: EC2 avanzado
AWS Summit Bogotá Track Avanzado: EC2 avanzadoAWS Summit Bogotá Track Avanzado: EC2 avanzado
AWS Summit Bogotá Track Avanzado: EC2 avanzadoAmazon Web Services
 
Making the most out of Heterogeneous Chips with CPU, GPU and FPGA
Making the most out of Heterogeneous Chips with CPU, GPU and FPGAMaking the most out of Heterogeneous Chips with CPU, GPU and FPGA
Making the most out of Heterogeneous Chips with CPU, GPU and FPGAFacultad de Informática UCM
 
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseTackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseDatabricks
 
”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016Kuniyasu Suzaki
 
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.J On The Beach
 
Arm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdfArm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdfPaul Yang
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computingbakers84
 
QEMU and Raspberry Pi. Instant Embedded Development
QEMU and Raspberry Pi. Instant Embedded DevelopmentQEMU and Raspberry Pi. Instant Embedded Development
QEMU and Raspberry Pi. Instant Embedded DevelopmentGlobalLogic Ukraine
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceAmazon Web Services
 
Computação acelerada – a era das ap us roberto brandão, ciência
Computação acelerada – a era das ap us   roberto brandão,  ciênciaComputação acelerada – a era das ap us   roberto brandão,  ciência
Computação acelerada – a era das ap us roberto brandão, ciênciaCampus Party Brasil
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMDevOps.com
 

Similar to O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kubernetes Engine (20)

RMOUG 2013 - Where did my CPU go?
RMOUG 2013 - Where did my CPU go?RMOUG 2013 - Where did my CPU go?
RMOUG 2013 - Where did my CPU go?
 
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
 
FPGAs in the cloud? (October 2017)
FPGAs in the cloud? (October 2017)FPGAs in the cloud? (October 2017)
FPGAs in the cloud? (October 2017)
 
Achieving the ultimate performance with KVM
Achieving the ultimate performance with KVMAchieving the ultimate performance with KVM
Achieving the ultimate performance with KVM
 
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsChoose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
 
AWS Summit Bogotá Track Avanzado: EC2 avanzado
AWS Summit Bogotá Track Avanzado: EC2 avanzadoAWS Summit Bogotá Track Avanzado: EC2 avanzado
AWS Summit Bogotá Track Avanzado: EC2 avanzado
 
Making the most out of Heterogeneous Chips with CPU, GPU and FPGA
Making the most out of Heterogeneous Chips with CPU, GPU and FPGAMaking the most out of Heterogeneous Chips with CPU, GPU and FPGA
Making the most out of Heterogeneous Chips with CPU, GPU and FPGA
 
Programming Models for Heterogeneous Chips
Programming Models for  Heterogeneous ChipsProgramming Models for  Heterogeneous Chips
Programming Models for Heterogeneous Chips
 
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseTackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
 
”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016
 
AMD K6
AMD K6AMD K6
AMD K6
 
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.
 
Arm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdfArm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdf
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
 
Pgopencl
PgopenclPgopencl
Pgopencl
 
PostgreSQL with OpenCL
PostgreSQL with OpenCLPostgreSQL with OpenCL
PostgreSQL with OpenCL
 
QEMU and Raspberry Pi. Instant Embedded Development
QEMU and Raspberry Pi. Instant Embedded DevelopmentQEMU and Raspberry Pi. Instant Embedded Development
QEMU and Raspberry Pi. Instant Embedded Development
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
Computação acelerada – a era das ap us roberto brandão, ciência
Computação acelerada – a era das ap us   roberto brandão,  ciênciaComputação acelerada – a era das ap us   roberto brandão,  ciência
Computação acelerada – a era das ap us roberto brandão, ciência
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVM
 

More from Igor Sfiligoi

Preparing Fusion codes for Perlmutter - CGYRO
Preparing Fusion codes for Perlmutter - CGYROPreparing Fusion codes for Perlmutter - CGYRO
Preparing Fusion codes for Perlmutter - CGYROIgor Sfiligoi
 
Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...Igor Sfiligoi
 
The anachronism of whole-GPU accounting
The anachronism of whole-GPU accountingThe anachronism of whole-GPU accounting
The anachronism of whole-GPU accountingIgor Sfiligoi
 
Auto-scaling HTCondor pools using Kubernetes compute resources
Auto-scaling HTCondor pools using Kubernetes compute resourcesAuto-scaling HTCondor pools using Kubernetes compute resources
Auto-scaling HTCondor pools using Kubernetes compute resourcesIgor Sfiligoi
 
Speeding up bowtie2 by improving cache-hit rate
Speeding up bowtie2 by improving cache-hit rateSpeeding up bowtie2 by improving cache-hit rate
Speeding up bowtie2 by improving cache-hit rateIgor Sfiligoi
 
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence SimulationsPerformance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence SimulationsIgor Sfiligoi
 
Comparing GPU effectiveness for Unifrac distance compute
Comparing GPU effectiveness for Unifrac distance computeComparing GPU effectiveness for Unifrac distance compute
Comparing GPU effectiveness for Unifrac distance computeIgor Sfiligoi
 
Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Igor Sfiligoi
 
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory AccessAccelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory AccessIgor Sfiligoi
 
Using A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific OutputUsing A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific OutputIgor Sfiligoi
 
Using commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobsUsing commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobsIgor Sfiligoi
 
Modest scale HPC on Azure using CGYRO
Modest scale HPC on Azure using CGYROModest scale HPC on Azure using CGYRO
Modest scale HPC on Azure using CGYROIgor Sfiligoi
 
Data-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud BurstData-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud BurstIgor Sfiligoi
 
Scheduling a Kubernetes Federation with Admiralty
Scheduling a Kubernetes Federation with AdmiraltyScheduling a Kubernetes Federation with Admiralty
Scheduling a Kubernetes Federation with AdmiraltyIgor Sfiligoi
 
Accelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACCAccelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACCIgor Sfiligoi
 
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...Igor Sfiligoi
 
Porting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUsPorting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUsIgor Sfiligoi
 
Demonstrating 100 Gbps in and out of the public Clouds
Demonstrating 100 Gbps in and out of the public CloudsDemonstrating 100 Gbps in and out of the public Clouds
Demonstrating 100 Gbps in and out of the public CloudsIgor Sfiligoi
 
TransAtlantic Networking using Cloud links
TransAtlantic Networking using Cloud linksTransAtlantic Networking using Cloud links
TransAtlantic Networking using Cloud linksIgor Sfiligoi
 
Bursting into the public Cloud - Sharing my experience doing it at large scal...
Bursting into the public Cloud - Sharing my experience doing it at large scal...Bursting into the public Cloud - Sharing my experience doing it at large scal...
Bursting into the public Cloud - Sharing my experience doing it at large scal...Igor Sfiligoi
 

More from Igor Sfiligoi (20)

Preparing Fusion codes for Perlmutter - CGYRO
Preparing Fusion codes for Perlmutter - CGYROPreparing Fusion codes for Perlmutter - CGYRO
Preparing Fusion codes for Perlmutter - CGYRO
 
Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...
 
The anachronism of whole-GPU accounting
The anachronism of whole-GPU accountingThe anachronism of whole-GPU accounting
The anachronism of whole-GPU accounting
 
Auto-scaling HTCondor pools using Kubernetes compute resources
Auto-scaling HTCondor pools using Kubernetes compute resourcesAuto-scaling HTCondor pools using Kubernetes compute resources
Auto-scaling HTCondor pools using Kubernetes compute resources
 
Speeding up bowtie2 by improving cache-hit rate
Speeding up bowtie2 by improving cache-hit rateSpeeding up bowtie2 by improving cache-hit rate
Speeding up bowtie2 by improving cache-hit rate
 
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence SimulationsPerformance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
 
Comparing GPU effectiveness for Unifrac distance compute
Comparing GPU effectiveness for Unifrac distance computeComparing GPU effectiveness for Unifrac distance compute
Comparing GPU effectiveness for Unifrac distance compute
 
Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...
 
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory AccessAccelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
 
Using A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific OutputUsing A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific Output
 
Using commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobsUsing commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobs
 
Modest scale HPC on Azure using CGYRO
Modest scale HPC on Azure using CGYROModest scale HPC on Azure using CGYRO
Modest scale HPC on Azure using CGYRO
 
Data-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud BurstData-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud Burst
 
Scheduling a Kubernetes Federation with Admiralty
Scheduling a Kubernetes Federation with AdmiraltyScheduling a Kubernetes Federation with Admiralty
Scheduling a Kubernetes Federation with Admiralty
 
Accelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACCAccelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACC
 
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
 
Porting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUsPorting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUs
 
Demonstrating 100 Gbps in and out of the public Clouds
Demonstrating 100 Gbps in and out of the public CloudsDemonstrating 100 Gbps in and out of the public Clouds
Demonstrating 100 Gbps in and out of the public Clouds
 
TransAtlantic Networking using Cloud links
TransAtlantic Networking using Cloud linksTransAtlantic Networking using Cloud links
TransAtlantic Networking using Cloud links
 
Bursting into the public Cloud - Sharing my experience doing it at large scal...
Bursting into the public Cloud - Sharing my experience doing it at large scal...Bursting into the public Cloud - Sharing my experience doing it at large scal...
Bursting into the public Cloud - Sharing my experience doing it at large scal...
 

Recently uploaded

Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Recently uploaded (20)

Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kubernetes Engine

  • 1. Evaluation of ARM CPUs for IceCube available through Google Kubernetes Engine Igor Sfiligoi – UCSD in collaboration with David Schultz & Benedikt Riedel - University of Wisconsin Madison CMS O&C Meeting, 9/28/22
  • 2. Benchmark setup • Google offered some credits to IceCube to be used on the Google Kubernetes Engine (GKE) • A managed Kubernetes setup on the Google Compute Cloud (GCP) • Google offers several generations of CPUs • x86 from both Intel and AMD, in both Hyper-threaded and not mode • ARM CPUs from Ampere Computing (no hyper-threading) • The main CPU-heavy application is CORSIKA • Cosmic air shower simulator • x86 binaries available on CVMFS • ARM binaries compiled in GKE CMS O&C Meeting, 9/28/22
  • 3. CPUs used • Being a Cloud platform, a lot of details are hidden • But this is what I can see: • t2a - one vCPU per core • Ampere Altra Arm Q64-30 processor with all-core frequency of 3.0 GHz • t2d - one vCPU per core • AMD EPYC 7B13 - 2.45 base 2.8 effective 3.5 max boost • c2d - two vCPU per core • AMD EPYC 7B13 - 2.45 base 2.8 effective 3.5 max boost • c2 - two vCPUs per core • Intel® Xeon® Gold 6253CL - 3.1 base 3.8 all core turbo 3.9 single core Same HW, different setup CMS O&C Meeting, 9/28/22
  • 4. Benchmark results ARM CPUs faster than all tested x86 CPUs Hyper-threading actually quite effective ARM CMS O&C Meeting, 9/28/22
  • 5. Cost effectiveness But AMD pricing definitely not in line with performance ARM CPUs more cost-effective than x86 CPUs ARM CMS O&C Meeting, 9/28/22
  • 6. Porting experience • x86 binaries already available, so a no brainer • We did not have ARM binaries, so we had to compile from source • Likely the first time it was done for most of the IceCube SW stack • The process was mostly painless • Some hardcoded x86-specific compiler parameters • Easy to replace with ARM-equivalents (but took a few passes to get them all) • CORSIKA itself built without a hitch • But we did need a bunch of helper binaries for the jobs to run • OS and the like came out of the box for both platforms • As standard docker images (we used k8s, remember?) CMS O&C Meeting, 9/28/22
  • 7. This is it for concrete data • The previous slides contain pretty much all that we have done • In the next few slides I will show a few reference data about power consumption I pulled from the Web • Take with a grain of salt! CMS O&C Meeting, 9/28/22
  • 8. CPU comparison • Ampere Altra Q64-30 • 64 cores at 3.0 GHz • 180W TDP • 8x 3200 DDR4 RAM • AMD EPYC 7B13 • CPU family 25, Stepping 1 • 60 cores, base freq 2.45 Ghz • 8x 3200 DDR4 RAM • Closest public TDP data is for EPYC 7713 at 225W TDP https://www.anandtech.com/show/16979/the-ampere- altra-max-review-pushing-it-to-128-cores-per-socket https://www.amd.com/en/products/cpu/amd-epyc-7713 https://cloud.google.com/compute/docs/general-purpose- machines#t2d_machines Intel Xeon 6253CL supposedly has 18 cores at 200W TDP CMS O&C Meeting, 9/28/22
  • 9. CPU comparison • Ampere Altra Q64-30 • 64 cores at 3.0 GHz • 180W TDP • 8x 3200 DDR4 RAM • AMD EPYC 7B13 • CPU family 25, Stepping 1 • 60 cores, base freq 2.45 Ghz • 8x 3200 DDR4 RAM • Closest public TDP data is for EPYC 7713 at 225W TDP https://www.anandtech.com/show/16979/the-ampere- altra-max-review-pushing-it-to-128-cores-per-socket https://www.amd.com/en/products/cpu/amd-epyc-7713 https://cloud.google.com/compute/docs/general-purpose- machines#t2d_machines Intel Xeon 6253CL supposedly has 18 cores at 200W TDP CPU power consumption is just a fraction of the total node power usage, of course! CMS O&C Meeting, 9/28/22
  • 10. CPU power effectiveness AMD EPYC in HT mode likely beats ARM CPU in power effectiveness. But ARM CPU more power efficient when non-HT mode is desired. Hyper-threading great for power effectiveness Disclaimer: Power usage based on estimated TDP CMS O&C Meeting, 9/28/22
  • 11. In summary • ARM CPUs now easily available in the commercial Clouds • Porting code to ARM not hard if you have the source code • In GKE, using IceCube CORSIKA simulation • ARM cores about as fast as the x86 cores (even a bit faster) • Priced cheaper, too • No hard data on power usage, but extrapolated data imply • ARM CPUs only modestly more power efficient that x86 competition CMS O&C Meeting, 9/28/22
  • 12. Acknowledgement • This work was partially funded by the U.S. National Science Foundation (NSF) under grants OAC-1826967, OAC-1541349, OPP- 2042807, CNS-1925001 and OAC- 2103963. • All Google Kubernetes Engine costs have been covered by Google- issued credits. CMS O&C Meeting, 9/28/22