Multi-arch from the ground up

Cheryl Hung, @oicheryl
Sr Director, Infra Ecosystem, Arm
Multi-Arch Infrastructure from the
Ground up
KubeCon CloudNativeCon EU 2023
Amsterdam, 19 Apr 2023

About me
@oicheryl
2010
● C++
engineer
2016
● Infra
● Dev Adv
2017
● Cloud
Native
London
2018
● CNCF

Cheryl, Sr Director Infra Ecosystem, Arm
@oicheryl

Arm Infra Ecosystem: Cloud, 5G, Telco, Networking
@oicheryl
1. Developer Outreach 2. SW/HW support 3. Standards

Objectives
1. Why is Multi-Architecture infrastructure tricky?
2. How do I do Multi-Arch with Kubernetes?
3. Case studies: FusionAuth, Honeycomb, Arm
@oicheryl

Why is Multi-Architecture
infrastructure tricky?

@oicheryl
Does your infrastructure support Multi-Arch?
1. Completely
2. Partially
3. Experimenting
4. No, but interested
5. No, not yet

@oicheryl
Why Multi-Arch infrastructure?

Better price-performance
Global Cloud IT spending

Fruit computers
ARM v1 ARM v7
ARM v6
ARM v8.6
ARM v8.4A
Arm’s RISC architecture targets power eﬃciency and
performance and can be licensed for diﬀerent use cases

@oicheryl
Goals of Multi-Arch infrastructure
Workloads should run on the
best hardware for their
price/performance needs
Without developers being
concerned with the
underlying architecture

But Multi-Arch touches everything…
● Infra As Code
● CI/CD, reproducible
builds
● Packaging, binaries,
images, registries
● Testing, scheduling,
rollout, performance
testing
● Kubernetes upgrades
● …

How do I do Multi-Arch with
Kubernetes?

@oicheryl
Assumption 1: cloud native infra

2019
2020
2021
2022
@oicheryl
Assumption 2: Public cloud

1. Inform
Inventory your software stack
● OS
● Container images
● Libraries, frameworks and runtimes
● Tools used to build, deploy and test
● Tools used to monitor and manage
and check each for Arm support (AArch64 in GCC, arm64 in Linux
kernel)
Identify hotspots
@oicheryl

2. Optimize
Provision test Arm64 environment
Upgrade container images and test
Performance testing
Update CI/CD for reproducible Arm64 builds
@oicheryl

3. Operate
Build K8s cluster
● Mixed control plane and worker nodes
● Cluster creation
● Daemonsets
Canary or blue-green deployment
● Node aﬃnity, taints, tolerations
● Diﬀerent limits and requests per architecture
@oicheryl

“48 of the top 50
Amazon EC2 customers
use AWS Graviton
processors for their
workloads”
Danilo Poccia, Chief
Evangelist (EMEA) AWS, Aug
2022
@oicheryl

Summer
2020
Community
member
investigates
running on
Raspberry Pi
Load tests on
Graviton
Oﬃcial support;
rolled out to SaaS
in June
Summer
2021
March
2022
>70% of SaaS
instances run on
Arm
March
2023
A developer-focused API-ﬁrst auth provider deployable anywhere
@oicheryl

technical timeline
1. Finding the JVM that supported ARM, especially Mac ARM chips (Java 17 was
the ﬁrst one to do so). Added Java 17/ARM support to the code base Dec 2021
- Feb 2022
2. Updating and testing install scripts to use the correct JVM
3. Updating docker to target the ARM architecture with jlink and multi-arch
builds
4. Checking for ARM support in public cloud regions when spinning up SaaS
5. Update the application to expose the underlying architecture
Logins are especially CPU intensive due to password hashing, so the team load
tested 50k logins. Arm handled 26-49% more logins per second and cost 8-10%
less than Intel on EC2
@oicheryl

“Because we run on Java, our lift was pretty small. We just
had to ﬁnd a JVM built for ARM, and then work out any
remaining kinks”
- Daniel DeGroﬀ, FusionAuth CTO
“I just switched a FusionAuth instance to arm64 and the
transition was so smooth I couldn't even tell whether it's
actually running the arm64 version”
- Hendy Irawan, BandungPermaculture.com CIO and FusionAuth
user
@oicheryl

Full stack observability enabling engineers to deeply
understand and debug production software
@oicheryl
March
2020
First experiments
with Graviton2
Ingest workers in
production
Virtually all
workloads and
envs on Arm
92% of vCPUs on
Arm
May
2021
Nov
2021
Turned oﬀ last x86
EC2 instances
99% Arm on
Lambda
April
2022

1. Chose to migrate ingest workers first as they are stateless, performance
critical and scales out horizontally. Written in Golang, so compiling for Arm
was easy.
2. Deployed in dogfood environment and observed positive results.
3. Initially not Kubernetes or container orchestration, everything in Terraform
and Chef so could switch Arm Amazon Machine Images (AMIs) by
enumerating all the dependencies to update.
4. Next up were own workloads (easiest to re-compile and highest compute
spend), Kafka. The last were ad-hoc one-off services and those difficult to
migrate.
honeycomb.io/blog/present-future-arm-aws-graviton-honeycomb

“Graviton has enabled Honeycomb to scale up our product without
increased operational toil, spend less on compute, and have a smaller
environmental footprint.”
- Ian Smith, Engineering Manager, Honeycomb.io
“I personally approached it as an idle experiment with a few spare
afternoons, and was surprised by how compelling the results were.
Saving 40% on the EC2 instance bill for this service […] is well worth the
investment”
- Liz Fong Jones, Field CTO, Honeycomb.io
@oicheryl

Dogfooding on Arm
@oicheryl
2019 Arm moved EDA
tools from on-prem
x86 to Graviton
✅ 60% better performance
✅ 50% reduced cost
✅ >1MW power saved/day

Thanks!
Slides at oicheryl.com
Takeaways
@oicheryl
Sched
feedback

Thanks!
Takeaways
● Why multi-arch?
@oicheryl
Sched
feedback

Thanks!
Takeaways
● Why multi-arch?
● How?
@oicheryl
Sched
feedback

Thanks!
Takeaways
● Why multi-arch?
● How?
● Talk to me!
○ Technical assistance from Arm
○ Credits for CI/CD
○ Success stories
@oicheryl
Sched
feedback

Thanks!
Takeaways
● Why multi-arch?
● How?
● Talk to me!
○ Technical support from Arm
○ Credits for CI/CD
○ Success stories
@oicheryl
Sched
feedback

Multi-arch from the ground up

Recommended

Recommended

More Related Content

Similar to Multi-arch from the ground up

Similar to Multi-arch from the ground up (20)

More from Cheryl Hung

More from Cheryl Hung (20)

Recently uploaded

Recently uploaded (20)

Multi-arch from the ground up