This talk was given at KCD Munich - July 17 2023
Abstract
“Kubernetes is a platform for building platforms. It’s a better place to start: not the endgame”, tweeted by Kelsey Hightower in November 2017. 6 years later the Cloud Native Community is faced with 159 different CNCF projects to choose from. Entering CNCF can be overwhelming!
Cloud Native Platform Engineering with white papers, best practices and reference architectures are here to convert this dilemma into an opportunity. Internal Developer Platforms (IDP) are being built as we speak enabling organizations to harness the power of Kubernetes as a self-service platform.
Join this talk with Andreas Grabner, CNCF Ambassador, and get some insights on tooling, use cases and best practices so we can all fulfill the idea that Kelsey put out years ago.
KCD Munich - Cloud Native Platform Dilemma - Turning it into an Opportunity
1. The Cloud Native Platform Dilemma
Turning it into an Opportunity!
Andreas Grabner
CNCF Ambassador, DevRel @ CNCF Keptn
Global DevRelLead @ Dynatrace
Keptn: https://www.keptn.sh
3. You Build it You Run it Doesn’t Scale!
“Organizations reaching 50-100 engineers run into expertise bottlenecks.
Every team can’t be an expert in 10 different tools
needed to build, test, deploy, operate & secure software they own!”
Luca Galante, Head of Product at Humanitec on “PurePerformance Podcast”
4. Hiring 10x Software Engineers is a Myth
“Leaders often feel proud of their high performers.
In reality, having 10x engineers in your software organization may be a sign of
an organization that is set out to fail in the long run.”
Ari-Pekka Koponen, Head of Platform at Swarmia in “Busting the 10x software engineering myth”
6. Numbers confirm: Platform Engineering impacts developer velocity!
What led to the creation of a platform team?
22% – Need to increase speed of delivery
18% – We need to scale up
10% – Engineers were taking on too much work
What are the benefits of platform engineering?
#1 – Improves System Reliability
#2 – Improves Efficiency & Productivity of my work
#3 – Speed up delivery time
Increased –
66%
Descreased
Stayed the
same
Dont Know
Get the Report here: https://www.puppet.com/success/resources/state-of-platform-engineering
8. Before you start, put your product manager hat on …
Demand
Needs Wants
1: Understand your internal users! 2: Decide what to build vs buy!
9. … enables software-engineering self-service …
Internal Development Platform (IDP) is a PRODUCT that …
… by providing “Golden Path” recommended ways
… resulting in increased productivity and happiness of app teams
… for all relevant “use cases”: From developing, building, testing, securing,
documenting, deploying, operating, supporting and retiring software
… matching the preferred abstraction and skillset of the organization
10. Keep it Simple!
Start with an MVP
(Minimum Viable Product)
that may look like this …
11. Propose new Use Case
Self-Service Website Self-Service Serverless Function Self-Service Cloud Hosted App
12. As you add more use cases, the
platform may evolve to this …
13. 1: Start with template in Backstage 2: Add your code into Git
4: Argo/Flux/… does GitOps Magic 5: Observability into Deploy &App
3: Commit and push
$ git commit –m “adding new feature”
$ git push
15. Kubernetes is the core of platforms being built right now …
https://www.cncf.io/reports/cncf-annual-survey-2022/
16. Entering this community “fresh” can feel … overwhelming!
12000
attendees
159
CNCF projects
56%
1st time visitors
17. But you are not alone … join the existing communities …
https://platformengineering.org/
https://engineering.atspotify.com/
18. … join our efforts on defining reference architectures …
Infrastructure
Platform
Self-Service On-boarding
Templates, Catalog, Doc, Community
SRE
SLO, Auto-Scaling, Incident Response
Diagnostics & Insights
access to Observability
Secure Progressive Delivery
Blue/Green, Canary, Feature Flags
DevEx
Platform Services
Delivery Services
Platform Interface
Storage
Secret Messaging
Service Mesh
Policy Mgmt.
Scaling
Orchestration
Caching
Infra
Config
Deploy
Observe
Automate
Database
Dev
Portal
Git CI/CD Testing
Delivery
Control
Ticketing
Access to Observability, Security & Automation
Metrics
Logs
Traces
Events
Business
Container Registry
Platform
Engineering
Team
Building
Platform
as
a
Product
Platform
End-
User
Features
X-as-Code Observability
End-Users
Existing
PaaS
Security
Doc &
Community
Public, Private or Hybrid Cloud
Managed or Self-hosted K8s
Join the CNCF
Platform Working Group
19. … and remember - you don’t need to start from scratch …
These and many more …
… could be a good
starting point
20. Whatever type of platform you build
You will be judged by its success …
21. Here are key KPIs to measure the Success of an IDP!
CNCF Platforms White Paper: https://tag-app-delivery.cncf.io/whitepapers/platforms
Product Delivery (DORA)
Deployment Frequency
How often an organization successfully releases to production
Lead Time for Changes
The amount of time it takes a commit to get into production
Change Failure Rate
The percentage of deployments causing a failure in production
Time to Restore Service
How long it takes an organization to recover from a failure in production
User Adoption & Productivity
Active users and retention
includes number of capabilities provisioned and user growth/churn
Net Promoter Score (NPS)
or other survey measuring user satisfaction with a product
Developer Productivity (SPACE Metrics)
Satisfaction, Performance, Activity, Collaboration, Efficiency
22. Make your Platform Observable
… in order to deliver and optimize those KPIs
23. Observability for Platform, Delivery & DevEx to become successful
Self-Service On-boarding
Templates, Catalog, Doc, Community
SRE
SLO, Auto-Scaling, Incident Response
Diagnostics & Insights
access to Observability
Secure Progressive Delivery
Blue/Green, Canary, Feature Flags
Public, Private or Hybrid Cloud
Managed or Self-hosted K8s
Access to Observability, Security & Automation
Active Users
NPS
SPACE
Availability,
Resiliency,
Security
DORA
FinOps
Utilization
Infrastructure
Platform
DevEx
Platform Services
Delivery Services
Platform Interface
Platform
Engineering
Team
Building
Platform
as
a
Product
Platform
End-
User
Features
Success KPIs
for Platform
X-as-Code Observability
SLAs
for Platform
End-Users
25. #1 – Availability, Resilience & Security for all Platform Services
Availability Resiliency & Dependencies
Resource & Capacity
0
Security
26. #2 – SLAs, Adoption and Behavior for all Platform Services
SLAs User Behavior & Experience
Service Adoption Metrics & Usage Insights
27. #3 – Identify misconfiguration and educate users
Extract errors from Logs, Traces, Metrics …
28. #4 – Measure Application Deployment-Aware DORA
business-app:2.0 **
Frontend-Svc:2.0
part-of: business-app
Backend-Svc:1.5
part-of: business-app
Storage-Svc:1.0
part-of: business-app
Post
Pre
Post
Pre
Post
Pre
Timespan & Result for each single deployment
Pre-App-Deployment
Post-App-Deployment
Timespan Time & Result for whole app deployment
Observe: Metrics (DORA) & Traces
Because Pod Deploy != (doesn’t measure) App Deploy!
29. #5 – Observability for delivery checks, scaling, remediation …
$ kubectl apply –n prod –f payment-service.yaml
payment-service deployment changed
$ kubectl get pod –n prod
NAME READY STATUS RESTARTS AGE
payment-service 2/2 Running 0 15m
objectives:
- keptnMetricRef:
name: response-time
evaluationTarget: "<100ms"
- keptnMetricRef:
name: request-failure-rate
evaluationTarget: "<1%"
- keptnMetricRef:
name: availability-slo
evaluationTarget: ">99.99%"
Because Pod Running
Unhealthy
!= (doesn’t mean) App Healthy!
33. My 3 takeaways to be successful at Platform Engineering!
1: Start with a Minimum Viable Product
2: Seek the wisdom of the community
3: Make your Platform Observable
Find more:
https://www.keptn.sh/
https://lifecycle.keptn.sh/
https://github.com/keptn-sandbox/klt-on-k3s-with-argocd/
https://twitter.com/keptnProject
https://slack.keptn.sh
Title: The Cloud Native Platform Dilemma – Turning it into an Opportunity!
Time: 30min
“Kubernetes is a platform for building platforms. It’s a better place to start: not the endgame”, tweeted by Kelsey Hightower in November 2017. 6 years later the Cloud Native Community is faced with 159 different CNCF projects to choose from. Entering CNCF can be overwhelming!
Cloud Native Platform Engineering with white papers, best practices and reference architectures are here to convert this dilemma into an opportunity. Internal Developer Platforms (IDP) are being built as we speak enabling organizations to harness the power of Kubernetes as a self-service platform.
Join this talk with Andreas Grabner, CNCF Ambassador, and get some insights on tooling, use cases and best practices so we can all fulfill the idea that Kelsey put out years ago.
Because now more than ever organizations are trying to adopt DevOps & SRE practices but are failing to scale!
And we cant hire 10x engineers and hope they solve all our problems …
Understand what your organization needs and where they are inefficient!
Pick existing platforms first before building something yourself!
When building your own platform, treat is a product and design it to match the skill level and expectations of your users (= your internal developers)
Feedback from MarkT:
Slide 10 - make sure you speak to the idea that IDP is the natural evolution of a pipeline
there are still pipelines…but they are more expansive, robust, supportive of “how to develop a product” not just “build code and push it”
for me - I have 3 teams under me now (Observability, Performance and AI/ML)
for all 3 teams - our technical product is a pipeline, but our business product (e.g. value) is an “a new or enhanced capability of our organization”
“new capabilities” are - stuff we just never did before and it’s greatly beneficial (e.g. having PurePath traces across all environments)
“enhanced capabilities” - for things we know are good practices, but we will reduce the toil, automate the painful stuff, do the practice faster and more often
for me - I read Slide 10 for “IDP” as connecting technical practices to organizational capabilities
Platform Engineering is how we simplify computing for every developer in our organizations by creating the environments for them to run their apps.
Those environments are the platform, and they must be observable to be successful