Service-Level Objective for Serverless Applications

Service-Level Objective for
Serverless Applications
Hai Duc Nguyen, Aleksander Slominski, and Lionel Villard

Background
Business objectives
- Market growth
- Maximize user
experiences
Feedback: Performance metrics, earning, etc.
Business improvement loop
Solution deployments
- Functionalities
- Latency
- Throughput
Encode business logics in terms of performance
indicators that can be measured and enforced by proper
system designs
- Software-level Agreements (SLA): performance
guarantee agreements defined by user and enforced
by the implementation
- Software-level Objective (SLO): key elements of SLA,
objectives that the implementation must achieve to
satisfy the SLA
2

SLO/SLA Example: Reefer Container Shipment
reference implementation*
*see https://ibm-cloud-architecture.github.io/refarch-kc/
• Business goal: customer
experience should not be
affected by delay
• Translate to SLA/SLO
- 99% of order placement < 2s
- No order placement lost
- Availability > 99.999%
- Capacity > 1M concurrent
orders
- Business logics change
- Market growth
- Response to competitors’
improvement
3
Can we translate the business goal to system
implementation fast?

Serverless (industry perspective)
• According to Cloud Native Computing Foundation*: “Serverless computing refers to the concept of
building and running applications that do not require server management. It describes a finer-
grained deployment model where applications, bundled as one or more functions, are uploaded to
a platform and then executed, scaled, and billed in response to the exact demand needed at the
moment”
* https://www.cncf.io/blog/2018/02/14/cncf-takes-first-step-towards-serverless-computing/
Time
Workload
Serverless
Allocation
Cost
- Autoscaling: dynamically add/remove
resources according to workload, scale-to-
zero
- Truly pay-as-you-go: only pay for function
execution time with fine-grain pricing (per
100ms)
- Detach resource management from
software development/maintenance
process
- Speed up software evolution
4

Serverless Implementation
Kubernetes is “an open-source system for automating
deployment, scaling, and management of containerized
applications”*
• Become the de facto standard platform for container orchestration.
• Market adoption is strong: 80% of companies used Kubernetes in
some forms in 2019, and the number is subjected to increase***
Knative is “an open source community project which adds
components for deploying, running, and managing serverless,
cloud native applications to Kubernetes”**.
* https://kubernetes.io
** https://www.redhat.com/en/topics/microservices/what-is-knative
*** Kubernetes Ecosystem, https://thenewstack.io/ebooks/kubernetes/state-of-kubernetes-ecosystem-second-edition-2020/
5

Example: handling a new shipment order
Preprocess
Find
container
Schedule
shipment
New order Response
𝑓
1. Submit
2. Establish
Knative
Kubernetes
4. Invoke
3. Subscribe
𝑓
5. Publish 6. Trigger
Container Container
𝑓 Kafka
Kafka
Container
6

Why Serverless SLO?
• In Serverless, it is order of magnitude faster to translate among
business logics and system designs
• Zero deployment/maintenance effort
• New way of utilizing resources
• Finer grain resource management  truly pay-as-you-go
• High flexibility  more rooms for resource management optimization
• Spatial: stateless
• Temporal: short execution time
 Potential way to quickly define and enforce SLOs
7

Experiment setup for serverless sequence*
100ms 100ms 100ms
…
- Periodic bursts
- Constant ramp-up, 4 orders per sec2
- Limited rate, up to 120 orders per sec
Workload (Input)
Preprocess
Find
container
Schedule
shipment
New order Response
forward-1 Kafka Kafkaforward-2 forward-3 Output
System specs:
- Knative v0.15.1 over Kubernetes v1.18.2 supported by KinD**
- Events are transmitted through Kafka v2.5.0
- 12 CPU, 48 GB Memory and 256GB SSD
*Script available at https://github.com/ngduchai/event-driven-apps
**KinD helps deploying Kubernetes cluster over local computer, see https://kind.sigs.k8s.io
8

Invocation Overhead of forward-1
• Scaling delay: Serverless framework need time to detect workload changes and
schedule resources accordingly
• Cold start: It takes time to allocate and initialize new function sandboxes
Latency distributionDemand vs. System capacity9

Autoscaler behaviors over the Sequence
Scale-up delay propagation Latency delay propagation
- Serverless platform is not aware of the sequence topology
- The overhead is amplified as computation propagates over the topology
10

Problem and Approach
• Can we leverage the serverless model to define and enforce a new set
of SLA/SLO to support fast translation from business logic to
application solutions?
SLO EnforcerBusiness + Application
Logics
Application + SLO
Description
Workload
1. Define 2. Deploy
5. Config
3. Run
4. Feedback
with perf.
metrics
6. Evaluate
11

Application Description: Open Application
Model (OAM)
• “Open Application Model (OAM) is a runtime-agnostic specification for defining
cloud native applications and enable building application-centric platforms by
natural. Focused on application rather than container or orchestrator, Open
Application Model brings modular, extensible, and portable design for building
application centric platforms on any runtime systems like Kubernetes, cloud, or
IoT devices.” *
• Provide for translating business goals to application design and application description,
including
• Computation components (e.g. Micro-services)
• Topology
• Deployment environment
• We extend OAM for defining and enforcing SLO by
• Adding serverless (Knative deployment) description support
• Adding SLO description
*OAM see https://github.com/oam-dev/spec
12

SLO Description
• As long as
• input rate < 100 rps:
• Ramp-up < 10 req/sec*sec
• Guarantee:
• 95th latency < 1000ms
• 400ms <= Mean latency <= 600ms
• Stddev latency <= 200
input output
Per-request Pricing
End-to-end Latency Earning ($)
< 600 0.00002
< 1000 0.00001
Otherwise 0
Translate to SLO
Add to Blueprint
(slodesc.yaml)
13

Control Invocation Overhead with Knative
• Cold start: multiplex
multiple invocation into big
pods (e.g. containers) to
reduce # cold starts
• Scaling delay: reserve extra
resource to buy time when
workload surges
14
Note: 400m+1 = use pod size of 400m with
reservation = 400m (1 pod)

Topology Awareness Configuration with
Knative
• Select the right pod size • Select the right reservation
Increasing pod size doesn’t always
improve concurrency significantly
Increasing workflow size increases the
per-component deployment cost
15

Demonstration
• Naïve serverless vs. Serverless + SLO enforcement
• Reuse the previous workload, application logics, and system setup
• SLO: as long as input rate < 120 order per sec:
• 95th of end-to-end latency < 1500ms
• Recorded Videos
• LINK TO BE ADDED
16

Highlights (forward-3)
17
Serverless Serverless +
SLO enforcement

Results: Latency distribution
Serverless deployment Serverless + SLO enforcer deployment
Extremely high latency due to scaling lag
and topology unawareness
18
SLO Enforcer successfully meet SLO
requirements by choosing right pod size
and reservation

Results: Earning
• Cost is calculated based on IBM
container pricing*, with 0.000034
USD/Second/Core
* https://cloud.ibm.com/kubernetes/catalog/about
Per-request Icome
End-to-end Latency Income ($)
< 500 0.0000012
< 1000 0.0000011
< 1500 0.000001
Otherwise 0
19
SLO Enforcer meet the SLO at reasonably low
cost, thereby creates high earning, satisfying
the business goal.

Related Work
• Handling invocation overhead
• Cold start: SAND (ATC18), SOCK (ATC18), Catalyzer (ASPLOS20)
• Scaling: Shadrad et. al. (ATC’20)
• Per-function optimization, no topology support
• Topology-aware Deployment for Serverless: IBM Composer, AWS Step
Function
• Simple topology (sequence + parallel), no performance guarantee
• Performance Guarantee for Serverless: Real-time Serverless (WoSC19)
• Rate guarantee but no topology support
20

Conclusion and Future Vision
• Serverless opens opportunities to quickly build and adjust software
solution to business goals. But many challenges arise
• Scaling overhead
• Lack of topology awareness
• We propose an SLO Serverless interface to describe and enforce
business goals in terms of SLOs
• Long-term vision
• Support more complicated workload topologies
• Efficient SLO enforcement (smarter metrics selection, ML approaches, etc.)
• Generic mechanism for all serverless platforms (not just Knative)
21

Coming soon…
• Will be available soon
• Blog post
• Demonstration scripts
• … and slides, demo recording and talk recording posted later after the
talk
22

𝑓
𝑓
𝑓
𝑓
𝑓
𝑓
𝑓
…
New orders
Valid?
Yes
No
ReturnInspect
Find voyages
Find container
Workload
Generator
Order 𝑓
AllocatedOrder
RejectedOrder
Workload
Observer
Find containers
for an order
Supply chain from manufacturer to retailer
The Reality
25

Workload
Generator
Order 𝑓
AllocatedOrder
RejectedOrder
Workload
Observer
Find containers
for an order
Workload
Generator
Order 𝑓
Workload
Generator
𝑓
Workload
Generator
Empty
26

Service-Level Objective for Serverless Applications

More Related Content

What's hot

Similar to Service-Level Objective for Serverless Applications

Recently uploaded

Service-Level Objective for Serverless Applications