Self-adaptive container monitoring with performance-aware Load-Shedding policies, by Rolando Brondolin, PhD student in System Architecture at Politecnico di Milano
Self-adaptive container monitoring with performance-aware Load-Shedding policies
1. 1
Self-adaptive container monitoring with
performance-aware load-shedding policies
NECST Group Conference 2017 @ Sysdig
07/05/2017
Rolando Brondolin
rolando.brondolin@polimi.it
DEIB, Politecnico di Milano
2. Cloud trends
• 2017 State of the cloud [1]:
– 79% of workloads run in cloud (41% public, 38% private)
– Operations focusing on:
• moving more workloads to cloud
• existing cloud usage optimization (cost reduction)
2
• Nowadays Docker is becoming the de-facto standard for Cloud deployments
– lightweight abstraction on system resources
– fast deployment, management and maintenance
– large deployments and automatic orchestration
[1] Cloud Computing Trends: 2017 State of the Cloud Survey, Kim Weins, Rightscale
4. Infrastructure monitoring (1)
• Container complexity demands strong monitoring capabilities
– Systematic approach for monitoring and troubleshooting
– Tradeoff on data granularity and resource consumption
4
#requests/s
heap size
CPU usage
Q(t) λ(t) μ(t)
#store/s
#load/s
high visibility on system state
non negligible cost
few information on system state
cheap monitoring
VS
5. • Container complexity demands strong monitoring capabilities
– Systematic approach for monitoring and troubleshooting
– Tradeoff on data granularity and resource consumption
few information on system state
cheap monitoring
high visibility on system state
non negligible cost
Infrastructure monitoring (2) 5
#requests/s
heap size
CPU usage
Q(t) λ(t) μ(t)
#store/s
#load/s
VS
High data granularity Good data granularity High data granularity
Code instrumentation Code instrumentation No instrumentation
Low metrics rate High metrics rate High metrics rate
6. Sysdig Cloud monitoring 6
http://www.sysdig.org
• Infrastructure for container monitoring
• Collects aggregated metrics and shows system state:
– “Drill-down” from cluster to single application metrics
– Dynamic network topology
– Alerting and anomaly detection
• Monitoring agent deployed on each machine in the cluster
– Traces system calls in a “streaming fashion”
– Aggregates data for Threads, FDs, applications, containers and hosts
7. IssuesEffectCause
Problem definition
• The Sysdig Cloud agent can be modelled as a server with a finite queue
• characterized by its arrival rate λ(t) and its service rate μ(t)
• Subject to overloading conditions
7
Events arrives at
really high frequency Queues grow
indefinitely
High usage of system
resources
Uncontrolled
loss of events
S
λ(t) φ(t)
μ(t)
Λ Φ
Q
S
φ(t)
μ(t)
Φ
Q
of a streaming system with queue, processing element and streaming
output flow . A server S, fed by a queue Q, is in overloading
eater than the service rate µ(t). The stability condition stated
he necessary and sufficient condition to avoid overloading. A
ncing overloading should discard part of the input to increase
to match the arrival rate (t).
µ(t) (t) (2.1)
rmalizing is twofold, as we are interested not only in controlling
t also in maximizing the accuracy of the estimated metrics. To
which represents the input flow at a given time t; and ˜x, which
ut flow considered in case of overloading at the same time t. If
Output quality
degradation
8. Proposed solution: FFWD
Fast Forward With Degradation (FFWD) is a framework that tackles load peaks
in streaming applications via load-shedding techniques
general approach but leveraging domain-specific details
8
Load Manager
*when*
aggregated
metrics
correction
LS Filter
*where*
Policy wrapper
*how much*
shedding
plan
Mitigate high usage of
system resources
Avoid uncontrolled
loss of events
minimize output quality
degradation
9. Utilization-based Load Manager
The system in Figure 1 can be modeled by means of
Queuing Theory: the application is a single server node fed
by a queue, which provides the input jobs at a variable arrival
rate (t); the application is able to serve jobs at a service
rate µ(t). The system measures (t) and µ(t) in events per
second, where the events are respectively the input tweets and
the serviced tweets.
Starting from this, the simplest way to model the system
behavior is by means of the Little’s law (1), which states that
the number of jobs inside a system is equal to the input arrival
rate times the system response time:
N(t) = (t) · R(t) (1)
Q(t) = Q(t 1) + (t) µ(t) (2)
U(t) =
(t)
µmax
+
Q(t)
µmax
(3)
Q(t) = µmax · U(t) (t) (4)
e(t) = U(t) U(t 1) (5)
he system in Figure 1 can be modeled by means of
euing Theory: the application is a single server node fed
a queue, which provides the input jobs at a variable arrival
(t); the application is able to serve jobs at a service
µ(t). The system measures (t) and µ(t) in events per
ond, where the events are respectively the input tweets and
serviced tweets.
tarting from this, the simplest way to model the system
avior is by means of the Little’s law (1), which states that
number of jobs inside a system is equal to the input arrival
times the system response time:
N(t) = (t) · R(t) (1)
Q(t) = Q(t 1) + (t) µ(t) (2)
U(t) =
(t)
µmax
+
Q(t)
µmax
(3)
Q(t) = µmax · U(t) (t) (4)
e(t) = U(t) U(t 1) (5)
S:
Control error:
4.3. Policy wrapper and
equation (4.13). This leads to the final formulation of the Loa
(4.14), where the throughput at time t + 1 is a function of th
the maximum available throughput times the feedback error.
e(t) = U(t) ¯U
µ(t + 1) = (t) + µmax · e(t)
The Load Manager formulation just obtained is compose
the one hand, when the contribution of the feedback error e(
Requested throughput:
4.3. Policy wrapper and L
equation (4.13). This leads to the final formulation of the Load
(4.14), where the throughput at time t + 1 is a function of the
the maximum available throughput times the feedback error.
e(t) = U(t) ¯U
µ(t + 1) = (t) + µmax · e(t)
The Load Manager formulation just obtained is composed
the one hand, when the contribution of the feedback error e(t
condition of equation (4.15) is met; on the other hand, the secon
The system can be characterized
by its utilization and its queue size
Load Manager 9
Load Manager
LS Filter
Policies
SP
Metrics
correction
• The Load Manager computes the throughput μ(t) that
ensures stability such that:
we analyze the formulation for the Load Manager’s actuation µ(t+1) just obtained,
ice that it is a sum of two different contributions. On the one hand, as the error e(t)
to zero, the stability condition (4.7) is met. On the other hand, the contribution:
(t) ensures a fast actuation in case of a significant deviation from the actual system
rium.
(t) µ(t) (4.7)
course, during the lifetime of the system, the arrival rate (t) can vary unpre-
ly and can be greater than the system capacity µc(t), defined as the rate of events
ted per second. Given the control action µ(t) (i.e., the throughput of the system)
e system capacity, we can define µd(t) as the dropping rate of the LS. As we did
), we can estimate the current system capacity as the number of events analyzed
last time period. Thus, for a given time t, equation (4.8) shows that the service
the sum of the system capacity estimated and the number of events that we need
p to achieve the required stability:
µ(t) = µc(t 1) + µd(t) (4.8)
Utilization
s section we describe the Utilization based Load Manager, which becomes of use
e of streaming applications which should operate with a limited overhead. The
tion based Load Manager, which is showed in Figure 4.4, resorts to queuing theory
CPU utilization Arrived events Residual events
Current utilization Target utilization
Arrival rate
Max theoretical
throughput
Control errorThe requested throughput is used by the load shedding policies to derive the LS probabilities
10. Policy wrapper and policies
• The policy wrapper provides access to statistics of processes,
the requested throughput μ(t+1) and the system capacity μc(t)
10
Fair policy
• Assign to each process the “same" number
of events
• Save metrics of small processes, still
accurate results on big ones
Priority-based policy
• Assign a static priority to each process
• Compute a weighted priority to partition
the system capacity
• Assign a partition to each process and
compute the probabilities
Load Manager
LS Filter
Policies
SP
Metrics
correction
Baseline policy
• Compute one LS probability for all processes (with μ(t+1) and μc(t))
11. Load Shedding Filter
• The Load Shedding Filter applies the probabilities
computed by the policies to the input stream
• For each event:
• Look for load shedding probability depending on input class
• If no data is found we can drop the event
• Otherwise, apply the Load Shedding probability computed by the policy
• The dropped events are reported to the application for metrics correction
11
Load Manager
LS Filter
Policies
SP
Metrics
correction
Load Shedding
Filter
Shedding
Plan
event buffers
ok
drop probability
Event
Capture
ko
12. • We evaluated FFWD within Sysdig
with 2 goals:
• System stability (slide 13)
• Output quality (slides 14 15 16 17)
• Results compared with the reference
filtering system of Sysdig
• Evaluation setup
• 2x Xeon E5-2650 v3,
20 cores (40 w/HT) @ 2.3Ghz
• 128 GB DDR4 RAM
• Test selected from Phoronix test suite
Experimental setup 12
test ID name priority # evts/s
A nginx 3 800K
B postmark 4 1,2M
C fio 4 1,3M
D simplefile 2 1,5M
E apache 2 1,9M
test ID instances # evts/s
F 3x nginx, 1x fio 1,3M
G 1x nginx, 1x simplefile 1,3M
H
1x apache, 2x postmark,
1x fio
1,8M
Homogeneous benchmarks
Heterogeneous benchmarks
Syscall intensive benchmarks
from Phoronix test suite
13. System stability 13
• We evaluated the Load Manager with all the tests (A, B, C, D, E, F, G)
• With 3 different set points (Ut 1.0%, 1.1%, 1.2% w.r.t. system capacity)
• Measuring the CPU load of the sysdig agent with:
• reference implementation
• FFWD with fair and priority policy
• We compared the actual CPU load
with the QoS requirement (Ut)
• Error measured with MAPE (lower
is better) obtained running 20 times
each benchmark
• 3.51x average MAPE improvement,
average MAPE below 5%
Test
Ut = 1.1%
reference fair priority
A 7,12% 1,78% 3,78%
B 34,06% 4,37% 4,46%
C 28,03% 2,27% 2,24%
D 11,52% 1,41% 1,54%
E 26,02% 8,51% 8,99%
F 22,67% 8,11% 3,74%
G 16,42% 3,37% 2,73%
H 19,92% 8,41% 8,01%
14. Output quality - heterogeneous
• We tried to mix the homogeneous tests
• simulate co-located environment
• add OS scheduling uncertainty and noise
• QoS requirement Ut 1.1%
• MAPE (lower is better) between exact and approximated metrics
• Compare metrics from reference, FFWD fair, FFWD priority
• Three tests with different syscall mix:
• Network based mid-throughput: 1x Fio, 3x Nginx, 1.3M evt/s
• Mixed mid-throughput: 1x Simplefile, 1x Nginx, 1.3M evt/s
• Mixed high-throughput: 1x Apache, 1x Fio, 2x Postmark, 1.8M evt/s
14
18. Conclusion
• We saw the main challenges of Load Shedding for container monitoring
– Low overhead monitoring
– High quality and granularity of metrics
• Fast Forward With Degradation (FFWD)
– Heuristic controller for bounded CPU usage
– Pluggable policies for domain-specific load shedding
– Accurate computation of output metrics
– Load Shedding Filter for fast drop of events
18
19. 19
Questions?
Rolando Brondolin, rolando.brondolin@polimi.it
DEIB, Politecnico di Milano
NGC VIII 2017 @ SF
FFWD: Latency-aware event stream processing via domain-specific load-shedding policies. R. Brondolin, M. Ferroni, M. D.
Santambrogio. In Proceedings of 14th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing (EUC 2016)
24. Response time Load Manager 24
S:
(Little’s Law)
(Jobs in the system)
The system can be characterized by its response time and the jobs in the system
Control error:
Requested throughput:
The requested throughput is used by the load shedding policies to derive the LS probabilities
25. 25
S:
(Little’s Law)
(Jobs in the system)
The system can be characterized by its response time and the jobs in the system
Control error:
Requested throughput:
The requested throughput is used by the load shedding policies to derive the LS probabilities
Old response time Target response time
Response time Load Manager
26. 26
S:
(Little’s Law)
(Jobs in the system)
The system can be characterized by its response time and the jobs in the system
Control error:
Requested throughput:
The requested throughput is used by the load shedding policies to derive the LS probabilities
Requested throughput Arrival rate
Control error
Response time Load Manager
27. Case studies 27
System monitoring [2]
• Goal: Distributed monitoring of systems
and applications w/syscalls
• Constraint: CPU utilization
• Based on: Sysdig monitoring agent
• Output: aggregated performance metrics
for applications, containers, hosts
• FFWD ensures low CPU overhead
• policies based on processes in the system
[1] http://nlp.stanford.edu [2] http://www.sysdig.org
Sentiment analysis [1]
• Goal: perform real-time analysis on tweets
28. Case studies 28
System monitoring [2]
• Goal: Distributed monitoring of systems
[1] http://nlp.stanford.edu [2] http://www.sysdig.org
Sentiment analysis [1]
• Goal: perform real-time analysis on tweets
• Constraint: Latency
• Based on: Stanford NLP toolkit
• Output: aggregated sentiment score for
each keyword and hashtag
• FFWD maintains limited the response time
• policies on tweet keyword and #hashtag
29. Real-time sentiment analysis 29
• Real-time sentiment analysis allows to:
– Track the sentiment of a topic over time
– Correlate real world events and related sentiment, e.g.
• Toyota crisis (2010) [1]
• 2012 US Presidential Election Cycle [2]
– Track online evolution of companies reputation, derive social
profiling and allow enhanced social marketing strategies
[1] Bifet Figuerol, Albert Carles, et al. "Detecting sentiment change in Twitter streaming data." Journal of Machine Learning Research:
Workshop and Conference Proceedings Series. 2011.
[2] Wang, Hao, et al. "A system for real-time twitter sentiment analysis of 2012 us presidential election cycle." Proceedings of the ACL
2012 System Demonstrations.
30. Sentiment analysis: case study 30
• Simple Twitter streaming sentiment analyzer with Stanford NLP
• System components:
– Event producer
– RabbitMQ queue
– Event consumer
• Consumer components:
– Event Capture
– Sentiment Analyzer
– Sentiment Aggregator
• Real-time queue consumption, aggregated metrics emission each second
(keywords and hashtag sentiment)
31. FFWD: Sentiment analysis 31
• FFWD adds four components:
– Load shedding filter at the beginning of the pipeline
– Shedding plan used by the filter
– Domain-specific policy wrapper
– Application controller manager to detect load peaks
Producer
Load Shedding
Filter
Event
Capture
Sentiment
Analyzer
Sentiment
Aggregator
Policy
Wrapper
Load Manager
Shedding
Plan
real-time queue
batch queue
ok
ko
ko count
account metrics
R(t)
stream statsupdated plan
μ(t+1)
event output metricsinput tweets
drop probability
Component
Data structure
Internal information flow
External information flow
Queue
analyze event
λ(t)
Rt
32. Sentiment - experimental setup 32
• Separate tests to understand FFWD behavior:
– System stability
– Output quality
• Dataset: 900K tweets of 35th week of Premier League
• Performed tests:
– Controller: synthetic and real tweets at various λ(t)
– Policy: real tweets at various λ(t)
• Evaluation setup
– Intel core i7 3770, 4 cores @ 3.4 Ghz + HT, 8MB LLC
– 8 GB RAM @ 1600 Mhz