SFScon 22 - Andrea Janes - Scalability assessment applied to microservice architectures.pdf

1
Scalability assessment applied
to microservice architectures
Andrea Janes
FHV Vorarlberg University of Applied Sciences
11. November 2022

2
The starting point
– “The term ’Microservice Architecture’ has sprung up over
the last few years to describe a particular way of designing
so ware applications as suites of independently
deployable services. While there is no precise definition of
this architectural style, there are certain common
characteristics around organization around business
capability, automated deployment, intelligence in the
endpoints, and decentralized control of languages and
data1
.”
1
Martin Fowler 2014: Microservices, https://martinfowler.com/articles/
microservices.html

3
Common representation of such architecture
Front-end
Service
registry
Micro-
service3
Micro-
service4
Micro-
service1
Micro-
service2

4
Continuous delivery is key
– Automation is needed to operate such a “system of
systems” (e.g., to use “blue/green deployment”)
– Monitoring is crucial:
– failures in one service can cause degradation in another
– timing issues have a higher impact than in a monolith
– debugging becomes complicated
– Verifying and guaranteeing quality requirements becomes
harder.
– Our research goal is to develop approaches to identify
performance bottlenecks, detect anomalies, and recognize
anti-patterns in a DevOps setting.

5
PPTAM: Overview (container diagram)2
PPTAM
record performance
tests pass/fail outcome
Test Engineer
performs qualiy
assurance,
release management
Product manager
visualize
tries to isolate problems
Software developer
APM Tools
Dashboard
Driver Testbed
retrieves Analysis
Architect
Repository
looks for
architectural alternatives
reads to generate
tests plan
stores test results
deploys/
queries
2
https://github.com/pptam/pptam-tool

6
Application Performance Monitoring
Micro-
service1
Micro-
service2
Front-end
Service
registry
User

7
Micro-
service1
Micro-
service2
Front-end
Service
registry
User
Front-end
Service registry
Micro-service1
Micro-service2

8
Micro-
service1
Micro-
service2
Front-end
Service
registry
User
Front-end
Service registry
Micro-service1
Micro-service2

9
Micro-
service1
Micro-
service2
Front-end
Service
registry
User
Front-end
Service registry
Micro-service1
Micro-service2

10
Micro-
service1
Micro-
service2
Front-end
Service
registry
User
Front-end
Service registry
Micro-service1
Micro-service2

11
Operational profile
workload
probability
0.0
0.4
0.3
0.2
0.1
operational
proﬁle
50 100 150 200

12
PPTAM: Overview (container diagram)
PPTAM
record performance
tests pass/fail outcome
Test Engineer
performs qualiy
assurance,
release management
Product manager
visualize
Software developer
APM Tools
Dashboard
Driver Testbed
retrieves Analysis
Architect
Repository
looks for
reads to generate
tests plan
stores test results
deploys/
queries

13
PPTAM: Overview (component diagram)
PPTAM
Driver Testbed
Docker swarm
Test-by-test
configuration
Test
orchestrator
drives
SUT
describes
SUT
configuration
calls
queries
stores test results
Testing
framework
Metrics
collection
Load test
template
uses
stores
configures
PPTAM
Configuration
uses
Analytics
Extraction/
Aggregation
of SUT Data
Repository
visualize
reads from
reads
Test plan
generator
System logs
Dashboarding
0
1
2 3
4
5
Test Engineer Product manager
Software developer Architect
reads to generate
test plan
APM Tools
writes
uses
runs
reads
deployed on
looks for
record performance tests
pass/fail outcome
performs qualiy
assurance, release management

14
PPTAM: Process
t
workload
9 am
Observe
workload
situations
workload
rel.
freq.
Empirical
distribution of
workload situations
sampled workload
rel.
freq.
Sample
selection
workload p(w)
50
100
150
.11
.19
.22
... ...
Load test
sequence
Analysis Sampling Experiment generation
Experiment
execution
Testbed
Test
engine
PPTAM service metric
1
2
...
...
...
...
Baseline
requirements
Load test template
Architectural
alternatives
Results for architectural alternatives
123
1 2 3
4

15
Analysis of the results
response
time
workload
Baseline
workload

16
response
time
workload
Baseline
workload
mean response
time for baseline
service 1

17
response
time
workload
Baseline
workload
mean response
time for baseline
service 1

18
response
time
workload
Baseline
workload
mean response
time for baseline
service 1

19
response
time
workload
Baseline
workload
mean response
time for baseline
service 1

20
response
time
workload
Baseline
workload
mean response
time for baseline
service 1

21
response
time
workload
Baseline
workload
Maximum
tolerated
workload for
service 1
mean response
time for baseline
service 1

22
response
time
workload
Baseline
workload
Maximum
tolerated
workload for
service 1
mean response
time for baseline
service 2
Maximum
tolerated
workload for
service 2
mean response
time for baseline
service 1

23
response
time
workload
Baseline
workload
Maximum
tolerated
workload for
service 1
mean response
time for baseline
service 2
Maximum
tolerated
workload for
service 2
mean response
time for baseline
service 1
Operating
point

23
response
time
workload
Baseline
workload
Maximum
tolerated
workload for
service 1
mean response
time for baseline
service 2
Maximum
tolerated
workload for
service 2
mean response
time for baseline
service 1
Operating
point
Variable Service 1 Service 2
x(l0) 0.018 2.008
σ 0.008 0.003
Req. 0.042 2.017
x(lop) 0.015 2.009
Pass/fail pass pass
Calls 20% 80%

23
response
time
workload
Baseline
workload
Maximum
tolerated
workload for
service 1
mean response
time for baseline
service 2
Maximum
tolerated
workload for
service 2
mean response
time for baseline
service 1
Operating
point
x(l0) 0.018 2.008
σ 0.008 0.003
Req. 0.042 2.017
x(lop) 0.015 2.009
Pass/fail pass pass
Calls 20% 80%
Success rate=20% + 80%

24
response
time
workload
Baseline
workload
Maximum
tolerated
workload for
service 1
mean response
time for baseline
service 2
Maximum
tolerated
workload for
service 2
mean response
time for baseline
service 1
Operating
point
x(l0) 0.018 2.008
σ 0.008 0.003
Req. 0.042 2.017
x(lop) 2.015 2.009
Pass/fail fail pass
Calls 22% 78%
Success rate=78%

25
Success rate for different workloads
sampled workload situation
success
rate
0.0
1
0.5
50 100 150 200
architectural alternative 2

26
Operational profile
workload
probability
0.0
0.4
0.3
0.2
0.1
operational
proﬁle
50 100 150 200

27
Success rate × probabilty
sampled workload situation
domain
metric
0.0
0.4
0.3
0.2
0.1
operational
proﬁle
50 100 150 200

28
Ongoing work: identification of Antipatterns
– Application Hiccups: temporarily increased response times
that return to normal later.
– The Stifle: data is retrieved through many similar (or
equal) database queries. As each request causes a
considerable overhead, the high amount of database
requests leads to a performance problem.
– Traffic Jam: one problem causes a backlog of jobs that
produces wide variability in response time which persists
long a er the problem has disappeared.

29
Thank you for your attention!

SFScon 22 - Andrea Janes - Scalability assessment applied to microservice architectures.pdf

Recommended

Recommended

More Related Content

Similar to SFScon 22 - Andrea Janes - Scalability assessment applied to microservice architectures.pdf

Similar to SFScon 22 - Andrea Janes - Scalability assessment applied to microservice architectures.pdf (20)

More from South Tyrol Free Software Conference

More from South Tyrol Free Software Conference (20)

Recently uploaded

Recently uploaded (20)

SFScon 22 - Andrea Janes - Scalability assessment applied to microservice architectures.pdf