Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
~ilities Test Automation
Xi Chen
Aldo Suwandi
Delivery and Quality Solution Group
Ecosystem Service Department
1
Background Story
2
3
Rakuten EcoSystem
 Global Start Up and Expansion
 Enterprise in Japan
Reliability
Recoverability
Scalability
Operabili...
4
Current Eco-system review
Planned Scale Out / InMonolithic Architecture No Standard OPS Automation
5
Requirements for modern platform: ZED
Microservice Architecture High Reliability / Recoverability
Easy Scaling / Operati...
Ecosystem Service Operation
6
Service A
User SRE
Service B Service C
Reliability Operability
Recoverability / Scalability
7
~ility Test for modern platform
• Reliability Test
• Operability Test
• Scalability Test
• Recoverability Test
~ility Test Problems
8
Definition
9
Reliability : the capability of the system to maintain its
service provision under defined conditions for def...
Reliability
10
User
requests
User
User
Pod - A
Pod - B
Pod - C
service / application
Monitoring Operability
11
kibana
SRE
app
fluentd
pod (1..X)
datadog agent
elastic-search
kubernetes
application utilizatio...
User Story
12
1. As SRE I want to be notified by the monitoring / alert
system once there is an incident within 5 minutes....
Current Problem & Situation
13
It requires at least 10 days to complete operability and
reliability testing
• Manual execu...
Solution
14
Main Features
15
1. Operability Test
2. Reliability Test + Performance Test
3. Reliability Test + Functional Test
Operability Test
16
SRE
Reliability
Test
Framework
API
Demo (1)
17
Demo (2)
18
Reliability + Performance Test
19
QA
Reliability
Test
Framework
50
100
150
200 210
190 203
185
200
0 0 0 0 10 8 3 2 4
50
1...
Demo
20
Reliability + Functional Test
21
QA
Functional
Test
Framework
API
Reliability
Test
Framework
dependency
trigger system fai...
Demo
22
Conclusion
23
Results
24
Before
It requires 10 days to complete due to :
• Manual execution of manifest
configuration settings
• Manual ...
Summary
25
1. This test framework could reduce the lead time by giving
confidence for SRE team about their system configur...
We are hiring Senior QA Engineer!
26
Upcoming SlideShare
Loading in …5
×

~ilities Testing

9,046 views

Published on

Migrating applications & platforms from VM-based infrastructure to Container-based infrastructure is a challenging task.
One of the key strategies is to make sure the quality attributes (software -ilities) is carefully tested during the migration.
In this presentation, we will share our solution to automating software -ilities testing, particularly for Operability and Reliability.

Published in: Technology
  • Be the first to comment

~ilities Testing

  1. 1. ~ilities Test Automation Xi Chen Aldo Suwandi Delivery and Quality Solution Group Ecosystem Service Department 1
  2. 2. Background Story 2
  3. 3. 3 Rakuten EcoSystem  Global Start Up and Expansion  Enterprise in Japan Reliability Recoverability Scalability Operability
  4. 4. 4 Current Eco-system review Planned Scale Out / InMonolithic Architecture No Standard OPS Automation
  5. 5. 5 Requirements for modern platform: ZED Microservice Architecture High Reliability / Recoverability Easy Scaling / Operation Standardization https://jenkins.io/
  6. 6. Ecosystem Service Operation 6 Service A User SRE Service B Service C Reliability Operability Recoverability / Scalability
  7. 7. 7 ~ility Test for modern platform • Reliability Test • Operability Test • Scalability Test • Recoverability Test
  8. 8. ~ility Test Problems 8
  9. 9. Definition 9 Reliability : the capability of the system to maintain its service provision under defined conditions for defined periods of time. Operability : ability of the software to be easily operated by a given user in a given environment. (ISO 9126 Software Quality Characteristics)
  10. 10. Reliability 10 User requests User User Pod - A Pod - B Pod - C service / application
  11. 11. Monitoring Operability 11 kibana SRE app fluentd pod (1..X) datadog agent elastic-search kubernetes application utilization application log kubernetes event new relic event alert operate
  12. 12. User Story 12 1. As SRE I want to be notified by the monitoring / alert system once there is an incident within 5 minutes. 2. As SRE, when I scale out the application, there should be no error alert triggered by the monitoring system. 3. As QA I want to verify if certain percentage of request shall be succeed when there is an incident.
  13. 13. Current Problem & Situation 13 It requires at least 10 days to complete operability and reliability testing • Manual execution of manifest configuration settings • Manual checking of alert system / configuration • Environment preparation
  14. 14. Solution 14
  15. 15. Main Features 15 1. Operability Test 2. Reliability Test + Performance Test 3. Reliability Test + Functional Test
  16. 16. Operability Test 16 SRE Reliability Test Framework API
  17. 17. Demo (1) 17
  18. 18. Demo (2) 18
  19. 19. Reliability + Performance Test 19 QA Reliability Test Framework 50 100 150 200 210 190 203 185 200 0 0 0 0 10 8 3 2 4 50 100 150 200 200 182 200 183 196 0 50 100 150 200 250 0:00:00 0:00:20 0:00:30 0:00:40 0:00:50 0:01:00 0:01:10 0:01:20 0:01:30 Number of Requests per Second All Requests Failed Requests Successful Request execute trigger result system failure test triggered result https://jenkins.io/
  20. 20. Demo 20
  21. 21. Reliability + Functional Test 21 QA Functional Test Framework API Reliability Test Framework dependency trigger system failure functional test case
  22. 22. Demo 22
  23. 23. Conclusion 23
  24. 24. Results 24 Before It requires 10 days to complete due to : • Manual execution of manifest configuration settings • Manual checking of alert triggered • Environment preparation After It only takes approximately 2 days to finish all the test, since all of the test setup and scenarios are automated.
  25. 25. Summary 25 1. This test framework could reduce the lead time by giving confidence for SRE team about their system configurations 2. Provide transparency between all stakeholders about operational activities 3. Allowing QA / Test engineer to test on reliability perspective.
  26. 26. We are hiring Senior QA Engineer! 26

×