Performance Testing Methodologies By Marius Brecher 1. Introduction Performance testing is a subset of performance engineering; an emerging computer science practice which strives to build performance into the design and architecture of a system, often prior to the start of the actual coding effort. Performance testing can serve different purposes. It can demonstrate that a system meets its performance criteria, it can compare performance between two systems and it can identify which components of the system might cause bottlenecks or have an overall impact on performance. Performance testing can also serve to validate and verify other quality attributes of the system such as scalability, reliability and resource usage; whilst helping to tune the system to best handle real production load without performance impacts. The most important thing to note about the different types of performance testing (SVP -‐ Stress, Volume and Performance) is that it is not trying to find functional defects. SVP needs to be able to execute end-‐to-‐end business transactions using different volumes and scenarios to determine how well some aspects of the system perform under any given load. It is critical to the cost of new projects that performance test efforts are included in the early stages of development and extends through to deployment. The later a performance defect is detected, the higher the cost of remediation. Monitoring is needed to ensure there is proper feedback validation and that the system meets the NFR specified performance metrics. An appropriately defined Monitoring Process specifies the planning, design, installation, configuration and control of the monitoring subsystem.
The benefits are as follows: 1. You can establish service level agreements at the ‘use case’ level. 2. You can turn on and off monitoring at periodic points, or to support problem resolution. 3. You can generate regular reports. 4. You have the ability to track trends over time – such as the impact of increasing user loads and growing data sets on use case level performance. The trend analysis component of Performance Monitoring should not be undervalued. This functionality, when properly implemented and used will enable the prediction of the application’s performance degradation and system stress, whilst gradually increasing transaction volumes and user concurrency. Observing this behaviour in the early stages of development and testing will enable proper management budgeting, deployment of the required resources to maintain the necessary system, running within the limits of business requirements and overall cost reduction. 2. Types of Performance Testing 2.1 Load testing Load testing is the simplest form of performance testing. A load test is usually conducted to understand the behaviour of the system under a specific expected load. This load can be the expected concurrent number of users on the application, performing a specific number of transactions within the set duration. This test will provide the response times of all business critical transactions. If the database or application server, etc. are also monitored, this simple test can also determine any bottlenecks in the application software. 2.2 Stress testing Stress testing is normally used to understand the upper limits of capacity within the system. This kind of test is done to determine the systems robustness in terms of extreme load and helps application administrators make sure the system will perform sufficiently if the current load goes well above the expected maximum. 2.3 Endurance testing (soak testing) Endurance testing is usually done to determine if the system can sustain the continuous expected load. During endurance tests, memory utilization is monitored to detect potential leaks. What is also important but often overlooked, is performance degradation. This test ensures that the throughput and/or response times after a long period of sustained activity are as good if not better than at the beginning of the test. It
essentially involves applying a significant load to a system for an extended period of time. The goal is to discover how the system behaves under sustained use. 2.4 Spike testing Spike testing is achieved though suddenly and significantly increasing the number of users or load generated by users and observing the consequent behaviour of the system. The goal is to determine whether the system is equipped to handle dramatic changes in load. 2.5 Configuration testing Rather than testing for performance from the perspective of load, tests are created to determine the effects of configuration changes to the systems components on performance and behaviour. A common example would be experimenting with different methods of load-‐balancing. 2.6 Isolation testing Isolation testing is not unique to performance testing but is a term used to describe repeating a test execution that resulted in a system problem. It is often used to isolate and confirm the fault domain. 3. Prerequisites To achieve successful and accurate performance testing results the following key steps should be followed: 3.1 Understand your testing environment It is important to understand your test environment capacity, including its limitations and stability in comparison to its production environment. Ideally, performance testing will be conducted on a pre-‐production environment or an environment which is a mirror of production, but that won’t be the reality in most cases. Generally it is the performance engineer’s responsibility to maintain communication with administrators and architects and to understand the exact differences between the two environments in terms of hardware, database capacity and general configuration. Equally it will be their responsibility to set up or match the production environment as closely as possible. All business and technical stakeholders should be aware of the differences between the two environments, and understand the potential need to scale down the performance test and extrapolate results.
3.2 Understand your system configuration Knowing the system and application configuration is also a very important part of performance testing planning, and needs to be understood by the performance test engineers. The different system components and services will need to be identified for two key reasons: a) Identifying the system’s different components and services used by the application allows the performance engineer to plan test scenarios and understand how it can be incorporated in early stages of development. This enables testing of the different components or services separately, whilst the rest of the application’s functionality is still under development. b) Understanding the different functionalities of the various components will make results analysis much easier. The reason for failed transactions will be identified more quickly because it can be pinpointed to the relevant component, and won’t require end-‐to-‐end analysis. 3.3 Know your business requirements Business requirements need to be collated, discussed and agreed upon by all stakeholders. Equally the following information should be a vital part of designing the complete test plan and execution: a) Number of concurrent users. b) The exact production transaction mix. c) Expected user behaviour. d) Production volumes. e) Expected business test cases. f) Expected SLAs. g) Required test inputs and outputs.
Performance testing will focus around peak production volumes as well as up to 5 years volume growth prediction. Business transactions selection will aim for the top 10-‐15 most used transactions in production. Some rarely used transactions with suspected performance impact will also be included, in order to identify any performance degradation they might introduce. The performance engineer will research non-‐functional requirements, usually by initiating communications with the following resources: a) Business representatives should be able to provide information on user behaviour, business test cases and the expected business SLAs. The business representative who, in most cases will approach a technical production support resource to extract the necessary information from the live production data can also provide the more ‘production specific’ information. b) Production support resources can be approached directly to help in identifying performance-‐testing requirements. Production support resources can extract the necessary information from production as per the performance engineer specifications. c) Capacity planning resources will be a good point of reference for a more accurate production transaction’s mix, environment resource utilisation and in identifying any possible problematic transactions. 3.4 Know your weapon of choice Testing tools hold a critical role in the whole performance testing design and should always be selected after thorough analysis of the project’s (and overall business’) needs. There are many free tools on the market but each has their limitations. The selected tool will have to satisfy customer’s needs on numerous levels: a) Usability for and beyond the current project. b) Ability to perform the project’s specific tasks. c) Dynamic, user friendly and easy to use. d) Access to a large pool of professional resources. e) Ongoing support. f) Purchase and continuous maintenance costs.
When multiple performance tools already exist, tools consolidation should be recommended. This is because multiple tools will require multiple resources possessing different skills, higher licensing and maintenance cost. 3.5 Know your monitoring tools Application Performance Management (APM) tools such as AppDynamics, DynaTrace, Wily Introscope or Foglight (just to mention a few) are the most common solution for application monitoring across multiple environments and multiple technologies (for example Java, .Net, VM) databases. APM tools are used to improve user satisfaction. They perform deep diagnostics for quick resolutions, making sure requirements are met and environments are performing as expected. Monitoring is essential in helping performance engineers identify the root cause of every performance-‐impacting incident. APM tools can manage user experience from multiple perspectives, enabling performance SLA detection and response time issues isolations. By capturing real user transactions, the engineer can understand how the application’s design and configuration are affecting the overall performance. Monitoring will play a very important role in helping with the following: Application Server Monitoring and Diagnostics resolve problems before they impact users and violate SLAs. They do this by simplifying management of the application server, the user transactions running through it, and the underlying infrastructure. Database Monitoring and Management tools provide simplified, consistent performance monitoring and management across different database platforms, helping you reduce administrative costs and improve service levels. Middleware Monitoring delivers unparalleled application performance by monitoring the health of your middleware environment and resolving incidents before they become an issue for your business.
Network System Management minimizes the wasted time (and chaos) resulting from sudden network problems with a complete visibility of your network resources; including hardware, operating systems, virtualization, databases, middleware, applications and services. The ability to drill down on a problematic transaction and view the associated code will save developers and test engineers considerable effort during performance analysis. 3.6 Understand your test data The test data used in performance testing could have a significant impact on the validity of the test, as well as the overall test results. As such, the creation of valid data and ongoing maintenance should be given high importance and planned very carefully. The test data should be thoroughly understood because it will be expected to match different conditions and rules to successfully trigger the many different components and services used by the application. Test data should not be reused over long periods of time or over multiple projects as this might cause ‘data exhaustion’, which will affect test results and might impact the overall performance test validity. If and when possible, it would be beneficial to refresh the test database before each test, or at the very least clean up activities or transactions created by test users during each test. Having the system and database in identical states before each test will ensure results are similar and that any noticeable performance impacts are related to application changes or environment settings; not test data or scripts, or test tool related issues. 3.6.1 Recommended procedures for test creation and database management The following recommended procedures will ensure each executed test is identical in terms of the test data used, database size and data validity. As such it will minimize data exhaustion. 188.8.131.52 Database refresh • Database refresh with a production copy every 6-‐12 months (depending on production database changes, growth and usage of the test environment).
• Test data creation after completion of the database refresh or data extraction (if possible). • Regression test, ensuring test data created is valid and sufficient for the expected test efforts. • Complete test database backup. This back up will be used for continuous database refreshes as needed: o Before each test executed. o Once every agreed time (once a week, once a month), dependant on availability and environment usage. 184.108.40.206 Test data clean-‐up If exercising the above is not possible due to issues with database size, resource availability or environment complexities, the following steps could be considered: • Large amounts of data need to be created to cover the duration of the test executions. • User transactions created during the test need to be reverted, either by creating frontend or backend scripts (if possible). • Test data should be replaced before each test execution. 3.6.2 Recommended procedures for data creation are: SQL scripts should be used for data extraction directly from the newly refreshed database. The test engineer will identify the requested test data format in order to help with the data extraction. This data creation option will be the most efficient way of ensuring valid data is used without actually generating additional large amounts of test data, which would increase the database size and in some cases might slightly impact overall performance. Using SQL scripts to create test data directly to the database. The test engineer will identify the exact test data requirements n order to help with the SQL scripts development. This data creation option will be the fastest and most reliable way of creating large amounts of test data without stressing the applications or any other services.
Test engineers should create utility scripts to be executed, using the application’s frontend. Using the frontend is the most common method used for data creation. It might be the most time and resources-‐consuming option, but it is also the quickest, most efficient method for the test engineer. This is because when using the frontend, the test engineer does not rely on the availability of other resources for scripts creation and execution. As long as the environment is available, the test engineer can develop and run the utility scripts to create all necessary test data. 4. Test Scripts and Scenarios Development Test scripts should be developed only after all the above points are established and understood. The test engineer will have to understand the exact business requirement and transactions flow to record and develop a valid script. Ideally, the scripting procedure will take place in the performance testing environment, with the latest stable code available, to avoid rescripting and multiple script modifications. In reality it will probably be efficient for un-‐official scripting to be conducted on lower environments such as UAT or even SYS if the code version is stable enough, which will give the test engineer enough time to cover the application’s behaviour knowledge and reduce pressure from the official performance testing schedule. Taking the initiative and engaging in the process as early as possible will benefit all parties involved. The scripts will have to follow the basic standards and processes: a) Include script descriptions and relevant steps/actions. b) Include descriptions of data type usage. c) Follow the standardized transactions format and naming convention. d) Reuse actions/functions if possible. e) Keep scripts simple. This ensures any issues are due to an application’s problems, not script complexity. f) Enter comments and logic descriptions for every request. This enables other team members to understand your scripts.
Test scenarios should be designed to follow business requirements, production volumes and the expected transaction mix requirements established in the planning stage. When designing test scenarios, it is important to understand that we are trying to achieve a successful run to identify performance issues. As such, there is no need to overload the environment at the beginning of the test. Ideally we would ramp up user concurrency and volumes for the first 15–30 minutes (depending on volumes), maintaining the full load for 45-‐90 minutes before starting the ramp down. Please note this is a sample. Different tests will require different designs. 5. Test Communication Procedures In most cases environment resources are shared amongst different test environments, different test teams and even within production. For this reason maintaining continuous communication with the different stakeholders is very important and highly recommended. If possible, a team track should be created and sent for approval before each test. The team track will include all the stakeholders (database administrators, middleware and production support, UAT testers, project managers, developers and business representatives) that might be impacted by the test execution. A test notification email will need to be sent before each test to all stakeholders, informing them about the following: a) Test execution times. b) Test duration. c) Test purpose and objectives. d) The applications and environments that will be affected. The notification email should be sent in reasonable time before each test, allowing the recipients to raise any concerns or suggestions in time for the test. 6. Test Reports and Documentations When establishing test reports it is critical to identify who the recipients are. Likewise, it’s important to send your test results to both technical and business representatives. The results should be sent as soon as possible after each test, allowing the various stakeholders to understand and comment on the test results.
Most tools now have the capability to create some sort of a report, which should include test transaction response times, volumes and concurrency and overall resources utilisation. Often however, those reports will include too much technical information, which will not be relevant to all stakeholders, in particular, the business stakeholders. The recommended solution for this overload is to separate the level of detail for the two groups. The communication for non technical recipients could be in an e-‐mail format and would include the following to help them understand the high level outcomes of the test: a) Test objectives, duration and execution time. b) High-‐level findings (was the test successful or not?) c) Performance degradation (i.e. show slow transaction response times). d) Observations. e) Conclusions. f) The next planned steps. Technical stakeholders will need additional information on top of this communication, which could be displayed in an excel spreadsheet attached to the test results e-‐mail. Consider the following as part of the technical report: a) Display only the business transactions in focus. b) Display min, average, max and 90 percentile for each transaction. c) Compare results for the same transactions between the established baseline and the current test. d) The baseline and test comparison will need to highlight differences in events (past transactions), average response times and the 90 percentile. e) Include monitoring graphs to back up your findings. f) Calculate overall throughput. g) Display number of concurrent users used. h) Display any error messages received during the test. i) Display any available utilization graphs. This type of reporting will take place after each individual test.
The final report would be a “Test Finding” report, which would be created at the end of the performance test exercise. The Test Finding report should summarise all the test efforts, findings, conclusions and suggestions and should include the following information: a) Summaries of the overall test efforts (test scripts used, execution start and end times, type of test executed, applications tested). b) Issues identified and fixes applied during the test. c) The performance status at the end of testing. d) Expected performance impacts on production. e) Overall test conclusions. f) Final comments and improvement suggestions, if applicable. 7. Conclusion This paper highlights the best practice processes used and refined by independent diagnostic experts, Ecetera and has been put together in response to what we’ve noted are often overlooked activities in the Performance Monitoring field. Equally, Performance Testing itself forms one part of a much wider Performance Engineering framework and is rarely considered in isolation. To receive a Performance Engineering consultation for your organisation, contact Ecetera on (02) 8278 7068 or email firstname.lastname@example.org.