Web-application's performance testing Ilkka Myllylä

  • 485 views
Uploaded on

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
485
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
28
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Web-application’s performance testing Ilkka Myllylä Reaaliprosessi Ltd
  • 2. Agenda
    • Research results about web-application scalability
    • Performance requirements specification
    • Load testing tools
    • Load test preparation
    • Load test execution
    • Analysis
      • What is bottleneck and how to find it ?
      • Optimization and tuning
      • What is root cause and how to fix it ?
  • 3. Research results about web-application scalability
  • 4. Is scalability problem ?
    • Research results (Newport Group Inc.)
    • Scalability in production
      • as required 48%
      • worse than requirements 52% (average 30%)
    • Late from timetable and costs overrun
      • as required 1 month and 70 000 euros
      • worse 2 months and 140 000 euros
    • What explains bad results ?
      • performance testing timing is most important factor
  • 5. Performance testing timing
  • 6. Early start of testing is profitable
    • Efficiency
      • one and 2-5 user tests show average 80% of bottlenecks
    • Costs
      • late fixing costs many times more -> wrong architecture risk for example
  • 7. When to test performance ?
    • Architecture validation
    • Component performance
    • New application system performance test
    • Major changes to application made
  • 8. Realistic performance testing is not easy
    • Too optimistic results are common problem
      • ” It worked in tests !” says 32 % (Gartner Group)
    • Bad environment
      • Testing environment not equal to production
    • Bad design + implementation
      • Load testing tool not used right way
    • Wrong tool for a job
      • Load test tool used not functional / does not include features needed
  • 9. Performance requirements specification
  • 10. Performance requirements testability
    • Requirements should be exact enough in order know how to test them
    • Usage information
      • Different users and their profiles ?
      • Transaction amounts ?
    • Response times
      • Detailed response times requirements ?
    • Technical constraints
      • User terminals, software, connection speed ?
      • Technical architecture
      • Database size
    • Security and reliability constraints
  • 11. Completeness of requirements
    • Sensibility check : Are customers requirements / calculations right and sensible ?
    • Completeness : What information is missing ?
      • Checklist
  • 12. System usage information
    • Goal : Real usage simulation that is close enough to reality
    • Business / use case scenarios are good starting point
    • Close enough ?
      • Not all functions are tested but most important and most used are tested
      • 3-5 scenarios is usually enough for one application
    • Usage information : History info available ?
      • Yes : Future scenarios ?
      • No : Estimation and calculation with known facts
  • 13. Transaction profile
  • 14. User profiles
  • 15. Does customers requirements / calculations make sense ?
    • Expectations can be too high
      • ” Response time for all functions should be below 2 sec”
      • ” In old system everything was very fast”
    • Different functions are not equal
      • requirements should be set individually for important use cases
      • and their functions
    • Costs should be involved
      • less response time, more costs
      • technically ”chalenging” requirements ?
  • 16. How long user is willing to wait ?
    • Not simple thing
      • different users have different requirements for same functions
    • Research results (Netforecast Inc)
      • two important factors : frequency and amount
      • strong variation between applications and functions
      • satisfactory average 10 s
  • 17. Frequency of use
    • Frequency : How often user needs to use function ?
      • more often user needs, less he/she is willing to wait
    • Example 1 : Frequency
      • Use case : Search customer info
      • A. Once a hour : requirement for response time 5 s
      • B. Once a month : requirement for response time 30 s
  • 18. Amount of information
    • Amount : How much valuable information we get as result ?
      • more information one get, more he/she is willing to wait
    • Example 2 : Amount
      • A. Saving function for few input fields : requirement for response time 3 s
      • B. Search function for product information (100 fields) : requirement for response time 10 s
  • 19. Response time requirements
  • 20. Performance requirements in contracts
    • Requirements for performance testing
      • Appendix about performance requirements
      • Test engineer validate testability
      • Both customer and supplier benefit
    • Good for customer
      • No extra costs late in development cycle
    • Good form supplier
      • No new ”surprise” requirements late in development cycle
    • What about performance testing tools ?
      • expensive investment
  • 21. Load testing tools
  • 22. Types of performance testing
    • Concurrency
      • one to several users concurrently in risk scenarios
      • manual testing normally
    • Performance
      • requirements for scalability ?
      • load testing tool needed
    • Load
      • max number of users ?
    • Reliability
      • long usage possible without degrading performance ?
  • 23. Load testing tools - markets
    • Mercury Interactive market leader (> 50%)
    • Big six have 90% of markets
    • 100’s of small companies
    • Prices for major tools quite high
    • Growing market
  • 24. Load testing tools– options to get one
    • Buy licence (+ consulting)
      • usually virtual user count pricing
    • Buy service
      • load generation externally
    • Rent licence (+ consulting)
      • need for only limited time and maybe for just this one project
  • 25. Load testing tools – main functions
    • Recording and editing scripts
    • Scenario design and running
    • Analysis and reporting
  • 26. Load testing tools - features
    • Lots of obligatory features
      • script recording
      • parametrizing
      • scenario design
        • ramp up and weighting
        • different scripts running concurrently
      • online results / feedback
        • error detection
        • transaction level response times
      • http protocol support
  • 27. Load testing tools - features
    • Lots of usually needed features
      • Distributed clients
      • Unix/Linux clients
      • Multi protocol support
      • Multi speed support
      • Multi browser support
      • Server monitoring
      • Content validation
      • Dynamic urls supported
  • 28. Features - Make or buy
    • Some features are possible to be done manually
      • server monitoring
      • analysis and reporting
    • Usability
      • best tools are really easy to use
      • others need lots of work and ”programming experience”
    • Workarounds
      • more features than promised with clever trics
  • 29. Good tool combination
    • Separate load and monitoring tool
      • even from different vendor ?
      • how about profiling ?
    • Script reusability
      • same scripts for functional and load testing
  • 30. Load test preparation
  • 31. Testing environment
    • Same as production environment
      • Other applications dividing same resources (firewall etc) ?
    • Controllable
      • No outside disturbance
  • 32. ” Basic” optimization made
    • What is basic optimization ?
      • Server parameters are validated by responsible persons and list of values given to load testers
      • Database : sql performance checked and necessary indexes exists
    • Without basic optimization load test is waste of time
      • just first obvious bottleneck is found and no real information exists
  • 33. Load test cases
    • First each script separately with ramp-up usage
      • easier to see what is problem straight away
    • Real usage scenario with weighted scripts
      • usually many test runs before goals are achieved’
      • time for repeated tests
      • usual usage first then special occasions
    • What-if scenarios
      • one change at time to see influence of changing factor
    • Risk based testing
      • different location and speed testing
      • hacker testing -> Dos attack etc
  • 34. Script selection
    • User or process scripts ?
      • both are possible
    • Example : Petshop application – user oriented
      • Create order - returning customer
      • Create order - new customer
      • Searching customer
    • Example : Petshop application – process oriented
      • Registeration
      • Create order
      • Search order
  • 35. Script recording and editing
    • Script = Program code (C, Perl etc) to execute test automatically
    • Basic recording
      • execute test case with recording on
      • check and set recording options before start
      • generates script
    • Editing
      • parametrizing
      • transactions
      • think time changes
      • checkpoints
      • comments
  • 36. Parametrizing
    • Recorded script includes hard coded input values
      • If we execute load test with hard coded values results are not realiastic (too good or bad)
    • Parametrizing = Different input values for different virtual users
      • all users of system have different user information
      • more realistic load
  • 37. Test data generation
    • Parameter data with right distribution
      • Generation of test data to text files which load tools can use
    • Real amount of data in databases
      • Backup and restore procedures
  • 38. Transaction
    • Detailed response time information inside script
    • Exact execution times and problem transactions could be seen
      • script with 10 transactions -> when response time increase, are all the tranasactions equally slow or just some ?
  • 39. Checkpoint
    • Functions in script that check correctness of results during execution
    • In some tools could be set automatically
      • Others need manual implementation
    • Find errors otherwise not seen
  • 40. Think time
    • Think time = Time user uses for looking and input before making next request to server
    • Important parameter when estimating usage
      • Less think time means more frequent request and more load to servers
    • Example
      • 100 users logged to system with 10 s average think time = 1 user 6 transactions/ minute and 100 users 600 t/min = 10 tps
      • If think time is 30 s load is 3 tps
  • 41. Comments
    • Making and testing scripts = software development
      • comments for meintenance
      • tools own naming is not always good -> changes needed to get readability
  • 42. Script testing
    • Executing single script succesfully
      • at least twice
      • with checkpoints and parameters
  • 43. Scenario creation and testing
    • Usage information and ramp-up of different scripts in same scenario
    • Designed counters available and working for testing
    • Test run with couple of users
  • 44. Ramp-up
    • Ramp-up
      • User amount increase little by little
      • In real life usually amounts does not change immediately
      • When user amount increase little by little, it is easy to see how response time and utilization develop
      • Stabilizing before next level of load
    • Example : 1000 users use system at same time
      • first 50 users then 50 more every 10 minutes until response time is bad or errors start to increase
  • 45. Collection of performance counters
    • Responsiblity of getting performance counters is usually divided between
      • administrators
      • developers
      • testers -> Load Tool monitors
    • Load tool monitors should tested
      • not always so easy to get information from servers as vendor says
  • 46. Load test execution
  • 47. Reset and warm-up
    • Reset situation
      • Old tests should not influence
    • Warm-up
      • Before actual test, some usage need to done
  • 48. Synchronizing people involved
    • Test manager gets ready from all people involved
    • When test ends syncronization again to stop monitors
    • Collection of results
  • 49. Active online following
    • Counters
      • following online monitors
      • response time and throughput
      • client and server system counters (cpu, memory, disk)
    • Error messages
      • if lot of errors occured test should be stopped
      • errors occur often before application run out of system resourches
  • 50. Response time
    • Most important counter for performance
    • Response time = time user needs to wait before able to continue
    • Industry standard for response time : 8 seconds
    • With response time usage information is needed too
      • simultaneous user amount and what most of they are doing
    • Example
      • 100 simultaneous sessions, 50% update and 50% search
      • Response time requirement 4 s to 95% of insert and update of bill insert. To other functions requirement is 8 s.
  • 51. Throughput
    • Another important counter for validating scalability
    • Amount of transactions, events, bytes or hits per second
    • usual counter tps (=transaction per second)
    • Requirements could be told as throughput value
    • Bottleneck could be seen easily with saturation point of throughput
  • 52. Throughput and response time
  • 53. Performance ”knee”
  • 54. What is bottleneck and how to find it ?
  • 55. What is bottleneck and why it is important ?
    • Any resource (software, hardware, network) which limits speed of application
      • under requirements
      • from good to even better (changing requirements)
    • Bootleneck is result
      • reason should be analysed and fixed
      • for example disk i/o is bootleneck and fix is to distribute log file and database files to different disks
    • ” Chain is as strong as it’s weakest link”
      • application is as fast as worse bottleneck permits
  • 56. How can we identify bottleneck ?
    • Using application and measuring
      • response time and throughput
      • resourche usage measuring
    • One user
      • ralative slowness and resourche utilization (= not yet bottleneck but possible to see that bigger amounts of users will couse one)
    • Several users
      • trends possible to see already (=1 user 1s, 5 users 3 s, 1000 users ? s)
    • Required amount of users
      • Actual max usage scenatio
  • 57. Not so nice features of bottleneck
    • Real bottleneck influence load of other resources
      • ” everything influences everything”
      • when disk is bottleneck, processor looks like one too (but is not)
      • when real bottleneck is fixed other problem will be solved too
      • if we increase processor power, it does not help
    • Real bottleneck ”create” other problems but hide them too
      • first bottleneck should be solved in order to see next real bottleneck
  • 58. Amount and finding of bottlenecks
    • One application has usually many bottlenecks
      • many changes are needed in order to
    • One test finds only one bottleneck
      • many iterations are needed in order to fix all bottlenecks
  • 59. Most common bottlenecks in web-applications
  • 60. Server counters and profiling
    • What counters and log/profile information do we need in order see bottleneck and root cause ?
    • Two levels of counters
      • system counters – cpu utilization %
      • application software counters – Oracle cache hit ratio %
    • Log/profile information
      • detail level resource usage information
  • 61. Collecting system counters
    • Memory, CPU, network and disk counters could be collected
      • with operating system dependent programs like Windows Performance monitor or Unixin sar,top etc
      • with load testing programs like LoadRunner or QALoad
    • Collecting with load testing programs is easier and information is in easy to analyze/report form
    • Counters for all four are needed
  • 62. Interpreting system counters
    • Most important counters
      • CPU – queue length tells if it is too busy
      • Disk – queue length tells if it is too busy
      • Network - queue length tells if it is too busy
      • Memory – hard page faults (disk) tells if it is too small
    • However one counter is not enough
      • to be sure more counters are needed
      • to see root couse more counters are needed
  • 63. Application counters
    • Collecting with load testing programs is easy and information is in easy to analyze/report form
    • However all counters are not available to load testing tools
      • online monitors (Websphere, WebLogic) could be used to complement information
    • Different products have different counters
      • need for understanding that particular product
  • 64. Profiling tools
    • Collecting exact information in call level
      • memory usage
      • disk i/o usage
      • response time
    • Collecting information may influence quite much to results
      • one solution is to make two test runs : one without logging/profiling and other with them
  • 65. Example 1 : One clear bottleneck
    • One of four system resources is busy
      • easy to see bottleneck
  • 66. Example 2 : More than one system resources looks bad
    • However only one resource is real bottleneck
      • others are ”side effects” of real bottleneck
  • 67. Example 3 : None of system resources looks bad
    • Where is the bottleneck then ?
      • usually some software application uses works inefficiently internally or interface queue to external systems does not work efficiently
  • 68. What is root cause and how to fix it ?
  • 69. How to see root cause ?
    • Application level information is usually needed and always good to have
      • Software code problems could be solved when we see which is slow function
    • Some root causes are easy to see while others needs sophisticated monitoring and profiling
  • 70. Software implementation
    • Database server
      • Bad sql from performance point of view (works but not efficiently)
      • No or not good enough indexes used
    • Application server
      • Object references not freed -> too much objects in heap
      • Bad methods used from performance point of view
    • Idea is to decrease load to hardware resouches
  • 71. Efficiency
    • Not efficient use of existing hardware resources
      • Parametrizing and configuring help
  • 72. Capacity
    • Resource too slow for handling events fast enough
    • More resourches or reconfiguring existing resourches
      • Cpu from web-server to db server
  • 73. Hard constraints and requirements
    • Client’s complicated business logic requirements
      • too much bytes needed in user interface (slow network speed)
      • too many different sources of information needed (syncronous)
      • long transactions; single function needs many chained updates
    • Security requirements
      • too much request to web server -> encrypted network traffic
    • Online data needed
      • many big updates needed immediately
  • 74. Bad design
    • Application tiers
      • distribution of tiers possible (=EJB vs pure Servlet)
    • Technology
      • too much information in session object
    • Infrastructure
      • not compatible versions of different vendors from performance point of view
      • needed functionality not available (= distibution not supported)
  • 75. Tuning
    • Tuning
      • application
      • server software
      • operating system
      • network
    • Usually good choice
      • fast to do
      • risks to regression small
    • Usually tuning is not enough
      • changes are needed
  • 76. Changing
    • Application code
    • Application software
    • Hardware
    • Network infrastructure
  • 77. Tuning vs change
    • Tuning is not so risky
    • Change is not always possible
    • In practise both are valid and equally considered
  • 78. Example : Tuning vs change
    • Sales system has application server processor bottleneck
    • Could be removed
      • More processing power
      • Less processing needed -> application code change
    • If application logic need to be changed a lot
      • More processing power choosed
    • If application logic need to be changed a little
      • application code change choosed
    • If both are fast, easy and costs are low
      • both are choosed
  • 79. Removing bottlenecks
    • Idea : Removing root cause of bottleneck one by one
    • Rerun same test to see influence
  • 80. Testing part of system
    • Sometimes it is difficult to see bottleneck and root cause
      • More information is needed in order to understand system better
      • Testing just one suspect at time is usually possible but could need much effort
    • Testing only one extent at time is ultimate way
  • 81. Top – down optimizing
    • When there is plenty of time
      • not very fast, but efective
    • Idea : Optimize one level at time
    • -> Level by level readyness
      • No jumping between levels
    Application code Application software Operating system Hardware
  • 82. Memory–cache-pool-area usage
    • Idea : Data or service that application needs is already in memory as much as possible
    • System level
      • big enough memory -> not much swapping needed
      • proxy server caches content
    • Application level
      • big enough database connection pool -> new objects not needed
      • big enough database sort area -> not much swapping needed
  • 83. Connection and thread pools
    • Creating many objects at startup
      • new user gets object from pool
      • when used object returns to pool
  • 84. Synchronous and asynchronous traffic
    • If possible actions could happen asyncronously (= no need to wait that action is ready)
    • Interfaces to other systems
  • 85. Distributing load
    • Between servers
      • load balancing
    • Inside server
      • cpus
      • disks
    • Between networks
      • segments
  • 86. Cut down features
    • Sometimes only possibility is to cut down features and requirements
      • deadline too neat to make other optimizing
      • costs or risks too big when doing anything else
  • 87. Making recomendations for correcting actions
    • Need usually interpretion of results from different persons
      • however understanding and criticality is needed
    • Results should be clear
      • usual ”It is not our software but yours” conversations could be avoided if nobody can question results and recomendations
      • need to show where problem is not !
  • 88. Example : Internet portal
    • Application : Many background systems develop data to this portal
    • Response time in USA:ssa 5 s, when connections are fast
    • In Asia every connection takes 2 sec and moving elements between server and client is slow too
    • Logic : 12 frames inside each other
    • Result : Opening first page takes 2 s*12 + 30 s = 54 sec
    • Requirement : 8 sec
  • 89. Corrective actions and ideas
    • Idea 1 : Faster connection
      • not possible -> thousands of internet customers
    • Idea 2 : Content nearer to customer
      • pictures partly to client workstations -> security regulations prevent partly
      • content to servers near customers (Content Delivery Network)
      • helps some but not enough
    • Idea 3 : Packaging of data
      • helps some but not enough
    • Idea 4 : Application logic change -> less frames
      • lot of costs
      • requirements achieved
  • 90. Error and lessons learned
    • Internet users with slow connections and different geagraphical areas
      • can be important user group
      • Technical design failed to this group
    • Perfotmance testing late in development cycle
      • too late
      • not simulated real usage good enough
    • Pilot users saved much
      • not widely used when problems we seen
    • Solution was found (as usual)
      • but fixing took much time and money