Application Performance Lecture


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • GPD Service performance testing and root cause analysis is an engagement between Company-X and TechVoyant Private Limited to establish and ascertain problems that Company-X is facing in the performance of their project reporting web service called Global Delivery Dashboard.
  • Application Performance Lecture

    1. 1. Application Performance Analysis Finding Bottlenecks in End-user Experience of Applications
    2. 2. Introduction <ul><li>VISHWANATH RAMDAS </li></ul><ul><li>TECHVOYANT INFOTECH PRIVATE LTD </li></ul><ul><li>IT INFRASTRUCTURE DESIGN AND MANAGEMENT SERVICES </li></ul>
    3. 3. Application Performance depends on <ul><li>Client Nodes & Configuration </li></ul><ul><li>Network Infrastructure </li></ul><ul><ul><li>Devices | Links Capacity </li></ul></ul><ul><ul><li>Usage of the Capacity Patterns </li></ul></ul><ul><li>Server nodes hardware </li></ul><ul><ul><li>Capacity to Handle Transactions </li></ul></ul><ul><ul><li>Available Resources </li></ul></ul><ul><ul><ul><li>CPU | Memory </li></ul></ul></ul><ul><ul><li>OS Services </li></ul></ul><ul><ul><ul><li>Web Server </li></ul></ul></ul><ul><li>Application Design </li></ul><ul><ul><li>Application Code </li></ul></ul><ul><ul><li>Database Queries </li></ul></ul>
    4. 4. Typical Stages in Performance Analysis <ul><li>Understand the Problem </li></ul><ul><li>Design the Experiment </li></ul><ul><ul><li>State the Null Hypothesis </li></ul></ul><ul><ul><li>Factor all Elements in the End user Experience. </li></ul></ul><ul><ul><li>Design experiments to eliminate variables. </li></ul></ul><ul><li>Collect Data </li></ul><ul><ul><li>Where do you measure </li></ul></ul><ul><ul><li>How do you measure </li></ul></ul><ul><ul><li>What do you measure </li></ul></ul><ul><li>Analyze Data </li></ul><ul><ul><li>Verify THE HYPOTHESIS </li></ul></ul><ul><ul><li>Analyze and contextualize information. </li></ul></ul><ul><ul><li>Arrive at causes and suggest remedies </li></ul></ul>
    5. 5. Capture All the Elements – Fish Bone Diagram GPD
    6. 6. Key Challenges in Performance Analysis <ul><li>How do you design the experiment to isolate the variables in the problem. </li></ul><ul><li>How do you probe and Measure </li></ul><ul><ul><li>Most Infrastructure Elements are known </li></ul></ul><ul><ul><li>Context specific Measurement of the Elements is difficult </li></ul></ul><ul><li>What Mathematical Model during Analysis ? </li></ul>
    7. 7. Large Intranet Web Application Case Study ..
    8. 8. Global Projects Delivery (GPD) – Context <ul><li>The GPD server is based out of Mumbai </li></ul><ul><li>The server is accessed by users from multiple locations </li></ul><ul><ul><li>Bangalore 3 locations </li></ul></ul><ul><ul><li>Mumbai 3 locations </li></ul></ul><ul><ul><li>Hyderabad </li></ul></ul><ul><li>The server is critical to monitoring & managing project progress across all Development Centers in India </li></ul><ul><li>User base of 6000+ users </li></ul>
    9. 9. GPD Architecture Overview Web Layer /IIS HTTP/HTT COM Components Services Layer ASP ASP ASP Messaging/ Mail Extraction Business Processing Validation Graphs/ Charts Query/ Report Building MS Project Utility Active Report Server Microsoft Project Pinnacle Graphic Server Microsoft CDONTS SQLServer Data Access Layer ADO Report Browser
    10. 10. Current Concern Area <ul><li>GPD Server response has been erratic </li></ul><ul><li>User experience has been varying </li></ul><ul><ul><li>Across Locations worsening in remote locations </li></ul></ul><ul><ul><li>Across time </li></ul></ul><ul><li>Frequent transaction failure and disconnects </li></ul>
    11. 11. Probes – Onion Ring formation for differential Analysis. Based on the network and system design given by Company-X, we planned to deploy 12 probes over different layers of the infrastructure
    12. 12. Test Implementation – Approach. <ul><li>Set up Server </li></ul><ul><li>Set up probes at key locations on the network </li></ul><ul><li>Simulate user transactions </li></ul><ul><li>Assign Simulations & Schedule </li></ul><ul><li>Transact & Collect performance Data </li></ul>PROBES SIMULATION PROBING DATA COLLECT TEST APPLICATION
    13. 13. Typical Internet Transaction Where is the server? DNS resolution Client Identifies server Connect to server with request (GET) Client connects to server Server Response with Initial byte of data Includes Web server Application Server Database .. Server responds With request Time to download data fully Including Page Layout Page objects .. images frames (which form requests) Page content Request is transmitted to the client
    14. 14. Study was done in 3 STAGES <ul><li>Study Web Server Logs for user access Patterns </li></ul><ul><ul><li>Log files of the GPD Server </li></ul></ul><ul><ul><li>Identify most accessed pages within GPD. </li></ul></ul><ul><li>Setup Experiment Server </li></ul><ul><ul><li>Setup & Deploy HP Internet Services Tool to probe </li></ul></ul><ul><ul><ul><li>DNS Time; Connect Time; Server response; Data transfer; </li></ul></ul></ul><ul><ul><li>Set Up probes in critical customer locations & Collect Data for over 1 week </li></ul></ul><ul><ul><li>De-duplicate and massage data collected </li></ul></ul><ul><li>Analyze & Report </li></ul><ul><ul><li>Slice data collected across dimensions </li></ul></ul><ul><ul><li>Use differential data to triangulate on the root cause </li></ul></ul>
    15. 15. Page wise analysis - 8 pages constitute 80% of hits … Selected as the Target Pages in the Test Environment 7.5% others 0.6% PM/DetailsDA.asp 0.7% IBIsEntForRvw.asp 0.8% IB/IBQueryBld.asp 1.0% PM/ProjectRES.asp 1.1% EmployeeSel.asp 1.1% PMDashboard.asp 1.4% ReportUIBld.asp 3.2% IBIssueAssgn.asp 3.3% DailyActivityMatrix.asp 4.1% CommonPage.asp 5.3% DeveloperDatabase.asp 7.0% Introduction.asp 7.0% IBIssueList.asp 8.8% CommonList.asp 47.2% DailyActivity.asp
    16. 16. Navigation bar forms 71% of access > large overhead to access relevant pages <ul><li>¾ of transacted pages are the navigation bar. </li></ul><ul><li>Among the remaining ¼ transactions the following pages are key </li></ul><ul><ul><li>Dailyactivity.asp </li></ul></ul><ul><ul><li>Commonlist.asp </li></ul></ul><ul><ul><li>ProjectResources.asp </li></ul></ul><ul><ul><li>TaskAssignment.asp </li></ul></ul>
    17. 17. In English! these key pages are 3.2% 3.3% 4.1% 5.3% 7.0% 7.0% 8.8% 47.2% Assign Work Request Task Up Date for a period Project Task List for User ?? Back end page that runs with login List Work Request Received Project List for logged user Task Up date IBIssueAssgn.asp DailyActivityMatrix.asp CommonPage.asp DeveloperDatabase.asp Introduction.asp IBIssueList.asp CommonList.asp DailyActivity.asp
    18. 18. Time analysis of traffic- Most of the Volume happens @ 3 points in the day <ul><li>The design will consist of probe schedule of 5 min access within 15 min intervals to address </li></ul><ul><ul><li>Morning Session peak </li></ul></ul><ul><ul><li>Post lunch session Peak </li></ul></ul><ul><ul><li>Evening Close out peak </li></ul></ul><ul><li>This would be the same for all 8 probes across the network </li></ul>
    19. 19. Usage pattern on 2 nd November.. <ul><li>Large Usage early in the morning </li></ul><ul><li>A marginal peak in the evening </li></ul>
    20. 20. Server response is within control! <ul><li>Maximum response time for any page was less than 8 seconds </li></ul><ul><li>96% + responses were within 0.05 seconds! </li></ul>
    21. 21. Very few page/transaction drops @ server! <ul><li>2 % transactions failed </li></ul><ul><li>failed transactions were mainly due to: </li></ul><ul><ul><li>Server error (83%) </li></ul></ul><ul><ul><li>Page Not Found: 404 (15%) </li></ul></ul>
    22. 22. 98% of response is in Data Transfer.. Server Response with Initial byte of data Includes Web server Application Server Database .. Server responds With request 98% 1.6% ~0% ~0% Where is the server? DNS resolution Client Identifies server Connect to server with request (GET) Client connects to server Time to download data fully Including Page Layout Page objects .. images frames (which form requests) Page content Request is transmitted to the client
    23. 23. 98% of response is regardless of location or time…. Overall SERVER IS NOT THE BOTTLE NECK
    24. 24. In 2 nd run key pages were smaller but down load times didn’t reduce! <ul><li>Key page sizes are ~ 50% smaller in size </li></ul><ul><li>YET! Download (transfer times ) approximately the same </li></ul><ul><li>Network is still the bottleneck! (98.4%) </li></ul>
    25. 25. For e.g. in Common list & daily Activity the transfer time increases 3 – 5 X with more requests.
    26. 26. Multiple Requests is causing a slow down in data transfer > APPLICATION DESIGN! <ul><li>Pages with multiple requests result in data transfer at much lower rates. </li></ul><ul><li>For e.g. Commonlist.asp </li></ul><ul><ul><li>Project list of the user </li></ul></ul><ul><li>Multi Requests also do not improve performance during lean hours like midnight! </li></ul>
    27. 27. 2 nd run – multiple requests continue to effect transfer throughput <ul><li>Still 50% slower throughput in multiple request pages. </li></ul>
    28. 28. How Does a typical page load? <ul><li>Initially the layout is loaded (frames | tables) </li></ul><ul><li>Frame Content is downloaded as requests </li></ul><ul><li>Objects in each frame are requested from the server. </li></ul><ul><li>Pages with multiple frames download the data in multiple requests </li></ul><ul><li>That’s like a waiter bringing your dinner as individual items rather than with a tray! </li></ul>
    29. 29. Some insights <ul><li>Frequently visited pages are large with multiple frames These pages have large foot-prints </li></ul><ul><ul><li>Multiple requests sought by web/app server to generate a single page </li></ul></ul><ul><li>Task update (dailyactivity.asp) is most accessed page. </li></ul><ul><ul><li>Access is 6 - 10 X times more compared to other pages </li></ul></ul><ul><ul><li>Multiple keystrokes & pages to reach the desired page </li></ul></ul><ul><ul><ul><li>Leading to excessive data transfer over the network </li></ul></ul></ul><ul><ul><li>Small relevant payload </li></ul></ul><ul><ul><ul><li>large redundant data overhead –rich with images & frames. </li></ul></ul></ul>
    30. 30. Overall 7 % of transactions failed: mostly in login.asp ~ IIS Web server? <ul><li>The most sensitive pages to transaction failure </li></ul><ul><ul><li>20% > Login.ASP > Is there a problem with IIS? </li></ul></ul><ul><ul><li>9% > Employee Select.ASP </li></ul></ul><ul><ul><li>8% > Common List.ASP > nothing conclusive! SQL ? COM+ </li></ul></ul><ul><li>Page failure was not related to page size!. </li></ul>
    31. 31. Conclusions <ul><li>There is a need to reduce data volume at source by </li></ul><ul><ul><li>Changes to the presentation layer </li></ul></ul><ul><ul><li>Changes in sequence of pages </li></ul></ul><ul><ul><li>Remove redundant objects/images </li></ul></ul><ul><ul><li>De-link application logic from presentation and data </li></ul></ul><ul><li>There is a need to re-engineer the network to handle more traffic </li></ul><ul><ul><li>Relocating the server closer to majority users </li></ul></ul><ul><ul><li>Shifting the network hub closer to server </li></ul></ul><ul><ul><li>Removing bottlenecks across the network </li></ul></ul><ul><ul><ul><li>Router, switch, firewall configuration </li></ul></ul></ul><ul><ul><ul><li>Expanding access pipes </li></ul></ul></ul><ul><ul><ul><ul><li>QOS, Shaping </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Physically </li></ul></ul></ul></ul>Current VOLUME OF DATA CAPACITY OF PIPE Improved
    32. 32. Actions from Discussion HI | Need to create the extra access HI | reduces data volume Create Separate portals for Managers & Users. Users could have frameless simple pages NIL MED : reduce multiple requests. LO : reduce redundant hits to server HI : reduce redundant hits to server HI : change design of pages HI : potential to improve data transfer speeds by multiples APPLICATION Reduce # of requests by reducing FRAMES; Images (not as critical) Improve presentation flow | Ensure users need fewer clicks to access important pages like daily Activity. Remove images on the pages GPD ; Company-X :hourglass HI : Consulting engagement HI : Identify specific device & design related issues Study and optimize data routing between clients and server HI : direct cost increase HI : More space less latency Introduce more bandwidth on to the pipe Lo : Efforts minimal but impact on other applications could be adverse. HI : More space less latency Increase Pipe Allocation from current burst max of 2 MB to 3 MB HI : move hub to server location MED : fewer hopes fewer drops Bring network HUB close to server location HI : need to relocate production server to BLR HI : reduces load on the link (2/3 users are in BLR & HYD >> free pipe by 50% traffic BRING SERVER CLOSER TO MASS OF USERS NETWORK EFFORT IMPACT ACTION