2. Cloud Computing Defined “Cloud Computing is large pool of easily usable and accessible virtualized resources that can be dynamically reconfigured to adjust to a variable load and operated on a pay-per-use model” ACM “Cloud Computing is a style of computing where massively scalable IT-related capabilities are provided ‘as a service’ across the Internet to multiple external customers” Gartner
3. Cloud Computing Defined “Cloud Computing is large pool of easily usable and accessible virtualized resources that can be dynamically reconfigured to adjust to a variable load and operated on a pay-per-use model” ACM “Cloud Computing is a style of computing where massively scalable IT-related capabilities are provided ‘as a service’ across the Internet to multiple external customers” Gartner
11. Today’s ‘Era of Tera’ Uncertainty Uncertainty in Business (“Slashdot/Techcrunched”) Uncertainty in Economy Users and Data flood Millions of Users and PBs of data Latency Matters Performance is now directly related to customer service Global-Scale Spanning Multiple Geographies Diverse Environments Mobile Platforms Middleware on Variety of DB Servers, App Servers
12.
13. Scale: 50 servers to 5000 servers in 3 days Amazon EC2 easily scaled to handle additional trafficPeak of 5000 instances Number of EC2 Instances “Techcrunched” Launch of Facebook modification. Steady state of ~40 instances 4/12/2008 4/14/2008 4/15/2008 4/16/2008 4/18/2008 4/19/2008 4/20/2008 4/17/2008 4/13/2008
14.
15. How to Test in this “Era of Tera” How will you test to see whether your website is spike-proof ? How will you test your website if 50M users are going to hit your website in next 2 hours? How will you test for 750K concurrent users ? How will you test your latency from different parts of the world ? How will you test when you have minimal testing budget in this economy ? How will you test on different environments ?
16. Everything’s Changed, Nothing’s different Stress Testing Load Testing Web Performance Testing Web App (AJAX) testing Usability Testing Unit Testing Regression Tests Integration Tests
18. Common problems in our world of testing “I cannot reproduce the bug” (environment mismatch) “Its just take too much time to configure the tools” “Site works fine in US, but does not work from EU” “Its too expensive to set up, maintain and update a test lab” “Its takes too much time and efforts to set up a test lab” “Test phase last for only 2 months : Underutilized Test Boxes”
19. On-demand Test Labs Physical Test Labs become out-dated too fast Maintaining Test Labs is pain Configuration Latest patches Test lab when you need, For the duration you need “Need it now” “Need only for 3 month Test cycle” Elastic scale (Grow and Shrink requirements based on pre-defined SLA) Throw-away Test labs (Get a brand new lab every time) No more begging for more servers required
21. AMIs for Reuse and Repros Virtualization Create test environments dynamically Bundle AMIs With basic dependencies and OS of your choice Share AMIs Share entire environments with dev/prod teams with few clicks
22. Testing as a Service : “Push it to the Cloud” Traditional enterprise solutions are complex Incur High upfront license fees Steep learning curve in Open source tools Testing as a Service Stress, Load, Performance Testing services Pay as you go Meter bandwidth in/out Meter Instance Usage hours Meter CPU Usage-based costing model
38. “Let’s run it again!” Test more and Test Often Iterative process of test-analyze-fix-test Testing is a background activity Real-time results in Dashboards Automation through Web Services Set up test labs on-demand Automated scripts to launch infrastructure you need Cost-effective Automated Testing Infrastructure up only during the build and test time Build run at 2AM on 2 Instances for 2 Hours : Cost of ~$1/Day
39.
40. Test #60650 EC2 instances spawned Served 500K concurrent Users and 10 Million Hits in 1 Hour Test #0 GOAL: 3M Users/Hits in 1 Hour 200K concurrent Users Local : 100 Concurrent User test Test-Fix-Test Iterative Process Timeline Test #25300 EC2 instances spawned Crash point: 170K concurrent users Served 3M Users in 1 Hour Test #160 EC2 instances spawned Crash point: 500 concurrent users
41. Test #60650 EC2 instances spawned Served 500K concurrent Users and 10 Million Hits in 1 Hour Test #0 GOAL: 3M Users/Hits in 1 Hour 200K concurrent Users Local : 100 Concurrent User Fail Goal: Exceeded Timeline : 3 Months Actual Testing time: 60 hours Test-Fix-Test Iterative Process Timeline Test #25300 EC2 instances spawned Crash point: 170K concurrent users Served 3M Users in 1 Hour Test #160 EC2 instances spawned Crash point: 500 concurrent users
46. User Testing at WeoGeo “We created a 6 question survey focused on our registration and email validation process and offered 2 cents per completed survey. After only 3 batches of 6 surveys (18 total for a whopping 36 cents!) we identified and confirmed problems with AOL, MSN Hotmail, and Yahoo! Mail. Other EC2 users had reported similar problems which quickly led us to a solution”
47. Create actual test scenarios (Selenium) Usability testing Cross browser testing Analyze test results Test Links on the website Create Surveys to rate look and feel, navigation, search features of your website
48. Generations of Testing * James Whittaker Blog posts on “Future of Testing” TestSourcing = CrowdSourcing + CloudComputing
49. A Test problem : New Video Startup Suppose you just launched a new website that embeds videos on other websites... ... and you just landed a biz dev deal that will add 200X more load in less than two weeks! You want to know your site can handle 2000 concurrent video streams and the associated AJAX calls in between each clip. What would you do?
50. A Solution The future of software (and testing) requires ultra-tight iteration loops. Cloud computing is poised to be a rocket on the back of agile techniques. Virtualize for consistent state management Crowd-source for quick human intelligence Massiveparallelization using both But first: How quickly can you obtain 2000 Firefox browsers?
51. On-demand load testing service (pay only for what you use) Uses real Firefox browsers (based on Selenium automation technology) Bypasses traditional load testing approach of simulating HTTP traffic Only possible because of cloud computing Runs from EC2 US-East and EU-West Regions
52. 2000 Browsers in 15 Minutes Massive amount of hardware required... ... yet available in minutes 334 High-CPU Extra Large EC2 Instances 2.6TB of RAM 2672 CPU Cores Over 550 mbps throughput
53. Using Amazon EC2 15 minutes before a test: prepare hundreds of EC2 instances Each instance runs a Firefox browser and VNC X Server Failures are visually captured as screenshots Data is consolidated in a local EC2 availability zone and uploaded to S3 Our costs only occur when we have revenues, so our pricing can be very low
54.
55. Parallel Machines and People BrowserMob is just the tip of the iceberg! Imagine: What if quality could be verified in minutes instead of hours (1000s mins v/s 1 min) The key is parallel execution: Running automated tests in parallel (eg: unit tests, integration tests, browsers, etc) Using the crowd to temporarily increase your QA staff by 100X
56. Cloud Computing is inevitable Testing in the Cloud Instant Test Labs in Minutes Testing as a Service Virtualization/AMI for Reuse and Repros Web Services for Automation On-Demand Workforce of Testers Client and Server Parallelization
Stress test in the CloudCreate AMIs with libraries and dependenciesAdd “computer power” when needed and turn it off to reduce costsLoad test in the CloudGenerate load from one Availability Zone to test on other.Startup a pre-configured TestBox (EC2 instance) in minutesPerformance test in the CloudTest at Global scale - Latency from different parts of the worldStore all instrumentation data on S3, SimpleDB.Web App testingBrowser based Ajax/Selenium testing from different availability zones (US and EU)Create different deployment environments using scripts Usability TestingOn-demand workforceWhat does Testing in the Cloud mean:Automated, Virtual Test Labs that are live only when you need themStress test in the CloudFind the source of latency, Potential Crashes and/or points of Failure. Get Profile information thru logs and instrumentation and measureLoad test in the CloudGenerating load from one Availability zone to other “staging” servers on or off 100 concurrent browsing users that randomly click on links. Then, the load can be increased by 100 users every 10 minutes until the total expected user load of 100,000 users is reached.Performance test in the CloudHow fast the page is loading for a given user in given stateUsability Testing
Testing in the CloudInstant Test Labs in MinutesTesting as a ServiceVirtualization/AMIs for Reuse and ReprosWeb Services for AutomationOn-Demand Workforce of Testers (“Elastic QA Staff”)Client and Server Parallelization
Pay as you go - Increased utlization
The Scheduler plans the plan, spawns the Load Generators, coordinates activities of the Load Generators, and plays traffic cop for all other tests being conducted in LoadStorm. The Load Generators (LG) produce the requests to the target web applications (server). They handle all communications with the server, including capturing returned pages and status codes. The LG builds an extensive record of raw data regarding the test metrics. The Summarizer uses the database of findings from the LG to calculate the metrics and KPIs that LoadStorm makes available through the graphs and tables for analysis.
Testing as Background/Daily activity
QTRAXNew Music Site for FREE Music Downloads, with 300,000 registered users
TESTING CHALLENGEWanted to Test QTRAX.com “staging” sites located in LA-London-HKWanted to Test several different “real world” user scenario’s (Global)Wanted to Test over 3M users hitting web site in a (1) hour periodWanted to Test a “burst” of 200K concurrent users• QTRAX largest previous Load Test was 100 usersQTRAX TEST SETUPSOASTA Team worked with QTRAX to create (20) user scenariosSOASTA Team then provisioned (650) servers located in NJ, UK in 15 minutesQTRAX decided to monitored over a (800) areas of network, system, and applicationsQTRAX TESTStress and Load Test were Performed on the Qtrax Site located in Los Angeles Data CenterIterative Test Process, Lasted over (3) months, with a total of (60) Hours of actual Test TimeRamping up and spiking to 500,000 concurrent users or 2.32 Gbit per sec.Tested over 10M Hits per hour on the QTRAX siteRecored several TB’s of Test analytics and results Data.QTRAX RESULTSAggregated, correlated Test Data displayed LIVE thru real-time DashBoardsProblems were found-fixed-retested, until goal of 500,000 concurrent was hit
The instant when average response times increased -- and at what user loadInformation about which application servers were not being balanced properly by the load balancersInformation about which application servers were having connection problemsInformation about which servers (Database, application server, web servers) were hitting CPU limitations at low virtual user levelsInformation about which user scenarios scaled well as the user load increasedMetrics around errors, error rates, and the causes of those errors
The instant when average response times increased -- and at what user loadInformation about which application servers were not being balanced properly by the load balancersInformation about which application servers were having connection problemsInformation about which servers (Database, application server, web servers) were hitting CPU limitations at low virtual user levelsInformation about which user scenarios scaled well as the user load increasedMetrics around errors, error rates, and the causes of those errors
Because some WeoGeo Market users reported that they were not receiving email notifications, we had a need for User Testing across a variety of email platforms.
TestSourcing = CrowdSourcing + Cloud Computing
Cloud Computing is changingIn this Era of Tera, Testing for Scale is imperativeTesting as a ServiceOn-DemandCloud TestingVirtualization for ReuseVirtualization for ReprosTest Labs in Minutes