Reducing Build Time - Aman King


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • * Green icon represents a stable green CI build
    * Orange icon represents a CI build in progress
    * Red icon represents a broken CI build
  • * It is good to immediately know where you stand: whether all is good or something is broken
    * That’d enable you to fix something almost as soon as you break it
  • * High functional coverage implies that almost all functionality of the system is tested and any error introduced would be caught
    * Short CI build time implies that quick feedback is in place and any error introduced in the covered functionality would be immediately highlighted
  • * A website with a lot of content that comes from various sources, with a focus on creating an online community of people with similar interests
  • * Failures not caused by programmatic errors: possible reasons could include AJAX response delays, unpredictable browser responsiveness, network slowness, etc
  • * If a build is broken on what appears to be non-deterministic failure, someone will have to manually trigger another build in hope of the error not reoccurring
    * A retriggered build will take long again, further delaying feedback
  • * One main process will fork multiple sub-processes in parallel, each running only a subset of the original collection of test scenarios
    * When all sub-processes complete, the union of their results will be the final result of the entire test run
    * If the sub-processes are run on the same machine, a powerful multicore machine is desirable
    *Another approach is to run the sub-processes on different machines on a grid
  • * Chart of build times vs build numbers: the drastic drop is when parallelization was implemented
    * Slight variances in build times, including minor drops, also occur based on memory and CPU load at the time
  • * Used a Ruby library called parallel_tests to fork multiple Ruby processes, each running a subset of Cucumber features
    * This also involved coming up with custom rake tasks, including multiple pre-steps and post-steps
    * Instead of using parallel_tests to parallelize on the same machine, Selenium Grid is an alternate option to leverage grid computing
  • * Capture each forked process’s results independently* Generate consolidated HTML report as after-step when all forked processes finish
  • * Rails’ database.yml was modified to suffix process number to the name of the database
  • * Sunspot gem’s sunspot.yml was modified to suffix process number to the paths that Solr was to use
    * Monkey-patched Sunspot code to allow above parameterization in sunspot.yml
  • * Although each sub-process starts its independent Firefox instance, Selenium uses a shared set of ephemeral ports to communicate with the browsers
    * Contention occurs for these ephemeral ports which get locked when in use for a particular Firefox instance
    * Monkey-patched Capybara’s Selenium driver code to retry communicating with Firefox if a couple of attempts fail
  • * A particular machine can only allow so much parallelization based on its hardware specifications
    * Beyond a certain number of sub-processes, the non-deterministic failures will increase
    * After a few tries, we stabilized on having 6 concurrent sub-processes on the machine we were using
  • * Cucumber provides a rerun mode which retries all failed tests one more time before deciding the final status of the test suite run
    * Customized a rake task to invoke Cucumber’s rerun sequentially, after the parallel results come in
  • * We stubbed some external calls to return known values to cut down on communication delays
    * Used a Ruby library called WebMock
    * Examples: authentication servers, email servers, Facebook, Twitter, etc
  • * It helps to reduce the number of independent tests without reducing the functionality coverage
    * Example: if during sign up, email and zipcode are mandatory, they don’t need independent tests; there can be one test that begins by leaving email and zipcode empty, verifies that error messages are displayed for email and zipcode, and then adds email and zipcode, and proceeds with valid sign up scenarios
    * The above reduces browser clearing steps, visiting a certain page repeatedly without it contributing to the test steps, and so on
  • * There are multiple benefits of having quick feedback; implementing parallelization is one approach to achieve that
    * A complementary approach is to introspect and work on improving performance and responsiveness of the app itself, reducing build time by reducing waiting time
    * Being aware of build times and improving the same continuously is a good habit; even with parallelization in place, as more functionality gets added, build times will creep up
  • Reducing Build Time - Aman King

    1. 1. Saturday, 26Saturday, 26thth March, 2011March, 2011
    2. 2. Reducing build time when patience is not a virtue Aman King Application Developer ThoughtWorks
    3. 3. Recognize these?
    4. 4. Rank these
    5. 5. Did you choose this? 1 32
    6. 6. Or did you choose this? 1 23
    7. 7. Patience is not always a virtue!
    8. 8. Fail fast! quick feedback
    9. 9. Rank these  High functional coverage  Short build time
    10. 10. Did you choose this? a.High functional coverage b.Short build time
    11. 11. Or did you choose this? a.Short build time b.High functional coverage
    12. 12. You can choose both! a.Short build time a.High functional coverage
    13. 13. But before we see how…
    14. 14. Project background
    15. 15. Content driven, community oriented website
    16. 16. Ruby on Rails Cucumber + Capybara + Selenium
    17. 17. 4 : 1 ratio of Dev : QA Everyone writes functional tests in a common automation suite Acceptance Test Driven Development (ATDD)
    18. 18. Our problems
    19. 19. Long build time ~ 55 minutes
    20. 20. Non-deterministic failures
    21. 21. Manual reruns needed
    22. 22. Our solution: Parallelization
    23. 23. Basic idea
    24. 24. Reduced build time ~ 5 minutes
    25. 25. Build time chart
    26. 26. What we did
    27. 27. Parallelization on a single multicore machine used Ruby library parallel_tests
    28. 28. Report generation in parallel wrote custom report formatter
    29. 29. Isolating databases each process needs its own database
    30. 30. database.yml test: &test adapter: mysql2 encoding: utf8 reconnect: false database: myapp_test<%= ENV['TEST_ENV_NUMBER'] %> pool: 5 username: root password:
    31. 31. Isolating external dependencies (Solr) each process needs its own Solr instance
    32. 32. sunspot.yml test: solr: hostname: localhost port: <%= 8982 - ENV['TEST_ENV_NUMBER'].to_i %> log_level: INFO #WARNING log_file: <%= File.join(::Rails.root, 'log', "solr_#{ENV['TEST_ENV_NUMBER'].to_i}.log") %> data_path: <%= File.join(::Rails.root, 'solr', 'data’, "#{ENV['TEST_ENV_NUMBER'].to_i}") %> pid_path: <%= File.join(::Rails.root, 'solr', 'pids', "#{ENV['TEST_ENV_NUMBER'].to_i}") %>
    33. 33. Handling multiple Firefox instances and Selenium’s use of shared ephemeral ports
    34. 34. Monitoring Memory and CPU usage adjust number of parallel process accordingly
    35. 35. Auto rerunning failed tests non-deterministic failures are likelier with the CPU under stress
    36. 36. Beyond parallelization…
    37. 37. Stubbing external calls reduce dependency-related delays where avoidable
    38. 38. Consolidating scenarios combine similar test scenarios into single runs instead of separate tests
    39. 39. Maintaining conventions when writing automated tests avoid time-based wait statements, use test framework APIs that take lesser time, etc
    40. 40. Conclusion
    41. 41. Resources        
    42. 42. Thank you  