Automated Test Hell
Wojciech Seliga
wojciech.seliga@spartez.com, @wseliga
or There and Back Again
About me
• Coding since 6yo	

• Former C++ developer (’90s, early ’00s)	

• Agile Practices (inc.TDD) since 2003	

• Dev N...
XP Promise
CostofChange
Time
Waterfall
XP
The Story
2.5 years ago
About 50 engineers
Obsessed with Quality
Almost 10 years
of accumulating
garbage automatic tests
18 000 tests* on all levels
13 000 unit tests
*excluding tests of the libraries
4 000 func and integration tests
1 000 Sel...
Atlassian JIRA
Our
Continuous Integration
environment
Test frameworks
• JUnit 3 and 4
• JMock, Easymock, Mockito
• Powermock, Hamcrest
• QUnit, HTMLUnit, Jasmine.js, Sinon.js
•...
Bamboo Setup
• Dedicated server with 70+ remote
agents (including Amazon Elastic)
• Build engineers
• Bamboo devs on-site
Looks good so far?
for each main branch
Run in parallel
in batches
Run first
There
is
Much
More
Type of tests
• Unit
• Functional
• Integration
• Platform
• Performance
Platforms
• Dimension - DB: MySQL, PostgreSQL, MS
SQL, Oracle
• Dimension - OS: Linux, Windows
• Dimension - Java ver.: 1....
Triggering Builds
• On Commit (hooks, polling)
• Dependent Builds
• Nightly Builds
• Manual Builds
Very slow (long hours)
and fragile feedback loop
Serious performance and
reliability issues
It takes time to fix it...
Sometimes very long
You commit at 3 PM
You get “Unit Test Green” email at 4PM
You get flood of “Red Test X” emails at 4 - 9PM
Your colleagues o...
“We probably spend more
time dealing with the JIRA
test codebase than the
production codebase”
Dispirited devs
accepting RED as a norm
Broken window theory
Feedback
Speed
`
Test
Quality
Catching up with UI
changes
Page Objects Pattern
Problem:
Solution:
Page Objects Pattern
• Page Objects model UI elements (pages,
components, dialogs, areas) your tests
interact with
• Page ...
Page Objects Example
public class AddUserPage extends AbstractJiraPage!
{!
!
private static final String URI = !
"/secure/...
Using Page Objects
@Test!
public void testServerError()!
{!
jira.gotoLoginPage().loginAsSysAdmin(AddUserPage.class)!
.addU...
Opaque Test Fixtures
REST-based Set-up
Problem:
Solution:
REST-based Setup
@Before!
public void setUpTest() {!
restore("some-big-xml-file-with-everything-needed-inside.xml");!
}!
@...
Flakey Tests
Timed Conditions
Problem:
Solution:
Mock Unreliable Deps
Test-friendly Markup
Flakey Tests
Quarantine
Problem:
Solution:
Fix Eradicate
Quarantine
• @Ignore
• @Category
• Quarantine on CI server
• Recover or Die
Non-deterministic tests are
strong inhibitor of change
instead of the catalyst
Execution Time:
Test Level
Unit Tests
REST API Tests
JWebUnit/HTMLUnit Tests
Selenium/WebDriver Tests
Speed Confidence
Our example:
Front-end-heavy web app
100 WebDriver tests:
100 QUnit tests:
15 minutes
1.2 seconds
Test Pyramid
Unit Tests (including JS tests)
REST / HTML Tests
Selenium
Good!
Test Code is Not
Trash
Design
Maintain
Refactor
Share
Review
Prune
Respect
Discuss
Restructure
Optimum Balance
Isolation Speed Coverage Level Access Effort
Dangerous to temper with
MaintainabilityQuality / Determinism
Two years later…
People - Motivation
Making GREEN the norm
Shades of Red
Pragmatic CI Health
Build Tiers and Policy
Tier A1 - green soon after all commits
Tier A2 - green at the end of the day
Tier A3 - green at the...
Wallboards:
Constant
Awareness
Training
• assertThat over assertTrue/False and
assertEquals	

• avoiding races - Atlassian Selenium with its
TimedElement...
Quality
Automatic Flakiness Detection
Quarantine
Re-run failed tests and see if they pass
Quarantine - Healing
SlowMo - expose races
Selenium 1
Selenium ditching
Sky did not fall in
Ditching - benefits
• Freed build agents - better system throughput	

• Boosted morale	

• Gazillion of developer hours sav...
Ditching - due diligence
• conducting the audit - analysis of the
coverage we lost	

• determining which tests needs to re...
Flaky Browser-based Tests
Races between test code and asynchronous page logic
Playing with "loading" CSS class does not re...
Races Removal with Tracing
// in the browser:!
function mySearchClickHandler() {!
    doSomeXhr().always(function() {!
   ...
Can we halve our build times?
Speed
Parallel Execution - Theory
End of Build
Batches
Start of Build
Parallel Execution
End of Build
Batches
Start of Build
Parallel Execution -
Reality Bites
End of Build
Batches
Start of Build
Agent
availability
Dynamic Test Execution
Dispatch - Hallelujah
"You can't manage what
you can't measure."
not by W. Edwards Deming
If you believe just in it 	

you are doomed.
You can't improve something
if you can't measure it
Profiler, Build statistics, Logs, statsd → Graphite
Anatomy of Build*
Compilation
Packaging
Executing Tests
Fetching Dependencies
*Any resemblance to maven build is entirely ...
JIRA Unit Tests Build
Compilation (7min)
Packaging (0min)
Executing Tests (7min)
Fetching Dependencies (1.5min)
SCM Update...
Decreasing Test
Execution Time to
ZERRO
alone would not let us
achieve our goal!
Agent Availability/Setup
• starved builds due to
busy agents building
very long builds	

• time synchronization
issue - NT...
• Proximity of SCM repo	

• shallow git clones are not so fast and lightweight +
generating extra git server CPU load	

• ...
• Fix Predator	

• Sandboxing/isolation agent trade-off:

rm -rf $HOME/.m2/repository/com/atlassian/*

into

find $HOME/.m...
Compilation
• Restructuring multi-pom maven project
and dependencies	

• Maven 3 parallel compilation FTW



-T 1.5C

*opt...
Unit Test Execution
• Splitting unit tests into 2 buckets: good and
legacy (much longer)	

• Maven 3 parallel test executi...
Functional Tests
• Selenium 1 removal did help	

• Faster reset/restore (avoid unnecessary
stuff, intercepting SQL operati...
Functional Tests
Publishing Results
• Server log allocation per test → using now
Backdoor REST API (was Selenium)	

• Bamboo DB performance...
Unexpected Problem
• Stability Issues with our CI server	

• The bottleneck changed from I/O to CPU	

• Too many agents pe...
JIRA Unit Tests Build Improved
Compilation (1min)
Packaging (0min)
Executing Tests (5min)
Fetching Dependencies (10sec)
SC...
Improvements Summary
Tests Before After Improvement %
Unit tests 29 min 17 min 41%
Functional tests 56 min 34 min 39%
WebD...
Better speed increases	

responsibility
Fewer commits (authors) per single build
vs.
The Quality Follows
But that's still bad
We want CI feedback loop in a few minutes maximum
Splitting The Codebase
Inevitable Split - Fears
• Organizational concerns - understanding,
managing, integrating, releasing	

• Mindset change - ...
Splitting code base
• Step 0 - JIRA Importers Plugin (3.5 years ago)	

• Step 1- New IssueView and Navigator	

• Step 2 - ...
We are still escaping hell.
Hell sucks in your soul.
Conclusions
• Visibility and problem awareness help	

• Maintaing huge testbed is difficult and costly	

• Measure the prob...
Test performance 	

is a damn important
feature!
XP vs Sad Reality
CostofChange
Time
Waterfall
XP - ideal
Sad Reality
Interested in such stuff?
http://www.spartez.com/careers
We are hiring in Gdańsk
• Turtle - by Jonathan Zander, CC-BY-SA-3.0	

• Loading - by MatthewJ13, CC-SA-3.0	

• Magic Potion - by Koolmann1, CC-BY-...
ThankYou!
Escaping Test Hell - ACCU 2014
Upcoming SlideShare
Loading in...5
×

Escaping Test Hell - ACCU 2014

2,593

Published on

My talk delivered on 10th of April 2014 in Bristol at ACCU Conference.
This is the combination of a few talks I delivered over 2012 and 2013 with some latest updates.

This is an experience report based on the work of many developers from Atlassian and Spartez working for years on Atlassian JIRA.
If you have (or going to have) thousands of automated tests and you are interested how it may impact you, this presentation is for you.

Published in: Engineering
3 Comments
5 Likes
Statistics
Notes
No Downloads
Views
Total Views
2,593
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
31
Comments
3
Likes
5
Embeds 0
No embeds

No notes for slide

Escaping Test Hell - ACCU 2014

  1. 1. Automated Test Hell Wojciech Seliga wojciech.seliga@spartez.com, @wseliga or There and Back Again
  2. 2. About me • Coding since 6yo • Former C++ developer (’90s, early ’00s) • Agile Practices (inc.TDD) since 2003 • Dev Nerd,Tech Leader, Agile Coach, Speaker, PHB • 6.5 years with Atlassian (JIRA Dev Manager) • Spartez Co-founder & CEO
  3. 3. XP Promise CostofChange Time Waterfall XP
  4. 4. The Story
  5. 5. 2.5 years ago
  6. 6. About 50 engineers
  7. 7. Obsessed with Quality
  8. 8. Almost 10 years of accumulating garbage automatic tests
  9. 9. 18 000 tests* on all levels 13 000 unit tests *excluding tests of the libraries 4 000 func and integration tests 1 000 Selenium tests
  10. 10. Atlassian JIRA
  11. 11. Our Continuous Integration environment
  12. 12. Test frameworks • JUnit 3 and 4 • JMock, Easymock, Mockito • Powermock, Hamcrest • QUnit, HTMLUnit, Jasmine.js, Sinon.js • JWebUnit, Selenium, WebDriver • Custom runners
  13. 13. Bamboo Setup • Dedicated server with 70+ remote agents (including Amazon Elastic) • Build engineers • Bamboo devs on-site
  14. 14. Looks good so far?
  15. 15. for each main branch
  16. 16. Run in parallel in batches Run first
  17. 17. There is Much More
  18. 18. Type of tests • Unit • Functional • Integration • Platform • Performance
  19. 19. Platforms • Dimension - DB: MySQL, PostgreSQL, MS SQL, Oracle • Dimension - OS: Linux, Windows • Dimension - Java ver.: 1.5, 1.6, 1.7, 1.8 • Dimension - CPU arch.: 32-bit, 64-bit • Dimension - Deployment Mode: Standalone, Tomcat, Websphere, Weblogic Run Nightly Coming
  20. 20. Triggering Builds • On Commit (hooks, polling) • Dependent Builds • Nightly Builds • Manual Builds
  21. 21. Very slow (long hours) and fragile feedback loop
  22. 22. Serious performance and reliability issues
  23. 23. It takes time to fix it...
  24. 24. Sometimes very long
  25. 25. You commit at 3 PM You get “Unit Test Green” email at 4PM You get flood of “Red Test X” emails at 4 - 9PM Your colleagues on the other side of the globe You happily go home You
  26. 26. “We probably spend more time dealing with the JIRA test codebase than the production codebase”
  27. 27. Dispirited devs accepting RED as a norm
  28. 28. Broken window theory
  29. 29. Feedback Speed ` Test Quality
  30. 30. Catching up with UI changes Page Objects Pattern Problem: Solution:
  31. 31. Page Objects Pattern • Page Objects model UI elements (pages, components, dialogs, areas) your tests interact with • Page Objects shield tests from changing internal structure of the page • Page Objects generally do not make assertions about data. The can assert the state. • Designed for chaining
  32. 32. Page Objects Example public class AddUserPage extends AbstractJiraPage! {! ! private static final String URI = ! "/secure/admin/user/AddUser!default.jspa";! ! @ElementBy(name = "username")! private PageElement username;! ! @ElementBy(name = "password")! private PageElement password;! ! @ElementBy(name = "confirm")! private PageElement passwordConfirmation;! ! @ElementBy(name = "fullname")! private PageElement fullName;! ! @ElementBy(name = "email")! private PageElement email;! ! @ElementBy(name = "sendemail")! private PageElement sendEmail;! ! @ElementBy(id = "user-create-submit")! private PageElement submit;! ! @ElementBy (id = "user-create-cancel")! private PageElement cancelButton;! ! @Override! public String getUrl()! {! return URI;! }! ... @Override! public TimedCondition isAt()! {! return and(username.timed().isPresent(), ! password.timed().isPresent(), fullName.timed().isPresent());! }! ! public AddUserPage addUser(final String username, ! final String password, final String fullName, final String email, final boolean receiveEmail)! {! this.username.type(username);! this.password.type(password);! this.passwordConfirmation.type(password);! this.fullName.type(fullName);! this.email.type(email);! if(receiveEmail) {! this.sendEmail.select();! }! return this;! }! ! public ViewUserPage createUser()! {! return createUser(ViewUserPage.class);! }! ! ! public <T extends Page> T createUser(Class<T> nextPage, Object...args)! {! submit.click();! return pageBinder.bind(nextPage, args);! }!
  33. 33. Using Page Objects @Test! public void testServerError()! {! jira.gotoLoginPage().loginAsSysAdmin(AddUserPage.class)! .addUser("username", "mypassword", "My Name",! "sample@email.com", false)! .createUser();! // assertions here! }!
  34. 34. Opaque Test Fixtures REST-based Set-up Problem: Solution:
  35. 35. REST-based Setup @Before! public void setUpTest() {! restore("some-big-xml-file-with-everything-needed-inside.xml");! }! @Before! public void setUpTest() {! restClient.restoreEmptyInstance();! restClient.createProject(/* project params */);! restClient.createUser(/* user params */);! restClient.createUser(/* user params */);! restClient.createSomethingElse(/* ... */);! }! VS
  36. 36. Flakey Tests Timed Conditions Problem: Solution: Mock Unreliable Deps Test-friendly Markup
  37. 37. Flakey Tests Quarantine Problem: Solution: Fix Eradicate
  38. 38. Quarantine • @Ignore • @Category • Quarantine on CI server • Recover or Die
  39. 39. Non-deterministic tests are strong inhibitor of change instead of the catalyst
  40. 40. Execution Time: Test Level Unit Tests REST API Tests JWebUnit/HTMLUnit Tests Selenium/WebDriver Tests Speed Confidence
  41. 41. Our example: Front-end-heavy web app 100 WebDriver tests: 100 QUnit tests: 15 minutes 1.2 seconds
  42. 42. Test Pyramid Unit Tests (including JS tests) REST / HTML Tests Selenium Good!
  43. 43. Test Code is Not Trash Design Maintain Refactor Share Review Prune Respect Discuss Restructure
  44. 44. Optimum Balance Isolation Speed Coverage Level Access Effort
  45. 45. Dangerous to temper with MaintainabilityQuality / Determinism
  46. 46. Two years later…
  47. 47. People - Motivation Making GREEN the norm
  48. 48. Shades of Red
  49. 49. Pragmatic CI Health
  50. 50. Build Tiers and Policy Tier A1 - green soon after all commits Tier A2 - green at the end of the day Tier A3 - green at the end of the iteration unit tests and functional* tests WebDriver and bundled plugins tests supported platforms tests, compatibility tests
  51. 51. Wallboards: Constant Awareness
  52. 52. Training • assertThat over assertTrue/False and assertEquals • avoiding races - Atlassian Selenium with its TimedElement • Favouring unit tests over functional tests • Promoting Page Objects • Brownbags, blog posts, code reviews
  53. 53. Quality
  54. 54. Automatic Flakiness Detection Quarantine Re-run failed tests and see if they pass
  55. 55. Quarantine - Healing
  56. 56. SlowMo - expose races
  57. 57. Selenium 1
  58. 58. Selenium ditching Sky did not fall in
  59. 59. Ditching - benefits • Freed build agents - better system throughput • Boosted morale • Gazillion of developer hours saved • Money saved on infrastructure
  60. 60. Ditching - due diligence • conducting the audit - analysis of the coverage we lost • determining which tests needs to rewritten (e.g. security related) • rewriting the tests (good job for new hires + a senior mentor)
  61. 61. Flaky Browser-based Tests Races between test code and asynchronous page logic Playing with "loading" CSS class does not really help
  62. 62. Races Removal with Tracing // in the browser:! function mySearchClickHandler() {!     doSomeXhr().always(function() {!         // This executes when the XHR has completed (either success or failure)!         JIRA.trace("search.completed");"     });! }! // In production code JIRA.trace is a no-op // in my page object:! @Inject! TraceContext traceContext;!  ! public SearchResults doASearch() {!     Tracer snapshot = traceContext.checkpoint();!     getSearchButton().click(); // causes mySearchClickHandler to be invoked!     // This waits until the "search.completed"
 // event has been emitted, *after* previous snapshot    !     traceContext.waitFor(snapshot, "search.completed"); !     return pageBinder.bind(SearchResults.class);! }!
  63. 63. Can we halve our build times? Speed
  64. 64. Parallel Execution - Theory End of Build Batches Start of Build
  65. 65. Parallel Execution End of Build Batches Start of Build
  66. 66. Parallel Execution - Reality Bites End of Build Batches Start of Build Agent availability
  67. 67. Dynamic Test Execution Dispatch - Hallelujah
  68. 68. "You can't manage what you can't measure." not by W. Edwards Deming If you believe just in it you are doomed.
  69. 69. You can't improve something if you can't measure it Profiler, Build statistics, Logs, statsd → Graphite
  70. 70. Anatomy of Build* Compilation Packaging Executing Tests Fetching Dependencies *Any resemblance to maven build is entirely accidental SCM Update Agent Availability/Setup Publishing Results
  71. 71. JIRA Unit Tests Build Compilation (7min) Packaging (0min) Executing Tests (7min) Fetching Dependencies (1.5min) SCM Update (2min) Agent Availability/Setup (mean 10min) Publishing Results (1min)
  72. 72. Decreasing Test Execution Time to ZERRO alone would not let us achieve our goal!
  73. 73. Agent Availability/Setup • starved builds due to busy agents building very long builds • time synchronization issue - NTPD problem
  74. 74. • Proximity of SCM repo • shallow git clones are not so fast and lightweight + generating extra git server CPU load • git clone per agent/plan + git pull + git clone per build (hard links!) • Stash was thankful (queue) SCM Update - Checkout time 2 min → 5 seconds
  75. 75. • Fix Predator • Sandboxing/isolation agent trade-off:
 rm -rf $HOME/.m2/repository/com/atlassian/*
 into
 find $HOME/.m2/repository/com/atlassian/ 
 -name “*SNAPSHOT*” | xargs rm • Network hardware failure found (dropping packets) Fetching Dependencies 1.5 min → 10 seconds
  76. 76. Compilation • Restructuring multi-pom maven project and dependencies • Maven 3 parallel compilation FTW
 
 -T 1.5C
 *optimal factor thanks to scientific trial and error research
 7 min → 1 min
  77. 77. Unit Test Execution • Splitting unit tests into 2 buckets: good and legacy (much longer) • Maven 3 parallel test execution (-T 1.5C) 7 min → 5 min 3000 poor tests (5min) 11000 good tests (1.5min)
  78. 78. Functional Tests • Selenium 1 removal did help • Faster reset/restore (avoid unnecessary stuff, intercepting SQL operations for debug purposes - building stacktraces is costly) • Restoring via Backdoor REST API • Using REST API for common setup/ teardown operations
  79. 79. Functional Tests
  80. 80. Publishing Results • Server log allocation per test → using now Backdoor REST API (was Selenium) • Bamboo DB performance degradation for rich build history - to be addressed 1 min → 40 s
  81. 81. Unexpected Problem • Stability Issues with our CI server • The bottleneck changed from I/O to CPU • Too many agents per physical machine
  82. 82. JIRA Unit Tests Build Improved Compilation (1min) Packaging (0min) Executing Tests (5min) Fetching Dependencies (10sec) SCM Update (5sec) Agent Availability/Setup (3min)* Publishing Results (40sec)
  83. 83. Improvements Summary Tests Before After Improvement % Unit tests 29 min 17 min 41% Functional tests 56 min 34 min 39% WebDriver tests 39 min 21 min 46% Overall 124 min 72 min 42% * Additional ca. 5% improvement expected once new git clone strategy is consistently rolled-out everywhere
  84. 84. Better speed increases responsibility Fewer commits (authors) per single build vs.
  85. 85. The Quality Follows
  86. 86. But that's still bad We want CI feedback loop in a few minutes maximum
  87. 87. Splitting The Codebase
  88. 88. Inevitable Split - Fears • Organizational concerns - understanding, managing, integrating, releasing • Mindset change - if something worked for 10+ years why to change it? • Trust - does this library still work? • We damned ourselves with big buckets for all tests - where do they belong to?
  89. 89. Splitting code base • Step 0 - JIRA Importers Plugin (3.5 years ago) • Step 1- New IssueView and Navigator • Step 2 - now everything else follows JIRA 6.0
  90. 90. We are still escaping hell. Hell sucks in your soul.
  91. 91. Conclusions • Visibility and problem awareness help • Maintaing huge testbed is difficult and costly • Measure the problem - to baseline • No prejudice - no sacred cows • Automated tests are not one-off investment, it's a continuous journey • Performance is a damn important feature
  92. 92. Test performance is a damn important feature!
  93. 93. XP vs Sad Reality CostofChange Time Waterfall XP - ideal Sad Reality
  94. 94. Interested in such stuff? http://www.spartez.com/careers We are hiring in Gdańsk
  95. 95. • Turtle - by Jonathan Zander, CC-BY-SA-3.0 • Loading - by MatthewJ13, CC-SA-3.0 • Magic Potion - by Koolmann1, CC-BY-SA-2.0 • Merlin Tool - by By L. Mahin, CC-BY-SA-3.0 • Choose Pills - by *rockysprings, CC-BY-SA-3.0 • Flashing Red Light - by Chris Phan, CC BY 2.0 • Frustration - http://www.flickr.com/photos/striatic • Broken window - http://www.flickr.com/photos/leeadlaf/ Images - Credits
  96. 96. ThankYou!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×