James Farrier
From - Auckland New Zealand
Studied - Software Engineering
Been working in Test Automation for 12 years
Recently founded Appsurify
About Me
With less time
(Thanks Agile)
we need to
improve our
testing efficiency
What does ‘Shift
Left’ mean?
Testing Manifesto
How our CI should
look
10 sec 10 sec 30 sec 2 min
Commit Build
Unit
Tests
API
Tests
UI Tests
10 sec
Deploy
1 day
Manual
Tests
30 sec
Release
No bugs
found
PARTY!
Context switching
is terrible for your
teams
productivity
How our CI should
look
10 sec 10 sec 30 sec 2 min
Commit Build
Unit
Tests
API
Tests
UI Tests
10 sec
Deploy
1 day
Manual
Tests
30 sec
Release
No bugs
found
PARTY!
To rely on
automation, we
need a reliable
signal
How our CI should
look
10 sec 10 sec 30 sec 2 min
Commit Build
Unit
Tests
API
Tests
UI Tests
10 sec
Deploy
1 day
Manual
Tests
30 sec
Release
No bugs
found
PARTY!
What really
happens - Dev 1
9 am
1 min 1 min After 10 min
1000 tests
fail, WTF?!
10 min
Commit Build
Unit
Tests
API
Tests
Investig
ation
Let’s rerun!
Fix
10:30 am
Ready
What really
happens - Dev 2
10 am
1 min 1 min After 10 min
1000 tests
fail, WTF?!
10 min
Commit Build
Unit
Tests
API
Tests
Investig
ation
10:30 am
Meeting
11:30 am
Queue Lunch
What really
happens - Dev 1
10:30 am
1 min 15 min 1 min 30 min all
pass after
reruns
Commit Build Coffee
Unit
Tests
API
Tests
2 hours, 100
failures
UI Tests
What really
happens - Dev 1
11 am
What really
happens - Dev 1
1 pm
1 min 15 min 1 min 30 min all
pass after
reruns
Commit Build Coffee
Unit
Tests
API
Tests
2 hours, 100
failures
UI Tests
40 min
Investig
ation
Create
Fix
What really
happens - Dev 2
1 pm
1 min 15 min 1 min 30 min all
pass after
reruns
Commit Build Coffee
Unit
Tests
API
Tests
2 hours, 100
failures
UI Tests
30 min
Investig
ation
What really
happens - Dev 2
3:30 pm
What really
happens - Dev 2
3:30 pm
1 min 15 min 1 min 30 min all
pass after
reruns
Commit Build Coffee
Unit
Tests
API
Tests
2 hours, 100
failures
UI Tests
30 min
Investig
ation
At 4pm
Re-
queue
Dev 1’s
change
labeled
critical
Build
cancelled
Go
Home
What really
happens - Dev 1
4:30 pm
1 min 15 min 1 min 30 min all
pass after
reruns
Commit Build Coffee
Unit
Tests
API
Tests
Removed
UI Tests
1 hour
Deploy
3 hours.
Testers leave at
9 pm
Manual
Tests
Bug
Missed
Release
2 am
Bug
Found
Developers hate
flaky and slow
CI/build pipelines
Developers hate
flaky and slow
CI/build pipelines
Dev 1 Dev 2
● 70 min waiting on API
tests
● 10 min investigating
API tests
● 2 hours waiting on UI
tests
● 40 min investigating
UI tests
● Loses 2+ hours
context switching
● 1 hour trying to
understand new code
● Productivity < 2 hours
● 10 min investigating
same API failures as
Dev 1
● 30 min investigating
same UI failures as
Dev 1
● 2 hours waiting on UI
tests
● 2 hours waiting on
Dev 1’s changes
● Productivity < 2 hours
Shifting Left Challenges
Tests Are
Unreliable
Not
Enough
Time To
Test
Tests Take
Too Long
To Run
Elite Companies Test Automation
1 Written in Code
2 Shared
Responsibility
3Visible Results
4Gated CI/CD
What can we do
about long
running test suites
Run less tests?
But then we might
miss something!
What about code
coverage? That’s
difficult to setup
and still runs too
many tests!
The best
companies run a
prioritized set of
tests
To rely on
automation, we
need a reliable
signal
74% of test
failures at Google
are caused by
Flaky Tests
Almost 16% of our tests have some level of flakiness
associated with them!
John Micco, Google
“
Manually remove
flaky tests from
test runs
Delete flaky tests
Fix flaky tests
Manually dealing
with flaky tests
Advantages Disadvantages
● Stops flaky tests
breaking the build
● Misses defects
● Eventually recreate
the same tests
● Disheartening
● May not be possible
(3rd party)
● Doesn’t scale
Rerun flaky tests,
great in theory -
can be bad in
practice
Using a
classification
algorithm to
detect flaky tests
and then
quarantine the
unreliable results
Which is likely to
be Flaky? Failure
early in the test or
at late?
Some of the most
common failures
with Selenium
occur early i.e.
Browser setup
Which is likely to
be Flaky? Error or
Assert Failure?
Which is likely to
be Flaky? If we
change Transfers
section a
Transfers test that
fails or a Payment
test that fails?
Which is likely to
be Flaky? New
Error or an error
that’s caused
flaky failures
before?
Which is likely to
be Flaky? A test
that has never
flaked before or a
test that looks like
Xmas lights?
Now we have
useful automation
Manual Testing Challenges
Unsure
What To
Test
Use Your
Gut To
Decide
Not
Enough
Time
Identify risky
commits and the
areas that are
likely to have
defects
Analyze the
commit and
defect history
Who is more likely
to create a bug?
Someone who
created bugs in
the past or
someone who
hasn’t created a
bug before?
Who is more likely
to create a bug?
Someone
experienced with
this area or
someone new to
this area of the
code?
Who is more likely
to create a bug?
Someone
experienced with
this type of task or
someone new to
this task type?
Where is more
likely to have a
bug? An area
which 10 devs
have changed or
one?
Where is more
likely to have a
bug? An area
which 10 devs
have changed or
one?
Which is more
likely to have a
bug? A small
commit or a
really big
commit?
Which is more
likely to have a
bug? A Java file
change or a C++
file change?
Which is more
likely to have a
bug? A sane
commit message
or something like
this?
Who is more likely
to create a bug in
the morning?
Someone you
know got no
sleep or someone
well rested?
Now we can
reduce our
testing scope
whilst finding all
the bugs
Testing Manifesto
Why are we
creating so many
bugs?
Assign tasks
based on who is
available
Not everyone has
the same skill set
jamesfarrier@appsurify.com
Thanks for Listening

Shifting Testing Left - The Pain Points and Solutions