Leveraging data visualization to improve the efficiency of large-scale test automation infrastructure.
Watch the full talk at: http://youtube.com/watch?v=oRIci6n566w
1. Big Data
Makes The Flake Go Away
Leveraging data visualization to
improve the efficiency of large-scale
test automation infrastructure
Dave Cadwallader
Automation Infrastructure
2. What are we going to learn?
1. What is test flake?
2. What problems does it cause?
3. Why is it so hard to prevent?
4. What can we do to stop it?
3. Who am I
and why should you listen to me?
Dave Cadwallader
Sr. Engineering Manager
Automation Infrastructure
4. Who am I
and why should you listen to me?
60,000 minutes (41 days) of
testing per day
Dave Cadwallader
Sr. Engineering Manager
Automation Infrastructure
5. Who am I
and why should you listen to me?
Dave Cadwallader
Co-Creator of TestArmada
We don’t make the test libraries you use.
We make the test libraries you use better.
6. Takeaways You’ll Get
1. Understanding of the various types of test flake
2. How to use statistics and data visualization to help
measure your own flake levels
3. How to squash test flake once you’ve found it
4. How to get involved in a community-driven effort to
end test flake
102. What have we learned?
1. When trends appear chaotic, find
another dimension to slice
2. Keep slicing until a difference is
found between slices
3. Use those differences to narrow
down root causes
103. Common Causes of Flake
1. Long-Running Tests
2. Live Network Calls
3. Non-Deterministic
Application Bugs
104.
105. Bloop Roadmap
1. Open Source it!
2. More Stats!
3. Determine test run order
based on flake/timing history
----- Meeting Notes (11/16/16 10:07) -----
TestArmada is how we made large-scale test automation successful at Walmart
----- Meeting Notes (11/16/16 10:07) -----
we run a full cross-browser selenium suite on every single pull request
----- Meeting Notes (11/16/16 10:07) -----
that means devs are waiting - and very sensitive to time.
----- Meeting Notes (11/16/16 10:07) -----
also when tests slow -
----- Meeting Notes (11/16/16 09:47) -----
So we all wind up having an emotional, stressful reaction because of this word.
----- Meeting Notes (11/16/16 09:47) -----
"oh, i'll just ignore that failure because it's flakey". allows real app bugs to slip thru
----- Meeting Notes (11/16/16 09:47) -----
We're going to turn this into a positive thing.
----- Meeting Notes (11/16/16 09:47) -----
If you can measure i
----- Meeting Notes (11/16/16 10:07) -----
if you can measure it, you can control it.
----- Meeting Notes (11/16/16 11:04) -----
"hope is not a strategy" - Google SRE. we can't keep hiding from flake. we need to acknowledge that flake exists, and go hunting for it. when we find it, we measure it, and when we measure it, we can start to control it.
----- Meeting Notes (11/16/16 11:04) -----
so we called up saucelabs and BOOM we went from 100 concurrent VMs to 1000
----- Meeting Notes (11/16/16 10:07) -----
high concurrency is dramatically affected by test flake. to see how, let's briefly dive into how we orchestrate massive concurrency.
----- Meeting Notes (11/16/16 10:07) -----
two main benefits:
1. massively parallel runner
2. fault tolerant (handles retries, only reports a test as a failure if it fails 3x)
----- Meeting Notes (11/16/16 10:07) -----
remember this last test - "amend order cancel"
----- Meeting Notes (11/16/16 10:07) -----
imagine flake is like a bruised apple, but we don't know which parts are safe to eat. we keep slicing to separate out the good parts from the bad. we use metrics and data viz to slice our data the same way, looking to separate what's flakey from what's not.
----- Meeting Notes (11/16/16 09:32) -----
We have tests dipping into the yellow and red zone. Risk of timing out.
----- Meeting Notes (11/16/16 10:07) -----
When we're looking for flake, instead of trying to pretend it doesn't exist, it's exciting when we find it. Even more exciting when we narrow it down!
----- Meeting Notes (11/16/16 10:07) -----
Come get TA stickers!