Big Data Makes The Flake Go Away

•Download as PPTX, PDF•

0 likes•110 views

Leveraging data visualization to improve the efficiency of large-scale test automation infrastructure. Watch the full talk at: http://youtube.com/watch?v=oRIci6n566w

Software

Big Data
Makes The Flake Go Away
Leveraging data visualization to
improve the efficiency of large-scale
test automation infrastructure
Dave Cadwallader
Automation Infrastructure

What are we going to learn?
1. What is test flake?
2. What problems does it cause?
3. Why is it so hard to prevent?
4. What can we do to stop it?

Who am I
and why should you listen to me?
Dave Cadwallader
Sr. Engineering Manager
Automation Infrastructure

Who am I
and why should you listen to me?
60,000 minutes (41 days) of
testing per day
Dave Cadwallader
Sr. Engineering Manager
Automation Infrastructure

Who am I
and why should you listen to me?
Dave Cadwallader
Co-Creator of TestArmada
We don’t make the test libraries you use.
We make the test libraries you use better.

Takeaways You’ll Get
1. Understanding of the various types of test flake
2. How to use statistics and data visualization to help
measure your own flake levels
3. How to squash test flake once you’ve found it
4. How to get involved in a community-driven effort to
end test flake

What does Automation mean at
…and how does flake get in the way?

When Tests Fail in CI
Wow, thanks!
My app has a bug!
CI is flakey!

confidence erosion
a growing mistrust of the
systems designed to keep
us safe

non-deterministic
“…given the same input,
exhibits different behaviors
on different runs”
https://en.wikipedia.org/wiki/Nondeterministic_algorithm

High Concurrency
will make test suites fast!
Bright Idea We Had:

Retry Tail
Extra rounds of
testing tacked onto
the end of a test
suite when one or
more failing tests
are retried.

High Concurrency
will make test suites fast!
is weakened by test flake.

Magellan
by
a test orchestrator
(runs your existing test library)
Massively Parallel
Fault Tolerant

1% flake might not cause failures,
But it does cause perceived slowness
Every blip counts.

the sound nicknamed Bloop is the most
likely to come from some sort of animal…

https://microchip.wdfiles.com/local--files/tcpip:tcp-vs-udp/TCP_vs_UDP.JPG

Flake Rate
Tests that eventually passed
after requiring one or two retries

Suite Runtime
Total Duration of All Tests in a Suite

Interpreting Lines is Tough…
We Need Statistics!

Standard Deviation
“…a measure of how
spread out numbers are.”
http://www.mathsisfun.com/data/standard-deviation.html

http://www.mathsisfun.com/data/standard-deviation.html
The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and 300mm.
…the mean (average) height is 394 mm. The Standard Deviation is 147 mm.

Individual Test Timings
By Test and Browser

What have we learned?
1. When trends appear chaotic, find
another dimension to slice
2. Keep slicing until a difference is
found between slices
3. Use those differences to narrow
down root causes

Common Causes of Flake
1. Long-Running Tests
2. Live Network Calls
3. Non-Deterministic
Application Bugs

Bloop Roadmap
1. Open Source it!
2. More Stats!
3. Determine test run order
based on flake/timing history

@TestArmada
@geek_dave
testarmada.github.io
Join the Community
#hooray4flake

Viewers also liked

Matinée 01 Big DataEvenements01

Connected Event - Du Big Data au Smart Data 7Oct2015 - EPFLRaphael Rollier

Model Automation in RWill Johnson

Cas d’usage du Big Data pour la relation et l’expérience clientJean-Michel Franco

Python as part of a production machine learning stack by Michael Manapat PyDa...PyData

Petit Déjeuner Datastax 14-04-15 : Les nouvelles architectures de stockage et...OCTO Technology

Spark - Ippevent 19-02-2015Alexis Seigneurin

Thèse professionnelle - COMMENT LES BIG DATA VONT AMELIORER LE MARKETING DANS...Thibault PAILLIER

Quoi de neuf pour JHipster en 2016Ippon

Etude sur le Big DataNexialog Consulting

Une application qui fonctionne : prendre en compte les émotions des utilisate...OCTO Technology

Système d’Information à l’Apec : un nouveau coeur de métier mis en place avec...Ippon

Machine learningebiznext

Démystifions le machine learning avec spark par David Martin pour le Salon B...Ippon

Recommender Systems with Apache Spark's ALS FunctionWill Johnson

Approximate nearest neighbor methods and vector models – NYC ML meetupErik Bernhardsson

Spark ML par Xebia (Spark Meetup du 11/06/2015)Modern Data Stack France

OCTO 2012 - Banque du futur 2020 : scenarios 2020OCTO Technology

Du Big Data vers le SMART Data : Scénario d'un processusCHAKER ALLAOUI

Luigi presentation NYC Data ScienceErik Bernhardsson

Viewers also liked (20)

Matinée 01 Big Data

Connected Event - Du Big Data au Smart Data 7Oct2015 - EPFL

Model Automation in R

Cas d’usage du Big Data pour la relation et l’expérience client

Python as part of a production machine learning stack by Michael Manapat PyDa...

Petit Déjeuner Datastax 14-04-15 : Les nouvelles architectures de stockage et...

Spark - Ippevent 19-02-2015

Thèse professionnelle - COMMENT LES BIG DATA VONT AMELIORER LE MARKETING DANS...

Quoi de neuf pour JHipster en 2016

Etude sur le Big Data

Une application qui fonctionne : prendre en compte les émotions des utilisate...

Système d’Information à l’Apec : un nouveau coeur de métier mis en place avec...

Machine learning

Démystifions le machine learning avec spark par David Martin pour le Salon B...

Recommender Systems with Apache Spark's ALS Function

Approximate nearest neighbor methods and vector models – NYC ML meetup

Spark ML par Xebia (Spark Meetup du 11/06/2015)

OCTO 2012 - Banque du futur 2020 : scenarios 2020

Du Big Data vers le SMART Data : Scénario d'un processus

Luigi presentation NYC Data Science

Similar to Big Data Makes The Flake Go Away

Open source bridge testing antipatterns presentationmmrobins

Google, quality and younelinger

Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...Codemotion

Chaos engineering Alberto Acerbis

Applying principles of chaos engineering to serverless (O'Reilly Software Arc...Yan Cui

Metric Abuse: Frequently Misused Metrics in OracleSteve Karam

Why Software Test Performance MattersSolano Labs

Testing smellsSidu Ponnappa

SELJE_Database_Unit_Testing_Slides.pdfEric Selje

How to Actually DO High-volume Automated TestingTechWell

Chaos Engineering - The Art of Breaking Things in ProductionKeet Sugathadasa

DockerCon SF 2019 - TDD is DeadKevin Crawley

DevOps - Chaos Engineering on KubernetesDavid Hsu

Tests antipatternsMaciej Przewoznik

Performance Analysis of Idle Programsgreenwop

Lessons From The Core: Longitudinal Assessment vs. Point Sampling of Behavior...InsideScientific

An Introduction to unit testingSteven Casey

Creating testing tools to support developmentChema del Barco

DevOps - Boldly Go for DistroPaul Boos

Similar to Big Data Makes The Flake Go Away (20)

Open source bridge testing antipatterns presentation

Google, quality and you

Yan Cui - Applying principles of chaos engineering to Serverless - Codemotion...

Chaos engineering

Applying principles of chaos engineering to serverless (O'Reilly Software Arc...

Metric Abuse: Frequently Misused Metrics in Oracle

Why Software Test Performance Matters

Testing smells

SELJE_Database_Unit_Testing_Slides.pdf

How to Actually DO High-volume Automated Testing

Chaos Engineering - The Art of Breaking Things in Production

DockerCon SF 2019 - TDD is Dead

DevOps - Chaos Engineering on Kubernetes

Tests antipatterns

Performance Analysis of Idle Programs

Lessons From The Core: Longitudinal Assessment vs. Point Sampling of Behavior...

An Introduction to unit testing

Creating testing tools to support development

DevOps - Boldly Go for Distro

Recently uploaded

Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase

What is Fashion PLM and Why Do You Need ItWave PLM

GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko

Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq

Professional Resume Template for Software DevelopersVinodh Ram

Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions

BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.

Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app

Asset Management Software - InfographicHr365.us smith

ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110

Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea

Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed

Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran

chapter--4-software-project-planning.pptkotipi9215

Recently uploaded (20)

Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024

What is Fashion PLM and Why Do You Need It

GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf

Salesforce Certified Field Service Consultant

Professional Resume Template for Software Developers

Advancing Engineering with AI through the Next Generation of Strategic Projec...

BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data

Folding Cheat Sheet #4 - fourth in a series

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx

Asset Management Software - Infographic

ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...

Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...

办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样

Unveiling Design Patterns: A Visual Guide with UML Diagrams

Intelligent Home Wi-Fi Solutions | ThinkPalm

chapter--4-software-project-planning.ppt

Big Data Makes The Flake Go Away

1. Big Data Makes The Flake Go Away Leveraging data visualization to improve the efficiency of large-scale test automation infrastructure Dave Cadwallader Automation Infrastructure

2. What are we going to learn? 1. What is test flake? 2. What problems does it cause? 3. Why is it so hard to prevent? 4. What can we do to stop it?

3. Who am I and why should you listen to me? Dave Cadwallader Sr. Engineering Manager Automation Infrastructure

4. Who am I and why should you listen to me? 60,000 minutes (41 days) of testing per day Dave Cadwallader Sr. Engineering Manager Automation Infrastructure

5. Who am I and why should you listen to me? Dave Cadwallader Co-Creator of TestArmada We don’t make the test libraries you use. We make the test libraries you use better.

6. Takeaways You’ll Get 1. Understanding of the various types of test flake 2. How to use statistics and data visualization to help measure your own flake levels 3. How to squash test flake once you’ve found it 4. How to get involved in a community-driven effort to end test flake

7. What does Automation mean at …and how does flake get in the way?

10.

11.

12.

13.

14.

15.

16. When Tests Fail in CI Wow, thanks! My app has a bug! CI is flakey!

17. flake

18. flake problems caused by

19. confidence erosion a growing mistrust of the systems designed to keep us safe

20. flake why is such a pain?

21. non-deterministic “…given the same input, exhibits different behaviors on different runs” https://en.wikipedia.org/wiki/Nondeterministic_algorithm

22. pass/fail flake

23. performance flake

24. flake take back

25. #hooray4flake

26. High Concurrency will make test suites fast! Bright Idea We Had:

27.

28.

29.

30.

31. Smoosh the Sandcastle!

32.

33.

34. What we Wanted

35. What we Got

36.

37.

38.

39.

40. Retry Tail Extra rounds of testing tacked onto the end of a test suite when one or more failing tests are retried.

41. High Concurrency will make test suites fast! is weakened by test flake.

42. Magellan by a test orchestrator (runs your existing test library) Massively Parallel Fault Tolerant

43.

44.

45.

46.

47.

48.

49.

50.

51.

52.

53.

54. 1% flake might not cause failures, But it does cause perceived slowness Every blip counts.

55. BLOOP by

56.

57. the sound nicknamed Bloop is the most likely to come from some sort of animal…

58. Icequakes!

59. Anatomy of Bloop UDP

60. https://microchip.wdfiles.com/local--files/tcpip:tcp-vs-udp/TCP_vs_UDP.JPG

61. Let’s Measure Some Stuff!

62. Flake Rate Tests that eventually passed after requiring one or two retries

63.

64.

65.

66.

67. Suite Runtime Total Duration of All Tests in a Suite

68.

69.

70.

71.

72. Per-Team Operational Health

73.

74. Individual Test Timings

75.

76.

77. Individual Test Timings By Test

78.

79. Keep on Slicing

80.

81. Interpreting Lines is Tough… We Need Statistics!

82.

83. Standard Deviation

84. Standard Deviation “…a measure of how spread out numbers are.” http://www.mathsisfun.com/data/standard-deviation.html

85. http://www.mathsisfun.com/data/standard-deviation.html The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and 300mm. …the mean (average) height is 394 mm. The Standard Deviation is 147 mm.

86.

87.

88.

89.

90.

91. Individual Test Timings By Browser

92.

93.

94. Individual Test Timings By Test and Browser

95.

96.

97.

98. Pass/Fail Rate By Test and Browser

99.

100.

101.

102. What have we learned? 1. When trends appear chaotic, find another dimension to slice 2. Keep slicing until a difference is found between slices 3. Use those differences to narrow down root causes

103. Common Causes of Flake 1. Long-Running Tests 2. Live Network Calls 3. Non-Deterministic Application Bugs

104.

105. Bloop Roadmap 1. Open Source it! 2. More Stats! 3. Determine test run order based on flake/timing history

106.

107.

108.

109. @TestArmada @geek_dave testarmada.github.io Join the Community #hooray4flake

Editor's Notes

----- Meeting Notes (11/16/16 10:07) ----- TestArmada is how we made large-scale test automation successful at Walmart
----- Meeting Notes (11/16/16 10:07) ----- we run a full cross-browser selenium suite on every single pull request
----- Meeting Notes (11/16/16 10:07) ----- that means devs are waiting - and very sensitive to time.
----- Meeting Notes (11/16/16 10:07) ----- also when tests slow -
----- Meeting Notes (11/16/16 09:47) ----- So we all wind up having an emotional, stressful reaction because of this word.
----- Meeting Notes (11/16/16 09:47) ----- "oh, i'll just ignore that failure because it's flakey". allows real app bugs to slip thru
----- Meeting Notes (11/16/16 09:47) ----- We're going to turn this into a positive thing.
----- Meeting Notes (11/16/16 09:47) ----- If you can measure i ----- Meeting Notes (11/16/16 10:07) ----- if you can measure it, you can control it. ----- Meeting Notes (11/16/16 11:04) ----- "hope is not a strategy" - Google SRE. we can't keep hiding from flake. we need to acknowledge that flake exists, and go hunting for it. when we find it, we measure it, and when we measure it, we can start to control it.
----- Meeting Notes (11/16/16 11:04) ----- so we called up saucelabs and BOOM we went from 100 concurrent VMs to 1000
----- Meeting Notes (11/16/16 10:07) ----- high concurrency is dramatically affected by test flake. to see how, let's briefly dive into how we orchestrate massive concurrency.
----- Meeting Notes (11/16/16 10:07) ----- two main benefits: 1. massively parallel runner 2. fault tolerant (handles retries, only reports a test as a failure if it fails 3x)
----- Meeting Notes (11/16/16 10:07) ----- rapid freezing - causes these crazy noises
----- Meeting Notes (11/16/16 10:07) ----- remember this last test - "amend order cancel"
----- Meeting Notes (11/16/16 10:07) ----- imagine flake is like a bruised apple, but we don't know which parts are safe to eat. we keep slicing to separate out the good parts from the bad. we use metrics and data viz to slice our data the same way, looking to separate what's flakey from what's not.
----- Meeting Notes (11/16/16 09:32) ----- We have tests dipping into the yellow and red zone. Risk of timing out.
----- Meeting Notes (11/16/16 10:07) ----- When we're looking for flake, instead of trying to pretend it doesn't exist, it's exciting when we find it. Even more exciting when we narrow it down!
----- Meeting Notes (11/16/16 10:07) ----- Come get TA stickers!

Big Data Makes The Flake Go Away

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (20)

Similar to Big Data Makes The Flake Go Away

Similar to Big Data Makes The Flake Go Away (20)

Recently uploaded

Recently uploaded (20)

Big Data Makes The Flake Go Away

Editor's Notes