Matt Archer - How To Regression Test A Billion Rows Of Financial Data Every Sprint - EuroSTAR 2012

How to regression test a billion rows of
financial data every sprint
Matt Archer, Independent Software Tester, UK
www.eurostarconferences.com
@esconfs
#esconfs

Big data (a pension scheme example)
2
x x =
100,000
people
12
months
100
years
1.2 million
estimates
1.2 million
estimates
10s of financial
models
x
A Very Large
Database! =
Billions of
database rows

Creating a website
3
Page 1
Page 2
Page 3
Pages to develop + test
Sprint 1

Creating a website
4
Page 1
Page 2
Page 3
Page 6 Page 4 Page 5
Pages to regression test
Sprint 1 Sprint 2

Creating a website
5
Page 1
Page 2
Page 3
Sprint 1 Sprint 2 Sprint 3

Creating a website
6
Page 1
Page 2
Page 3
Sprint 1 Sprint 2 Sprint 3 Sprint 4

Creating a website
7
Page 1
Page 2
Page 3
Pages to enhance + test
+ regression test
Sprint 1 Sprint 2 Sprint 3 Sprint 4 Sprint 5

A common approach
8
Regression Test Strategy 1
(“The Regression Pack”)
1. Re-run tests from previous sprints.
2. Select those tests using a risk-based
heuristic (rule of thumb).
3. Automate as much as possible,
increasing the coverage over time.

An example web page with financial data
9

Example of a data-checking test
Given…
…an organisation with approved data
* Much better BDD examples exist.
Opted for brevity over clarity.
When…
… a user selects an Information Type of “Present Value” a
Timeframe of “Weekly” and a Grouping of “Individual”
Then…
… the data should be aggregated accordingly*
10
132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987
132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987
132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987
132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987
132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987
132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987
132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987
132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987 132473943987

Data permutations
2 x 3 x 3 = 18
11

How many permutations for an entire site!?
12
Page 1
Page 2
Page 3
Pages to enhance + test
+ regression test

The thoroughness vs. maintenance conundrum
13
Follow
Your
Instincts…
…create those
tests if you
Think they’ll
find bugs!
BUT more tests =
more maintenance

An alternative approach
So, what is the
alternative?
14

An alternative approach
15
Regression Test Strategy 2
(“Capture and Compare”)
1. Take snapshots of the data as it is
displayed in a known high-quality build
2. Store the snapshots somewhere safe
3. Use the snapshot to help identify
unforeseen changes (bugs) in future
release candidates and live releases

Step 1: Identify a gold build
16
Page
11
Page
10
Page
12
Page 1
Page 2
Page 3
The “Gold” Build (1.3.2.14)
I’m so great,
I’ve been the subject of…
BDD
Unit Tests
Integration Tests
Exploratory Testing
UX Inspections
SME Reviews
Customer Demos
And maybe even…
production use
(with NO complaints!)

Step 2: Specify the data permutations to capture
17
<WebPage path = “/analytics”>
<QueryStringParam name = “InformationType”>
<QueryStringValue value = “PresentValue”/>
<QueryStringValue value = “Cashflow”/>
</QueryStringParam>
…
…
</WebPage>

Step 3: Capture and clean the HTML tables
18
Capture specifications
Capture
String Host
String Spec_Location
String Save_Location
Void Capture()
Void Clean()
Data snapshots
<table>
<tr>
<td>4.24</td>
<td>15.93</td>
<td>12.67</td>
…
</tr>
…
</table>
Selenium
“Gold” build

Step 4: Synchronise the data
19
Database
Database Synchronise
Release Candidate “Gold” Build

Step 5: Compare the release against the snapshot
20
Compare
String Host
String Spec_Location
String Snap_Location
Void Compare()
Void Highlight()
Data snapshots
Capture specifications
Highlighted release candidate
Release candidate

21
Step 6: Manual validation
What is a bug? What is a deliberate change?

What to capture, what to test?
What to capture? What to test?
The final build from
the previous sprint
The final build from
the current sprint
The build currently
deployed in live
The build currently
deployed in staging
The build deployed in
live prior to release
The build deployed in
live after the release
22
• The technique can support a variety of team milestones

Summary
• Beware generic regression
testing (or any testing) strategies
• Look for more efficient ways of
detecting regression bugs
• Balance your regression techniques
23

24
Questions?
Matt Archer
August 2012
Twitter: @MattArcherUK
Blog: mattarcherblog.wordpress.com
Email: matthewjarcher@googlemail.com

Matt Archer - How To Regression Test A Billion Rows Of Financial Data Every Sprint - EuroSTAR 2012

Recommended

Recommended

More Related Content

Similar to Matt Archer - How To Regression Test A Billion Rows Of Financial Data Every Sprint - EuroSTAR 2012

Similar to Matt Archer - How To Regression Test A Billion Rows Of Financial Data Every Sprint - EuroSTAR 2012 (20)

More from TEST Huddle

More from TEST Huddle (20)

Recently uploaded

Recently uploaded (20)

Matt Archer - How To Regression Test A Billion Rows Of Financial Data Every Sprint - EuroSTAR 2012