Free online webinar
Free 1-day local
Local user groups
around the world
interest user groups
Free Online Resources
Download the GuideBook App
and search: PASS Summit 2018
Follow the QR code link displayed on session
signage throughout the conference venue and
in the program guide
Your feedback is
important and valuable.
Go to passSummit.com
3 Ways to Access:
Submit by 5pm Friday, November 16th to win prizes.
MVP & Partner, Crafting
Author of Developing Azure Solutions
MVP for 8 years
Speaker at PASS Summit, SQL Bits,
Redgate SQL in the City, and many
Teach courses on SQL Server, SQL
Intelligence, Azure Cloud, and
• Why do we test? What’s the purpose?
• What is an Enterprise Data Architecture?
• Why are the challenges of testing in an EDA?
• What solutions are exist for those challenges?
• What challenges will likely always be there?
• What are some best practices in general to work within the
constraints we have?
Purpose of automatic testing
• To verify that changes to a production system do not break
earlier expected work
• To preserve intent and self-document functionality
• To automatically deploy a system once all tests successfully
• To increase our rate of change and match the pace the
business expects from us
• To test things unattended so the QA department is not a
bottleneck for product delivery
Testing: The epicenter of a deployment
What’s better at preserving intent?
Comments, Source Control Comments, or
• Source control comments are done AFTER the changes, not
before or with the changes
• Comments are often outdated, ignored, and can’t be
reviewed at compilation
• Tests fail when intent is violated, pointing to the code that
failed, making it inviolate
Testing: Rate of change
Time it takes
to make a
Time since project began
QA Dept as Bottleneck
• Overtime, developers can outpace a QA team. More
application surface area is built, there will be more to
completely test to see the impact.
• Even the smallest changes begin having the largest impact
Enterprise Data Architecture
• Data Mart/Data Warehouse ODS
• Data Lake Architecture
• Lambda Architecture
• Kappa Architecture
• IoT Architecture
What do all of these architectures have in
• Some move data more than others, but there is always an
element of moving data from one place to the other
• They have processes that drastically change the data so that
it doesn’t resemble what it originally looked like
• They store the exact same data several time. I’ve seen
company EDAs that store the same source data seven times
before it arrives at the final destination
• I’ve seen companies that store different views of the same
aggregation: Weekly, Monthly, Quarterly, MTD, YTD, YOY
What are the common challenges of
testing these environments?
Why do so many companies fail at automated testing and give
Well, it’s not their fault, and it’s not yours
• EDAs have some insane complexity
• Most EDAs have all the things that software developers
avoid when writing code
Problem #2: Side-Effects
Stored procedures always violate the DRY
principle: Don’t repeat yourself
They do this for a variety of reasons:
• It is difficult to pass tables of data between
• It is easy to load a temp table up and
perform 100s of operations against it
• Stored procedures don’t have any of the
components necessary to avoid repetition
like inheritance, encapsulation,
polymorphism, arrays, foreach loops, etc.
All of this means we end up getting very large
stored procedures with 1000s of lines that
comprise a lot of our ETL
Problem #3: Queries take a long time to run
• If a suite of tests take hours to
run, then we can’t run them very
• We will never have good test
coverage of our architecture if
we can’t run the tests in between
changes to the system
• Queries timeout
• Queries against source systems
are a bad idea, meaning we can’t
tell if we got all the data with
Problem #4: Source systems don’t really want
If we’re pulling data from transactional systems, testing against
the OLTP database creates a load. Create too much, and we’ll
get kicked off the source, pronto.
Problem #5: All of these EDAs have a lot of
repetitive data. Where do we test?
• Do we test data from source to staging? From staging to
ODS? From source to ODS? From source to DW?
• With all the repetition, we might have the manufacturing
problem that Taichi Ohno railed against while inventing lean
manufacturing and Just-in-Time
Problem #6: Accuracy is deadly important
• Test data is not always available
• Production data must be 100% reconciled and accurate
• Cash flow statements, reports to the board, SEC filings,
auditing records, etc.
• No mistakes, means no mistakes
Remember our goals with testing
• Did a change break the system?
• Are we preserving existing, needed functionality?
• Can we deploy automatically when our tests run?
OK, that’s the bad news, now what can we
do about it?
• Testing Strategies
• Testing Tools
• Testing Best Practices
• Are we all looking at the same requirements?
• Write a test plan, execute it!
• 100% test coverage is impossible, so what will we test?
• Where will we test?
• Where does it tend to break? Are we recording our outages
• Stay focused
It’s not a perfect world, but here are some
things we can do
• Approval Tests
• Except Query Replacement Testing
Best practices for data testing
• Test row counts
• Test random and sampled data elements
• Start with thorough manual testing and then slowly
• Alert in your audit log
• Test records are streaming
• Make testing fast!
• Deleting tests is essential!
• It’s absolutely ok to delete tests that aren’t working for you. It is not
OK to give up on testing.
• SSRS Testing
• Power BI Testing
• Push logic into the database
• Push coloring into the database
• Test at the database
• JPG/PDF testing is still not mature enough to use
• Test report data sizes and subscription dates
Develop a culture of testing
There are teams that talk about testing in every sentence
There are teams that treat testing as a burden or someone
Guess who write better products, has happy users, and enjoys
making changes to their own system?
Remember change lock. It will make you want to quit your job.
Write tests to avoid change lock