Light Speed with Test Data Management
Efficiency Out of Chaos
Kellyn Pot’Vin-Gorman | Technical Intelligence Manager
2© 2016 Delphix Corporation
Kellyn Pot’Vin-Gorman
Technical Intelligence Manager, Delphix
• Multi-platform DBA, (Oracle, MSSQL, MySQL,
Sybase, Postgres…..)
• Oracle ACE Director, (Alumni)
• Oak Table Network
• APEX Women in Technology Award, CTA 2014
• STEM education with Raspberry Pi and Python
• Board of Director, RMOUG
• Training Days Conference Director
• Author, blogger, (http://dbakevlar.com)
This session is focused on the database and application
environment. No actual database or vendor platform special
knowledge is required to gain value from the session,
although I can answer questions on most platforms due to my
multi-platform technical background.
© 2016 Delphix Corporation 4
Agenda
Story Time1
What is Test Data Management2
Agile Testing, Speed is Everything3
Data Masking4
Code Control5
5© 2016 Delphix Corporation
Story Time
• Large company
• 4TB transactional database, (small by today’s standards)
• Financial data, aggregated to other financial systems.
• Agile development released, (most often) directly to production
• Archaic development, (1/3 of size of production) and a rarely used
test environment.
6© 2016 Delphix Corporation
After 4-6 Weeks of Research…
• Over 40% data corruption to main transactional system that feeds
into datamarts.
• Even higher percentage of corruption impact in marts due to poor
agile development practices and percentage of highly volatile
source data.
I now have to reveal my findings to senior management as the
new Lead DBA…
7© 2016 Delphix Corporation
“Accuracy is over-rated…”
8© 2016 Delphix Corporation
https://www.tutorialspoint.com/software_testing_dictionary/test_data_management.htm
Test Data management is very critical during the test life cycle. The
amount of data that is generated is [often] enormous to test the
required changes.
Over 80% of organizations stated that RECEIVING or REFRESHING
the data to perform tests was the largest consumer of testing time,
(over 90%) leaving actual work to consume less than 10% of the
overall testing scenario.
Test Data Management
9© 2016 Delphix Corporation
What Is Test Data Management?
• Tools that assist in producing data sets for testing.
• Not only produce, but do so in an interval that is able to match the
testing cycle.
• An ability to quickly isolate and deliver test FAILURES for development
to investigate.
• Ability to identify code changes by versions
• To deliver in a timely manner, many TDM applications create synthetic
or subsets of data that aren’t a full copy of the production experience.
• Don’t take copy data management and security into consideration.
10© 2016 Delphix Corporation
As important as development
environments are, providing proper
testing that’s able to handle the
speed of your companies
deployments are a bigger hurdle.
11© 2016 Delphix Corporation
Cloning- Can’t Be Done Effectively
http://www.informationweek.com/pdf_whitepapers/approved/1345732672_back_to_basics.pdf
12© 2016 Delphix Corporation
What is the “Right Size”?
These are not the same data
or the same challenges the
developer and tester will face
once it gets to production!
http://www.informationweek.com/pdf_whitepapers/approved/1345732672_back_to_basics.pdf
13© 2016 Delphix Corporation
▶▶▶
Virtualize and Deployed▶ ▶ ▶
Virtualization Eliminates 90%+ of Storage Issues
Storage Pool for Delphix
QA
DEV PATCH TEST
PRODUCTION
Database/App Tier
1
TB
1 TB
600GB
Read From Production
Each Virtual Database takes up around 5-10Gb upon creation, (dependent upon parameters)
TEST
Read AND Write
14© 2016 Delphix Corporation
http://www.idtheftcenter.org/images/breach/ITRCBreachReport_2016.pdf
Data Security
15© 2016 Delphix Corporation
Confidential data
Production
Non-
Production
Exposure
16© 2016 Delphix Corporation
Confidential data
Production
Non-
Production
Exposure
Encryption
Masking
Solution
17© 2016 Delphix Corporation
As 80% of data in a company are copies, then 80% of data won’t be subject to
security like a production environment. Securing this data is not just a priority,
but in many cases, subject to legal ramifications, (i.e. PCI/PII)
• Masking Requirements
• Masking shouldn’t be reversible
• The masked data should be representative of the original data type to ensure
performance is consistent.
• Referential Integrity should be maintained as part of the masking process.
Masking should be a simple, repeatable process with a user interface
that ensures it is simple.
Masking in the Picture
18© 2016 Delphix Corporation
Do I Have to Mask Data? Nah….
Type of Data Year Passed Ruling
Data Masking in
the EU
2014 ARTICLE 29 DATA PROTECTION
HIPAA 1996 Health Insurance Portability and
Accountability Act
PCI 2016,
(Updated)
Payment Card Industry Standards
PII Personably Identifiable Information
SOX 2002 Sarbanes-Oxley Act
19© 2016 Delphix Corporation
Ability to Deliver and Mask Data for Testing- FAST
DB
AP
P
DB
APP
DB
APP
DB
APP
DB
APP
DB
Mask PCI/PII
and then
virtualize
Develop Test Deploy
20© 2016 Delphix Corporation
Source Control
“A component of software configuration management, version control,
also known as revision control or source control, is the management of
changes to documents, computer programs, large web sites, and other
collections of information.”
21© 2016 Delphix Corporation
Branching and Bookmarking
• The ability to mark each iteration of development with a
bookmark
• Simplify to lock and deliver while testing a consistent
image via a virtual database, (VDB)
• If a test goes wrong, the ability to “bookmark”, (and
subsequent snapshot) to deliver to development to
address.
22© 2016 Delphix Corporation
23© 2016 Delphix Corporation
Environments
Snapshot
Previous
Snapshots
Virtual to
Physical
Timeflow Slider
Actions
24© 2016 Delphix Corporation
How This All Comes Together…
• Virtualization is the key to fast, efficient and FULL copies of
production environments for agile and automated testing for agile
shops.
• Data masking that can be done once, easily maintained with a
repeatable process via a strong discovery and implementation as
part of the virtualization process secures the 80% of data that is
outside the control of production.
• Virtualized environments that are built with development and testing
in an Agile or DevOps environments makes it simple to accomplish
what may see impossible and do so at light speed.
Kellyn Pot’Vin-Gorman
Technical Intelligence Manager
kellyn@delphix.com
http://dbakevlar.com

Light Speed with Test Data Management

  • 1.
    Light Speed withTest Data Management Efficiency Out of Chaos Kellyn Pot’Vin-Gorman | Technical Intelligence Manager
  • 2.
    2© 2016 DelphixCorporation Kellyn Pot’Vin-Gorman Technical Intelligence Manager, Delphix • Multi-platform DBA, (Oracle, MSSQL, MySQL, Sybase, Postgres…..) • Oracle ACE Director, (Alumni) • Oak Table Network • APEX Women in Technology Award, CTA 2014 • STEM education with Raspberry Pi and Python • Board of Director, RMOUG • Training Days Conference Director • Author, blogger, (http://dbakevlar.com)
  • 3.
    This session isfocused on the database and application environment. No actual database or vendor platform special knowledge is required to gain value from the session, although I can answer questions on most platforms due to my multi-platform technical background.
  • 4.
    © 2016 DelphixCorporation 4 Agenda Story Time1 What is Test Data Management2 Agile Testing, Speed is Everything3 Data Masking4 Code Control5
  • 5.
    5© 2016 DelphixCorporation Story Time • Large company • 4TB transactional database, (small by today’s standards) • Financial data, aggregated to other financial systems. • Agile development released, (most often) directly to production • Archaic development, (1/3 of size of production) and a rarely used test environment.
  • 6.
    6© 2016 DelphixCorporation After 4-6 Weeks of Research… • Over 40% data corruption to main transactional system that feeds into datamarts. • Even higher percentage of corruption impact in marts due to poor agile development practices and percentage of highly volatile source data. I now have to reveal my findings to senior management as the new Lead DBA…
  • 7.
    7© 2016 DelphixCorporation “Accuracy is over-rated…”
  • 8.
    8© 2016 DelphixCorporation https://www.tutorialspoint.com/software_testing_dictionary/test_data_management.htm Test Data management is very critical during the test life cycle. The amount of data that is generated is [often] enormous to test the required changes. Over 80% of organizations stated that RECEIVING or REFRESHING the data to perform tests was the largest consumer of testing time, (over 90%) leaving actual work to consume less than 10% of the overall testing scenario. Test Data Management
  • 9.
    9© 2016 DelphixCorporation What Is Test Data Management? • Tools that assist in producing data sets for testing. • Not only produce, but do so in an interval that is able to match the testing cycle. • An ability to quickly isolate and deliver test FAILURES for development to investigate. • Ability to identify code changes by versions • To deliver in a timely manner, many TDM applications create synthetic or subsets of data that aren’t a full copy of the production experience. • Don’t take copy data management and security into consideration.
  • 10.
    10© 2016 DelphixCorporation As important as development environments are, providing proper testing that’s able to handle the speed of your companies deployments are a bigger hurdle.
  • 11.
    11© 2016 DelphixCorporation Cloning- Can’t Be Done Effectively http://www.informationweek.com/pdf_whitepapers/approved/1345732672_back_to_basics.pdf
  • 12.
    12© 2016 DelphixCorporation What is the “Right Size”? These are not the same data or the same challenges the developer and tester will face once it gets to production! http://www.informationweek.com/pdf_whitepapers/approved/1345732672_back_to_basics.pdf
  • 13.
    13© 2016 DelphixCorporation ▶▶▶ Virtualize and Deployed▶ ▶ ▶ Virtualization Eliminates 90%+ of Storage Issues Storage Pool for Delphix QA DEV PATCH TEST PRODUCTION Database/App Tier 1 TB 1 TB 600GB Read From Production Each Virtual Database takes up around 5-10Gb upon creation, (dependent upon parameters) TEST Read AND Write
  • 14.
    14© 2016 DelphixCorporation http://www.idtheftcenter.org/images/breach/ITRCBreachReport_2016.pdf Data Security
  • 15.
    15© 2016 DelphixCorporation Confidential data Production Non- Production Exposure
  • 16.
    16© 2016 DelphixCorporation Confidential data Production Non- Production Exposure Encryption Masking Solution
  • 17.
    17© 2016 DelphixCorporation As 80% of data in a company are copies, then 80% of data won’t be subject to security like a production environment. Securing this data is not just a priority, but in many cases, subject to legal ramifications, (i.e. PCI/PII) • Masking Requirements • Masking shouldn’t be reversible • The masked data should be representative of the original data type to ensure performance is consistent. • Referential Integrity should be maintained as part of the masking process. Masking should be a simple, repeatable process with a user interface that ensures it is simple. Masking in the Picture
  • 18.
    18© 2016 DelphixCorporation Do I Have to Mask Data? Nah…. Type of Data Year Passed Ruling Data Masking in the EU 2014 ARTICLE 29 DATA PROTECTION HIPAA 1996 Health Insurance Portability and Accountability Act PCI 2016, (Updated) Payment Card Industry Standards PII Personably Identifiable Information SOX 2002 Sarbanes-Oxley Act
  • 19.
    19© 2016 DelphixCorporation Ability to Deliver and Mask Data for Testing- FAST DB AP P DB APP DB APP DB APP DB APP DB Mask PCI/PII and then virtualize Develop Test Deploy
  • 20.
    20© 2016 DelphixCorporation Source Control “A component of software configuration management, version control, also known as revision control or source control, is the management of changes to documents, computer programs, large web sites, and other collections of information.”
  • 21.
    21© 2016 DelphixCorporation Branching and Bookmarking • The ability to mark each iteration of development with a bookmark • Simplify to lock and deliver while testing a consistent image via a virtual database, (VDB) • If a test goes wrong, the ability to “bookmark”, (and subsequent snapshot) to deliver to development to address.
  • 22.
    22© 2016 DelphixCorporation
  • 23.
    23© 2016 DelphixCorporation Environments Snapshot Previous Snapshots Virtual to Physical Timeflow Slider Actions
  • 24.
    24© 2016 DelphixCorporation How This All Comes Together… • Virtualization is the key to fast, efficient and FULL copies of production environments for agile and automated testing for agile shops. • Data masking that can be done once, easily maintained with a repeatable process via a strong discovery and implementation as part of the virtualization process secures the 80% of data that is outside the control of production. • Virtualized environments that are built with development and testing in an Agile or DevOps environments makes it simple to accomplish what may see impossible and do so at light speed.
  • 25.
    Kellyn Pot’Vin-Gorman Technical IntelligenceManager kellyn@delphix.com http://dbakevlar.com

Editor's Notes

  • #6 I’d been just hired as the Lead DBA for this company This was agile before agile was really a thing DBA group reported to development…until I moved them to operations years later.
  • #7 I now have to go tell upper management of my findings I wasn’t asked to do this, it was just something I recognized was a deficit in the environment and had to be done before fixing everything.
  • #8 Now this isn’t the response I expected Promptly cashed out any stocks I had that even remotely touched their stock data.
  • #10 Test Data Management or TDM With masking of critical data pushed to test and development, which can be time consuming, too.
  • #12 IBM says this can’t be done effectively- it’s easy to put in place, but…. Time consuming, lots of hardware and support costs. Not collaborative between the DBA and the tester, (non Scrum culture driven) Not scalable? Risky for data security Sythetic is safer?? Subsetting is less expensive, but it’s resource intensive.
  • #13 I think the right size is for development and test is the same size and the same data as production to ensure that they are going up against the same challenges they’ll have in production. That’s the RIGHT SIZE. At no time does this model discuss the power of virtualization.
  • #14 Point out the engine and size after we’ve compressed and de-duplicated. Note that each of the VDBs will take approximately 5-10G vs. 1TB to offer a FULL read/write copy of the production system It will do so in just a matter of minutes. That this can also be done for the application tier!
  • #15 Almost 30 million users at risk already this YEAR! Almost 900 breaches reported and this is what was reported. 60% is stated unreported.
  • #18 If the ssn is the reference key, then the numbers should be masked identically across the objects to ensure integrity is maintained.
  • #19 Article 29 makes it unlawful in EMEA to not just encrypt, but to mask data in non production systems and when handling data outside of secure environments. HIPAA protects medical information PCI protects payment information, via the internet, inside companies and in the public eye. PII protects personably identify information between systems, (big brother) for demographics and information collections SOX protects investor information
  • #23 This may appear to be a traffic disaster of changes, but for developers with Agile experience, a “sprint” looks just like this. You have different sprints that are quick runs and merges where developers are working separately on code that must merge successfully at the correct intersection and be deployed. Versioning with source control is displayed at the top, using Virtual images. You can see each iteration of the sprints. In the middle section is the branches of that occur during the development process. A virtual can be spun from a virtual, which means that it’s easier for developers to work from the work another developer has produced. Stopping points and release via a clone is simply minutes vs. hours or days.
  • #24 This is less overwhelming than the last image…show how easy it is to manage and work with Delphix, (specifying a virtualizing product)