Fakultät für Informatik
Technische Universität München




Testing & Quality Assurance in
Data Migration Projects
Williamsburg, 26th of September 2011




Klaus Haller2
Florian Matthes1
Christian Neubert1
Christopher Schulz1

1Lehrstuhl
         I19 (sebis), Fakultät für Informatik, TU München, Garching, Germany
2Swisscom IT Services Finance, Testing & Quality Assurance, Zurich, Switzerland


110816_CS_ICSM 2011                                                               1
The author team

Software Engineering for Business Information Systems (sebis)
   Prof. Dr. Florian Matthes is holder of the chair Software Engineering for
   Business Information Systems (sebis) at the TU München, Germany
   Research areas in Enterprise Architecture Management & Social Software

 Swisscom IT Services Finance
   Design, implementation, and operations of IT systems (customer-specific and
   standard software) and BPO services for ~190 banking & insurance institutions
   The Testing & QA group offers management and technical consulting, test
   automation, and testing as a services

    Christian Neubert                                Authors
                      PhD student at sebis,             Dr. Klaus Haller (Swisscom IT
                      primary research area: Web        Services Finance)
                      2.0 Tools, Hybrid Wikis,          Prof. Florian Matthes (TU München)
                      Model driven development
                                                        Christopher Schulz (TU München)
                      Professional working
                      experience as software
                      engineer in the area of
                      logistics
110816_CS_ICSM 2011                                                                      2
Mastering data migration projects is a
challenging task


                      „83% of data migrations fail outright or exceed their allotted budgets
                      and implementation schedules.“
                                                                        [Gartner Group, 2005]

                      ”..current success rate for the data migration portion of projects (that is
                      those that were delivered on time and on budget) is just 16%.”
                                                                         [Bloor research, 2007]

                      “Few companies have the necessary skills to manage, build and
                      implement a successful data migration.”
                                                                           [Endava, 2007]




110816_CS_ICSM 2011                                                                            3
Definition, drivers, and characteristics of data
migration projects
 Data migration
   Tool supported one-time process which aims at migrating
   formatted data from a source structure to a target data
   structure whereas both structures differ on a conceptual
   and/or technical level

 Drivers
   Corporate events like mergers and acquisitions or carve-outs
   Implementation of novel business-models and processes
   Technological progress and upgrades
   New statutory and regulatory requirements

 Characteristics
   Re-occurring replacement or consolidation of existing
   business applications
   Everlasting although infrequently performed discipline
   Constantly underestimated in size and complexity


110816_CS_ICSM 2011                                               4
Research focus



      How does a comprehensive process model for migrating data to
      relational databases looks like?



      What are risks frequently occurring in context of data migration
      projects and is there an appropriate classification scheme
      helping to structure them?



      Which dedicated testing and risk mitigation techniques cope
      with these issues from a technical and organizational point of
      view?




110816_CS_ICSM 2011                                                      5
Migration programs rest on an architecture
 Source staging database
   Copy of source database to
   uncouple both databases

 Transformation database
   Stores intermediate results of
   the data migration programs

 Target staging database
   Stores the result of the transfor-
   mation ready for the upload

 Data migration program
   Transforms and moves the data & its representation from source to target database
   Comprises the subprograms extract & pre-filter, transform, and upload

 Orchestration component
   Ensures the correct starting order of the programs using a timetable-like mechanism

110816_CS_ICSM 2011                                                                      6
Migrating data in a stepwise & iterative style
                                 Practice-proven process model
                                 consists of 4 main stages which are
                                 subdivided into 14 distinct phases

                                 1. Initialization, prepares the
                                    necessary infrastructure and
                                    organization

                                 2. Development, implements the
                                    actual data migration programs

                                 3. Testing, validates the
                                    correctness, stability, and
                                    execution time of both, data and
                                    migration programs

                                 4. Cut-Over, finally switches to the
                                    target application by executing the
                                    migration programs

110816_CS_ICSM 2011                                                    7
A risk model helps to turn vague migration
fears into concrete risks




      Shaped like a house, the model is subdivided into
        • business risks often articulated by the customer,
        • IT management risks with a technical focus, and
        • data migration risks covering issues associated with migration programs
      Business and IT management risks are abstract but map on data migration risks

110816_CS_ICSM 2011                                                               8
Different testing techniques mitigate the risk
often emerging in data migration projects




      Concrete testing techniques, their explicit mapping on risks, as well as
      dedicated testing phases assure the quality of data migration projects


110816_CS_ICSM 2011                                                              9
Systematize the testing-based quality
assurance techniques
 Data validation
   Combination of automated and manual comparisons to validate completeness,
   semantical correctness, and consistency on the structure & data level

 Completeness and type correspondence tests
   Automated comparison of all data to identify new or missing business objects

 Appearance tests
   Manual comparison of a selection of business objects on GUI level

 Integration tests
    Semi-automated tests dedicated to the proper functioning of the target application
    with the migrated data in context of its interlinked applications

 Processability test
   Test focusing on coordinated interplay of target business application and new data

 Partial/Full Migration run test
   Semi-automated validation of the data migration programs in part or entirety

110816_CS_ICSM 2011                                                                      10
Each data migration is risk is covered by a
different set of testing techniques


               Risk                           Testing technique
             Stability   Partial/full migration run test
          Corruption     Appearance test, processability test, integration test
          Semantics      Appearance test, processability test, integration test
       Completeness      completeness & type correspondence
       Execution risk    Full migration run test
    Orchestration risk   Partial/full migration run test
        Dimensioning     Partial/full migration run test
         Interference    Operational risk, no testing
     Parameterization    Appearance test, processability test, integration test,
                         completeness & type correspondence test




110816_CS_ICSM 2011                                                                11
Project management-based quality assurance
 Involve an external data migration team
   Experienced specialists bring in methodologies, tool support, and know-how
      Reduce IT management risks of extended delays and overspends

 Exercise due while perform project scoping
   Careful scoping in strategy phase applying source-push or target-pull principles
     Eliminate risk of data and transformation loss

 Apply a data migration platform
   Scalable and reusable platform ensures independence from source & target
   database while providing increased migration leeway for testing measures
      Mitigate risk of corruption and instability
      Prevent budget and time overruns
      Reduce risk of interference between the migration teams
      Reduce parameterization risk


110816_CS_ICSM 2011                                                               12
Project management-based quality assurance
 Thoroughly analyze and cleanse data
   In-depth analysis helps to understand the data’s semantics & structure and to
   seize migration project’s characteristics more accurately
      Prevent project delays and budget overruns
      Mitigate the risks of corruption
      Reduce performance and stability risk for target application
      Bring down the risk of unstable data migration programs

 Migrate in an incremental and iterative manner
   Early and regular generation of migration results ensures a high project
   traceability and the possibility for frequent adjustments
      Reduce risk of project failure




110816_CS_ICSM 2011                                                                13
Summary and outlook
      To deliver a data migration project in time and on budget, a stringent approach,
      proactive risk mitigation techniques, and distinct test activities are required

 This contribution…
   outlines a practice-proven process model describing how to proceeded when
   shifting data from a source to a target database
   introduces and classifies dedicated risk mitigation techniques and project
   management practices helping to assure the quality in data migration projects

 Future directions
   Empirically evaluate process model, risk mitigation, and project management
   techniques in practice
   Examine the case where several source databases have to be consolidated
   resulting in data migration series
   Enhance process model with additional data harmonization activities
   Identify alternative versions of the model and techniques for NoSQL databases


110816_CS_ICSM 2011                                                                      14
Thank you for your attention!




             Any
          Questions?

Contact
Christian.Neubert@in.tum.de
Christopher.Schulz@in.tum.de
Klaus.Haller@swisscom.com

Further information
http://wwwmatthes.in.tum.de/wikis/sebis/mergers-and-acquisitions
http://finance.swisscom.com/

110816_CS_ICSM 2011                                                15

Industry - Testing & Quality Assurance in Data Migration Projects

  • 1.
    Fakultät für Informatik TechnischeUniversität München Testing & Quality Assurance in Data Migration Projects Williamsburg, 26th of September 2011 Klaus Haller2 Florian Matthes1 Christian Neubert1 Christopher Schulz1 1Lehrstuhl I19 (sebis), Fakultät für Informatik, TU München, Garching, Germany 2Swisscom IT Services Finance, Testing & Quality Assurance, Zurich, Switzerland 110816_CS_ICSM 2011 1
  • 2.
    The author team SoftwareEngineering for Business Information Systems (sebis) Prof. Dr. Florian Matthes is holder of the chair Software Engineering for Business Information Systems (sebis) at the TU München, Germany Research areas in Enterprise Architecture Management & Social Software Swisscom IT Services Finance Design, implementation, and operations of IT systems (customer-specific and standard software) and BPO services for ~190 banking & insurance institutions The Testing & QA group offers management and technical consulting, test automation, and testing as a services Christian Neubert Authors PhD student at sebis, Dr. Klaus Haller (Swisscom IT primary research area: Web Services Finance) 2.0 Tools, Hybrid Wikis, Prof. Florian Matthes (TU München) Model driven development Christopher Schulz (TU München) Professional working experience as software engineer in the area of logistics 110816_CS_ICSM 2011 2
  • 3.
    Mastering data migrationprojects is a challenging task „83% of data migrations fail outright or exceed their allotted budgets and implementation schedules.“ [Gartner Group, 2005] ”..current success rate for the data migration portion of projects (that is those that were delivered on time and on budget) is just 16%.” [Bloor research, 2007] “Few companies have the necessary skills to manage, build and implement a successful data migration.” [Endava, 2007] 110816_CS_ICSM 2011 3
  • 4.
    Definition, drivers, andcharacteristics of data migration projects Data migration Tool supported one-time process which aims at migrating formatted data from a source structure to a target data structure whereas both structures differ on a conceptual and/or technical level Drivers Corporate events like mergers and acquisitions or carve-outs Implementation of novel business-models and processes Technological progress and upgrades New statutory and regulatory requirements Characteristics Re-occurring replacement or consolidation of existing business applications Everlasting although infrequently performed discipline Constantly underestimated in size and complexity 110816_CS_ICSM 2011 4
  • 5.
    Research focus How does a comprehensive process model for migrating data to relational databases looks like? What are risks frequently occurring in context of data migration projects and is there an appropriate classification scheme helping to structure them? Which dedicated testing and risk mitigation techniques cope with these issues from a technical and organizational point of view? 110816_CS_ICSM 2011 5
  • 6.
    Migration programs reston an architecture Source staging database Copy of source database to uncouple both databases Transformation database Stores intermediate results of the data migration programs Target staging database Stores the result of the transfor- mation ready for the upload Data migration program Transforms and moves the data & its representation from source to target database Comprises the subprograms extract & pre-filter, transform, and upload Orchestration component Ensures the correct starting order of the programs using a timetable-like mechanism 110816_CS_ICSM 2011 6
  • 7.
    Migrating data ina stepwise & iterative style Practice-proven process model consists of 4 main stages which are subdivided into 14 distinct phases 1. Initialization, prepares the necessary infrastructure and organization 2. Development, implements the actual data migration programs 3. Testing, validates the correctness, stability, and execution time of both, data and migration programs 4. Cut-Over, finally switches to the target application by executing the migration programs 110816_CS_ICSM 2011 7
  • 8.
    A risk modelhelps to turn vague migration fears into concrete risks Shaped like a house, the model is subdivided into • business risks often articulated by the customer, • IT management risks with a technical focus, and • data migration risks covering issues associated with migration programs Business and IT management risks are abstract but map on data migration risks 110816_CS_ICSM 2011 8
  • 9.
    Different testing techniquesmitigate the risk often emerging in data migration projects Concrete testing techniques, their explicit mapping on risks, as well as dedicated testing phases assure the quality of data migration projects 110816_CS_ICSM 2011 9
  • 10.
    Systematize the testing-basedquality assurance techniques Data validation Combination of automated and manual comparisons to validate completeness, semantical correctness, and consistency on the structure & data level Completeness and type correspondence tests Automated comparison of all data to identify new or missing business objects Appearance tests Manual comparison of a selection of business objects on GUI level Integration tests Semi-automated tests dedicated to the proper functioning of the target application with the migrated data in context of its interlinked applications Processability test Test focusing on coordinated interplay of target business application and new data Partial/Full Migration run test Semi-automated validation of the data migration programs in part or entirety 110816_CS_ICSM 2011 10
  • 11.
    Each data migrationis risk is covered by a different set of testing techniques Risk Testing technique Stability Partial/full migration run test Corruption Appearance test, processability test, integration test Semantics Appearance test, processability test, integration test Completeness completeness & type correspondence Execution risk Full migration run test Orchestration risk Partial/full migration run test Dimensioning Partial/full migration run test Interference Operational risk, no testing Parameterization Appearance test, processability test, integration test, completeness & type correspondence test 110816_CS_ICSM 2011 11
  • 12.
    Project management-based qualityassurance Involve an external data migration team Experienced specialists bring in methodologies, tool support, and know-how Reduce IT management risks of extended delays and overspends Exercise due while perform project scoping Careful scoping in strategy phase applying source-push or target-pull principles Eliminate risk of data and transformation loss Apply a data migration platform Scalable and reusable platform ensures independence from source & target database while providing increased migration leeway for testing measures Mitigate risk of corruption and instability Prevent budget and time overruns Reduce risk of interference between the migration teams Reduce parameterization risk 110816_CS_ICSM 2011 12
  • 13.
    Project management-based qualityassurance Thoroughly analyze and cleanse data In-depth analysis helps to understand the data’s semantics & structure and to seize migration project’s characteristics more accurately Prevent project delays and budget overruns Mitigate the risks of corruption Reduce performance and stability risk for target application Bring down the risk of unstable data migration programs Migrate in an incremental and iterative manner Early and regular generation of migration results ensures a high project traceability and the possibility for frequent adjustments Reduce risk of project failure 110816_CS_ICSM 2011 13
  • 14.
    Summary and outlook To deliver a data migration project in time and on budget, a stringent approach, proactive risk mitigation techniques, and distinct test activities are required This contribution… outlines a practice-proven process model describing how to proceeded when shifting data from a source to a target database introduces and classifies dedicated risk mitigation techniques and project management practices helping to assure the quality in data migration projects Future directions Empirically evaluate process model, risk mitigation, and project management techniques in practice Examine the case where several source databases have to be consolidated resulting in data migration series Enhance process model with additional data harmonization activities Identify alternative versions of the model and techniques for NoSQL databases 110816_CS_ICSM 2011 14
  • 15.
    Thank you foryour attention! Any Questions? Contact Christian.Neubert@in.tum.de Christopher.Schulz@in.tum.de Klaus.Haller@swisscom.com Further information http://wwwmatthes.in.tum.de/wikis/sebis/mergers-and-acquisitions http://finance.swisscom.com/ 110816_CS_ICSM 2011 15