Data Migration and MDM - DMM5


Published on

Presentation given by Wael Elrifai to Data Migration Matters 5 in London on 25 May, 2012.

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Data Migration and MDM - DMM5

  1. 1. The Data Migration Challenge: Elements including MDM by Wael ElrifaiLondon - New York - Dubai - Mumbai - Hong Kong 2012
  2. 2. Understanding Migration Assumptions Few source Specific All Data Documented Valid systems Data Available System Data Formats Interfaces T R U T H Many More Data in Needed Data Unknown Poor Data Source unknown is Missing System Quality Systems formats Interfaces “Migration is not just about moving the data… It’s about making the data work.”Confidential - not for redistribution
  3. 3. These Application Projects have a Common CriticalRequirement: Migrating Data Application Implementation From legacy into new application Application Upgrade From previous to new version Application Instance Consolidation From multiple instances to fewer M&A Integration From acquired systems Legacy Retirement From legacy into new systems Outsourcing From company to outsourcer
  4. 4. Project Overview: Data Migration to ERP • 200+ source systems • Operating in 14 languages • Different sets of users working in different regions with different applications and languages • Highly fragmented lines of business and regions • No concept of Data Governance or Master Data Management • No concept of Data Quality Analysis
  5. 5. Methodology: Practical Data Migration Landscape Gap Analysis & Migration Design Analysis Mapping & Execution (LA) (GAM) (MDE) Legacy Technical Decommissioning Migration (LD) Controller Migration Strategy & Profiling Tool Data Quality Tool Governance DMZ (MSG) Data Quality Rules (DQR)Engagement Key Data Stakeholder System Retirement Plan Business Management (SRP) (KDSM)
  6. 6. Team Structure & Communications• Primary Business Team located in Hong Kong • 6 Business Analysts • 2 Technical Coordinators• Primary Development Team in Hong Kong • 8 Developers• Offshore Development Team in Mumbai, India • 4 Developers• Unique Aspects • Agile/Scrum meetings conducted via Video Conference • Email usage limited • Assigned secretary with output immediately posted on Wiki for comments • Team Lead makes final “closing comments” on each issue
  7. 7. Application Migration: The Anatomy of Failure Long development times •Often many months or even years without any „visible‟ signs of progress •CAUSE: failure to properly decompose development into practical, achievable and meaningful „phases‟ and „sprints‟ Long development times – for individual ETL flows •Due to extensive and repeated re-working of ETL code •Resulting from failures in unit testing and user acceptance testing •CAUSE: poor and inadequate design Considerable variations in quality & efficiency of code •Increasing time for new/other developers to modify code •CAUSE: failure to define and firmly enforce standards
  8. 8. Application Migration : The Anatomy of Failure Minimal attention to data cleansing or standardisation •Leading to longer report development times •And greater inconsistencies in reporting •Effectively pushing data quality management to report developers •AND information consumers •CAUSE: failure to recognise importance and impact of employing a systematic approach to managing data quality Poor reliability •Arising from „unexpected‟ variations in structure or content of incoming source files •CAUSE: failure to cater for Murphy‟s Law – i.e. the most frequent and most obvious causes of
  9. 9. Application Migration : The Anatomy of Failure Poor performance •CAUSE: failure to give due consideration to scale and complexity of ETL processes – during the design stage •CAUSE: failure to fully understand the underlying causes – when performance problems become evident •CAUSE: failure to routinely monitor performance or undertake adequate capacity planning – to cater for gradual or step-change increases in data volumes
  10. 10. Application Migration: The Anatomy of Success Entity Level Data Model Design „MAPPING‟ & ETL Phasing TEMPLATES REUSABLE Forensic Sprint COMPONENTS Hosted Data Analysis Code Go Live Translations Soft Detailed & Go Live Functional Design Master Schedule UAT Detailed Technical Design EnforceIncluding Peer Review System Standards Master Test Technical Authority &Schedule Reusable Components Peer Review Build Technical Authority Unit Test
  11. 11. Abstraction of Rules & Reusability • Automated ETL mapping development based on source system metadata • Automated data type verification for flat file data based on header information •Consistent use of a single value mapping table abstracted to accommodate data migration rules • Automated data type verification for flat file data based on header information •Single generic “run script” which operates based on a simple dependency matrix • This is more important in operational rather that data migration situations, but becomes important when dependencies are complex
  12. 12. Data Migration Guiding Principles Creating Data Standards to Reduce ComplexityFuture State Environments Create Entity Attribute Model• Enterprise Apps DataModels• ODS Data Models ODS Common Data Standards Enterprise RepresentationCurrent State • Create Domain Model DWEnvironments • Create Entity Model• Source Tables • Create Entity Relationship• Source Attributes Model• Upstream Sources Customer• Downstream Targets• Create as is Domain Model• Create as is Entity Model ETC Initial Common Data Rationalize Domains and Rationalize Attributes across Standards and creation of:Entities across Current State Map in all Application Current State and Future •Initial DQ Program Environments to the and Future State State Environments •Initial Data Ownership Model Environments Enterprise Standard •Initial Data Management •Governance Processes Confidential - not for redistribution
  13. 13. Sample Architecture Diagram – Subset of Project
  14. 14. Data Governance - 14-step (sounds like a lot!) program 1. Review available documentation on process flow 2. Agree scope of work 3. Plan and schedule meetings 4. Produce initial definitions of DG framework 5. Assemble DG working group 6. Engage with Data Stewards 7. AS-IS business process analysis 8. AS-IS data analysis 9. Define TO-BE processes 10. Define TO-BE system requirements 11. Assemble business glossary 12. Introduce standardization of business-critical data items 13. Implement DG KPI tracking and DQ exception reporting 14. Conduct periodic audit of business processes
  15. 15. Master Data Management - Highlights • DON‟T FORGET! Your data migration tools may end up being the real-time MDM Hub communication logic/tools as well, design appropriately • Simplified load tools that can be used by analysts • Custom match/merge algorithms • Gray‟s coding • 14 languages including European, Middle Eastern (right-to-left), East Asian • Some transliteration rules built using statistical regression on 30m customer records • Match/merge algorithms with discrete variables and user interface • Ability to allow users to target hotspots • Variable “sliders” - Meshed variables for hotspot analysis allows for more merge sensitivity flexibility • Data analysis for predicting why false positives and false negatives occur • Role of each source • Types of data that most often “fails” • Google Maps/Address integration for matching (cloud), data enhancement, and more
  16. 16. Testing • Custom “Black Box” testing tool designed • Specialized for database tests • Requires addition of some metadata columns to data model • S_ID • Batch_ID • LOAD_TIME • Automatic storage of test cases • Test data • Documentation on test being run • User metadata • Test metadata • Sets database into a known state • Can generate test data • Single unified interface • Fault-Fix workflow management
  17. 17. Documentation • Automated • Driven by • Business requirements documented in • Custom testing tool • Wiki documentation • ETL tool metadata • Custom testing tool metadata This is highly contingent on being able to enforce developer rules about documentation within tools.
  18. 18. Risk Mitigation Extract data early • Data should be seen immediately. We‟ve seen problems come up because data didn‟t conform to expectations. Convert data early • Our existing build will allow for the first conversion to take place within weeks for all objects. Convert data often • An iterative approach to both data quality and conversion allows for repeated analysis. This should be driven by development schedules rather than inversely by validation schedules that aren‟t related to development time. Use real data from the start • Conversion team should have direct access to source systems, without a dependency on another team to create extracts. Seek to incorporate external and up-to-date information about your Master Data • Tools like Google‟s business services, D&B, Bloomberg and others can help
  19. 19. Data Migration through Information DevelopmentLessons Learned Prioritise Planning • Define business priorities and start with quick wins • Dont do everything at once – Deliver complex projects through an incremental programme • “Chunks” need to be appropriate, based on elements like homogeneity of front- end, single sets of business users across geographies, language usage, etc. Focus on the Areas of High Complexity •Dont wait until the 11th hour to deal with Data Quality issues – Fix them early •Follow the 80/20 rule for fixing data – Does this iteratively through multiple cycles •Understand the sophistication required for Application Co-Existence and that in the • In the short term your systems will get more complex Keep the Business Engaged • Communicate continuously on the planned approach defined in the strategy The overall Blueprint is the communications document for the life of the programme • Try not to be completely infrastructure-focused for long-running releases – Always deliver some form of new business functionality • Align the migration programme with analytical initiatives to give business users more access to data • Ensure that the Data Governance program has “teeth”Confidential - not for redistribution
  20. 20. Questions? ?Peak Consulting UK Headquarters90 Long Acre, Covent GardenLondon WC2E 9RZT: +44 (0)20 7849 3422F: +44 (0)20 7990 9478www.peakconsulting.euConfidential - not for redistribution