Best Practices: Data Admin & Data Management


Published on

I built this presentation for Informatica World in 2006. It is all about Data Administration, Data Quality and Data Management. It is NOT about the Informatica product. This presentation was a hit, with standing room only full of about 150 people. The content is still useful and applicable today. If you want to use my material, please put (C) Dan Linstedt, all rights reserved,

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • The purpose of this slide show is to present and discuss the role of data administration in the data integration world. Here we define some of the business and technical problems that DA’s face on a daily basis, then we move on to discuss the types of activities that a DA will under-take in an enterprise level initiative. Please bear in mind, that the DA is a role, and may not end-up being just a single individual, but rather a group of individuals, some of whom are directly responsible for Data Management as well.
  • In this section we define different DA roles, issues, and conceptual notions. We discuss the DA role from a 20,000 foot level where the enterprise “see’s” data administrators, and begins to understand what they do. The role of the DA ranges from monitoring business user meetings to over-seeing the design of data flow through business processes. Business Process flow has a large impact on the world of the DA and what they need to be capable of achieving. They need to work across multiple groups in order to achieve an enterprise vision of the data assets and models that will serve the enterprise.
  • Data Must Be: Auditable, Traceable, Stored in the granular format it arrived in, A “statement-of-fact” Business Rules must move to the output side of the equation. Data can be integrated by the same semantic grain, but cannot be altered.
  • The Data Administrator is responsible for identifying auditable or audited sources of data. The DA will be responsible for ensuring which data sets can and should be utilized to load enterprise data warehouses. The DA will set policies and procedures for measuring, auditing, and assessing the quality of information flowing to and from the source systems.
  • The Data Administrator is responsible for assigning or classifying different groups of errors, what will make the data set or break the data set. They are also responsible for the integrity of the data set, and ensuring that the data set matches the requirements set forth by the business users.
  • The Data Administrator might use a live chart like this one to examine the errors and the occurrences of errors over time. The DA will be responsible for the quality of the data, as it relates to the business metrics put forward. The DA will be responsible for maintaining the logical models, and the business processes – and if the error count is too high for a specific area of expertise, then the Data Manager must be notified, and corrective action must be taken.
  • Organic Data Administration,
  • Best Practices: Data Admin & Data Management

    1. 1. Best Practices: Data Administration and Quality Daniel Linstedt, all rights reserved,
    2. 2. Introduction and Expectations <ul><li>Author, Inventor, Speaker – and part time photographer… </li></ul><ul><li>25+ years in the IT industry </li></ul><ul><li>Worked in DoD, US Gov’t, Fortune 50, and so on… </li></ul><ul><li>Find out more about the Data Vault: </li></ul><ul><ul><li> </li></ul></ul><ul><ul><li> </li></ul></ul><ul><li>Full profile on </li></ul>
    3. 3. Agenda <ul><li>Introductions and expectations </li></ul><ul><li>Defining data administration issues </li></ul><ul><li>Applying best practices </li></ul><ul><li>Conclusions and Q&A </li></ul>
    4. 4. Defining Data Administration Issues
    5. 5. What is Data Administration? “ What do we mean by that in the case of data administration? We mean that DA must get out of the design review committee mentality and substitute something more value-added and flexible. It must recognize that systems tend to grow organically, and be a part of that process, rather than an instiller of order upon it.”  Eric Rawlins, 1995 Originally Published by: Database Research Group, Inc
    6. 6. The Role of Data Administration <ul><li>Data administration and management are key roles in today's enterprise projects </li></ul><ul><li>Data administration is a part of data management, the two should be utilized together </li></ul><ul><li>Compliance, accountability, and governance provide a foundationally strong and tenable architecture, and must be a part of the DAs working knowledge </li></ul>
    7. 7. Cross-Organization Roles and Responsibilities Business ( Owner View) Data Steward Discipline Authority Business Process Manager Data Usage Contact Data Manager Data Modeler DA is a ROLE and typically involves more than one person in order to achieve success. Logical (Designer View) Data Administrator Physical ( Builder View) Database Administrator
    8. 8. Data Administrator Responsibilities <ul><li>Crossing the organization, building accountability across data sets </li></ul><ul><li>Providing governance over master data and master metadata sets </li></ul><ul><li>Assisting the data modeler in managing logical data models and matching these to business processes </li></ul><ul><li>Ensuring the physical data set meets pre-designed metrics and measures, providing GAP analysis between what-was-designed and what-is-implemented </li></ul><ul><li>Interfacing with the business users to ensure master metadata is meeting their needs </li></ul><ul><li>Promote manageable and traceable systems growth through standards and metrics measurements </li></ul>
    9. 9. Top 10 Data Administration Issues <ul><li>Inadequate or missing master metadata </li></ul><ul><li>Ineffective master data management </li></ul><ul><li>Incomplete logical models </li></ul><ul><li>Undefined business process models </li></ul><ul><li>Missing process control and metrics measurements </li></ul><ul><li>Non-defined user access matrices </li></ul><ul><li>Ineffective change management </li></ul><ul><li>Missing element classification system </li></ul><ul><li>Lack of user-training material </li></ul><ul><li>Mismatched data performance SLAs with DBA objectives </li></ul>
    10. 10. Defining Data Administration Issues Top 4 Examples
    11. 11. Defining Master Metadata <ul><li>Master Metadata </li></ul><ul><ul><li>Information describing the elements/attributes, utilization of those attributes, which make up the master data structure </li></ul></ul><ul><ul><li>These metadata are agreed upon by the business users to be universal in definition </li></ul></ul><ul><li>Questions to ask </li></ul><ul><ul><li>Why is master metadata important? </li></ul></ul><ul><ul><li>What are the impacts of missing master metadata ? </li></ul></ul><ul><ul><li>Why is master metadata a part of the DA world? </li></ul></ul><ul><ul><li>How does a DA build a master metadata management program? </li></ul></ul>
    12. 12. Defining Master Data Management <ul><li>Master Data </li></ul><ul><ul><li>Information housed in a single, consistent, quality-cleansed reference table, located at a single location </li></ul></ul><ul><ul><li>All elements except the surrogate key in the master data set are defined by master metadata at a global (enterprise-, or sometimes industry-wide) level </li></ul></ul><ul><li>Questions to ask </li></ul><ul><ul><li>Why is master data important? </li></ul></ul><ul><ul><li>What are the impacts of master data? </li></ul></ul><ul><ul><li>Why is master data a part of the DA world? </li></ul></ul><ul><ul><li>How does a DA build a master data management program? </li></ul></ul><ul><ul><li>Is master data connected to master metadata? How? </li></ul></ul>
    13. 13. Assessing Logical Model Viability <ul><li>Logical Data Models </li></ul><ul><ul><li>A business view or representation of data integration in a data modeling format containing relationships and dependencies </li></ul></ul><ul><li>Questions to ask </li></ul><ul><ul><li>When was the last time the logical models were compared to the business process diagrams? </li></ul></ul><ul><ul><li>When was the last time the logical model was reviewed with the business users? </li></ul></ul><ul><ul><li>Do all the elements in the logical model contain metadata defined by key business individuals? </li></ul></ul><ul><ul><li>Does the logical model match the physical model? </li></ul></ul>
    14. 14. Defining Business Process Models <ul><li>Business Process Models </li></ul><ul><ul><li>A graphical flow of business processes including key data sets, dependencies, and key business processes </li></ul></ul><ul><ul><li>The processes identified are often referred to as critical path components </li></ul></ul><ul><li>Questions to ask </li></ul><ul><ul><li>Why is BPR a part of data administration? </li></ul></ul><ul><ul><li>What impact do BPM’s have on data administration? </li></ul></ul><ul><ul><li>Do all the elements in the logical model contain metadata defined by key business individuals? </li></ul></ul><ul><ul><li>Does the logical model match the physical model? </li></ul></ul>
    15. 15. Applying Best Practices
    16. 16. Revealing the DA Best Practices <ul><li>Review/construct logical data model to meet business needs </li></ul><ul><li>Establish master data management strategy </li></ul><ul><li>Audit data on a regular basis, ensure KPIs and metrics are met </li></ul><ul><li>Meet with end-users to synchronize metadata, logical data models, and business process flows </li></ul><ul><li>Maintain standards and end-user access paths </li></ul><ul><li>Build metrics and SLAs to monitor the DA process </li></ul><ul><li>Engineer and deploy a metadata management strategy </li></ul>
    17. 17. DA: MDM and Master Metadata <ul><li>Defining work breakdown structure (WBS) as it pertains to the data and metadata, and business processes themselves </li></ul><ul><li>Defining organizational breakdown structure (OBS) as it pertains to the data and business process ownership </li></ul><ul><li>Business process to logical data – mapping </li></ul><ul><li>Define and deploy data governance strategies </li></ul><ul><li>Classifying metrics for data errors, and auditable sources of data </li></ul><ul><li>Managing and tracking KPIs to architectural goals, aligning logical models, and business processes (data flow) to current business objectives </li></ul>Many times we see a cross-role responsibility of data management and data administration. The cross-role is responsible for the following:
    18. 18. Work Breakdown Structure <ul><li>Assume that the requirements and the types of deliverables are fairly well understood </li></ul><ul><ul><li>Code : source, mappings, workflows, errors, etc. </li></ul></ul><ul><ul><li>Documents: design spec., test plan, test cases, user manuals, etc. </li></ul></ul><ul><ul><li>Training: end user training, support personnel training, etc. </li></ul></ul><ul><li>For each of the deliverables, consider the set of activities that will be employed to develop the deliverable (based on the process/procedure chosen) </li></ul><ul><li>Map the deliverable against the chosen activities, and consider the sequencing of the activities, including any inter-deliverable relationships </li></ul>
    19. 19. Organizational Breakdown Structure <ul><li>Identify the workers within the IT community who are involved with Informatica, and Informatica support </li></ul><ul><li>Also identify the business technical liaison (business lead) from the business side and ensure the sponsor’s participation. </li></ul><ul><li>Cross the OBS with the WBS for a complete view of the work assignments </li></ul><ul><ul><li>This will also help determine the roles and responsibilities </li></ul></ul>
    20. 20. DA: Architecting Data Governance Business Rules & IQ EDW Source Systems Non Compliant Data Marts Business Rules & IQ EDW Source Systems Data Marts Compliant Hard Business Rules Soft Business Rules & IQ Shift to process AFTER the EDW Hard Business Rules Still process Before the EDW
    21. 21. Establishing Auditable Sources Sync Routines Data 2 nd Source System Staging EDW Data Warehouse Source System Data Export Sync Routines OLTP Oper Reports DW Exports <ul><li>Secure </li></ul><ul><li>Auditable </li></ul><ul><li>Compliant </li></ul>
    22. 22. DA – Defining Data Errors and Models <ul><li>Data admin designs the logical data models used for error handling </li></ul><ul><li>Data admin assesses the implementation of the error handling architecture </li></ul><ul><li>Data admin provides data accountability to the business users </li></ul><ul><li>Data admin defines master data sets, and master metadata for both “good” and “bad” data according to the business rules </li></ul>B.I. Tool Database Wrtr xform Rdr ETL Load Process Source System Staging Area Data Warehouse Data Marts **Error Stage **Error Warehouse Error Marts ** Not usually implemented
    23. 23. DA Example – Classifications of Errors <ul><li>Soft-Errors (Business Rule Breaks) </li></ul><ul><ul><li>Data requires parent key </li></ul></ul><ul><ul><li>Negative hours on a time-card were charged </li></ul></ul><ul><ul><li>Customer has no account, but has transactions </li></ul></ul><ul><li>Technical Errors </li></ul><ul><ul><li>Datatype mismatches </li></ul></ul><ul><ul><li>Null/Not Null issues </li></ul></ul><ul><ul><li>Missing data, default data, mis-calculated data </li></ul></ul><ul><li>Hard-Errors </li></ul><ul><ul><li>Database: out of space, bad indexes, roll-back segment </li></ul></ul><ul><ul><li>Network: went down, can’t locate IP </li></ul></ul><ul><ul><li>Machine: CPU bad, RAM bad, disk mad, machine shutting down, out of disk space </li></ul></ul>Business Owns the Error I.T. Owns the Error
    24. 24. DA: Tracking Errors – KPIs at Work
    25. 25. Metadata and Data Administration <ul><li>Metadata, data lineage, and element definition are all a part of data administration </li></ul><ul><li>How can we stitch the different metadata together? </li></ul><ul><li>Does our metadata repository capture its own versions of metadata? </li></ul><ul><li>Does our enterprise vision allow business users to access and modify metadata that they own? </li></ul><ul><li>Data administrators must be responsible for capturing, architecting, and engineering metadata standards across the organization </li></ul>
    26. 26. Metadata Administration Lifecycle Identify New Metadata Integrate With Master Metadata Repository Edit and Manage Master Metadata (Provide Business Users with Web Interface) Stitch Master Metadata Together Compare Master Metadata With Business Process And Objectives Export Master Metadata or Deploy via SOA With Master Data Set Derived from Meta Integration Metadata Lifecycle
    27. 27. Monitoring DA Efforts <ul><li>Focus on the Big Stuff </li></ul><ul><ul><li>KPI: peer review for readability and architecture </li></ul></ul><ul><ul><li>KPI: peer review and sign-off for metadata definitions </li></ul></ul><ul><ul><li>KPI: project plan with named deliverables and phases </li></ul></ul><ul><li>Make process rules, not artifact rules </li></ul><ul><ul><li>KPI: measure the effectiveness of edits/updates to the existing standards </li></ul></ul><ul><ul><li>KPI: track the amount of time spent by the DA on the edits and changes to the rules. </li></ul></ul><ul><ul><li>KPI: assess end-user access (queries) against the metadata, documentation, and models </li></ul></ul><ul><li>Consult others for added value </li></ul><ul><ul><li>KPI: track the effectiveness of the peer reviews by tying the above metrics together </li></ul></ul><ul><ul><li>KPI: assign specific recurring “standing” meetings, assign role-call and notes with action items </li></ul></ul><ul><li>Map the World </li></ul><ul><ul><li>KPI: track changes, amount of time spent on data models </li></ul></ul><ul><ul><li>KPI: measure level of effort based on complexity, resulting from impact analysis studies </li></ul></ul>Establish KPIs for Each of the Following Areas
    28. 28. Case Study for DA Results <ul><li>Government Manufacturing Firm </li></ul><ul><li>Three people, 6 months from start to finish on EDW </li></ul><ul><li>Passed government and financial audits in the first 3 weeks of production release </li></ul><ul><li>Saved the company $15M, and $45M in the first 3 months </li></ul><ul><li>Reduced cycle time in manufacturing by 3 months after showing specific data lineage </li></ul><ul><li>Changed IT from a cost center to a profit center </li></ul><ul><li>Demonstrated a 15-year-old billing error on the operational reports </li></ul>After Implementing DA Best Practices
    29. 29. Conclusions and Q&A
    30. 30. Revealing the DA Best Practices (Recap) <ul><li>Review/construct logical data model to meet business needs </li></ul><ul><li>Establish master data management strategy </li></ul><ul><li>Audit data on a regular basis, ensure KPIs and metrics are met </li></ul><ul><li>Meet with end-users to synchronize metadata, logical data models, and business process flows </li></ul><ul><li>Maintain standards and end-user access paths </li></ul><ul><li>Build metrics and SLAs to monitor the DA process </li></ul><ul><li>Engineer and deploy a metadata management strategy </li></ul>
    31. 31. The Experts Say… “ The Data Vault is the optimal choice for modeling the EDW in the DW 2.0 framework.” Bill Inmon <ul><ul><li>“ The Data Vault is foundationally strong and exceptionally scalable architecture.” </li></ul></ul>Stephen Brobst “ The Data Vault is a technique which some industry experts have predicted may spark a revolution as the next big thing in data modeling for enterprise warehousing....” Doug Laney
    32. 32. More Notables… <ul><ul><li>“ [The Data Vault] captures a practical body of knowledge for data warehouse development which both agile and traditional practitioners will benefit from..” </li></ul></ul>Scott Ambler
    33. 33. Where To Learn More <ul><li>The Technical Modeling Book: </li></ul><ul><li>The Discussion Forums: & events – Data Vault Discussions </li></ul><ul><li>Contact me: - web site [email_address] - email </li></ul><ul><li>World wide User Group (Free) </li></ul>
    34. 34. Thank you Contact us today: Dan Linstedt [email_address]