Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.



Published on

  • Be the first to comment

  • Be the first to like this


  1. 1. Information Governance and Data Discovery Vincent McBurney IM Practice Lead Focus Strategies and Solutions [email_address] DQ Asia Pacific March, 2011 Sydney, Australia
  2. 2. Data governance is a set of processes that ensures that important data assets are formally managed throughout the enterprise. Data Governance helps controls the cost, risk and time of data driven IT projects
  3. 3. IBM Information Governance Maturity Model <ul><li>The categories of effective data governance </li></ul>
  4. 4. Capability Maturity Model Integration (CMMI) <ul><li>Based on the Capability Maturity Model (CMM) and applied to each category of data governance. </li></ul>Graphic sourced from Carniegie Mellon Software Engineering Institute
  5. 5. What Maturity do you need? <ul><li>Recommended Maturity Level for different types of IT projects. </li></ul>
  6. 6. Recommended Online Community
  7. 7. <ul><li>Why does a simple enhancement request take so long? </li></ul><ul><li>Why are our estimates always wrong? </li></ul><ul><li>Why does everyone take so long to do things? </li></ul>
  8. 8. The Victim Statements – the Business They just spent so long on meetings and documentation and didn’t build anything! We could have built this faster ourselves When we got to UAT Testing there were bugs and it had to be fixed over and over again. We spend all this money on IT and what do we get for it?
  9. 9. Obvious Suspect – IT Team Requirements and rules kept changing right up through testing. You thought that was bad, wait until you see phase 2. No one told us there were three different definitions for client status. It would help if the business knew what they wanted.
  10. 10. Obvious Scapegoat – the New Guy I don’t know where the application documentation is. I need to update my resume. I don’t even know who was managing the project. Turned out the Functional Spec I was using was out of date by two years. There are three different definitions for client status? Wait, which client status are we talking about? I didn’t do a proper handover as the guy I replaced was always out to lunch.
  11. 11. The Coroners Report <ul><li>The team did not have the information and the context for the change. </li></ul>
  12. 12. The Information Server Approach <ul><li>Metadata Workbench and Business Glossary provide context </li></ul>
  13. 13. Define Business Glossary in the Unified Process Define Business Problem Obtain Executive Sponsorship Conduct Maturity Assessment Build Roadmap Establish Organisation Blueprint Build Data Dictionary Understand Data Create Metadata Repository Define Metrics Appoint Data Stewards Manage Data Quality Implement Master Data Management Create Specialised Centers of Excellence Manage Security & Privacy Manage Life-cycle Measure Results = Enable through Process = Enable through Technology
  14. 14. <ul><li>The steps to a successful Business Glossary. </li></ul>
  15. 15. Glossary in a Project <ul><li>Create your Glossary during the Understand and Define stage. </li></ul>Use and refine your Glossary during subsequent phases.
  16. 16. Identify Subject Areas <ul><li>If you are using a Business Glossary to support a Data Warehouse then start with the high level conceptual data model. </li></ul>Learning Teaching Development Management Outcome Grant Attempt Recruitment Admission Publication Unit Unit Offering Completion Staff Centre Location Research Policy Commercialisation Risk, Quality & Evaluation Course Student Award Unit Delivery Survey Alumni Organisation Faculty School Planning Health & Safety Training Accounts Performance
  17. 17. Start with a Formal Vocabulary <ul><li>Focus helped create a Glossary for a Data Collection at NCVER. </li></ul><ul><li>Clearly defined Data Dictionary with elements and rules. </li></ul>
  18. 18. Define the Lifecycle of Terms <ul><li>Work out the Data Stewardship Policies </li></ul><ul><ul><li>How to use the Term status </li></ul></ul><ul><ul><li>Identify review groups </li></ul></ul><ul><ul><li>Collaborate via email </li></ul></ul><ul><ul><li>Track changes over time </li></ul></ul><ul><ul><li>Report to track progress of reviews </li></ul></ul>
  19. 19. Basic Glossary Entry <ul><li>Using Glossary just for Definitions </li></ul>
  20. 20. Adding Synonyms and Related Terms <ul><li>Synonyms track different names for the term across the Enterprise. </li></ul><ul><li>Related Terms are used to define validation business rules. </li></ul>
  21. 21. Assigning Physical Assets <ul><li>External Assets – Given Name is linked to external HTTP links such as documents in Sharepoint or Intranet Pages. </li></ul><ul><li>Metadata Assets – Given Name has been explicitly linked to a FIRST_NAME column in the Warehouse. </li></ul>A linked Word Document Linked DB Columns Change History
  22. 22. Custom Browse and Data Entry Forms <ul><li>Using the Glossary API to write our own authoring forms </li></ul>Better Date entry Different Column Order Better Validation
  23. 23. Business Term Linkage <ul><li>Context is everything. </li></ul>Business Term System of Record DB DW Table Synonyms Hononyms Related Terms Cognos Data Model Metadata Workbench
  24. 24. Get a Fast Start with Imports <ul><li>The Quality of Business Glossary imports has a major impact on the success of the implementation. </li></ul><ul><ul><li>Excel imports using templates. </li></ul></ul><ul><ul><li>Build, copy paste and prepare content quickly. </li></ul></ul><ul><ul><li>Email content around for updates and review. </li></ul></ul><ul><li>300-400 terms in the first three weeks. </li></ul>
  25. 25. <ul><li>Profiling </li></ul><ul><li>Primary Foreign Key Discovery </li></ul><ul><li>Transformation Discovery </li></ul><ul><li>Unified Schema Build </li></ul>
  26. 26. Data Profiling
  27. 27. Primary and Foreign Key Discovery
  28. 28. Unified Schema Build Example <ul><li>Three different source systems, three tables each. </li></ul>
  29. 29. Overlap Analysis <ul><li>Find overlapping columns and data using profiling results. </li></ul>
  30. 30. Unified Schema Build
  31. 31. Unified Column Analysis <ul><li>See your data quality before you move the data. </li></ul>
  32. 32. InfoSphere Discovery <ul><li>Data Warehouse </li></ul><ul><ul><li>Data Inventory: Profiling and Primary/Foreign Keys </li></ul></ul><ul><ul><li>Design and Prototype: Schema Build and Overlap Profiling </li></ul></ul><ul><ul><li>Load: Mapping and Transformation Discovery </li></ul></ul><ul><li>Application Consolidation/Migration </li></ul><ul><ul><li>Data Inventory </li></ul></ul><ul><ul><li>Rule Discovery: document old rules, define new rules </li></ul></ul><ul><ul><li>Source to Target: map from old to new </li></ul></ul><ul><li>MDM </li></ul><ul><ul><li>Data Inventory: Overlapping Master Data and Conformance </li></ul></ul><ul><ul><li>Design and Prototype: build a unified MDM registry </li></ul></ul>
  33. 33. Unified Metadata Approach Business Glossary Discovery Fast Track Information Analyzer Discovery: Profiling, Values, Frequencies, Overlap and Links, Transform Discovery. Assign Terms Create FastTrack Maps Audit: Define valid reference data values and show them in Glossary. Show Profiling Stats in Glossary Mapping: Create source to target mappings with columns and terms. Map by physical names or business names. Automap by business names. Cognos Turn Framework Manager metadata into a Business Glossary. Popup field help. Link KPI definitions to related terms. Find in Cognos. Data Model Turn a Glossary into a Logical Model. Turn a Logical Model into a Glossary. Metadata Workbench Link Terms to Assets in bulk. Link Stewards in bulk. Report on changed and stale terms. Blueprint Director
  34. 34. Getting Started with Data Governance <ul><li>Six things everyone can do today: </li></ul><ul><li>Define your desired outcomes from Data Governance </li></ul><ul><li>Be clear about the problems you are solving </li></ul><ul><li>Define a realistic organisational structure for your environment </li></ul><ul><li>Focus on a DG pilot program that can deliver outcomes with business benefits </li></ul><ul><li>Take advantage of best practices and models from organisations like the Data Governance Council and MIKE 2.0 </li></ul><ul><li>Be real with organisational challenges, funding requirements, scope and duration of deliverables </li></ul>