Your SlideShare is downloading. ×
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
April 2004 - IRM Roundtable - Part 2 - Metadata
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

April 2004 - IRM Roundtable - Part 2 - Metadata


Published on

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • Our end state is convergence of these 3 areas. We want to be able to track DQ issues, make improvements based on the monitor results and improve overall DQ.
  • Examples of Applications that could be described with Application Runtime Metadata are security, regulatory compliance and BAU applications
  • Examples of Applications that could be described with Application Runtime Metadata are security, regulatory compliance and BAU applications
  • Transcript

    • 1. Information Resource Management (IRM) Round Table April 7, 2004 Hosted by:
    • 2. Agenda
      • Introductions & Agenda 9:00 – 9:15
      • Metadata Management 9:15 – 10:00
          • Gregg Wyant
      • Break 10:00 - 10:15
      • Metadata Management 10:15 – 11:00
          • Juanita Mercado
      • Metadata – The Promise vs Reality 11:00 – 11:45
          • Cass Squire
      • Wrap Up 11:45 –12:00
    • 3. Metadata Management Gregg Wyant Chief Data Architect Intel
    • 4. Metadata Management
      • Strategy / Approach:
        • Leverage progress of TQdM program
          • The TQdM program at Intel has made data issues visible
          • Metadata management needs to intersect the positive aspects of TQdM
        • “ Repeatable process, standard deliverables”
          • Metadata cannot be managed without consistent processes and fixed deliverables
      • Progress To Date:
        • Metadata program established and funded
        • Focus on data-oriented metadata management
        • First release of Enterprise Metadata Repository completed
      • Issues / Roadblocks:
        • Culture rewards solving each problem anew
          • This is slowly changing; pushing Reuse Awards to recognize desired behavior
          • Creating metrics which can be used as carrot and stick
        • Consensus management approach
          • Obtaining commitment to a proposed change is extremely time-consuming
    • 5. Metadata is the Enabler for Reuse
      • Metadata must be managed to be of any value
        • A repeatable process to identify and define metadata
        • Standard deliverables in which to capture metadata
        • Repeatable governance process to provide metadata credibility
      • Enterprise Architecture Framework for data explicitly defined
        • Deliverables defined for conceptual, logical, and physical data models
      • Reuse is Tracked by each Project
      • Data Analyst Community trained on deliverables, processes, and criticality of metadata management
      • Enterprise Metadata Repository (EMR)
        • Drives connections between process, data, apps, and tech
        • Implemented in phases to connect various metadata
    • 6. Enabling Metadata
    • 7. Browse Metadata/Run a Report
    • 8. Phase 2 Operational Metadata Phase 3 Strategic Metadata Phase 4 Enterprise-wide Integration And TQdM Platform Phase 1 Tracking and Managing Metadata Q4 Q3 Q1 Q2 Q4 Single metadata repository supporting viewers (portals) for multiple audiences, EAM Integration Metadata Program Major Phases Tracking and managing metadata repository; search, presentation, change management, and metadata exchange between certified Data modeling tool. Integrate legacy metadata into EMR Data monitoring, alerts, health indicators, DPM trending. (Integration with Reporting Tool ) Business Process metadata support. (certified Process modeling tool exchange) Phase 8 Enterprise-wide Integration And TQdM Platform Phase 5 Unstructured Metadata Phase 6 CMMI Integrations Phase 7 Legacy Migration 2006 Standardized physical database design processes & governance for all environments (OLTP, DSS, XML, EAI, etc). Ongoing deliverable and repository alignment Ongoing deliverable and repository alignment 2005
    • 9. Metadata, Standards and Reuse Critical Business Process models iPAG Standards Data models Process Tool Data Tool Data Architecture Review Enterprise Metadata Repository iDAG Standards Certified Deliverables Certified Data Objects Certified Processes based on Certified Data Objects Centralized Data Model Repository Centralized Process Model Repository
    • 10. EAF, Metadata and the Zachman Framework Metadata Portions of this page include Copyrighted material. See full disclaimer in backup. Data Process
    • 11.
      • Convergence of DQ, Metadata, BAM and PAF
        • Clearly defined architecture (boundaries between) for Metadata and the 3 major areas DQ, BAM and PAF
        • Track operational metadata, its monitors and alerts
      • Monitor the number of ROOs/RORs tied to applications
      • Metadata-driven capability to enable capture and notification of ROO/ROR defects and data movement excursions
      • Monitor Reporting
        • Real time reporting done from the monitoring tool
        • Analytics reporting is done from EDW
      • Feed DQ Corporate Scorecard
        • Single governance process
        • Track data quality issues to the data element level
        • Track DQ issues, measure effectiveness and drive improvements
      Metadata Architecture Framework Vision
    • 12. Information Resource Round Table Metadata Management Presented by: Juanita M. Mercado Lead Data Architect 2004-April
    • 13. Types of Metadata
      • Definition
        • Formal specification about ways to accurately and unambiguously describe information elements that are critical to the business
        • Standard definitions for meaning, acceptable content and relationships
      • System Integration
        • Formal specification on how to map equivalent data structures to support a federated operating model
        • Supports a uniform way of accessing various data formats such as flat files and databases
      • Application Runtime
        • Formal specification describing parameters for running an application executable. Includes dependency checks and runtime statistics.
      • Infrastructure
        • Formal specification for describing the system environment as far as networks, firewall, hardware and software, workstations, servers, etc.
    • 14. Roles and Functions
      • Enterprise Data Architect
        • Keeper of the formal specifications
      • Data Steward
        • Keeper of the business content
      • Application Data Architect
        • Applies the formal specifications and related business content to meet particular processing requirements
    • 15. Visa Global Business Elements (VGBE) Definition Metadata
      • What it is
        • Formal specification about ways to accurately and unambiguously describe of information elements that are critical to VisaNet interoperability
        • Standard definitions for meaning, acceptable content and relationships
      • What it accomplishes
        • Share metadata consistently across Visa and with IT partners
        • Assures consistent characteristics and behavior for all implementations
        • Extensible and flexible to support evolving business strategies
      • How it works
        • A structurally stable yet dynamic document (UML) that integrates into development environments
        • Automates inheritance of definitions, behaviors and relationships directly into application codes
    • 16. The Role of Definition Metadata Data Quality is a measure of how the data is tracking to the definitions established by the Data Architecture. Automation of this measurement is possible with metadata that is constructed using a formal specification. Definitions are also kept consistent when a specification is used.
    • 17. ISO 11179
      • An international standard for the specification of data elements
        • Framework for the specification and standardization of data elements
        • Classification for data elements
        • Basic attributes of data elements
        • Rules and guidelines for the formulation of data definitions
        • Naming and identification principles for data elements
    • 18. VGBE Specification
      • Identifying Attributes
        • Name
        • Identifier
        • Version
        • Synonymous Name
        • Abbreviated Name
        • Representation
        • ISO Field Number
        • XML Tag
        • MDR Attribute
      • Definition Attributes
        • Business Definition
      • Representational Attributes
        • Minimum Storage Length
        • Maximum Storage Length
        • List of Permissible Values
        • Numeric Precision
        • Numeric Scale
      • Administrative Attributes
        • Responsible Organization
        • Submitting Organization
    • 19. VGBE Specification
      • Business Entity Rule
        • Business Domain
        • Subject Area
        • Business Entity
        • Business Entity Attribute
          • Nullability
          • Default Value
        • Primary Key
        • ISO Field Number
      • Business Element Interaction Rule
        • Relationship Type
        • Related Business Element
        • Cardinality
        • Optionality
      • Roled Business Element Rule
        • Roled Name
        • Roled Definition
        • Fundamental Business Element
        • Identifier
        • Version
        • Roled Synonymous Name
        • Roled Abbreviated Name
        • Roled Context
      • Action Assertion Rule
        • Conditional Business Element
        • Influenced Business Element
        • Assertion Rule
    • 20. VGBE Registry
    • 21. VGBE Services Framework Use Case 1
    • 22. VGBE Services Framework Use Case 2
    • 23. Demo: Use Case 2
    • 24. Metadata The Promise Versus The Reality Cass Squire Associate Partner IBM Business Consulting Services (650) 520-7247 [email_address]
    • 25. Topics
      • The Need
      • A Strategy & Approach for Solving it
      • Progress To Date
      • The Real World: Issues and Road Blocks
    • 26. The Need
      • An entirely new set of data for a new kind of analytics is being rolled out
      • The business community has not be properly engaged
      • The business community needs to understand:
        • What data is available
        • What it means
        • What its source is
        • What its currency is
        • Who to go to to ask questions about the data and its meaning
      • A new tool for querying (Business Objects) is being rolled out as well
    • 27. Strategy & Approach
      • A clear need for a mechanism for capturing and sharing metadata surfaces as essential to the successful roll out of the new analytical environment
      • AbInitio is the corporate ETL tool – leverage its metadata for technical metadata
      • Determine the applicability of its repository – the Enterprise Metadata Environment (EME) for serving as the repository for all metadata
      • ERwin contains Business and Technical names, definitions, and allowable values – use it as the source for this metadata
      • Determine Data Stewards in the both the business and technical arenas to be the go-to people for questions
      • Evaluate other tools for applicability
      • Integrate Operational Metadata and SOX auditability
      • KISS
    • 28. In IBM’s Business Intelligence Reference Architecture, Metadata is one of the components that glues the whole process together. Data Sources Data Integration Access Transport / Messaging Hardware & Software Platforms Collaboration Data Mining Modeling Query & Reporting Network Connectivity, Protocols & Access Middleware Systems Management & Administration Security and Data Privacy Metadata Extraction Transformation Load / Apply Synchronization
      • Information
      • Integrity
      • Data Quality
      • Balance & Controls
      Scorecard Visualization Embedded Analytics Data Repositories Operational Data Stores Data Warehouses Metadata Staging Areas Data Marts Analytics Web Browser Portals Devices Web Services Enterprise Unstructured Informational External Data flow and Workflow Business Applications
    • 29. Progress to Date
      • AbInitio EME determined to be easily enough extensible to hold all metadata required – at least for initial phases
      • Their web-based reporting is deemed acceptable for technical users but not for business users
      • The ODBC API into their flat-file based repository is new and performance is “not ready for prime time” yet
      • The decision was made to extract from the EME to relational tables (15 +/-)
      • Business Objects (web version) as the user interface to the metadata since that’s what users will use to see the data – single universe and a dozen or so queries
      • Extracts from the EME into the relational tables have been built.
      • Data lineage simplified to ultimate source to ultimate target (intermediary ETL steps hidden) for business users
      • Processes and extracts built for getting data from ERwin into the repository
      • Processes for ensuring data analysts on all projects use the same tools/processes for capturing and publishing metadata for the repository
      • Identified Data Stewards and created a cross reference between them and the entities
      • Business users have applauded it as very useful in helping them understand and use the new data for analytics
      • Common processes for capturing Operational Metadata (Statistics, Error Reporting, Auditing) built and used by every ETL process
      • Unicorn evaluated as a possible tool – received high praises especially for its ability to speed up the mapping process - but the determination was made to postpone further testing of it until the above environment is in production for awhile
    • 30. The Real World: Issues and Road Blocks
      • Lots of hype about metadata – little in the way of tools to deliver
      • There are lots and lots of type of metadata – picking the right subset to implement is key
      • Ensuring automated maintenance is key
      • New data unfamiliar to the users
      • Demographics data – initial rollout a fiasco; queries produce wildly inaccurate numbers because the data is not well enough understood
      • Lots of churning while technical team got enough understanding of the data to identify the problems
      • User confidence in the data seriously hurt
      • Education program in the data and what queries would generate correct results required
    • 31. Demographic Data: Conceptual Model Make sure you know at what level the data applies ! has has has is part of the key to a / has consists of has one or more last name is part of the key to a contains one or more may play a role as more than one can have up to 10 (7 active) may have more than 1 may have many characteristics of may have one or more / has a primary member Geographic Area Geographic Area Demographics Physical Address Household Household Demographics Consumer Individual Demographics Name Household Member Account Screen Name Lifestyle Customer Member Prospect
    • 32. The Need and Promise is Great. The delivery isn’t there yet. The Perfect Repository o o o
      • They often take more effort to feed than the benefit derived
      • Many repository tools/vendors won’t expose (share) their metadata
      • They tend to be passive, and thus can get out of synch with the real world
    • 33. How to implement?
      • “ It’s like pinning jello to the wall”
      • There are no “best practices”
      • Are there analogies we can use?
    • 34. Layers & Perspectives of Data & Metadata Schema Description Layer Schema Layer Dictionary Layer Data in Production Database Example Instances of Entities and Relationships with Data Describing the Real World: Entities, Relationships Attributes: Entity-Types, Relationship-Types, Attribute-Types: Meta-Entity Types: 229-21-5941 0285762 (Personnel record for John Jones) (Payroll record for Pam Smith) Department- 24037 Department- 74941 (Payroll record for John Jones contains 0285762) (Attributes do not appear as discrete instances in a production database. They provide information used in the IRDS to represent real-world entities and attributes) Employee- ID-Number Social-Security- Number Personnel- Record Payroll- Record Payroll-Record CONTAINS- Employee-ID 9 (Characters) 1 (Low-of-Range) 10 (High-of-Range) Finance- Department Personnel- Department ELEMENT RECORD USER RECORD- CONTAINS- ELEMENT LENGTH ALLOWABLE- RANGE Entity-Type Relationship- Type Attribute-Type and Attribute- Group-Type The 1984/5 (+/-) Information Resources Dictionary Standard (IRDS) was an attempt to define a syntax for metadata exchange.
    • 35. Simple Metadata Model Physical Environment Data Element Organization Creators Owners/ Proponents Users File/Table Occurrences & Data Lineage (sources & targets) Relational Edits Domain/ Allowable Values Transformation Rules Programs Triggers Time Events Location Integrity Rules Reports
    • 36. John Zachman’s Enterprise Architecture Framework also provides us a way to categorize the sources of metadata.
    • 37. The Enterprise Architecture defines five views and six aspects of the enterprise.
      • John Zachman compares delivering technology to the enterprise to building an airplane. It is a complicated task involving several stages of design and many builders whose activities must be coordinated. He asks: Who would build an airplane without conceptual drawings? Without detailed sub-assembly charts? His premise is that the IT industry often has these design drawings and sub-assembly charts, but fails to file, cross-reference and maintain them.
      • Not only do IT practitioners need to keep multiple views of the enterprise, the relationships between the cells in the framework must also be tracked. It is important to know which functions use which data elements.
      • He says that as builders of data warehouses we have many of these “specification sheets” and he states that they contain metadata.
    • 38. Over time, repositories and metadata management tools have changed with, in spite of, or regardless of, the IT industry focus.
    • 39. Now Lets Look at This From Another Perspective Subject Areas, Cubes Store Display, Analyze, Discover Automate and Manage Transform Metadata Elements Mappings Business Views Templates Operational and External Data Distribute DATA DATA DATA Extract Find and Understand Analyze / Architect All is not lost! There are other tools that help with metadata needs.
    • 40. From Raw Data to Standardized Information to Usefulness What’s in the source data? Does it mean what you think it should? How is it structured? How might it be structured? Does it contain what you think it should? How complete is it? How clean is it? Does it follow the business rules? How is the quality changing over time? Resolve duplicates. Standardize names. Assign unique Id’s. Identify households. Correct and improve it. Change to standard values. Transform codes to meaningful terms, Summarize it. Deliver new sets of data on a periodic basis. Capture changes as required. Deliver updates/new transactions in as timely a fashion as required. ProfileStage (MetaRecon) Evoke Axio AuditStage (Quality Manager) Ab Initio Data Profiler QualityStage (Integrity) Trillium First Data Innovative Systems DataStage Informatica Ab Initio Warehouse Manager ETL tools, Propagation, Change Data Capture tools, MQ Sample Tools: The Problems: Deliver Discover Assess & Monitor Match & Merge Enrich & Transform
    • 41. Data Becomes Information If and Only If You:
      • Have the data and
      • Know you have it and
      • Can access it and
      • Can use it and
      • Can trust it!
      Tools, Techniques, & Processes The Problem
      • Capture Process; Business Process Re-Engineering
      • Metadata; Evangelism
      • BI Environment; Data Structured for Access; End-User Analysis tools
      • Business Metrics Captured
      • Data Quality Process; Metadata