Data Architecture for Data Governance


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • The typical state of affairs you will find in most organizations is as follows: We know we have data Some of the lineage is tracked – but probably out of date There is not a lot of clarity on ownership While the goal may be to have single sources of data – over time ad hoc data services and processes pop up Data quality is unclear or assumed but not guaranteed
  • The problem is that data is not static. While you may take a snapshot at any point in time it is the truth for that moment in time. Even if you have processes in place to manage the addition of data, change management of master and meta data and a set of guidelines – you can still end up with out of control data systems The root cause of this is the lack of governance, you don’t currently have governance……. See why this is tough problem to solve? The solution prevents the problem but most organizations have no clue where to start or if they do start they may either fail or complete the effort but not have a transition plan to maintain governance once it is established and over time the organization finds itself back in the same or a similar state.
  • There are a number of things that a Data Governance can do. It all depends on risk, priorities etc. As part of the lineage exercise and purely by formation and communication of the council’s existence – it is possible to start with a backlog of items to be addressed at the first meeting. So first and foremost – define how intake will work, how decisions will be made and how issues will be managed – the basics. Get the structure in place. Then make sure that you not only have a plan for today but that it is a flexible enough plan to be sustained over time. Because just as change is inevitable in data, the need for governance will not go away. Data governance councils will identify goals and priorities for achieving a level of proficiency across the organization which encompasses everything from policies and processes to monitoring and communications. Individuals outside of the governance council may approach the group with problems and findings which lead to investigations, studies and eventually standards for the management and handling of data. These standards will cover anything from data retention periods to best practices. The overall goal of the council is to ensure consistency in the collection, storage and delivery of data to support the business with cost effectiveness as a balancing factor.
  • If you walk into a room and mention governance, you can easily see that there are not a lot of people that will stick around for the conversation. Unless they have had the epiphany of how governance actually enables business and is not about locking things down and people out. There are concepts which have bad reputations – such as process and governance. These reputations are formed by bad implementations, overly zealous implementations or an inability to sustain the change over time. Worse yet, there are a lot of misconceptions about what governance actually is.
  • In most cases – governance has been overcomplicated. It is easier to talk about what governance isn’t vs. what governance is ‘ to govern’ is at the root – which translates to: have political authority: to be responsible officially for directing the affairs, policies, and economy of a state, country, or organization control something: to control, regulate, or direct something have influence over something: to have or exercise an influence over something This drives a lot of the perceptions people have about governance. Instead let’s think of governance as having goals such as: defining expectations, granting power, and verifying performance
  • The first step in the governance lifecycle is to form a team. This team may not end up being the final governing body or governance board. But as you develop the concept of governance you will want to ensure that you have equal representation of all data producing, storing and consuming groups – in other words almost every team of the organization will have representation at the table. As mentioned, this is not necessarily the final governance team. What is important at the formative stage is that you form a team of individuals with working knowledge of the data.
  • The first action is to catalog the data of the organization. The size of the data does not fall in line with the size of the company. In other words there can be very small companies with only a few employees which process and store petabytes of data a day. So don’t be fooled by looking at the size of the organization into thinking the data will align to that size. Data lineage is an iterative process. The size of the enterprise will determine how deep you go into the processes. Instructor: Flip to next slide to review BPMN process level definitions then return to this slide Choose a level that can be achieved in a short period of time. The goal would be to hit level 3 or 4. But in a larger enterprise you may only be able to hit level 1 or 2. If all you can accomplish in 2-3 weeks is a list of all data repositories with owners, purpose and some basic information about data sources etc. then you have enough to get started. The goal behind capturing lineage is to tie data to a purpose, typically a KPI (Key Performance Indicator). If you gather the KPIs for the company by department and then travel backwards from the end metric the goal is to find all processes and sources which lead to that data. No matter what, the lineage effort should catalogue systems and data. Instructor: Skip ahead past the next slide to the following slide
  • While the initial team is compiling the data lineage they should also be assessing maturity. Maturity should not be assumed to be consistent across the enterprise nor should you consider that maturity would be consistent across any one lineage for a metric.
  • By now it is time to form the Governance Body – you can form this body at any point in the process, but for the most part, until there is an overview of the data environment and state of maturity, the Governance Body will mostly sit back and support the discovery process. Let’s call this the Data Governance Council. The Council will be made up of representatives of each functional area of the organization. It is possible that there will be members that were part of the original working team that reside on the Council, or it could be there management.
  • Instructor: Ask – How do you define governance? Use this as an example of how individuals that make up the council will have their own definition of governance. As you can see – each one of you has different definitions of governance. This will also be true of the members of the council. Instructor: Ask – Why do you have the council define governance? Answers resemble the following: (you are looking for 2 major inputs) Establish a common definition Get all council members aligned to the definition As you can see, whenever you bring a group of people together of varying roles and backgrounds there is a likelihood that their definition of governance will vary. With that being said, it is important to have the council define governance to ensure that everyone has the same expectations. A shared definition of governance will only apply to this council for this organization at this time. Over time the definition may change and should be revisited each year.
  • The best way to revisit the definition to ensure it is still effective and applicable is to create a charter which includes the definition of governance.
  • Instructor: Ask – What do you think the scope of a data governance council should be? Answer: this will vary – you are looking for answers which are about the creation of rules or standards but not actually doing or building – typically. Note: If there are a lot of comments about building systems etc. (Ask – is there a risk of conflict of interest if the governance council develops systems vs. governing data?) Scope = focus. You can’t tackle everything – this is actually one of the reasons that governance councils fail – they have unreasonable goals. Each year the council should have a set of goals that are reasonable and can be achieved, followed by stretch goals. It is important to remember that everyone has a ‘day’ job and that the governance council typically gets about 10% of their attention. Image: street_sign_post_questions_400_clr
  • Define a set of goals and expectations typically these align to the terms of the council. A reasonable set of commitments for the first year of a council would be to: Establish lineage Establish current level of maturity Identify optimal level of maturity to reach Define roadmap Execute 1 st year actions, etc. The important thing is to review progress regularly and identify when things get off track. Adjust the plan if it takes longer to achieve completion or if the outcome does not meet the expectations of the council. Don’t wait until the end and let things slowly fall apart. The failure of a governance council often comes when success is assumed and not monitored. When defining the roadmap, it is important to think of the goals as being multi-generational. Additionally you will have to break down these goals into actions, accountabilities, responsibilities, constraints and dependencies.
  • Data Architecture for Data Governance

    1. 1. Data Governance
    2. 2. IASA isa non-profit professional association run by architects for all IT architects centrally governed and locally run technology and vendor agnostic Th
    3. 3. Information Architecture Module 0: Course Intro, Architecture  Module 3: Data Integration Fundamentals Introduction: Data,  Lesson 1: Integrating at the Company Level Information and Knowledge  Lesson 2: Data Characteristics  Lesson 3: Data Integration at the System level Module 1: Information for Business  Lesson 1 – Information as Strategy  Module 4: Data Quality and Governance  Lesson 1 – Data and Information Quality  Lesson 2 – Relating Information to Value  Lesson 2: Data Compliance  Lesson 3 – Information Scope and  Lesson 3 – Data and Information Governance Governance  Module 5: Advanced Information Management Module 2: Information Usage  Lesson 1: Data Warehousing  Lesson 1 – Who Uses Your Information  Lesson 2: Business Intelligence  Lesson 2 – How, When, Where and Why  Lesson 3: Data Security and Privacy is Information Used  Lesson 4: Metadata and Taxonomy Management  Lesson 3 – Form Factors  Lesson 5: Knowledge Management  Lesson 4 – Usage Design 1  Module 6: Architecture throughout the Lifecycle  Lesson 5 – Quality Attributes for Information Architecture  Lesson 6 – Data Tools and Frameworks
    4. 4. Typical State of Affairs
    5. 5. The Problem
    6. 6. What is data governance?
    7. 7. Governance is Not a Popular Topic
    8. 8. Governance is Overcomplicated It is not about politics Itis not about delivering verdicts Itis not about defining solutions
    9. 9. Step 1: Form a Team
    10. 10. Step 2: Establish Lineage
    11. 11. Step 3: Assess Maturity
    12. 12. Step 4: Create Governance Body
    13. 13. Step 5: Define Governance
    14. 14. Step 6: Define Charter •Charter
    15. 15. Step 7: Define Scope
    16. 16. Step 8: Create a Plan
    17. 17. Career Path
    18. 18. Iasa Architect Services Adopt Iasa Standards – Skills taxonomy, role descriptions, compensation models Set your goals for value Assess your current team – gap analysis Implement processes Develop and educate people Certify employees and vendors Build communities
    19. 19. •December 6th – 7th, 2012 // Austin, TX •Training: Dec. 3rd – 5th, 2012 •Certification: Dec. 7th – 9th, 2012 • ••Leading Innovation in Architecture. Stay ahead of the technology curve and shape the future of architecture in your organization. •Connect and share insights with the largest global network of technology and enterprise architect practitioners. •Attend sessions on the most current breakthroughs, case studies and key topics in architecture - presented by a mix of practicing architects from top performing businesses and organizations. •Participate in pre-conference workshops and training. Maximize your time at the summit by taking part in one of six intensive training courses designed to provide immediate solutions to benefit your everyday practice. •Featuring Industry Leading Keynote Speakers: •Sheila Jeffrey •David Del Giudice •Paul Preiss Senior Information Architect Principal Architect President, Founder Bank of America AstraZeneca Iasa Global •David Manning •Scott Whitmire •Cat Susch IT Enterprise Architect Enterprise Architect Enterprise Architect Idaho Power Company Nordstrom Microsoft
    20. 20. End Module
    21. 21. Data Architecture for Data Governance November 8, 2012 Presented by Malcolm Chisholm Ph.D. Telephone 732-687-9283© Inc., 2012. All rights reserved
    22. 22. Agenda • Introducing Enterprise Architecture (EA) • Introducing Data Governance • Data Architecture and Data Governance • Data Architecture Patterns • Data Architecture, Governance, and The Business© Inc., 2012. All rights reserved
    23. 23. Introducing Enterprise Architecture (EA)© Inc., 2012. All rights reserved
    24. 24. Need for Enterprise Architecture (EA)Myriad of Independent Projects in IT Strategy is Set at High Level Here is where we want to be in 5 years… CxO’s Operationalization Enterprise …Others… Architecture• EA is responsible for ensuring overall business strategy is implemented by IT© Inc., 2012. All rights reserved
    25. 25. The IT Mindset That’s not a clear We have a high-level requirement… strategy… Tell me what you want me to build Then I will design it Then I will build it Then I will turn it over to you Then I will walk away IT CxO • IT people just want to build stuff and hand it off to other people • They want business sponsors who will pay for this stuff and tell them exactly what is needed • Strategy does not match this mindset© Inc., 2012. All rights reserved
    26. 26. Design without Architecture Stairway to Ceiling All Design – No Architecture Rooms: 160; Doorways: 467; Doors: 950; Fireplaces: 47 (gas, wood, coal); Bedrooms: 40 “Circular” Stairway Constructed 1884 – 1922 (38 continuous years); Cost: $5.5M Blueprints: Never made; Individual rooms sketched out by Sarah “…staggering amount of creativity, energy, and Winchester on paper or other media (e.g. tablecloths) expense poured into each and every detail” • The Winchester Mystery House • Lots of design – but no architecture • There is a difference between architecture and design© Inc., 2012. All rights reserved
    27. 27. Design vs. Architecture Changed Source Integration Data Mirror Capture Data Flow DDL • Architecture deals with many instances of a component type that must interact • Design deals with one instance of a component type, without regard to interaction • E.g. Perspective of Databases : Data Environment (BI or Integration Environment in this example)© Inc., 2012. All rights reserved
    28. 28. Basic EA Taxonomy (“BAIT”) Enterprise Architecture Business Technology Architecture Architecture Application Information Architecture Architecture • EA is clearly not a monolithic discipline • It looks more like a collection of dissimilar components • Each component is a competency – so this matters • Yet there is no common agreement© Inc., 2012. All rights reserved
    29. 29. Introducing Data Governance© Inc., 2012. All rights reserved
    30. 30. What is Data Governance? Starter Definitions From the Data Governance Institute ”The exercise of decision-making and authority for data-related matters.” ”A system of decision rights and accountabilities for information- related processes, executed according to agreed-upon models which describe who can take what actions with what information, and when, under what circumstances, using what methods.”© Inc., 2010. All rights reserved
    31. 31. Data Governance is about Processes Processes for… Deciding who has what rights regarding data Making decisions about data Implementing decisions about data In other Words… Identifying the processes needed to manage data Identifying the actors in these processes Designing these processes And designing the processes to design these processes© Inc., 2010. All rights reserved
    32. 32. Data Governance – A Caution Data Governance is the design, building, and operation of formal business processes to manage the enterprise data resource. W R ON G The Business has…. Business Processes IT has… Governance• The decision rights of actors are defined within the context of these business processes• There is no real distinction between the processes of “data governance” and the “business processes” of the business . [Compare HR].• A distinction that identifies the set of business processes that manage the enterprise data resource as “data governance” is valid. [Compare HR].• Use of the term data governance to further the pretence that IT isuniverse totally outside of the enterprise is invalid.• Therefore, the processes of data governance are truly business© Inc., 2010. All rights reservedprocesses
    33. 33. Data Architecture and Data Governance© Inc., 2012. All rights reserved
    34. 34. Architectural Levels Reference Architecture Assimilate best Conceptual architectural Architecture practices from the IT industry. All Match/select Logical relevant component architectural Architecture classes and component classes patterns are and patterns to the Detailed design of Physical understood. business needs, future state Architecture (including future architecture. plans). Influencing actual physical implementations to ensure conformance with logical architecture A reference architecture is the highest level of architecture. It can be undertaken prior to even thinking about the enterprise. Its advantages are that it can be very high level and can cover the entire enterprise. Useful for thinking about high level data layers© Inc., 2011. All rights reserved
    35. 35. Reference Data Architecture - Example • Example of a simple reference data architecture • Just data layers – no services or components BUT – just having a picture can be problematic© Inc., 2011. All rights reserved
    36. 36. Need Definitions, Explanations – Not Just Picture LayerCharacteristic Transactional Data Warehouse Data Mart Application Data structured and View of data across the Data produced via the filtered to support enterprise. Supports Purpose automation of business dissemination, derivation specific information processes needs of small groups of of knowledge and history users. Derivations (including Data from Warehouse is All base (non-derived) aggregations) produced Data Life Cycle data originates here here, and history is transformed to support specific reporting inferred Extract / Transform / Create / Source / Read / Subscribe / Transform /Data Operations Update / Delete / Archive Load / Derive / Publish / Archive Archive Subject Oriented / Information Requirement Data Model Normalized to 3NF Snowflaked / Conformed Oriented / Snowflaked / Dimensions Conformed Dimensions• Much more is needed than the above• Definitions are a technical reference; explanations help stakeholders to understand the reference architecture © Inc., 2011. All rights reserved
    37. 37. Subject Area Model• A Subject Area Model is a taxonomy of the major areas in the enterprise that are relevant to data management• A necessary starting point for MDM• 1 – A subject area should have stable definitions of entities• 2 – Production of master data should not cross subject area boundaries• Identify the major data concepts in each subject area – may will be master data entities.© Inc., 2011. All rights reserved
    38. 38. A Subject Area Model is Not Costly Data Management• 10-15 subject areas per model is the norm• Do not need elaborate data modeling tools• Much more about a standardized view of the enterprise• Can be done with 1-3 people• Gets a lot of bang for the buck• Is high-level and important© Inc., 2011. All rights reserved
    39. 39. Example of Use: Establish Vendor Data Profiles Map Vendor’s Data Vendor Data to Subject 1 Areas Data Vendor Data Management 2 Create & Data Vendor Map Vendors Manage Subject to Uses 3 Area Model(s) Data Vendor 4 Map Subject Areas to User Data Vendor Requirements 5 Subject Area Model(s) Users • A Subject Area Model provides the basis for a common understanding of the data vendors provide • Will need to go to major entity clusters within Subject Area© Inc., 2011. All rights reserved
    40. 40. A Subject Area Model is a Taxonomy I don’t like your definition of this subject Your model area! does not tell me what is financial data! You mean to tell me every one of these subject areas includes access security for data!• People think a taxonomy is a way of unlocking the secrets of the universe• Taxonomies are a tools – you can have as many as you need to do your job• You can never expect the boundaries of a subject area model to be exactly precise and never changing© Inc., 2011. All rights reserved
    41. 41. Create a Metadata Subject Area Model The Fundamental Ontology of Metadata for the Enterprise© Inc., 2011. All rights reserved
    42. 42. Metadata Subject Area Model Advantages 1. Standardizes the metadata in the enterprise at a high level. 2. Can be used to prioritize projects and programmes. 3. Useful in communications. 4. Shows what metadata is being produced by subject area. This is a high-level inventory. 5. Can distinguish data-related metadata from other data 6. Within each subject area, definitions should be constant. There may be variations across subject areas. 7. Subject areas are candidates for conceptual models. Disadvantages 1. It is a taxonomy and can be argued over. No one taxonomy will satisfy every perspective. We use data categorization for that (see later). The Subject Area Model should be the most common and natural perspective. 2. The Subject Area Model often has more things expected of it than it can deliver. Managing expectations can be tricky. The Subject Area Model is itself Metadata and needs to be managed© Inc., 2011. All rights reserved
    43. 43. Data Architecture Patterns© Inc., 2012. All rights reserved
    44. 44. What is an Architectural Pattern? “Pattern in architecture is the idea of • An architectural style is a central, capturing architectural design ideas organizing concept for a system. as archetypal and reusable • An architectural pattern describes a descriptions” coarse-grained solution at the level of - Christopher Alexander, quoted in Business subsystems or modules and their Model Generation by Osterwalder and relationships. Pigneur (2010) • A system metaphor is more conceptual and it relates more to a real-world Antipattern: A pattern that is concept over a software engineering commonly used, but which is concept. ineffective or damaging. E.g. - Quoted from A Practical Guide to Enterprise Analysis Paralysis. Architecture by McGovern, Ambler et al, at - Term created by Andrew Koenig, 1995 Styles-Patterns • Patterns occur a lot in architecture • There are also styles, metaphors and antipatterns • Seem to have emerged from the application architecture competency (e.g. rare to hear of dimensional modeling as a pattern or style)© Inc., 2011. All rights reserved
    45. 45. Example: Farm and Market Pattern Traditional MDM Integration Hub Farm and Market Pattern for MDM© Inc., 2011. All rights reserved
    46. 46. Six Patterns for Data Hubs: 1 – Publish/Subscribe • Publisher pushes data to hub • Subscriber pulls data from hub • No data integration • Publisher may not know who the subscribers are • Weak governance - may not even be SLA’s© Inc., 2011. All rights reserved
    47. 47. Six Patterns for Data Hubs: 2 – ODS for Integrated Reporting • Supports principle that transaction applications will not do operational reporting (adverse affect on performance) • Some form of integration • Anti-pattern of real-time data warehousing • History (changed data capture) not considered© Inc., 2011. All rights reserved
    48. 48. Six Patterns for Data Hubs: 3 – ODS for Data Warehouses • No reporting in this kind of ODS • Probably evolved from ODS for reporting • Strong integration • Just to feed warehouses • At odds with Data Warehouse that has own staging, integration© Inc., 2011. All rights reserved
    49. 49. Six Patterns for Data Hubs: 4 – Traditional MDM Hub • Master Data only • Integration does happen (typically just Trust & Survivorship) • Some content management needed – typically to correct data quality problems • Typically DQ functionality an afterthought • Distributes “Golden Copy” Master Data to enterprise© Inc., 2011. All rights reserved
    50. 50. Six Patterns for Data Hubs: 5 – Message Hub • Messages – real-time or near-real-time - not batch data • Message switch • Command and control for message orchestration • Not just “listening” – does routing • Messages also stored in database© Inc., 2011. All rights reserved
    51. 51. Six Patterns for Data Hubs: 6 – Integration Hub • Like message hub, but for batch data • Event data equivalent of MDM hub • Just one place where integration done • Both warehouse and transaction applications need integrated data • Removes need for each application to do integration by itself© Inc., 2011. All rights reserved
    52. 52. Data Architecture, Governance, and The Business© Inc., 2012. All rights reserved
    53. 53. Another View of EA Many Years Ago… Today… …”the business” knew the business …it does so much less• Long ago, people had to know the business rules concerning what they did• These are now in applications, and staff are more concerned with knowing how to make the applications work• Data is an asset, but staff tend to know little about the data• Loss of business knowledge© Inc., 2012. All rights reserved
    54. 54. Who Knows The Business?• The business seems to be more fragmented than long ago (though that is difficult to prove)• Managers who know the business know less of it and are so time- constrained they cannot help much on aligning projects to enterprise strategy© Inc., 2012. All rights reserved
    55. 55. Populate EA with Business Experts A Different EA Vision: IT Tendency: Architects + Business Experts Populate EA with Architects • Business experts provide better relations with rest of business • Keep architects with just being oriented to architecture • Immediate business credibility for EA • They get poached© Inc., 2012. All rights reserved
    56. 56. Questions and Answers Data Architecture for Data Governance November 8, 2012 Presented by Malcolm Chisholm Ph.D. Telephone 732-687-9283© Inc., 2012. All rights reserved