All Together Now: A Recipe for Successful Data Governance


Published on

The Briefing Room with David Loshin and Phasic Systems

Slides from the Live Webcast on July 10, 2012

Getting disparate groups of professionals to agree on business terminology can take forever, especially when big dollars or major issues are at stake. Many data governance programs languish indefinitely because of simple hang-ups. But a new approach has recently achieved monumental results for the United States Navy. The detailed process has since been codified and combined with a NoSQL technology that enables even the most complex data models and definitions to be distilled into simple, functional data flows.

Check out this episode of The Briefing Room to hear Analyst David Loshin of Knowledge Integrity explain why effective Data Governance requires cooperation. Loshin will be briefed by Geoffrey Malafsky of Phasic Systems who will tout his company's proprietary protocol for extracting, defining and managing critical information assets and processes. He'll explain how their approach allows everyone to be "correct" in their definitions, without causing data quality or performance issues in associated information systems. And he'll explain how their Corporate NoSQL engine enables real-time harmonization of definitions and dimensions.

Visit us at:

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

All Together Now: A Recipe for Successful Data Governance

  1. 1. Eric.kavanagh@bloorgroup.comTwitter Tag: #briefr 7/10/12
  2. 2. !   Reveal the essential characteristics of enterprise software, good and bad !   Provide a forum for detailed analysis of today s innovative technologies !   Give vendors a chance to explain their product to savvy analysts !   Allow audience members to pose serious questions... and get answers!Twitter Tag: #briefr
  3. 3. !   July: Disruption !   August: Analytics !   September: Integration !   October: Database !   November: Cloud !   December: InnovatorsTwitter Tag: #briefr
  4. 4. !  Disruptive Innovation produces an unexpected new market and value network, and is usually geared toward a new set of customers. !  The consumer technology market teems with such game- changers: mp3 players, iPhone/iPads, portable storage devices, digital media, etc. !  While disruptive technologies often take a degree of time to obtain a foothold in the market, they can have a serious impact on industry incumbents, who can be slow to innovate.Twitter Tag: #briefr
  5. 5. David Loshin, president of Knowledge Integrity, Inc, is a recognized thought leader and expert consultant in the areas of data quality, master data management and business intelligence. David is a prolific author regarding business intelligence best practices and has written numerous books and papers on data management, including the just-published “Practitioner’s Guide to Data Quality Improvement.” David is a frequent invited speaker at conferences, web seminars, and sponsored web sites and channels including His best-selling book, “Master Data Management,” has been endorsed by data management industry leaders, and his valuable MDM insights can be reviewed at David can be reached at: or (301) 754-6350.Twitter Tag: #briefr
  6. 6. !   Focuses on agility and flexibility for data governance and standards !   Offers a core technology suite, DataStar, that delivers data modeling, integration, aggregation and automation. !   Developed a NoSQL alternative for data consolidationTwitter Tag: #briefr
  7. 7. Dr. Geoffrey Malafsky earned a Ph.D. in Nanotechnology from Pennsylvania State University. He was a research scientist at the Naval Research Laboratory before becoming a technology consultant in advanced system capabilities for numerous Government agencies and corporate clients. He has over thirty years of experience and is an expert in multiple fields including Nanotechnology, Knowledge Discovery and Dissemination, and Information Engineering. He founded and operated the technology consulting company TECHi2 prior to founding Phasic Systems Inc., where he is the CEO and CTO.Twitter Tag: #briefr
  8. 8. Bringing Agility and Flexibility toData Design and IntegrationPhasic Systems IncDelivering Agile
  9. 9. 10Introduction to Phasic Systems Inc•  Bringing Agile capabilities to data lifecycle for business success•  Methods and tools tested and refined over years of in-depth large- scale efforts•  Solve toughest data problems where traditional methods fail•  Based on extensive consulting lessons learned and real-world results•  Began in 2005 to commercialize advanced Agile methods successfully deployed in competitive development contracts
  10. 10. 11Phasic Systems Inc Management•  Geoffrey Malafsky, Ph.D, Founder and CEO ▫  Research scientist ▫  Supported many organizations in their quest to access the right information at the right time•  Tim Traverso, Sr VP Federal ▫  Technical Director, Navy Deputy CIO•  Marshall Maglothin, Sr VP HealthCare ▫  Sr. Executive multiple large health care systems•  Deborah Malafsky Sr VP Business Development
  11. 11. 12Our Agile Methods•  Why be Agile? ▫  Provide flexibility and adaptability to changing business needs while maintaining accuracy and commonality ▫  Segmented approach is too slow, rigid, and costly•  How? ▫  Treat data lifecycle as one continuous operation from governance to modeling to integration to warehouses to Business Intelligence ▫  Emphasize value produced at each step and overall coordination ▫  Seamlessly fit with existing organization, procedures, tools but add Agility, commonality, flexibility, and reduced cost and time•  We are Agile and comprehensive ▫  Typical 60-90 day engagement ▫  Deliver completed products not just plans or partial results
  12. 12. 13 Methods and Tools•  DataStar Discovery: Agile data governance, standards and design ▫  Add business and security context to data ▫  Flexible, common data definitions/ semantics, models•  DataStar Unifier: Agile warehousing and aggregation ▫  Simplified, common semantics using Corporate NoSQL™ ▫  Source to target mapping with flexibility, standardization ▫  Aggregate data using all use case and system variations simply and easily into standard or NoSQL databases
  13. 13. 14PSI Customer Testimonial “As a COO of a Wall Street firm and a former Vice Admiral in the United States Navy in charge of a large integrated organization of thousands of people and numerous IT systems, I have seen firsthand the critical role that high-quality enterprise data plays in day-to-day operations of an organization. Without timely access to reliable and trusted data all of our operations were vulnerable to poor decision making, weak performance, and a failure to compete. With Phasic Systems Inc.’s agile methodology and technology, we were finally able to solve our data challenges at a fraction of the time, cost, and organizational turmoil that all the previous and more expensive, time-consuming approaches failed to do. Phasic Systems Inc. offers a new and much-needed approach to this important area of Business Intelligence.” VADM (ret) J. “Kevin” Moran
  14. 14. 15The Business CaseToday’s Response Timeline (15 to 27 Months) 3 to 6 Months 6 to 9 Months 3 to 6 Months 3 to 6 Months Business Groups IT Groups BI Groups Users •  Requirements •  Develop Systems & Applications •  Capability Problems •  BI Data Models •  Conceptual/Logical Models •  Physical Data Models •  New Capabilities •  Reports •  Data Quality •  Databases / Data Warehouse •  Missing Data •  Dashboards •  Business Rules •  ETL controls •  Standards •  MDMTomorrow’s Initial Response Timeline with PSI (Subsequent Response Timeline – Days) 2 to 6 Months •  Requirements •  Develop Systems & Applications •  Conceptual Data Model •  Physical Data Models •  Logical Data Model •  Databases / Data Warehouse •  Business Rules •  ETL controls •  Standards •  MDM •  BI Data Models •  Data Quality
  15. 15. 16Agile: Overcome Hurdles•  Group rivalry ▫  Embrace important business variations; recognize no valid reason to force everyone to use only one view exclusively.•  Terminology confusion ▫  Use a guided framework of well-known concepts to rapidly identify, and implement variations as related entities.•  Poor knowledge sharing ▫  Use integrated metadata where important products (business models, data models, glossaries, code lists, and integration rules) are visible, coordinated, and referenceable•  Inflexible designs ▫  Use a hybrid approach (Corporate NoSQL™) for Agile warehousing and integration blending traditional tables and NoSQL for its immense flexibility and inherent speed
  16. 16. Schema Are Not EnoughGovernance Integration CEO/CFO/CIO SAP/IBM/ORACLE Design ? MDM Sales, ? Accounting D. Loshin 2008Which Value? Whose? My “customer” or your “customer”? How is data used? Must be agile in order to adapt quickly to new business needs ▫  Continuous change is norm: requirements, consolidation ▫  We must use all the important business variations of key terms (e.g. account, client, policy) – No such thing as single version for all!
  17. 17. 18Status Quo: Non-Agile Agile: Visible, Common
  18. 18. 19Unified Business Model™ Intuitive, List-based
  19. 19. 20Real Estate Listing Example•  Seems simple and well-defined ▫  Each house has a type, id, address, etc.. ▫  Industry standards: OSCRE, RETS•  Yet, data systems are very different ▫  Data model tied tightly to business workflow ▫  Extensions and “make-it-work” changes added over time•  Similar to customer relationship mgmt, ERP, and many other fields
  20. 20. 21Semantic Conflict inReal Estate Models NKY HOMESEEKERS NKY attribute ‘basement’ does not have a corollary in HOMESEEKERS
  21. 21. 22Data Value SemanticErrors = Inconsistent, Lot_dimensions: implied semantics for sizeDifficult to Merge, data. Actually has all sorts of dataReport, Analyze Semiannual_taxes: implied semantics for numeric data. Actually has all sorts of data
  22. 22. 23NKY HomeSeekers Texas
  23. 23. 24
  24. 24. 25Fully Integrated Metadata for Business, IT, and BI
  25. 25. 26
  26. 26. 27
  27. 27. 28DataStar Corporate NoSQL™•  Large systems use NoSQL for its flexibility, performance, and adaptability ▫  But, it is poorly suited for corporate use – lacks connection to business•  DataStar Corporate NoSQLTM ▫  Blends traditional techniques and NoSQL Speed ▫  Entities come directly from Unified Business Model & Agility ▫  Object structure with simple tables ▫  Key-value pairs are basic repeating structure of all tables ▫  Business driven terminology ▫  Easily handles semantic variations & updates w/o changes to logical or physical models ▫  Can be as ‘dimensional’ or ‘normalized’ as desired
  28. 28. 29Position Data Model
  29. 29. Results•  Applied to production data: ▫  Fully cleaned & integrated data governance approved –  Requirement: 500,000 records in 2 hrs on Sun E25K –  Actual: 50 minutes on 3 year low-cost server•  Governance documents produced and approved ▫  Legacy data models – first time in ten years ▫  Common data model – directly derived from ontology. Position-Resume model•  Standing governance board created with short decision- making monthly meetings ▫  Position-Resume Governance Board•  Process approach and technology applied to new IT systems
  30. 30. Navy HR Data Analysis•  Groups “share” data and control only if they don’t lose project control or funds•  Governance, business process, data engineers create separate designs and don’t know how to coordinate•  Try hard to follow industry guidance but stuck•  Actual data is very different than policy, mgmt awareness ▫  Example 1: Multiple Rate/Rating entries. Person xxxxxx has 5 entries: 4 end on the same date, 2 have start dates after they their end dates , 2 start and end on the same days but are different ▫  Example 2: 30 different values used for RACE but only 6 allowed values in the Navy Military Personnel Manual derived from DoD policy
  31. 31. 32Agile Warehousing and BI
  32. 32. 33Agile Warehousing and BI v
  33. 33. 34Resume Data Model
  34. 34. 35Key-Value Vocabulary Resume Identifiers
  35. 35. 36Key-Value Vocabulary Competency KSAs
  36. 36. Twitter Tag: #briefr
  37. 37. Agility and Collaboration for Data Governance David Loshin Knowledge Integrity, Inc. © 2012 Knowledge Integrity, Inc. 38 (301) 754-6350
  38. 38. Business Metadata Interdependencies Context Concept Process Business Policy © 2012 Knowledge Integrity, Inc. 41 (301) 754-6350
  39. 39. Objective: Translate Business Policies into Data RulesBusiness Business Information Business Data Metadata Goals Policy Policy Rules Rules Operational governance integrates monitoring conformance to data rules © 2012 Knowledge Integrity, Inc. 42 (301) 754-6350
  40. 40. © 2012 Knowledge Integrity, Inc. (301) 754-6350
  41. 41. © 2012 Knowledge Integrity, Inc. (301) 754-6350
  42. 42. © 2012 Knowledge Integrity, Inc. (301) 754-6350
  43. 43. © 2012 Knowledge Integrity, Inc. (301) 754-6350
  44. 44. © 2012 Knowledge Integrity, Inc. (301) 754-6350
  45. 45. © 2012 Knowledge Integrity, Inc. (301) 754-6350
  46. 46. Motivation: Complexity in Data Meanings & Semanticsp  What is a customer? Sales: Support: Someone Someone who who pays for has a license for our products use of our or services product Customer Servicep  These are potentially Finance Human Resources conflicting definitions “customer”p  Representations and underlying meanings Sales Legal from different business functions may differ Marketing ? Compliance © 2012 Knowledge Integrity, Inc. 50 (301) 754-6350
  47. 47. Build from the Bottom Up Data Governance Information Information Data Quality Access Control Usage Quality SLAs Information Architecture Domain Entity Models Relational Tables Directory Data Elements Critical Data Element Data Formats Aliases/Synonyms Data Elements Definitions Reference Metadata Conceptual Value Reference Mappings Domains Domains Tables Business Definitions Business Concepts Definitions Semantics Terms © 2012 Knowledge Integrity, Inc. 51 (301) 754-6350
  48. 48. Business Termsp  Within different contexts, business terms may be used with a specific definition to refer to: n  An action n  An entity n  A characteristicp  A business term may be used multiple times with different definitions © 2012 Knowledge Integrity, Inc. 52 (301) 754-6350
  49. 49. Example – Identifying Business Termsp  Order Confirmation If you do not receive a confirmation number (in the form of a confirmation page or email) after submitting payment information, or if you experience an error message or service interruption after submitting payment information, it is your responsibility to confirm with FizzDizzle Customer Service whether or not your order has been placed. © 2012 Knowledge Integrity, Inc. 53 (301) 754-6350
  50. 50. Example – Identifying Business Termsp  Order Confirmation If you do not receive a confirmation Nouns number (in the form of a confirmation •  You page or email) after submitting payment •  Confirmation number information, or if you experience an error message or service interruption after •  Confirmation page submitting payment information, it is your •  Confirmation email responsibility to confirm with FizzDizzle •  Payment information Customer Service whether or not your •  Error message order has been placed. •  Service interruption •  FizzDizzle Customer Service •  Order © 2012 Knowledge Integrity, Inc. 54 (301) 754-6350
  51. 51. © 2012 Knowledge Integrity, Inc. (301) 754-6350
  52. 52. Example – Identifying Business Termsp  Order Confirmation If you do not receive a confirmation Verbs number (in the form of a confirmation page •  Receive or email) after submitting payment •  Submitting information, or if you experience an error message or service interruption after •  Experience submitting payment information, it is your •  Confirm responsibility to confirm with FizzDizzle •  Placed Customer Service whether or not your order has been placed. © 2012 Knowledge Integrity, Inc. 56 (301) 754-6350
  53. 53. Bring it All Together: The Chain of Definition © 2012 Knowledge Integrity, Inc. 57 (301) 754-6350
  54. 54. Harmonization Data Type Element p  Use Chain of Definition Data Type First VARCHAR(25) to determine when:Element Middle VARCHAR(25) n  Similarly-named dataFirstName VARCHAR(35) Last VARCHAR(30) elements refer to theLastName VARCHAR(40) same data element SocialSec CHAR(9)SSN CHAR(11) conceptTelephone VARCHAR(20) n  Same-named data elements refer to different data element concepts n  Consolidating when possible and n  Differentiating when necessary © 2011 Knowledge Integrity, Inc. 58 (301) 754-6350
  55. 55. Impact Assessment Data Typep  Use chain of definition Element model to identify the Data Type First VARCHAR(25) instances that are Element Middle VARCHAR(25) impacted as a result of FirstName VARCHAR(35) Last VARCHAR(30) harmonization LastName VARCHAR(40) SocialSec CHAR(9) SSN CHAR(11) Telephone VARCHAR(20) © 2012 Knowledge Integrity, Inc. 59 (301) 754-6350
  56. 56. Questions and Open Discussionp www.dataqualitybook.comp  If you have questions, comments, or suggestions, please contact me David Loshin 301-754-6350 © 2011 Knowledge Integrity, Inc. 2012 60 (301) 754-6350
  57. 57. !   One of the common themes in the material you provided is the need for collaboration as part of the lifecycle management for the creation of a unified business model. To what extent is this collaboration driven by the software and how much requires processes designed around the software? !   What is your approach for transferring the knowledge for identifying semantic conflicts and resolving them within the organization? !   A lot of the slides suggest that the intent of the use of the technology is for developing data warehouse or business intelligence models. Is the use limited to consuming data from existing systems, or can it be used for reengineering operational or transaction systems, and if so how, and if not, why?Twitter Tag: #briefr
  58. 58. !   One of the barriers to value for existing metadata and governance tools is the need for ongoing maintenance of the content. How can the product be used to facilitate ongoing management and assurance of consistency of business terminology? !   Presuming that I am now a data consumer (say a business analyst) within the organization, how would I use this technology to clarify the definitions and lineage of business terms presented to me in a BI report?Twitter Tag: #briefr
  59. 59. !   What is your approach for capturing the semantics of implicit business concepts? In your real estate example, one of the columns for lot dimensions had implied semantics for size data, with an implication of measurement systems, units of measure, and even “topography” of the lot size. This implies the use of business concepts that are not explicit (acreage vs. square footage, transformations across frames of reference, qualification of lot shape, presentation of dimensionality). How does the tool capture implicit semantic information? !   Going back to collaboration: What types of interactive notifications are integrated into your environment to apprise individuals of changes to business terms, data element concepts, data elements, value domains, etc.?Twitter Tag: #briefr
  60. 60. Twitter Tag: #briefr
  61. 61. !   July: Disruption !   August: Analytics !   September: Integration !   October: Database !   November: Cloud !   December: InnovatorsTwitter Tag: #briefr
  62. 62. Twitter Tag: #briefr