Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Streamline Data Governance with Egeria: The Industry's First Open Metadata Standard


Published on

Learn about the industry's new open metadata standard Egeria, introduced in September by ODPi, The Linux Foundation’s Open Data Platform initiative. Egeria supports the free flow of standardized metadata between different technologies and vendor platforms, enabling organizations to locate, manage and use their data resources more effectively. Explore how Egeria's set of open APIs, types and interchange protocols to allow all metadata repositories to share and exchange metadata. From this common base, it adds governance, discovery and access frameworks for automating the collection, management and use of metadata across an enterprise. The result is an enterprise catalog of data resources that are transparently assessed, governed and used in order to deliver maximum value to the enterprise.

This presentation by ODPi Director John Mertic provides an introduction to Egeria, and explores how the standard provides a vendor-neutral approach to data governance. Learn how a group of companies led by ING, IBM and Hortonworks came together through the open source community to re-imagining data governance and delivered Egeria -- to automate the collection, management and use of metadata across organizations of any size and complexity. Learn how Egeria was built on open standards and delivered via Apache 2.0 open source license.

Published in: Technology
  • Be the first to comment

Streamline Data Governance with Egeria: The Industry's First Open Metadata Standard

  2. 2. Real Data Landscapes  Many enterprises have had 40 years of innovation embedded into their IT Systems.  Data Science projects spend a lot of time finding and validating data.  There is often a delay moving real- time analytics from POC into production. 2
  3. 3. Building governance maturity is a gradual process  Organizations may operate different levels of maturity in different parts of their business.  Choices determined by where the most value lies.  Many organizations aspire to provide all employees with the data they need (data citizenship*) 3
  4. 4. Governance program components 4
  5. 5. Observations from the maturity model  The number of bespoke integrations between tools to exchange metadata so it is consistently available to everyone who needs it grows steadily with each step up in maturity.  With little to no standardization between vendors, the cost and time delay is born by the organization. 5
  6. 6. Using Egeria …  Eases the cost of metadata integration through  Comprehensive standards and libraries.  Active vendor recruitment program.  Provides direct support to many governance roles, filling the gaps between function offered through commercial tools. 6
  7. 7. Example of a simple cohort Cohort A Chief Data Office Data Lake Systems of Record 7
  8. 8. Connecting to multiple cohorts Cohort BCohort A Chief Data Office Data Lake Systems of Record Mobile Apps Data Lake Systems of Record Marketing 8
  9. 9. Importance of the Graph Model 9 Database Column Glossary Term Server 1 Server 2 EntityEntity
  10. 10. Importance of the Graph Model 10 Database Column Glossary Term Glossary Term Meaning Server 1 Server 2 Reference Copy Relationship
  11. 11. Importance of the Graph Model 11 Database Column Glossary Term Server 1 Server 3 Server 2 Database Column Glossary Term Meaning
  12. 12. Importance of the Graph Model – Using Entity Proxies 12 Database Column Glossary Term Server 1 Server 3 Server 2 Meaning Database Column Glossary Term Entity Proxy
  13. 13. Metadata and governance digital platform Open Metadata and Governance Reporting Platform ETL Platform Analytics Platform Virtualization Platform Governance Platform Data Platform 13
  14. 14. Search Open Metadata Access Services Design philosophy Open Metadata Repository Services 14 Use cases, Personas, Practitioners input Data integration, availability and integrity best practices
  15. 15. Coco Pharmaceuticals persona Jules Keeper, CDO Tessa Tube, Chief Researcher Erin Overview, Information Architect Faith Broker Chief Privacy Offic e r Bob Nitter, Integration Developer Callie Quartile, Data Scientist Nancy Noah Cloud Specialist Gary Geeke IT Infrastructure 15
  16. 16. Open metadata type model summary Glossary Collaboration Governance Models and Reference Data Metadata Discovery Lineage Data Assets Base Types, Systems and Infrastructure 16
  17. 17. Each area caters for appropriate metadata structures Policy Metadata (Principles, Regulations, Standards, Approaches, Rule Specifications, Roles and Metrics) Governance Actions and Processes Augmentation MappingImplementation Business Objects and Relationships, Taxonomies and Ontologies Business Attributes Organization Teaming Metadata (people profiles, communities, projects, notebooks, …) Models and Schemas 4 3 1 5 Physical Asset Descriptions (Data stores, APIs, models and components) Asset Collections (Sets, Typed Sets, Type Organized Sets) Information Views Rights Management Reference Data Feedback Metadata (tags, comments, ratings, …) ClassificationSchemes Classification Strategy Subject Area Definition Campaigns and Projects Rollout 2 Discovery Metadata (profile data, technical classification, data classification, data quality assessment, …) Augmentation Instrument Association Information Process Instrumentation (design lineage) 6 7 ConnectorsBasic Types, Infrastructure and Systems Access 0 17
  18. 18. Current Open Metadata Access Services (OMASs) 18 Project Management Community ProfileAsset Catalog Stewardship Action Information View Governance Program Data Process Subject Area Connected Asset Discovery EngineGovernance Engine Data Protection Software Developer Data Platform Asset Owner Digital Architecture Data Science DevOps Asset Consumer Data Infrastructure Data Privacy Asset Lineage
  19. 19. Automating governance example IBM Information Governance Catalog Apache Atlas Apache Ranger Gaian Define Policies Hadoop Metadata Manage Data Access Egeria (Open metadata exchange and federated queries) Access Data Egeria Open Governance APIs configure configure 19
  20. 20. ODPi – A neutral home for collaboration 20
  21. 21. Egeria Conformance Program - its an “imitation game” 21
  22. 22. Realizing open metadata and governance  Delivering core technology  Recruiting vendors  Assisting practitioners 22 Vendors Practitioners Core Technology Compliance Suite Best Practices Project Egeria Project Data Governance
  23. 23. Help wanted  Governance practice leaders needed to build out best practices  If you buy data technology please encourage your vendors to consume the Egeria technology.  Looking for developers:  UI development  Graph repository (eg JanusGraph/TinkerPop)  Python clients  Join the ODPi to help fund our work  Tell everyone about want we do 23
  24. 24. z zz z z z z Questions? Open forum
  25. 25. Links  Press Release and Podcast  Open source repositories • • • sharing-exchange-and-governance-of-metadata/ • masterclass-with-mandy-chessell-part-1/ • • masterclass-with-mandy-chessell-part-2/
  26. 26.