SlideShare a Scribd company logo
Apache Atlas & Open Metadata
Dataworks Sydney 2017
Nigel Jones,
Software Architect
IBM
Ferd Scheepers
Chief Information Architect
ING
2
Open Metadata and Governance will allow…
… metadata to be captured when the data is created, moved with the data and
be augmented and processed by any of the vendor tools.
Open Metadata and Governance consists of:
1. Standardized, extensible set of metadata types
2. Metadata exchange APIs and notifications
3. Frameworks for automated governance
Open Metadata and Governance will allow you to have:
1. An enterprise data catalogue that lists all of your data, where it is located, its origin (lineage),
owner, structure, meaning, classification and quality
2. New data tools (from any vendor) connect to your data catalogue out of the box
3. Metadata being added automatically to the catalogue as new data is created and analysed
4. Subject matter experts collaborating around the data
5. Automated governance processes protect and manage your data
3
What is Open Metadata and Governance?
4
Positioning of Apache Atlas for Open Metadata
Open and
Unified Metadata
Metadata
repository
Apache Atlas
Metadata
repository
IBM
Metadata
repository
SAS
Open Metadata Repository Service
OMRS
Open Metadata Access Service
OMAS
Components defined
and being developed
by Open Metadata &
Governance project
Metadata
highway
• Apache Atlas provides an open community for developing the reference implementation
for open metadata and governance. In essence Apache Atlas delivers 2 main
capabilities:
• it plays a role of a metadata repository (Graph Database) for a metadata end-user tool
• and, it plays the important role of delivering the federated/unified metadata layer
across the entire landscape of an enterprise
• The software development governance from the Apache Software Foundation (ASF)
creates confidence that the technology will be maintained and enhanced as appropriate
in an equitable manner.
Role of Apache Atlas
5
… because Apache is mostly focused on development and we are missing a governance
body for managing the adoption of and compliance to the Open Metadata and Governance
standards. We envision the following roles for ODPI:
1. Be an advocate of the Open Metadata and Governance standards, make them visible
and their value understood.
2. Facilitate discussions around the Open Metadata and Governance standards evolution,
maintenance and development.
3. Test and sign-off compliance of vendor offerings to the Open Metadata and Governance
standards.
6
Doing all of this under Apache Atlas flag is not enough…
1. Hands-on Community members:
• ING
• IBM
• HortonWorks
2. Companies we have had conversations with:
• CIBC
• SAS
• Microsoft
• Oracle
• Informatica
• Waterline
• RBC
• DBS
7
Who is in ?
1. Ambition level:
• End of September 2017: Open Metadata working demo.
• Mid-December November 2017: first version of user access.
• Google for Data
2. Next steps:
• End of Q2 2018: production ready version of Virtual Data
Connector.
8
Timeline and next steps
About Me
https://www.linkedin.com/in/nigelljones
https://www.twitter.com/planetf1
jonesn@uk.ibm.com@
Objective
Why
How
Excite &
Engage
Apache Atlas
Open
Metadata
Atlas has graduated!
DOB: 2015-05-05R: 0.8.1
Atlas Architecture
Storage Repository
Graph
 Type System
 REST API
 Models
 UI & Apps
Hooks &
Bridges
https://cwiki.apache.org/confluence/display/ATLAS/Open+Metadata+and+Governance
A reminder of our problem.. And solution
Open and
Unified Metadata
Extend beyond Hadoop
++
Common Core Data model
Data Assets Governance Lineage
Glossary Collaboration
Models &
Reference
Data
Base Types,
Systems &
Infrastructure
Metadata
Discovery
https://cwiki.apache.org/confluence/display/ATLAS/Building+out+the+Open+Metadata+Typesystem
Open APIs - OMRS
Metadata Highway
Adapter
Plugin
Open Connector
Framework
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258803
Open APIs - OMAS
OMRS
Governance
Engine
OMAS
Glossary
OMAS
Asset OMAS
Information
View OMAS
++......
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258799
OMAS – detail
Project List
Metadata Service
Data/Asset
Community Metadata
Service
Landscape Definition
Metadata Service
Asset Catalog
Metadata Service
Classification and Mapping
Metadata Service
Information View
Metadata Service
Connector Directory
Metadata Service
Governance Definitions
Metadata Service
Information Process
Metadata Service
Glossary and Taxonomy
Metadata Service
Asset
Metadata Service
Discovery
Metadata Service
Governance Action
Metadata Service
Roles and Access
Metadata Service
Models and Schema
Metadata Service
Connector
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258799
Business
metadata
Structural
metadata for
a data store
New glossary function for semantic processing
EMPNAME EMPNO JOBCODE SALARY
EMPLOYEE
RECORD
Employee
Work Location
Annual Salary
Job Title
Employee Id
Employee Name
Hourly Pay Rate
Manager Compensation Plan
HAS-A
HAS-A
HAS-A
HAS-A
HAS-A
HAS-A
IS-A IS-A
Sensitive
IS-A
Data
00 3809890 6 7 Lemmie Stage 818928 3082 4 New York 4 27 DataStage Expert 1 45324 300 27 Code St Harlem NY 1 3
https://cwiki.apache.org/confluence/display/ATLAS/Area+3+-+Glossary
Replacing v1
Taxonomy (tech
preview)
Categories
Terms
hierarchies
Rich
Relationships
Classifications
Glossary
https://cwiki.apache.org/confluence/display/ATLAS/Area+3+-+Glossary
Governance Action Framework
metadata drives enforcement
Classification (tag) based – scalable, glossary
driven
Access, Masking, Filtering
Supports Apache Ranger but open APIs for others
Audit,Rights - Exception management, Rights,
Privacy (to look at in future)
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258801
Open Discovery Framework
Open
Framework
Plugins
characterize data
& relationships
Updates
metadata
with results
Initial implementation
in master
https://cwiki.apache.org/confluence/display/ATLAS/Automated+metadata+discovery
Open ecosystem
https://cwiki.apache.org/confluence/display/ATLAS/Open+Metadata+and+Governance
Summary
Open
Metadata
Enterprise
Catalog
Discovery
Multi Vendor
Open,
Layered APIs
Metadata
store
integration
Open Source
&
Governance
ubiquitous
Standard
Models
How can I get involved?
Discuss: Mailing List
Document, Explain: Wiki
Report, Design: Jira
Face to face
Code
Vendors!
https://cwiki.apache.org/confluence/display/ATLAS/Getting+Involved
Governance & Security BOF
Thursday 18:00
C4.7
Owen O’Malley
Nigel Jones
Ferd Scheepers
Backup
VDC End to End
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=69407333

More Related Content

What's hot

An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...
DataWorks Summit
 
Atlas ApacheCon 2017
Atlas ApacheCon 2017Atlas ApacheCon 2017
Atlas ApacheCon 2017
Vimal Sharma
 
Manage tracability with Apache Atlas, a flexible metadata repository
Manage tracability with Apache Atlas, a flexible metadata repositoryManage tracability with Apache Atlas, a flexible metadata repository
Manage tracability with Apache Atlas, a flexible metadata repository
Synaltic Group
 
Data Governance Initiative
Data Governance InitiativeData Governance Initiative
Data Governance Initiative
DataWorks Summit
 
Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015
Hortonworks
 
Data governance in Hadoop (My Personal Notes)
Data governance in Hadoop (My Personal Notes)Data governance in Hadoop (My Personal Notes)
Data governance in Hadoop (My Personal Notes)
Komes Chandavimol
 
Data Discovery & Lineage in Enterprise Hadoop
Data Discovery & Lineage in Enterprise HadoopData Discovery & Lineage in Enterprise Hadoop
Data Discovery & Lineage in Enterprise Hadoop
DataWorks Summit
 
GDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache AtlasGDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache Atlas
DataWorks Summit
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
DataWorks Summit
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
DataWorks Summit/Hadoop Summit
 
Deploying a Governed Data Lake
Deploying a Governed Data LakeDeploying a Governed Data Lake
Deploying a Governed Data Lake
WaterlineData
 
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & TrifactaExtend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
DataWorks Summit/Hadoop Summit
 
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
DataWorks Summit
 
JOSA TechTalk: Metadata Management
in Big Data
JOSA TechTalk: Metadata Management
in Big DataJOSA TechTalk: Metadata Management
in Big Data
JOSA TechTalk: Metadata Management
in Big Data
Jordan Open Source Association
 
Data Governance for Data Lakes
Data Governance for Data LakesData Governance for Data Lakes
Data Governance for Data Lakes
Kiran Kamreddy
 
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
DataWorks Summit
 
The Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture ViewThe Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture View
DataWorks Summit/Hadoop Summit
 
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Seetharam Venkatesh
 
Loan Decisioning Transformation
Loan Decisioning TransformationLoan Decisioning Transformation
Loan Decisioning Transformation
DataWorks Summit/Hadoop Summit
 
Navigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data DiscoveryNavigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data Discovery
DataWorks Summit/Hadoop Summit
 

What's hot (20)

An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...An architecture for federated data discovery and lineage over on-prem datasou...
An architecture for federated data discovery and lineage over on-prem datasou...
 
Atlas ApacheCon 2017
Atlas ApacheCon 2017Atlas ApacheCon 2017
Atlas ApacheCon 2017
 
Manage tracability with Apache Atlas, a flexible metadata repository
Manage tracability with Apache Atlas, a flexible metadata repositoryManage tracability with Apache Atlas, a flexible metadata repository
Manage tracability with Apache Atlas, a flexible metadata repository
 
Data Governance Initiative
Data Governance InitiativeData Governance Initiative
Data Governance Initiative
 
Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015
 
Data governance in Hadoop (My Personal Notes)
Data governance in Hadoop (My Personal Notes)Data governance in Hadoop (My Personal Notes)
Data governance in Hadoop (My Personal Notes)
 
Data Discovery & Lineage in Enterprise Hadoop
Data Discovery & Lineage in Enterprise HadoopData Discovery & Lineage in Enterprise Hadoop
Data Discovery & Lineage in Enterprise Hadoop
 
GDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache AtlasGDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache Atlas
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
 
Deploying a Governed Data Lake
Deploying a Governed Data LakeDeploying a Governed Data Lake
Deploying a Governed Data Lake
 
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & TrifactaExtend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
 
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
 
JOSA TechTalk: Metadata Management
in Big Data
JOSA TechTalk: Metadata Management
in Big DataJOSA TechTalk: Metadata Management
in Big Data
JOSA TechTalk: Metadata Management
in Big Data
 
Data Governance for Data Lakes
Data Governance for Data LakesData Governance for Data Lakes
Data Governance for Data Lakes
 
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
 
The Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture ViewThe Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture View
 
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
 
Loan Decisioning Transformation
Loan Decisioning TransformationLoan Decisioning Transformation
Loan Decisioning Transformation
 
Navigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data DiscoveryNavigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data Discovery
 

Similar to Apache atlas sydney 2017-v4

The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...
DataWorks Summit
 
ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...
ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...
ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...
DataWorks Summit/Hadoop Summit
 
Building Enterprise-Ready Knowledge Graph Applications in the Cloud
Building Enterprise-Ready Knowledge Graph Applications in the CloudBuilding Enterprise-Ready Knowledge Graph Applications in the Cloud
Building Enterprise-Ready Knowledge Graph Applications in the Cloud
Peter Haase
 
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summitAnalysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Slim Baltagi
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
DataWorks Summit/Hadoop Summit
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
DataWorks Summit/Hadoop Summit
 
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo
 
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
Indus Khaitan
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
Provectus
 
Master Meta Data
Master Meta DataMaster Meta Data
Master Meta Data
Digikrit
 
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Databricks
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Rittman Analytics
 
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are InterchangeableMyth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Denodo
 
Analyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentationAnalyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentation
AnalytixDataServices
 
Archonnex at ICPSR
Archonnex at ICPSRArchonnex at ICPSR
Archonnex at ICPSR
Harshakumar Ummerpillai
 
LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbench
Sheetal Pratik
 
Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?
DataWorks Summit/Hadoop Summit
 
Classification based security in Hadoop
Classification based security in HadoopClassification based security in Hadoop
Classification based security in Hadoop
Madhan Neethiraj
 
Big data Question bank.pdf
Big data Question bank.pdfBig data Question bank.pdf
Big data Question bank.pdf
Sitamarhi Institute of Technology
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
 

Similar to Apache atlas sydney 2017-v4 (20)

The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...
 
ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...
ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...
ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...
 
Building Enterprise-Ready Knowledge Graph Applications in the Cloud
Building Enterprise-Ready Knowledge Graph Applications in the CloudBuilding Enterprise-Ready Knowledge Graph Applications in the Cloud
Building Enterprise-Ready Knowledge Graph Applications in the Cloud
 
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summitAnalysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
 
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
 
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 
Master Meta Data
Master Meta DataMaster Meta Data
Master Meta Data
 
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
 
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are InterchangeableMyth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
 
Analyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentationAnalyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentation
 
Archonnex at ICPSR
Archonnex at ICPSRArchonnex at ICPSR
Archonnex at ICPSR
 
LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbench
 
Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?
 
Classification based security in Hadoop
Classification based security in HadoopClassification based security in Hadoop
Classification based security in Hadoop
 
Big data Question bank.pdf
Big data Question bank.pdfBig data Question bank.pdf
Big data Question bank.pdf
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014
 

Recently uploaded

What’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete RoadmapWhat’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete Roadmap
Envertis Software Solutions
 
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
kalichargn70th171
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
sjcobrien
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
Preparing Non - Technical Founders for Engaging a Tech Agency
Preparing Non - Technical Founders for Engaging  a  Tech AgencyPreparing Non - Technical Founders for Engaging  a  Tech Agency
Preparing Non - Technical Founders for Engaging a Tech Agency
ISH Technologies
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
The Third Creative Media
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
Alina Yurenko
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
ervikas4
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
Bert Jan Schrijver
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
XfilesPro
 
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
Tier1 app
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
kgyxske
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Paul Brebner
 
Project Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdfProject Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdf
Karya Keeper
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
gapen1
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
Reetu63
 

Recently uploaded (20)

What’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete RoadmapWhat’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete Roadmap
 
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
Preparing Non - Technical Founders for Engaging a Tech Agency
Preparing Non - Technical Founders for Engaging  a  Tech AgencyPreparing Non - Technical Founders for Engaging  a  Tech Agency
Preparing Non - Technical Founders for Engaging a Tech Agency
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
 
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
 
Project Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdfProject Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdf
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
 

Apache atlas sydney 2017-v4

  • 1. Apache Atlas & Open Metadata Dataworks Sydney 2017 Nigel Jones, Software Architect IBM Ferd Scheepers Chief Information Architect ING
  • 2. 2 Open Metadata and Governance will allow… … metadata to be captured when the data is created, moved with the data and be augmented and processed by any of the vendor tools.
  • 3. Open Metadata and Governance consists of: 1. Standardized, extensible set of metadata types 2. Metadata exchange APIs and notifications 3. Frameworks for automated governance Open Metadata and Governance will allow you to have: 1. An enterprise data catalogue that lists all of your data, where it is located, its origin (lineage), owner, structure, meaning, classification and quality 2. New data tools (from any vendor) connect to your data catalogue out of the box 3. Metadata being added automatically to the catalogue as new data is created and analysed 4. Subject matter experts collaborating around the data 5. Automated governance processes protect and manage your data 3 What is Open Metadata and Governance?
  • 4. 4 Positioning of Apache Atlas for Open Metadata Open and Unified Metadata Metadata repository Apache Atlas Metadata repository IBM Metadata repository SAS Open Metadata Repository Service OMRS Open Metadata Access Service OMAS Components defined and being developed by Open Metadata & Governance project Metadata highway
  • 5. • Apache Atlas provides an open community for developing the reference implementation for open metadata and governance. In essence Apache Atlas delivers 2 main capabilities: • it plays a role of a metadata repository (Graph Database) for a metadata end-user tool • and, it plays the important role of delivering the federated/unified metadata layer across the entire landscape of an enterprise • The software development governance from the Apache Software Foundation (ASF) creates confidence that the technology will be maintained and enhanced as appropriate in an equitable manner. Role of Apache Atlas 5
  • 6. … because Apache is mostly focused on development and we are missing a governance body for managing the adoption of and compliance to the Open Metadata and Governance standards. We envision the following roles for ODPI: 1. Be an advocate of the Open Metadata and Governance standards, make them visible and their value understood. 2. Facilitate discussions around the Open Metadata and Governance standards evolution, maintenance and development. 3. Test and sign-off compliance of vendor offerings to the Open Metadata and Governance standards. 6 Doing all of this under Apache Atlas flag is not enough…
  • 7. 1. Hands-on Community members: • ING • IBM • HortonWorks 2. Companies we have had conversations with: • CIBC • SAS • Microsoft • Oracle • Informatica • Waterline • RBC • DBS 7 Who is in ?
  • 8. 1. Ambition level: • End of September 2017: Open Metadata working demo. • Mid-December November 2017: first version of user access. • Google for Data 2. Next steps: • End of Q2 2018: production ready version of Virtual Data Connector. 8 Timeline and next steps
  • 11. Atlas has graduated! DOB: 2015-05-05R: 0.8.1
  • 12. Atlas Architecture Storage Repository Graph  Type System  REST API  Models  UI & Apps Hooks & Bridges https://cwiki.apache.org/confluence/display/ATLAS/Open+Metadata+and+Governance
  • 13. A reminder of our problem.. And solution Open and Unified Metadata
  • 15. Common Core Data model Data Assets Governance Lineage Glossary Collaboration Models & Reference Data Base Types, Systems & Infrastructure Metadata Discovery https://cwiki.apache.org/confluence/display/ATLAS/Building+out+the+Open+Metadata+Typesystem
  • 16.
  • 17. Open APIs - OMRS Metadata Highway Adapter Plugin Open Connector Framework https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258803
  • 18. Open APIs - OMAS OMRS Governance Engine OMAS Glossary OMAS Asset OMAS Information View OMAS ++...... https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258799
  • 19. OMAS – detail Project List Metadata Service Data/Asset Community Metadata Service Landscape Definition Metadata Service Asset Catalog Metadata Service Classification and Mapping Metadata Service Information View Metadata Service Connector Directory Metadata Service Governance Definitions Metadata Service Information Process Metadata Service Glossary and Taxonomy Metadata Service Asset Metadata Service Discovery Metadata Service Governance Action Metadata Service Roles and Access Metadata Service Models and Schema Metadata Service Connector https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258799
  • 20. Business metadata Structural metadata for a data store New glossary function for semantic processing EMPNAME EMPNO JOBCODE SALARY EMPLOYEE RECORD Employee Work Location Annual Salary Job Title Employee Id Employee Name Hourly Pay Rate Manager Compensation Plan HAS-A HAS-A HAS-A HAS-A HAS-A HAS-A IS-A IS-A Sensitive IS-A Data 00 3809890 6 7 Lemmie Stage 818928 3082 4 New York 4 27 DataStage Expert 1 45324 300 27 Code St Harlem NY 1 3 https://cwiki.apache.org/confluence/display/ATLAS/Area+3+-+Glossary
  • 22. Governance Action Framework metadata drives enforcement Classification (tag) based – scalable, glossary driven Access, Masking, Filtering Supports Apache Ranger but open APIs for others Audit,Rights - Exception management, Rights, Privacy (to look at in future) https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258801
  • 23. Open Discovery Framework Open Framework Plugins characterize data & relationships Updates metadata with results Initial implementation in master https://cwiki.apache.org/confluence/display/ATLAS/Automated+metadata+discovery
  • 26. How can I get involved? Discuss: Mailing List Document, Explain: Wiki Report, Design: Jira Face to face Code Vendors! https://cwiki.apache.org/confluence/display/ATLAS/Getting+Involved
  • 27. Governance & Security BOF Thursday 18:00 C4.7 Owen O’Malley Nigel Jones Ferd Scheepers
  • 28.
  • 30. VDC End to End https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=69407333