This document discusses modern data, data governance, and the Apache Atlas proposal. It defines modern data as including clickstream, web, social, geo-location, and IoT data that uses a schema-on-read approach, while traditional data refers to ERP, CRM, and SCM data that uses schema-on-write. It also discusses how modern data refers to stream processing using a streaming model, while both modern and traditional data can use batch processing. The document then defines data governance and discusses the Apache Atlas proposal, which allows governance visibility and controls for Hadoop and non-Hadoop data through services like search, lineage, access control, auditing, and lifecycle management powered by a flexible metadata repository.
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...Matt Stubbs
Date: 14th November 2018
Location: Self-Service Analytics Theatre
Time: 13:50 - 14:20
Speaker: Stephanie McReynolds
Organisation: Alation
About: Raw data is proliferating at an enormous rate. But so are our derived data assets - hundreds of dashboards, thousands of reports, millions of transformed data sets. Self-service analytics have ensured that this noise is making it increasingly hard to understand and trust data for decision-making. This trust gap is holding your organisation back from business outcomes.
European analytics leaders have found a way to close the gap between data and decision-making. From MunichRe to Pfizer and Daimler, analytics teams are adopting data catalogues for thousands of self-service analytics users.
Join us in this session to hear how data catalogues that activate data by incorporating machine learning can:
• Increase analyst productivity 20-40%
• Boost the understanding of the nuances of data and
• Establish trust in data-driven decisions with agile stewardship
In this video, Southard Jones from Birst describes the company's new Recurring Revenue Solution Accelerator, which gives subscription-based businesses the ability to maximize revenue, accelerate growth and reduce churn by putting fast, accurate and meaningful business intelligence and data analysis capabilities at their fingertips.
Learn more: http://www.birst.com/company/press/birst-helps-subscription-based-businesses-multiply-their-revenue-new-recurring-revenue
Watch the video presentation: http://wp.me/p3RLEV-1vI
You Need a Data Catalog. Do You Know Why?Precisely
Data catalog has become a more popular discussion topic within data management and data governance circles. “What is it?” and “Do I need one?” are two common questions; along with “How does a catalog relate to and support the data governance program?”
The data catalog plays a key role in the governance process; How well information can be managed, aligned to business objectives and monetized depends in great part to what you know about your data.
In this webinar you will learn about:
- The role of the data catalog
- What kinds of information should be in your data catalog
- Those catalog items that can be harvested systemically versus those that require stewardship involvement
- The role of the catalog in your data quality program
We hope you’ll join this on-demand webinar and learn how a data catalog should be part of your governance and data quality program!
This presentation contains a broad introduction to big data and its technologies.
Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis.
Big Data is a phrase used to mean a massive volume of both structured and unstructured data that is so large it is difficult to process using traditional database and software techniques. In most enterprise scenarios the volume of data is too big or it moves too fast or it exceeds current processing capacity.
Modern Integrated Data Environment - Whitepaper | QuboleVasu S
A whit-paper is about building a modern data platform for data driven organisations with using cloud data warehouse with modern data platform architecture
https://www.qubole.com/resources/white-papers/modern-integrated-data-environment
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...Matt Stubbs
Date: 14th November 2018
Location: Self-Service Analytics Theatre
Time: 13:50 - 14:20
Speaker: Stephanie McReynolds
Organisation: Alation
About: Raw data is proliferating at an enormous rate. But so are our derived data assets - hundreds of dashboards, thousands of reports, millions of transformed data sets. Self-service analytics have ensured that this noise is making it increasingly hard to understand and trust data for decision-making. This trust gap is holding your organisation back from business outcomes.
European analytics leaders have found a way to close the gap between data and decision-making. From MunichRe to Pfizer and Daimler, analytics teams are adopting data catalogues for thousands of self-service analytics users.
Join us in this session to hear how data catalogues that activate data by incorporating machine learning can:
• Increase analyst productivity 20-40%
• Boost the understanding of the nuances of data and
• Establish trust in data-driven decisions with agile stewardship
In this video, Southard Jones from Birst describes the company's new Recurring Revenue Solution Accelerator, which gives subscription-based businesses the ability to maximize revenue, accelerate growth and reduce churn by putting fast, accurate and meaningful business intelligence and data analysis capabilities at their fingertips.
Learn more: http://www.birst.com/company/press/birst-helps-subscription-based-businesses-multiply-their-revenue-new-recurring-revenue
Watch the video presentation: http://wp.me/p3RLEV-1vI
You Need a Data Catalog. Do You Know Why?Precisely
Data catalog has become a more popular discussion topic within data management and data governance circles. “What is it?” and “Do I need one?” are two common questions; along with “How does a catalog relate to and support the data governance program?”
The data catalog plays a key role in the governance process; How well information can be managed, aligned to business objectives and monetized depends in great part to what you know about your data.
In this webinar you will learn about:
- The role of the data catalog
- What kinds of information should be in your data catalog
- Those catalog items that can be harvested systemically versus those that require stewardship involvement
- The role of the catalog in your data quality program
We hope you’ll join this on-demand webinar and learn how a data catalog should be part of your governance and data quality program!
This presentation contains a broad introduction to big data and its technologies.
Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis.
Big Data is a phrase used to mean a massive volume of both structured and unstructured data that is so large it is difficult to process using traditional database and software techniques. In most enterprise scenarios the volume of data is too big or it moves too fast or it exceeds current processing capacity.
Modern Integrated Data Environment - Whitepaper | QuboleVasu S
A whit-paper is about building a modern data platform for data driven organisations with using cloud data warehouse with modern data platform architecture
https://www.qubole.com/resources/white-papers/modern-integrated-data-environment
Building Your Enterprise Data Marketplace with DMX-hPrecisely
In the past few years third-party data marketplaces, often provided as Data as a Service, have taken off. But most organizations already own the data most relevant to their business – data pertaining to their own customers, transactions, products, etc.
That’s why the most successful organizations are applying the concepts of external data markets to create their own enterprise data marketplaces, where users can easily find and access data from across the company that is clean, trustworthy and auditable.
View this webinar on-demand to learn how to build an enterprise data marketplace of your own with DMX-h! We'll cover:
• Attributes of a successful enterprise data marketplace
• Potential roadblocks, and how to overcome them
• Examples of customers who have successfully built data marketplaces with DMX-h
Sharing a presentation highlighting some key aspects to be taken into consideration while harnessing your Digital Transformation projects as a Digital Intelligence enabler for your enterprise
Tableau’s predictive modeling feature allows users to leverage powerful statistical models to build and update predictive models efficiently while giving them the flexibility to select their predictors, collaborate on the model results within other table calculations, and comprehend and examine a large volume of data. Go through this presentation to discover how Tableau’s predictive modeling feature allows users to leverage powerful statistical models to build and update predictive models efficiently.
Joe Caserta, President at Caserta Concepts, presented "Setting Up the Data Lake" at a DAMA Philadelphia Chapter Meeting.
For more information on the services offered by Caserta Concepts, visit our website at http://casertaconcepts.com/.
Daniel is a Project Leader at Datayaan.
He has worked on designing and implementing innovative solutions for complex business problems and has helped companies with digital transformation.
Telehealth, Transport Logistics, and Telcom are some of the key areas his work covers.
And on the tech side he has widespread knowledge and experience in Microservices,IoT and Cloud.
He's going to talk about his approach in transforming an organization to leverage data-driven decision making.
For this he presents Transport Logistics as a use case and walks us through an overview of how the transformation takes place:
How the Data is Collected and Processed.. What we can do using the collected data.. and how the organization is benefitted..
He is also going to shed some light on how IoT can be used to automate data collection which is very crucial for building an effective data-driven business model
Data-Ed Online Presents: Data Warehouse StrategiesDATAVERSITY
Integrating data across systems has been a perpetual challenge. Unfortunately, the current technology-focused solutions have not helped IT to improve its dismal project success statistics. Data warehouses, BI implementations, and general analytical efforts achieve the same levels of success as other IT projects – approximately 1/3rd are considered successes when measured against price, schedule, or functionality objectives. The first step is determining the appropriate analysis approach to the data system integration challenge. The second step is understanding the strengths and weaknesses of various approaches. Turns out that proper analysis at this stage makes actual technology selection far more accurate. Only when these are accomplished can proper matching between problem and capabilities be achieved as the third step and true business value be delivered. This webinar will illustrate that good systems development more often depends on at least three data management disciplines in order to provide a solid foundation.
Takeaways:
Data system integration challenge analysis
Understanding of a range of data system-integration technologies including
Problem space (BI, Analytics, Big Data), Data (Warehousing, Vault, Cube) and alternative approaches (Virtualization, Linked Data, Portals, Meta-models)
Understanding foundational data warehousing & BI concepts based on the Data Management Body of Knowledge (DMBOK)
How to utilize data warehousing & BI in support of business strategy
Enterprise Analytics: Serving Big Data Projects for HealthcareDATA360US
Andrew Rosenberg's Presentation on "Enterprise Analytics: Serving Big Data Projects for Healthcare" at DATA 360 Healthcare Informatics Conference - March 5th, 2015
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Denodo
This presentation has been extracted from a full webinar organized by Denodo. To learn more click here: http://bit.ly/1FOMD90
Big Data, Internet of Things, Data Lakes, Streaming Analytics, Machine Learning… these are just a few of the buzzwords being thrown around in the world of data management today. They provide us with new sources of data, new forms of analytics, and new ways of storing, managing and utilizing our data. The reality however, is that traditional Data Warehouse architectures are no longer able to handle many of these new technologies and a new data architecture is required.
So what does the new architecture look like? Does the enterprise data warehouse still have a role? Where do these new technologies fit in? How can business users easily and quickly access the various sources of data and analytic results at the right time to make the right decisions in this new world order?
Dr. Claudia Imhoff addresses these questions and presents the Extended Data Warehouse architecture (XDW), demonstrating the need for each component and how an enterprise combines these into appropriate workflows for proper decision support.
Data lineage to drive compliance and as a business imperativeLeigh Hill
The importance of data lineage has escalated in recent years in response to regulatory demand and increased business understanding of the benefits it can deliver. Like all capital markets technology, data lineage presents both challenges and opportunities, so how best can it be implemented and sustained? And how can your organisation reap the rewards of successful implementation?
This webinar will outline data lineage, its progress towards automation, and why it is so important from both a regulatory and business perspective. It will also provide advice on how to select a solution and step-by-step guidance on how to implement and integrate data lineage. Finally, the webinar will discuss how to manage data lineage to ensure regulatory compliance, deliver business benefits and plan for the future.
Register for the webinar to find out more about:
-The importance of data lineage in capital markets
-How to select a solution for your organisation
-Approaches to implementation and integration
-How to achieve sustainable regulatory compliance
-The business benefits of successful implementation
The Missing Link in Enterprise Data Governance - Automated Metadata ManagementDATAVERSITY
So many companies and organizations are in the same boat. They’re drowning in their data — so much data, from so many different sources. They understand that data governance is hugely important for them to be able to know their data inside and out and comply with regulations. What many companies have not yet come to terms with when implementing their data governance strategy and supporting tools, is the criticality of metadata in the process. As the ‘data about data,’ metadata provides the value and purpose of the data content, thereby becoming an extremely effective tool for quickly locating information – a must for BI groups dealing with analytics and business user reporting.
Octopai's CEO, Amnon Drori will discuss this critical missing link in enterprise data governance and the impact of automating metadata management for data discovery and data lineage for BI. He'll demonstrate how BI groups use Octopai to not only locate their data instantly, but to quickly and accurately visualize and understand the entire data journey to enable the business to move forward.
This article useful for anyone who want to introduce with Big Data and how oracle architecture Big Data solution using Oracle Big Data Cloud solutions .
Building Your Enterprise Data Marketplace with DMX-hPrecisely
In the past few years third-party data marketplaces, often provided as Data as a Service, have taken off. But most organizations already own the data most relevant to their business – data pertaining to their own customers, transactions, products, etc.
That’s why the most successful organizations are applying the concepts of external data markets to create their own enterprise data marketplaces, where users can easily find and access data from across the company that is clean, trustworthy and auditable.
View this webinar on-demand to learn how to build an enterprise data marketplace of your own with DMX-h! We'll cover:
• Attributes of a successful enterprise data marketplace
• Potential roadblocks, and how to overcome them
• Examples of customers who have successfully built data marketplaces with DMX-h
Sharing a presentation highlighting some key aspects to be taken into consideration while harnessing your Digital Transformation projects as a Digital Intelligence enabler for your enterprise
Tableau’s predictive modeling feature allows users to leverage powerful statistical models to build and update predictive models efficiently while giving them the flexibility to select their predictors, collaborate on the model results within other table calculations, and comprehend and examine a large volume of data. Go through this presentation to discover how Tableau’s predictive modeling feature allows users to leverage powerful statistical models to build and update predictive models efficiently.
Joe Caserta, President at Caserta Concepts, presented "Setting Up the Data Lake" at a DAMA Philadelphia Chapter Meeting.
For more information on the services offered by Caserta Concepts, visit our website at http://casertaconcepts.com/.
Daniel is a Project Leader at Datayaan.
He has worked on designing and implementing innovative solutions for complex business problems and has helped companies with digital transformation.
Telehealth, Transport Logistics, and Telcom are some of the key areas his work covers.
And on the tech side he has widespread knowledge and experience in Microservices,IoT and Cloud.
He's going to talk about his approach in transforming an organization to leverage data-driven decision making.
For this he presents Transport Logistics as a use case and walks us through an overview of how the transformation takes place:
How the Data is Collected and Processed.. What we can do using the collected data.. and how the organization is benefitted..
He is also going to shed some light on how IoT can be used to automate data collection which is very crucial for building an effective data-driven business model
Data-Ed Online Presents: Data Warehouse StrategiesDATAVERSITY
Integrating data across systems has been a perpetual challenge. Unfortunately, the current technology-focused solutions have not helped IT to improve its dismal project success statistics. Data warehouses, BI implementations, and general analytical efforts achieve the same levels of success as other IT projects – approximately 1/3rd are considered successes when measured against price, schedule, or functionality objectives. The first step is determining the appropriate analysis approach to the data system integration challenge. The second step is understanding the strengths and weaknesses of various approaches. Turns out that proper analysis at this stage makes actual technology selection far more accurate. Only when these are accomplished can proper matching between problem and capabilities be achieved as the third step and true business value be delivered. This webinar will illustrate that good systems development more often depends on at least three data management disciplines in order to provide a solid foundation.
Takeaways:
Data system integration challenge analysis
Understanding of a range of data system-integration technologies including
Problem space (BI, Analytics, Big Data), Data (Warehousing, Vault, Cube) and alternative approaches (Virtualization, Linked Data, Portals, Meta-models)
Understanding foundational data warehousing & BI concepts based on the Data Management Body of Knowledge (DMBOK)
How to utilize data warehousing & BI in support of business strategy
Enterprise Analytics: Serving Big Data Projects for HealthcareDATA360US
Andrew Rosenberg's Presentation on "Enterprise Analytics: Serving Big Data Projects for Healthcare" at DATA 360 Healthcare Informatics Conference - March 5th, 2015
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Denodo
This presentation has been extracted from a full webinar organized by Denodo. To learn more click here: http://bit.ly/1FOMD90
Big Data, Internet of Things, Data Lakes, Streaming Analytics, Machine Learning… these are just a few of the buzzwords being thrown around in the world of data management today. They provide us with new sources of data, new forms of analytics, and new ways of storing, managing and utilizing our data. The reality however, is that traditional Data Warehouse architectures are no longer able to handle many of these new technologies and a new data architecture is required.
So what does the new architecture look like? Does the enterprise data warehouse still have a role? Where do these new technologies fit in? How can business users easily and quickly access the various sources of data and analytic results at the right time to make the right decisions in this new world order?
Dr. Claudia Imhoff addresses these questions and presents the Extended Data Warehouse architecture (XDW), demonstrating the need for each component and how an enterprise combines these into appropriate workflows for proper decision support.
Data lineage to drive compliance and as a business imperativeLeigh Hill
The importance of data lineage has escalated in recent years in response to regulatory demand and increased business understanding of the benefits it can deliver. Like all capital markets technology, data lineage presents both challenges and opportunities, so how best can it be implemented and sustained? And how can your organisation reap the rewards of successful implementation?
This webinar will outline data lineage, its progress towards automation, and why it is so important from both a regulatory and business perspective. It will also provide advice on how to select a solution and step-by-step guidance on how to implement and integrate data lineage. Finally, the webinar will discuss how to manage data lineage to ensure regulatory compliance, deliver business benefits and plan for the future.
Register for the webinar to find out more about:
-The importance of data lineage in capital markets
-How to select a solution for your organisation
-Approaches to implementation and integration
-How to achieve sustainable regulatory compliance
-The business benefits of successful implementation
The Missing Link in Enterprise Data Governance - Automated Metadata ManagementDATAVERSITY
So many companies and organizations are in the same boat. They’re drowning in their data — so much data, from so many different sources. They understand that data governance is hugely important for them to be able to know their data inside and out and comply with regulations. What many companies have not yet come to terms with when implementing their data governance strategy and supporting tools, is the criticality of metadata in the process. As the ‘data about data,’ metadata provides the value and purpose of the data content, thereby becoming an extremely effective tool for quickly locating information – a must for BI groups dealing with analytics and business user reporting.
Octopai's CEO, Amnon Drori will discuss this critical missing link in enterprise data governance and the impact of automating metadata management for data discovery and data lineage for BI. He'll demonstrate how BI groups use Octopai to not only locate their data instantly, but to quickly and accurately visualize and understand the entire data journey to enable the business to move forward.
This article useful for anyone who want to introduce with Big Data and how oracle architecture Big Data solution using Oracle Big Data Cloud solutions .
Top 10 guidelines for deploying modern data architecture for the data driven ...LindaWatson19
Enterprises are facing a new revolution, powered by the rapid adoption of data analytics with modern technologies like machine learning and artificial intelligence (A).
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESBDenodo
Data integration is paramount, in this presentation you will find three different paradigms: using client-side tools, creating traditional data warehouses and the data virtualization solution - the logical data warehouse, comparing each other and positioning data virtualization as an integral part of any future-proof IT infrastructure.
This presentation is part of the Fast Data Strategy Conference, and you can watch the video here goo.gl/1q94Ka.
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarioskcmallu
What's the origin of Big Data? What are the real life usage scenarios where Hadoop has been successfully adopted? How do you get started within your organizations?
DOCUMENT SELECTION USING MAPREDUCE Yenumula B Reddy and Desmond HillClaraZara1
Big data is used for structured, unstructured and semi-structured large volume of data which is difficult to manage and costly to store. Using explanatory analysis techniques to understand such raw data, carefully balance the benefits in terms of storage and retrieval techniques is an essential part of the Big Data. The research discusses the MapReduce issues, framework for MapReduce programming model and implementation. The paper includes the analysis of Big Data using MapReduce techniques and identifying a required document from a stream of documents. Identifying a required document is part of the security in a stream of documents in the cyber world. The document may be significant in business, medical, social, or terrorism.
Big data is used for structured, unstructured and semi-structured large volume of data which is difficult to
manage and costly to store. Using explanatory analysis techniques to understand such raw data, carefully
balance the benefits in terms of storage and retrieval techniques is an essential part of the Big Data. The
research discusses the Map Reduce issues, framework for Map Reduce programming model and
implementation. The paper includes the analysis of Big Data using Map Reduce techniques and identifying
a required document from a stream of documents. Identifying a required document is part of the security in
a stream of documents in the cyber world. The document may be significant in business, medical, social, or
terrorism.
Slow Data Kills Business eBook - Improve the Customer ExperienceInterSystems
We live in an era where customer experience trumps product features and functions. How do you exceed customer’s expectations every time they interact with your organization? By leveraging more information and applying insights you have learned over time. Turning data-driven power into delightful experiences will give you the advantages required to succeed in today’s climate of one-click shopping and crowd-sourced feedback. Whether you are a retailer, a banker, a care provider, or a policy maker, your organization must harness the power of growing data volumes, data types, and data sources to foster experiences that matter.
Are You Prepared For The Future Of Data Technologies?Dell World
We are increasingly coming upon an age where technology is a strong enabler of business success, where there are strong synergies between business strategy and technology strategy. You often cannot discuss business strategy without data and related technologies being a big part of it. And as such, business leaders are increasingly turning to IT to compete more effectively in the market. As IT management, it falls upon you to ensure that your data technology architecture (software & hardware) is built in a way that it can handle the business demands of today and in to the future. In this session, we will discuss the various big data technology architectures and associated tools, and what role each should play in your data environment. We will also give real life examples of how others are using these technologies. Build a better data architecture, to unlock the power of all data.
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...LindaWatson19
Read how Solix leverages the Apache Hadoop big data platform to provide low cost, bulk data storage for Enterprise Archiving. The Solix Big Data Suite provides a unified archive for both structured and unstructured data and provides an Information Lifecycle Management (ILM) continuum to reduce costs, ensure enterprise applications are operating at peak performance and manage governance, risk and compliance.
http://www.embarcadero.com
Data yields information when its definition is understood or readily available and it is presented in a meaningful context. Yet even the information that may be gleaned from data is incomplete because data is created to drive applications, not to inform users. Metadata is the data that holds application
data definitions as well as their operational and business context, and so plays a critical role in data and application design and development, as well as in providing an intelligent operational environment that's driven by business meaning.
Big Brother Big Sister Bluemix Architecture from #HackathonCLTDave Callaghan
Big Brother and Big Sisters brought their enterprise system challenge to #HackathonCLT. I wanted to design a system that could support 10x the membership with the same expense. By using Bluemix to provide scale, the Watson APIs for both ingestion and analytics, and their Hyperledger implementation for security paired with HBase, I believe we have a potential solution. More to come!
Discuss building a trust solution for HealthIT or other regulated enterprises with blockchain using Hyperledger with Hbase for off-blockchain storage for scaling prototyped on Bluemix.
Stormwater analytics with MongoDB and PentahoDave Callaghan
Use MongoDB and Pentaho to rapidly evaluate a use case for the City of Charlotte's Stormwater Management System by creating "A Single View of a Raindrop".
There are any number of vendors and publications stating that IT departments need to invest big in Big Data and Big Analytics to meet the challenges of the Internet of Things. Let's swap out marketing and hype for logic and math and separate the signal from the noise. We'll come up with a clear problem definition and come up with an algorithmic approach to the problem. Once we have a framework, we can more intelligently choose an implementation.
2. What is Modern Data?
Clickstream, web and social, geo-location, IoT,
server logs, etc are considered modern.
(think schema-on-read)
ERP, CRM, SCM and LOB-specific OLTP are
considered traditional.
(think schema-on-write)
Mainframe is considered legacy.
(think mission-critical)
3. What is Modern Data?
Modern Data refers to stream processing
In a streaming data model, you store queries and then continuously
run data through the queries.
(think event-driven model)
Both Modern and traditional data refer to Batch Processing
In a traditional query model, you store data and then run queries on
the data as needed.
(think query-driven model)
4. What is Modern Data?
Modern Data refers to data; not to technologies.
It it the responsibility of those of us who architect, develop and
implement data technologies to appreciate this difference.
There have been many hard-won lessons learned in enterpise
data management.
The criticality of Data Governance may well top this list.
5. What is Data Governance?
The process by which an organization formalizes
the fiduciary duty for the management of data
assets critical to its success.
Forrester
Data governance is a system of decision rights
and accountabilities for information-related
processes, executed according to agreed upon
models, which describe who can take what
actions with what information, and when, under
what circumumstances, using what methods.
Data Governance Institute
10. Atlas Proposal
Background
Hadoop is one of many platforms in the modern enterprise data ecosystem and
requires governance controls commensurate with this reality.
Currently, there is no easy or complete way to provide comprehensive visibility
and control into Hadoop audit, lineage, and security for workflows that require
Hadoop and non-Hadoop processing.
Many solutions are usually point based, and require a monolithic application
workflow. Multi-tenancy and concurrency are problematic as these offerings are
not aware of activity outside of their narrow focus.
As Hadoop gains greater popularity, governance concerns will become
increasingly vital to increasing maturity and furthering adoption. It is a particular
barrier to expanding enterprise data under management.
11. Atlas Proposal
Apache Atlas allows agnostic governance visibility into Hadoop, these
abilities are enabled through a set of core foundational services powered
by a flexible metadata repository.
These services include:
Search and Lineage for datasets
Metadata driven data access control
Indexed and Searchable Centralized Auditing operational Events
Data lifecycle management – ingestion to disposition
Metadata interchange with other metadata tools