METADATA
Presented by:
Akshara S
JPC21001
Definition
• Metadata in a data warehouse refers to data that provides information about
other data stored in the warehouse.
• we can say that metadata is the summarized data that leads us to detailed
data. In terms of data warehouse,
• For example, the index of a book serves as a metadata for the contents in the
book.
Why we need Metadata?
1.Data discovery: Metadata makes it easier to discover data by providing a central repository of information about the data,
including its location, structure, and content. This enables users to more easily find and access the data they need to support their
business needs.
2.Data quality: Metadata can be used to ensure the quality of data by providing information about the data's lineage, such as
where it came from and how it was transformed. This information can be used to validate the accuracy and reliability of the data.
3.Data governance: Metadata can be used to support data governance by providing information about who is responsible for
managing the data, how it is used, and how it is protected. This information can be used to ensure that data is used in an
appropriate and controlled manner.
4.Data integration: Metadata can be used to support data integration by providing information about how data from different
sources can be combined and transformed. This information can be used to ensure that data from different sources can be used
together in a consistent and meaningful way.
5.Business intelligence: Metadata can be used to support business intelligence by providing information about the content
and structure of data stored in a data warehouse. This information can be used to create reports and perform analysis to support
decision-making.
Categories of Metadata
There are 3 categories in metadata:
1.Technical metadata
2.Business metadata
3.Operational metadata
1.Technical metadata:
This type of metadata describes the technical aspects of the data warehouse and its components, such
as data storage, data retrieval, and data processing.
2. Business metadata:
This type of metadata provides information about the business context and meaning of the data stored
in the warehouse.
3.Operational metadata:
This type of metadata provides information about the operations and processes associated with the
data stored in the warehouse, such as data ingestion, data quality, and data security.
Roles of Metadata
• Metadata acts as a directory.
• This directory helps the decision support system to locate the contents of the data warehouse.
• Metadata helps in decision support system for mapping of data when data is transformed from
operational environment to data warehouse environment.
• Metadata helps in summarization between current detailed data and highly summarized data.
• Metadata also helps in summarization between lightly detailed data and highly summarized data..
Metadata has a very important role in a data warehouse. The role of metadata in a warehouse is different
from the warehouse data, yet it plays an important role. The various roles of metadata are explained
below.
• Metadata is used for query tools.
• Metadata is used in extraction and cleansing tools.
• Metadata is used in reporting tools.
• Metadata is used in transformation tools.
• Metadata plays an important role in loading functions
Metadata Repository
• Definition of data warehouse: It includes the description of structure of data warehouse. The
description is defined by schema, view, hierarchies, derived data definitions, and data mart locations and
contents.
• Business metadata: It contains has the data ownership information, business definition, and
changing policies.
• Operational Metadata: It includes currency of data and data lineage. Currency of data means
whether the data is active, archived, or purged. Lineage of data means the history of data migrated and
transformation applied on it.
• Data for mapping from operational environment to data warehouse: It includes the
source databases and their contents, data extraction, data partition cleaning, transformation rules, data
refresh and purging rules.
• Algorithms for summarization: It includes dimension algorithms, data on granularity, aggregation,
summarizing, etc.
Challenges for Metadata
• Metadata in a big organization is scattered across the organization. This metadata
is spread in spreadsheets, databases, and applications.
• Metadata could be present in text files or multimedia files. To use this data for
information management solutions, it has to be correctly defined.
• There are no industry-wide accepted standards. Data management solution
vendors have narrow focus.
• There are no easy and accepted methods of passing metadata.
Thank You

metadata.pptx

  • 1.
  • 2.
    Definition • Metadata ina data warehouse refers to data that provides information about other data stored in the warehouse. • we can say that metadata is the summarized data that leads us to detailed data. In terms of data warehouse, • For example, the index of a book serves as a metadata for the contents in the book.
  • 3.
    Why we needMetadata? 1.Data discovery: Metadata makes it easier to discover data by providing a central repository of information about the data, including its location, structure, and content. This enables users to more easily find and access the data they need to support their business needs. 2.Data quality: Metadata can be used to ensure the quality of data by providing information about the data's lineage, such as where it came from and how it was transformed. This information can be used to validate the accuracy and reliability of the data. 3.Data governance: Metadata can be used to support data governance by providing information about who is responsible for managing the data, how it is used, and how it is protected. This information can be used to ensure that data is used in an appropriate and controlled manner. 4.Data integration: Metadata can be used to support data integration by providing information about how data from different sources can be combined and transformed. This information can be used to ensure that data from different sources can be used together in a consistent and meaningful way. 5.Business intelligence: Metadata can be used to support business intelligence by providing information about the content and structure of data stored in a data warehouse. This information can be used to create reports and perform analysis to support decision-making.
  • 4.
    Categories of Metadata Thereare 3 categories in metadata: 1.Technical metadata 2.Business metadata 3.Operational metadata
  • 5.
    1.Technical metadata: This typeof metadata describes the technical aspects of the data warehouse and its components, such as data storage, data retrieval, and data processing. 2. Business metadata: This type of metadata provides information about the business context and meaning of the data stored in the warehouse. 3.Operational metadata: This type of metadata provides information about the operations and processes associated with the data stored in the warehouse, such as data ingestion, data quality, and data security.
  • 6.
    Roles of Metadata •Metadata acts as a directory. • This directory helps the decision support system to locate the contents of the data warehouse. • Metadata helps in decision support system for mapping of data when data is transformed from operational environment to data warehouse environment. • Metadata helps in summarization between current detailed data and highly summarized data. • Metadata also helps in summarization between lightly detailed data and highly summarized data.. Metadata has a very important role in a data warehouse. The role of metadata in a warehouse is different from the warehouse data, yet it plays an important role. The various roles of metadata are explained below.
  • 7.
    • Metadata isused for query tools. • Metadata is used in extraction and cleansing tools. • Metadata is used in reporting tools. • Metadata is used in transformation tools. • Metadata plays an important role in loading functions
  • 8.
    Metadata Repository • Definitionof data warehouse: It includes the description of structure of data warehouse. The description is defined by schema, view, hierarchies, derived data definitions, and data mart locations and contents. • Business metadata: It contains has the data ownership information, business definition, and changing policies. • Operational Metadata: It includes currency of data and data lineage. Currency of data means whether the data is active, archived, or purged. Lineage of data means the history of data migrated and transformation applied on it. • Data for mapping from operational environment to data warehouse: It includes the source databases and their contents, data extraction, data partition cleaning, transformation rules, data refresh and purging rules. • Algorithms for summarization: It includes dimension algorithms, data on granularity, aggregation, summarizing, etc.
  • 9.
    Challenges for Metadata •Metadata in a big organization is scattered across the organization. This metadata is spread in spreadsheets, databases, and applications. • Metadata could be present in text files or multimedia files. To use this data for information management solutions, it has to be correctly defined. • There are no industry-wide accepted standards. Data management solution vendors have narrow focus. • There are no easy and accepted methods of passing metadata.
  • 10.