Introduction to Data
Architecting
Understanding how data is organized, managed, and utilized in
modern enterprises. This cast covers essential concepts from data
types to quality management, providing a foundational blueprint for
data excellence.
Content By:
Jitendra Tomar
Types of Data: The Foundation of Architecture
Structured Data
Organized in fixed fields within
tables (e.g., customer records in
a relational database). Easily
searchable and quantifiable.
Semi-Structured Data
Has some organizational
properties but offers flexibility
(e.g., JSON, XML files). Common
in web data and APIs.
Unstructured Data
Lacks a predefined format (e.g.,
images, videos, emails, text
documents). Requires advanced
analytics for insights.
Understanding these types is crucial as they dictate storage solutions, processing methods, and integration
strategies within your architecture.
Enterprise Data Model & Subject Area Model
Enterprise Data Model
A high-level blueprint representing all significant data
entities across the entire organization. It provides a
common language for business and IT.
Subject Area Model
Breaks down the broader enterprise model into
focused, manageable domains (e.g., Sales, Finance,
HR). Each subject area defines entities and
relationships relevant to that specific business
function.
Benefit: Ensures consistent data definitions and
relationships across departments, fostering a unified
understanding of organizational data.
Enterprise Data Model & Subject Area Model
Enterprise Data Model
A high-level data architecture representing enterprise-
wide data.
Captures business entities, relationships, and rules.
Benefits: Provides a single integrated view of
organizational data.
Supports consistency across departments.(Enterprise-
wide entity view, e.g., Customers, Products, Orders,
Finance)
Enterprise Data Model & Subject Area Model
Subject Data Model
Subset of Enterprise Data Model.
Focused on specific business domains (e.g., Sales, HR,
Finance).
Helps in modular data design.
Example: HR Subject Area Model Employees,
→
Departments, Payroll.
Conceptual Models & Entity Models Explained
Conceptual Model
An abstract, business-focused representation of
data. It identifies major business concepts and
their relationships, independent of technical
implementation.
Entity Model
More detailed, defining specific entities (things of
interest, like 'Customer' or 'Order') and their
attributes, along with precise relationships
between them.
Visual tools like Entity-Relationship (ER) diagrams are essential for communicating these models clearly to
both technical and non-technical stakeholders, bridging the gap between business needs and technical
design.
Data Reporting & Query Tools: Turning Data into Insights
Reporting tools like Power BI and Tableau empower
decision-makers with interactive dashboards and
visualizations, transforming raw data into actionable
business intelligence.
Query tools, such as SQL clients, provide analysts with
the power to directly extract, filter, and manipulate
data, enabling deep-dive analysis and custom
reporting.
Purpose:
Transform data into useful insights.
Tools:
SQL & Query Languages.
Business Intelligence (BI) Tools (Power BI, Tableau).
Reporting Dashboards.
Benefits:
Faster decision-making.
Customizable data views.
Metadata: The Data About Data
Origin & Structure
Describes where data came
from, its format, and how it's
organized.
Usage & Access
Details how data is used, who
can access it, and its
permissions.
Quality & Trust
Indicates data quality, helping
ensure trustworthiness and
reliability.
Metadata is critical for robust data governance,
enabling easy discoverability of assets and
ensuring the trustworthiness of all data within
the enterprise.
It Enhances data governance, discovery, and
trust.
Types: Business Metadata: Meaning, rules.
Technical Metadata: Schema, structure.
Operational Metadata: Usage, lineage.
Total Data Quality Management (TDQM)
01
Continuous Monitoring
Regularly checking data against
predefined quality rules.
02
Data Cleansing
Identifying and correcting errors,
inconsistencies, or duplicates.
03
Process Improvement
TDQM is a holistic, ongoing approach to
ensure data accuracy, completeness,
consistency, and timeliness across the
entire data lifecycle.
The outcome is highly reliable data that
supports confident business decisions,
fosters trust, and ensures regulatory
compliance.
Optimizing data entry and
integration workflows to prevent
future issues.
Layered Data Architecture: From Raw to Enriched Data
Enriched Layer
Creates tailored, high-value data products for specific business needs.
Conformed Layer
Integrates data across domains for a unified, consistent enterprise view.
Standardized Layer
Applies validation and formatting rules to ensure data consistency.
Staging/Raw Layer
Stores unaltered data directly from all source systems.
This layered approach systematically refines raw data into high-quality, actionable insights, supporting diverse
analytical and operational requirements.
Conclusion: Why Data Architecting Matters
Strategic Blueprint
Provides the necessary structure
for managing complex data
landscapes.
Quality & Consistency
Enables consistent, high-
quality data for robust
analytics and decision-
making.
Unlock Value
Empowers organizations to
maximize the full value of their
data assets.
Next Steps: Explore hands-on modeling and practical tool usage to deepen your understanding and
application of data architecting principles.

5.3. Introduction-to-Data-Architecting - Online.pptx

  • 1.
    Introduction to Data Architecting Understandinghow data is organized, managed, and utilized in modern enterprises. This cast covers essential concepts from data types to quality management, providing a foundational blueprint for data excellence. Content By: Jitendra Tomar
  • 2.
    Types of Data:The Foundation of Architecture Structured Data Organized in fixed fields within tables (e.g., customer records in a relational database). Easily searchable and quantifiable. Semi-Structured Data Has some organizational properties but offers flexibility (e.g., JSON, XML files). Common in web data and APIs. Unstructured Data Lacks a predefined format (e.g., images, videos, emails, text documents). Requires advanced analytics for insights. Understanding these types is crucial as they dictate storage solutions, processing methods, and integration strategies within your architecture.
  • 3.
    Enterprise Data Model& Subject Area Model Enterprise Data Model A high-level blueprint representing all significant data entities across the entire organization. It provides a common language for business and IT. Subject Area Model Breaks down the broader enterprise model into focused, manageable domains (e.g., Sales, Finance, HR). Each subject area defines entities and relationships relevant to that specific business function. Benefit: Ensures consistent data definitions and relationships across departments, fostering a unified understanding of organizational data.
  • 4.
    Enterprise Data Model& Subject Area Model Enterprise Data Model A high-level data architecture representing enterprise- wide data. Captures business entities, relationships, and rules. Benefits: Provides a single integrated view of organizational data. Supports consistency across departments.(Enterprise- wide entity view, e.g., Customers, Products, Orders, Finance)
  • 5.
    Enterprise Data Model& Subject Area Model Subject Data Model Subset of Enterprise Data Model. Focused on specific business domains (e.g., Sales, HR, Finance). Helps in modular data design. Example: HR Subject Area Model Employees, → Departments, Payroll.
  • 6.
    Conceptual Models &Entity Models Explained Conceptual Model An abstract, business-focused representation of data. It identifies major business concepts and their relationships, independent of technical implementation. Entity Model More detailed, defining specific entities (things of interest, like 'Customer' or 'Order') and their attributes, along with precise relationships between them. Visual tools like Entity-Relationship (ER) diagrams are essential for communicating these models clearly to both technical and non-technical stakeholders, bridging the gap between business needs and technical design.
  • 7.
    Data Reporting &Query Tools: Turning Data into Insights Reporting tools like Power BI and Tableau empower decision-makers with interactive dashboards and visualizations, transforming raw data into actionable business intelligence. Query tools, such as SQL clients, provide analysts with the power to directly extract, filter, and manipulate data, enabling deep-dive analysis and custom reporting. Purpose: Transform data into useful insights. Tools: SQL & Query Languages. Business Intelligence (BI) Tools (Power BI, Tableau). Reporting Dashboards. Benefits: Faster decision-making. Customizable data views.
  • 8.
    Metadata: The DataAbout Data Origin & Structure Describes where data came from, its format, and how it's organized. Usage & Access Details how data is used, who can access it, and its permissions. Quality & Trust Indicates data quality, helping ensure trustworthiness and reliability. Metadata is critical for robust data governance, enabling easy discoverability of assets and ensuring the trustworthiness of all data within the enterprise. It Enhances data governance, discovery, and trust. Types: Business Metadata: Meaning, rules. Technical Metadata: Schema, structure. Operational Metadata: Usage, lineage.
  • 9.
    Total Data QualityManagement (TDQM) 01 Continuous Monitoring Regularly checking data against predefined quality rules. 02 Data Cleansing Identifying and correcting errors, inconsistencies, or duplicates. 03 Process Improvement TDQM is a holistic, ongoing approach to ensure data accuracy, completeness, consistency, and timeliness across the entire data lifecycle. The outcome is highly reliable data that supports confident business decisions, fosters trust, and ensures regulatory compliance. Optimizing data entry and integration workflows to prevent future issues.
  • 10.
    Layered Data Architecture:From Raw to Enriched Data Enriched Layer Creates tailored, high-value data products for specific business needs. Conformed Layer Integrates data across domains for a unified, consistent enterprise view. Standardized Layer Applies validation and formatting rules to ensure data consistency. Staging/Raw Layer Stores unaltered data directly from all source systems. This layered approach systematically refines raw data into high-quality, actionable insights, supporting diverse analytical and operational requirements.
  • 11.
    Conclusion: Why DataArchitecting Matters Strategic Blueprint Provides the necessary structure for managing complex data landscapes. Quality & Consistency Enables consistent, high- quality data for robust analytics and decision- making. Unlock Value Empowers organizations to maximize the full value of their data assets. Next Steps: Explore hands-on modeling and practical tool usage to deepen your understanding and application of data architecting principles.