Introduction to Data
Architecting
Understandinghow data is organized, managed, and utilized in
modern enterprises. This cast covers essential concepts from data
types to quality management, providing a foundational blueprint for
data excellence.
Content By:
Jitendra Tomar
2.
Types of Data:The Foundation of Architecture
Structured Data
Organized in fixed fields within
tables (e.g., customer records in
a relational database). Easily
searchable and quantifiable.
Semi-Structured Data
Has some organizational
properties but offers flexibility
(e.g., JSON, XML files). Common
in web data and APIs.
Unstructured Data
Lacks a predefined format (e.g.,
images, videos, emails, text
documents). Requires advanced
analytics for insights.
Understanding these types is crucial as they dictate storage solutions, processing methods, and integration
strategies within your architecture.
3.
Enterprise Data Model& Subject Area Model
Enterprise Data Model
A high-level blueprint representing all significant data
entities across the entire organization. It provides a
common language for business and IT.
Subject Area Model
Breaks down the broader enterprise model into
focused, manageable domains (e.g., Sales, Finance,
HR). Each subject area defines entities and
relationships relevant to that specific business
function.
Benefit: Ensures consistent data definitions and
relationships across departments, fostering a unified
understanding of organizational data.
4.
Enterprise Data Model& Subject Area Model
Enterprise Data Model
A high-level data architecture representing enterprise-
wide data.
Captures business entities, relationships, and rules.
Benefits: Provides a single integrated view of
organizational data.
Supports consistency across departments.(Enterprise-
wide entity view, e.g., Customers, Products, Orders,
Finance)
5.
Enterprise Data Model& Subject Area Model
Subject Data Model
Subset of Enterprise Data Model.
Focused on specific business domains (e.g., Sales, HR,
Finance).
Helps in modular data design.
Example: HR Subject Area Model Employees,
→
Departments, Payroll.
6.
Conceptual Models &Entity Models Explained
Conceptual Model
An abstract, business-focused representation of
data. It identifies major business concepts and
their relationships, independent of technical
implementation.
Entity Model
More detailed, defining specific entities (things of
interest, like 'Customer' or 'Order') and their
attributes, along with precise relationships
between them.
Visual tools like Entity-Relationship (ER) diagrams are essential for communicating these models clearly to
both technical and non-technical stakeholders, bridging the gap between business needs and technical
design.
7.
Data Reporting &Query Tools: Turning Data into Insights
Reporting tools like Power BI and Tableau empower
decision-makers with interactive dashboards and
visualizations, transforming raw data into actionable
business intelligence.
Query tools, such as SQL clients, provide analysts with
the power to directly extract, filter, and manipulate
data, enabling deep-dive analysis and custom
reporting.
Purpose:
Transform data into useful insights.
Tools:
SQL & Query Languages.
Business Intelligence (BI) Tools (Power BI, Tableau).
Reporting Dashboards.
Benefits:
Faster decision-making.
Customizable data views.
8.
Metadata: The DataAbout Data
Origin & Structure
Describes where data came
from, its format, and how it's
organized.
Usage & Access
Details how data is used, who
can access it, and its
permissions.
Quality & Trust
Indicates data quality, helping
ensure trustworthiness and
reliability.
Metadata is critical for robust data governance,
enabling easy discoverability of assets and
ensuring the trustworthiness of all data within
the enterprise.
It Enhances data governance, discovery, and
trust.
Types: Business Metadata: Meaning, rules.
Technical Metadata: Schema, structure.
Operational Metadata: Usage, lineage.
9.
Total Data QualityManagement (TDQM)
01
Continuous Monitoring
Regularly checking data against
predefined quality rules.
02
Data Cleansing
Identifying and correcting errors,
inconsistencies, or duplicates.
03
Process Improvement
TDQM is a holistic, ongoing approach to
ensure data accuracy, completeness,
consistency, and timeliness across the
entire data lifecycle.
The outcome is highly reliable data that
supports confident business decisions,
fosters trust, and ensures regulatory
compliance.
Optimizing data entry and
integration workflows to prevent
future issues.
10.
Layered Data Architecture:From Raw to Enriched Data
Enriched Layer
Creates tailored, high-value data products for specific business needs.
Conformed Layer
Integrates data across domains for a unified, consistent enterprise view.
Standardized Layer
Applies validation and formatting rules to ensure data consistency.
Staging/Raw Layer
Stores unaltered data directly from all source systems.
This layered approach systematically refines raw data into high-quality, actionable insights, supporting diverse
analytical and operational requirements.
11.
Conclusion: Why DataArchitecting Matters
Strategic Blueprint
Provides the necessary structure
for managing complex data
landscapes.
Quality & Consistency
Enables consistent, high-
quality data for robust
analytics and decision-
making.
Unlock Value
Empowers organizations to
maximize the full value of their
data assets.
Next Steps: Explore hands-on modeling and practical tool usage to deepen your understanding and
application of data architecting principles.