SlideShare a Scribd company logo
Section 1 - Introduction and
basic overview
What is a Data Model?
• A graphical representation of data elements of an organization or business.
• It identifies those things about which it is important to track information (entities), the details
about those things (attributes), and the relationship between those things (relationships).
• Entity- Relationship Diagram plus Entity and Attribute data definitions
• Subject-oriented, designed in Third Normal Form
• Facilitates communication between the business users and the IT analysts
Data Modeling Terminologies:
• Entity: A thing which is recognized as being capable of an independent existence and which can be
uniquely identified. Examples: a Customer, an employee, an Account, a campaign, Branch. EDW
Example: Party, Agreement, Organization, Individual.
• Weak Entity: An entity can be a weak entity, that cannot be uniquely identified by its attributes
alone . Weak entity can be an associative entity, or a sub type entity.
Examples Associative entity: Account Party, Party Address.
Examples of sub type entity: Organization, Individual.
• Relationship: A relationship captures how two or more entities are related to one another.
Examples: ‘employee of’ is a relationship between an organization and employee.
• Attributes: Attribute describes an entity or a relationship. Entities and relationships can both have
attributes. Examples: Party Start date, Party name, Account Number.
Section 2a – Data Modeling & Data Mapping guidelines
Upstream
EDW Upstream Data Model – To EDW, Upstream is a combination of layers having processes, databases, jobs, and de-coupling views through which
extracted source system files are loaded from Extraction Layer to staging (or Loading) layer and then further transformed via Transform Layer to
Core EDW Layer.
Upstream Model standards:
• To be modeled and extended by keeping Teradata FSLDM as base.
• Tables to be in 3rd Normal Form as per Teradata FSLDM.
• Jobs to follow the Primary Key defined by OCBC Data Modeler.
Prerequisite:
• User requirement Or Functional Spec document, to determine the data elements to bring to EDW.
• Early data Inventory (EDI), to understand the source system which contains the required data elements.
• It has to explain the business attributes,
• primary keys of each tables,
• The relationship between the tables,
• If the source system has the files which relates to other system.
• Data Profiling documents to understand the data demographics and integrity of the new data elements
• Understanding of the existing EDW model Or Teradata FSLDM.
Section 2a – Data
Modeling & Data
Mapping guidelines
Upstream Contd..
Upstream Logical Data Modeling Process: Once the data
elements required to bring to EDW is finalized and the EDI
for the corresponding source system is available, then it
can follow the below process to prepare the data model.
1. Identify the data entity: Based on the EDI and data
profiling, identify the data entity corresponding to the
data element which has to bring to EDW. E.g. Customer,
account, Branch, employee, transaction, campaign,
product, collateral etc..
2. Group the attributes related to each entity, which EDW
is interested.
• Attributes describes data entities e.g.: Account (entity)
: Status, account number, account open date ..etc
3. Refer the FSDLM guidelines to map the identified
entities to OCBC FSLDM.
• As per FSLDM, Customer, Branch, Employee are Party
• Transaction is event
• Collateral is Party asset
4. Extend the OCBC FSLDM model if the required entity
doesn’t fit the existing model subject to the approval of
OCBC Data Modeler.
Section 2a – Data
Modeling & Data
Mapping guidelines
Upstream Contd..
Upstream Physical Data Model creation: Physical data models are derived from logical model entity
definitions and relationships and are constructed to ensure that data is stored and managed in physical
structures that operate effectively on the Teradata database platform. Below are the PDM creation
guidelines:
• Primary Index: Target tables to follow the best candidate eligible for Primary Index (Unique or Non-
Unique) knowing the access path, distribution, join criteria etc.
Note: Primary Index and Primary key are different concept.
• Partitioned Primary Index: To enhance the performance of the big size tables, PPI can be
implemented. Refer the Teradata technical documentation for more details on PPI guidelines.
• Default Values: It can be used to default the values if no value passed to the table attribute
• ETL Control Frame work Attributes: All EDW target tables (except b-key, b-map and reference
tables) will have seven control frame work attributes.
1. Start_Dt
2. End_Dt
3. Data_Source_Cd
4. Record_Deleted_Flag
5. BusinessDate
6. Ins_Txf_BatchID
7. Upd_Txf_BatchID
• Data Types Assignation: Data Types should be assigned as part of physicalisation.
• Null / Not Null Handling: Nulls and Not-Nulls should be assigned for each attribute.
Section 2a – Data
Modeling & Data
Mapping guidelines
Upstream Contd..
There are two Data Model deliverables in Upstream:
• T - EDW Platform Upstream Data Architecture: SDLC standard
document to document the data model, which describes all the
business data entities, attributes and how the data elements
modeled in OCBC EDW. It also should include all the model
design decisions and deviations if any. Data architecture
document should have the logical and physical model diagram
with relationships and primary keys, primary indexes and other
physical design considerations.
• T - EDW Platform Upstream Source Target Mapping: Data
architecture document will be used as the input for the source to
target mapping document. Source to Target Mapping document
captures the detailed business rules, how the source business
attributes transformed to EDW data elements. It will be used as
the reference document for the further ETL development.
Section 2b – Data
Modeling & Data
Mapping guidelines
Downstream
EDW Downstream Data Model: To EDW, Downstream is a combination of
layers having process, databases, tables, views and through which user
requirements are modeled and met after applying the business and
technical transformation rules on data read via customized OCBC
FSLDM.
Downstream Model standards:
• To be modelled to support user requirements and must be scalable.
• To be modelled to ensure that it’s easy to maintain and it’s tuned to
meet the SLA based on best performance.
• To be modelled keeping in mind the enterprise level reporting
(including geographies).
• To be modelled with various security attributes at most granular
level.
• Tables must not be skewed.
• Jobs to follow the Primary Key defined by OCBC Data Modeler.
• Target tables to follow the best candidate eligible for Primary Index
(Unique or Non-Unique) knowing the access path, distribution, join
criteria etc.
Prerequisite: User requirement or Functional Spec document, to determine
the data elements to bring report to users which:
• should explain the business rules to derive the reporting data
elements
• should specify the reporting level of each data element
• should have details of the reporting hierarchies
Section 2b – Data
Modeling & Data
Mapping guidelines
Downstream Contd..
There are two Data Model deliverables in Downstream:
• T - EDW Platform Downstream Data Architecture: SDLC standard
document to document the data model, which describes all the
business data entities, attributes and how the data elements
modeled in OCBC EDW downstream data mart to support the user
reporting requirement. It also should include all the model design
decisions and deviations if any. Data architecture document should
have the logical and physical model diagram with relationships and
primary keys, primary indexes and other physical design
considerations.
• T - EDW Platform Downstream Source Target Mapping: Data
architecture document will be used as the input for the source to
target mapping document. Source to Target Mapping document
captures the detailed business rules, how the source business
attributes transformed to EDW data elements. It will be used as the
reference document for the further ETL development.
Section 3a – Naming
standards and
conventions Contd…
Column name Convention: Table naming convention is as per below rule.
• Keep the column name same as of parent column in Core EDW.
• Take off the vowels i.e. a, e, i, o and u. Please do not remove the vowel if it is first letter of the word for example, “A”
in Amount. If the column name is still longer than the 30 characters please contact the architecture or the standard
and guidelines team.
• Keep the column name in mixed case (first letter in capital). For example, Data_Source_Cd instead of
DATA_SOURCE_CD or Data_SOURCE_Cd etc.
• The control columns should be brought as it is from Core EDW. And the above rules does not apply on them.
• Example:
• Like measure Count of Transaction is not part of Core EDW then
• Column Name
• Count_Of_Transactions / Transaction_Count
• Take off the vowels i.e. a, e, i, o and u.
• Column Name post this rule
• Cnt_Of_Trnsctn / Trnsctn_Cnt
• Keep the column name in mixed case (first letter in capital).
• Column Name post this rule
• Cnt_Of_Trnsctn / Trnsctn_Cnt
• The control columns should be brought as it is from Core EDW. And the above rules does not apply on them.
• Column Name post this rule
• The above column is not a control column.
• Control column like Data_Source_Cd should be brought as it is.
Section 4b – Best
Practices Contd…
Data Model Creation:
Below are the key steps in the data modeling
• Identify entities
• Identify attributes of entities
• Identify relationships
• Apply naming conventions
• Assign keys and indexes
• Normalize to reduce data redundancy
• Denormalize to improve performance
Section 5 – Check
list / Self Review
Process
Data Mapping:
• Please start the self review after the completion of data mapping document. Each and every
column in the doc should provide the useful and meaningful information. Provide the information
in the transformation rules as expected by the OCBC reviewer.
• Once the self review is done, suggest to have an internal review by Data Mapper before proceed
to the OCBC BADM team review to avoid rework / escalations.
Data Modeling:
• Check closely if all the names having correct naming conventions.
• Create the domains to provide the flexibility of changing a data type
• Create the indexes as per the OCBC standards.
• Assign all the data types to all the columns.
• Provide the table and column definitions for all the tables.
• Assign Null & Not null to all the columns.
• Create the default values, if any.
• Attach the abbreviation file with all the meaningful abbreviated tables and columns.
• Generate the DDLs from Erwin and add the database name for the datamart, send it to OCBC data
modeler for review. Once he/she approves will ask DBA to deploy it in the respective
environment.
• Make the data model presentable to the client / business users by coloring the entities and the
relationships / columns.
Section 6 – DOs and
DON’Ts
Please do a self review after the completion of
Data Model & Data Mapping. All the standards
should be meet as per the client expectations.
• For Data Modeling check all the data types, null,
not null fields, indexes, generate the DDLs and
pass it to the OCBC DM get the approval and
he/she will pass it to DBA to create the table
structures.
• For Data Mapping, all the columns should be
filled with proper formatting, coloring, if any,
font, size, uppercase, lowercase, etc.
• Every page of mapping document should be
updated by the mapping name.
• Extraction criteria should be clearly
mentioned.
THANK YOU

More Related Content

Similar to DataModeling.pptx

It 302 computerized accounting (week 2) - sharifah
It 302   computerized accounting (week 2) - sharifahIt 302   computerized accounting (week 2) - sharifah
It 302 computerized accounting (week 2) - sharifah
alish sha
 
Artifacts, Data Dictionary, Data Modeling, Data Wrangling
Artifacts, Data Dictionary, Data Modeling, Data WranglingArtifacts, Data Dictionary, Data Modeling, Data Wrangling
Artifacts, Data Dictionary, Data Modeling, Data Wrangling
Faisal Akbar
 
IT6701 Information Management - Unit I
IT6701 Information Management - Unit I  IT6701 Information Management - Unit I
IT6701 Information Management - Unit I
pkaviya
 
Week10 Analysing Client Requirements
Week10 Analysing Client RequirementsWeek10 Analysing Client Requirements
Week10 Analysing Client Requirements
hapy
 
WBC Entity Relationship and data flow diagrams
WBC Entity Relationship and data flow diagramsWBC Entity Relationship and data flow diagrams
WBC Entity Relationship and data flow diagrams
ArshitSood3
 
Journal of Physics Conference SeriesPAPER • OPEN ACCESS.docx
Journal of Physics Conference SeriesPAPER • OPEN ACCESS.docxJournal of Physics Conference SeriesPAPER • OPEN ACCESS.docx
Journal of Physics Conference SeriesPAPER • OPEN ACCESS.docx
LaticiaGrissomzz
 
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
JOHNLEAK1
 
oracle
oracle oracle
Unit 2_DBMS_10.2.22.pptx
Unit 2_DBMS_10.2.22.pptxUnit 2_DBMS_10.2.22.pptx
Unit 2_DBMS_10.2.22.pptx
MaryJoseph79
 
Week 2 - Database System Development Lifecycle-old.pptx
Week 2 - Database System Development Lifecycle-old.pptxWeek 2 - Database System Development Lifecycle-old.pptx
Week 2 - Database System Development Lifecycle-old.pptx
NurulIzrin
 
Chp06.pdfDFSGSDFGSDFGSDFGSDGSDGFDSGSDFGSDGFSDGS
Chp06.pdfDFSGSDFGSDFGSDFGSDGSDGFDSGSDFGSDGFSDGSChp06.pdfDFSGSDFGSDFGSDFGSDGSDGFDSGSDFGSDGFSDGS
Chp06.pdfDFSGSDFGSDFGSDFGSDGSDGFDSGSDFGSDGFSDGS
AvijitChaudhuri3
 
05. Physical Data Specification Template
05. Physical Data Specification Template05. Physical Data Specification Template
05. Physical Data Specification Template
Alan D. Duncan
 
Week7 Submit Analysis And Gain Agreement
Week7 Submit Analysis And Gain AgreementWeek7 Submit Analysis And Gain Agreement
Week7 Submit Analysis And Gain Agreement
hapy
 
EContent_11_2024_01_23_18_48_10_DatamodelsUnitIVpptx__2023_11_10_16_13_01.pdf
EContent_11_2024_01_23_18_48_10_DatamodelsUnitIVpptx__2023_11_10_16_13_01.pdfEContent_11_2024_01_23_18_48_10_DatamodelsUnitIVpptx__2023_11_10_16_13_01.pdf
EContent_11_2024_01_23_18_48_10_DatamodelsUnitIVpptx__2023_11_10_16_13_01.pdf
sitework231
 
Using the LEADing Data Reference Content
Using the LEADing Data Reference ContentUsing the LEADing Data Reference Content
Using the LEADing Data Reference Content
Global University Alliance
 
Data Models [DATABASE SYSTEMS: Design, Implementation, and Management]
Data Models [DATABASE SYSTEMS: Design, Implementation, and Management]Data Models [DATABASE SYSTEMS: Design, Implementation, and Management]
Data Models [DATABASE SYSTEMS: Design, Implementation, and Management]
Usman Tariq
 
Database 2 External Schema
Database 2   External SchemaDatabase 2   External Schema
Database 2 External Schema
Ashwani Kumar Ramani
 
Overview of business intelligence
Overview of business intelligenceOverview of business intelligence
Overview of business intelligence
Ahsan Kabir
 
DBMS data modeling.pptx
DBMS data modeling.pptxDBMS data modeling.pptx
DBMS data modeling.pptx
MrwafaAbbas
 
Chapter 3 requirements
Chapter 3 requirementsChapter 3 requirements
Chapter 3 requirements
Golda Margret Sheeba J
 

Similar to DataModeling.pptx (20)

It 302 computerized accounting (week 2) - sharifah
It 302   computerized accounting (week 2) - sharifahIt 302   computerized accounting (week 2) - sharifah
It 302 computerized accounting (week 2) - sharifah
 
Artifacts, Data Dictionary, Data Modeling, Data Wrangling
Artifacts, Data Dictionary, Data Modeling, Data WranglingArtifacts, Data Dictionary, Data Modeling, Data Wrangling
Artifacts, Data Dictionary, Data Modeling, Data Wrangling
 
IT6701 Information Management - Unit I
IT6701 Information Management - Unit I  IT6701 Information Management - Unit I
IT6701 Information Management - Unit I
 
Week10 Analysing Client Requirements
Week10 Analysing Client RequirementsWeek10 Analysing Client Requirements
Week10 Analysing Client Requirements
 
WBC Entity Relationship and data flow diagrams
WBC Entity Relationship and data flow diagramsWBC Entity Relationship and data flow diagrams
WBC Entity Relationship and data flow diagrams
 
Journal of Physics Conference SeriesPAPER • OPEN ACCESS.docx
Journal of Physics Conference SeriesPAPER • OPEN ACCESS.docxJournal of Physics Conference SeriesPAPER • OPEN ACCESS.docx
Journal of Physics Conference SeriesPAPER • OPEN ACCESS.docx
 
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
 
oracle
oracle oracle
oracle
 
Unit 2_DBMS_10.2.22.pptx
Unit 2_DBMS_10.2.22.pptxUnit 2_DBMS_10.2.22.pptx
Unit 2_DBMS_10.2.22.pptx
 
Week 2 - Database System Development Lifecycle-old.pptx
Week 2 - Database System Development Lifecycle-old.pptxWeek 2 - Database System Development Lifecycle-old.pptx
Week 2 - Database System Development Lifecycle-old.pptx
 
Chp06.pdfDFSGSDFGSDFGSDFGSDGSDGFDSGSDFGSDGFSDGS
Chp06.pdfDFSGSDFGSDFGSDFGSDGSDGFDSGSDFGSDGFSDGSChp06.pdfDFSGSDFGSDFGSDFGSDGSDGFDSGSDFGSDGFSDGS
Chp06.pdfDFSGSDFGSDFGSDFGSDGSDGFDSGSDFGSDGFSDGS
 
05. Physical Data Specification Template
05. Physical Data Specification Template05. Physical Data Specification Template
05. Physical Data Specification Template
 
Week7 Submit Analysis And Gain Agreement
Week7 Submit Analysis And Gain AgreementWeek7 Submit Analysis And Gain Agreement
Week7 Submit Analysis And Gain Agreement
 
EContent_11_2024_01_23_18_48_10_DatamodelsUnitIVpptx__2023_11_10_16_13_01.pdf
EContent_11_2024_01_23_18_48_10_DatamodelsUnitIVpptx__2023_11_10_16_13_01.pdfEContent_11_2024_01_23_18_48_10_DatamodelsUnitIVpptx__2023_11_10_16_13_01.pdf
EContent_11_2024_01_23_18_48_10_DatamodelsUnitIVpptx__2023_11_10_16_13_01.pdf
 
Using the LEADing Data Reference Content
Using the LEADing Data Reference ContentUsing the LEADing Data Reference Content
Using the LEADing Data Reference Content
 
Data Models [DATABASE SYSTEMS: Design, Implementation, and Management]
Data Models [DATABASE SYSTEMS: Design, Implementation, and Management]Data Models [DATABASE SYSTEMS: Design, Implementation, and Management]
Data Models [DATABASE SYSTEMS: Design, Implementation, and Management]
 
Database 2 External Schema
Database 2   External SchemaDatabase 2   External Schema
Database 2 External Schema
 
Overview of business intelligence
Overview of business intelligenceOverview of business intelligence
Overview of business intelligence
 
DBMS data modeling.pptx
DBMS data modeling.pptxDBMS data modeling.pptx
DBMS data modeling.pptx
 
Chapter 3 requirements
Chapter 3 requirementsChapter 3 requirements
Chapter 3 requirements
 

Recently uploaded

一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
y3i0qsdzb
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024
facilitymanager11
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
exukyp
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 

Recently uploaded (20)

一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 

DataModeling.pptx

  • 1. Section 1 - Introduction and basic overview What is a Data Model? • A graphical representation of data elements of an organization or business. • It identifies those things about which it is important to track information (entities), the details about those things (attributes), and the relationship between those things (relationships). • Entity- Relationship Diagram plus Entity and Attribute data definitions • Subject-oriented, designed in Third Normal Form • Facilitates communication between the business users and the IT analysts Data Modeling Terminologies: • Entity: A thing which is recognized as being capable of an independent existence and which can be uniquely identified. Examples: a Customer, an employee, an Account, a campaign, Branch. EDW Example: Party, Agreement, Organization, Individual. • Weak Entity: An entity can be a weak entity, that cannot be uniquely identified by its attributes alone . Weak entity can be an associative entity, or a sub type entity. Examples Associative entity: Account Party, Party Address. Examples of sub type entity: Organization, Individual. • Relationship: A relationship captures how two or more entities are related to one another. Examples: ‘employee of’ is a relationship between an organization and employee. • Attributes: Attribute describes an entity or a relationship. Entities and relationships can both have attributes. Examples: Party Start date, Party name, Account Number.
  • 2. Section 2a – Data Modeling & Data Mapping guidelines Upstream EDW Upstream Data Model – To EDW, Upstream is a combination of layers having processes, databases, jobs, and de-coupling views through which extracted source system files are loaded from Extraction Layer to staging (or Loading) layer and then further transformed via Transform Layer to Core EDW Layer. Upstream Model standards: • To be modeled and extended by keeping Teradata FSLDM as base. • Tables to be in 3rd Normal Form as per Teradata FSLDM. • Jobs to follow the Primary Key defined by OCBC Data Modeler. Prerequisite: • User requirement Or Functional Spec document, to determine the data elements to bring to EDW. • Early data Inventory (EDI), to understand the source system which contains the required data elements. • It has to explain the business attributes, • primary keys of each tables, • The relationship between the tables, • If the source system has the files which relates to other system. • Data Profiling documents to understand the data demographics and integrity of the new data elements • Understanding of the existing EDW model Or Teradata FSLDM.
  • 3. Section 2a – Data Modeling & Data Mapping guidelines Upstream Contd.. Upstream Logical Data Modeling Process: Once the data elements required to bring to EDW is finalized and the EDI for the corresponding source system is available, then it can follow the below process to prepare the data model. 1. Identify the data entity: Based on the EDI and data profiling, identify the data entity corresponding to the data element which has to bring to EDW. E.g. Customer, account, Branch, employee, transaction, campaign, product, collateral etc.. 2. Group the attributes related to each entity, which EDW is interested. • Attributes describes data entities e.g.: Account (entity) : Status, account number, account open date ..etc 3. Refer the FSDLM guidelines to map the identified entities to OCBC FSLDM. • As per FSLDM, Customer, Branch, Employee are Party • Transaction is event • Collateral is Party asset 4. Extend the OCBC FSLDM model if the required entity doesn’t fit the existing model subject to the approval of OCBC Data Modeler.
  • 4. Section 2a – Data Modeling & Data Mapping guidelines Upstream Contd.. Upstream Physical Data Model creation: Physical data models are derived from logical model entity definitions and relationships and are constructed to ensure that data is stored and managed in physical structures that operate effectively on the Teradata database platform. Below are the PDM creation guidelines: • Primary Index: Target tables to follow the best candidate eligible for Primary Index (Unique or Non- Unique) knowing the access path, distribution, join criteria etc. Note: Primary Index and Primary key are different concept. • Partitioned Primary Index: To enhance the performance of the big size tables, PPI can be implemented. Refer the Teradata technical documentation for more details on PPI guidelines. • Default Values: It can be used to default the values if no value passed to the table attribute • ETL Control Frame work Attributes: All EDW target tables (except b-key, b-map and reference tables) will have seven control frame work attributes. 1. Start_Dt 2. End_Dt 3. Data_Source_Cd 4. Record_Deleted_Flag 5. BusinessDate 6. Ins_Txf_BatchID 7. Upd_Txf_BatchID • Data Types Assignation: Data Types should be assigned as part of physicalisation. • Null / Not Null Handling: Nulls and Not-Nulls should be assigned for each attribute.
  • 5. Section 2a – Data Modeling & Data Mapping guidelines Upstream Contd.. There are two Data Model deliverables in Upstream: • T - EDW Platform Upstream Data Architecture: SDLC standard document to document the data model, which describes all the business data entities, attributes and how the data elements modeled in OCBC EDW. It also should include all the model design decisions and deviations if any. Data architecture document should have the logical and physical model diagram with relationships and primary keys, primary indexes and other physical design considerations. • T - EDW Platform Upstream Source Target Mapping: Data architecture document will be used as the input for the source to target mapping document. Source to Target Mapping document captures the detailed business rules, how the source business attributes transformed to EDW data elements. It will be used as the reference document for the further ETL development.
  • 6. Section 2b – Data Modeling & Data Mapping guidelines Downstream EDW Downstream Data Model: To EDW, Downstream is a combination of layers having process, databases, tables, views and through which user requirements are modeled and met after applying the business and technical transformation rules on data read via customized OCBC FSLDM. Downstream Model standards: • To be modelled to support user requirements and must be scalable. • To be modelled to ensure that it’s easy to maintain and it’s tuned to meet the SLA based on best performance. • To be modelled keeping in mind the enterprise level reporting (including geographies). • To be modelled with various security attributes at most granular level. • Tables must not be skewed. • Jobs to follow the Primary Key defined by OCBC Data Modeler. • Target tables to follow the best candidate eligible for Primary Index (Unique or Non-Unique) knowing the access path, distribution, join criteria etc. Prerequisite: User requirement or Functional Spec document, to determine the data elements to bring report to users which: • should explain the business rules to derive the reporting data elements • should specify the reporting level of each data element • should have details of the reporting hierarchies
  • 7. Section 2b – Data Modeling & Data Mapping guidelines Downstream Contd.. There are two Data Model deliverables in Downstream: • T - EDW Platform Downstream Data Architecture: SDLC standard document to document the data model, which describes all the business data entities, attributes and how the data elements modeled in OCBC EDW downstream data mart to support the user reporting requirement. It also should include all the model design decisions and deviations if any. Data architecture document should have the logical and physical model diagram with relationships and primary keys, primary indexes and other physical design considerations. • T - EDW Platform Downstream Source Target Mapping: Data architecture document will be used as the input for the source to target mapping document. Source to Target Mapping document captures the detailed business rules, how the source business attributes transformed to EDW data elements. It will be used as the reference document for the further ETL development.
  • 8. Section 3a – Naming standards and conventions Contd… Column name Convention: Table naming convention is as per below rule. • Keep the column name same as of parent column in Core EDW. • Take off the vowels i.e. a, e, i, o and u. Please do not remove the vowel if it is first letter of the word for example, “A” in Amount. If the column name is still longer than the 30 characters please contact the architecture or the standard and guidelines team. • Keep the column name in mixed case (first letter in capital). For example, Data_Source_Cd instead of DATA_SOURCE_CD or Data_SOURCE_Cd etc. • The control columns should be brought as it is from Core EDW. And the above rules does not apply on them. • Example: • Like measure Count of Transaction is not part of Core EDW then • Column Name • Count_Of_Transactions / Transaction_Count • Take off the vowels i.e. a, e, i, o and u. • Column Name post this rule • Cnt_Of_Trnsctn / Trnsctn_Cnt • Keep the column name in mixed case (first letter in capital). • Column Name post this rule • Cnt_Of_Trnsctn / Trnsctn_Cnt • The control columns should be brought as it is from Core EDW. And the above rules does not apply on them. • Column Name post this rule • The above column is not a control column. • Control column like Data_Source_Cd should be brought as it is.
  • 9. Section 4b – Best Practices Contd… Data Model Creation: Below are the key steps in the data modeling • Identify entities • Identify attributes of entities • Identify relationships • Apply naming conventions • Assign keys and indexes • Normalize to reduce data redundancy • Denormalize to improve performance
  • 10. Section 5 – Check list / Self Review Process Data Mapping: • Please start the self review after the completion of data mapping document. Each and every column in the doc should provide the useful and meaningful information. Provide the information in the transformation rules as expected by the OCBC reviewer. • Once the self review is done, suggest to have an internal review by Data Mapper before proceed to the OCBC BADM team review to avoid rework / escalations. Data Modeling: • Check closely if all the names having correct naming conventions. • Create the domains to provide the flexibility of changing a data type • Create the indexes as per the OCBC standards. • Assign all the data types to all the columns. • Provide the table and column definitions for all the tables. • Assign Null & Not null to all the columns. • Create the default values, if any. • Attach the abbreviation file with all the meaningful abbreviated tables and columns. • Generate the DDLs from Erwin and add the database name for the datamart, send it to OCBC data modeler for review. Once he/she approves will ask DBA to deploy it in the respective environment. • Make the data model presentable to the client / business users by coloring the entities and the relationships / columns.
  • 11. Section 6 – DOs and DON’Ts Please do a self review after the completion of Data Model & Data Mapping. All the standards should be meet as per the client expectations. • For Data Modeling check all the data types, null, not null fields, indexes, generate the DDLs and pass it to the OCBC DM get the approval and he/she will pass it to DBA to create the table structures. • For Data Mapping, all the columns should be filled with proper formatting, coloring, if any, font, size, uppercase, lowercase, etc. • Every page of mapping document should be updated by the mapping name. • Extraction criteria should be clearly mentioned.