SlideShare a Scribd company logo
The ETL (Extract-Transform-Load)
Process
Data Warehousing and Data Mining
2
© Prentice Hall, 2002
The ETL Process
 Capture
 Scrub or data cleansing
 Transform
 Load and Index
ETL = Extract, transform, and load
3
© Prentice Hall, 2002
Figure 11-10: Steps in data reconciliation
Static extract = capturing a
snapshot of the source data at a
point in time
Incremental extract = capturing
changes that have occurred since
the last static extract
Capture = extract…obtaining a
snapshot of a chosen subset of the
source data for loading into the data
warehouse
4
© Prentice Hall, 2002
Figure 11-10: Steps in data reconciliation (continued)
Scrub = cleanse…uses pattern
recognition and AI techniques to
upgrade data quality
Fixing errors: misspellings,
erroneous dates, incorrect field
usage, mismatched addresses,
missing data, duplicate data,
inconsistencies
Also: decoding, reformatting, time
stamping, conversion, key
generation, merging, error
detection/logging, locating missing
data
5
© Prentice Hall, 2002
Figure 11-10: Steps in data reconciliation (continued)
Transform = convert data from
format of operational system to
format of data warehouse
Record-level:
Selection – data partitioning
Joining – data combining
Aggregation – data summarization
Field-level:
single-field – from one field to one
field
multi-field – from many fields to one,
or one field to many
6
© Prentice Hall, 2002
Figure 11-10: Steps in data reconciliation (continued)
Load/Index= place transformed data
into the warehouse and create
indexes
Refresh mode: bulk rewriting of
target data at periodic intervals
Update mode: only changes in
source data are written to data
warehouse
7
© Prentice Hall, 2002
Figure 11-11: Single-field transformation
In general – some transformation
function translates data from old form to
new form
Algorithmic transformation uses
a formula or logical expression
Table lookup
– another
approach
8
© Prentice Hall, 2002
Figure 11-12: Multifield transformation
M:1 –from many source
fields to one target field
1:M –from one
source field to
many target
fields
Data Characteristic after ETL Process
 Detailed
 Detailed rather than summarized
 Historical
 Periodic data
 Normalized
 3rd NF or higher
 Comprehensive
 Enterprise-wide perspective
 Timely
 Up-to-date (do not need be real-time)
 Quality Controlled
 Unquestioned quality
10
© Prentice Hall, 2002
Derived Data
 Objectives
 Ease of use for decision support applications
 Fast response to predefined user queries
 Customized data for particular target audiences
 Ad-hoc query support
 Data mining capabilities
  Characteristics
 Detailed (mostly periodic) data
 Aggregate (for summary)
 Distributed (to departmental servers)
Most common data model = star schema
(also called “dimensional model”)
11
© Prentice Hall, 2002
Figure 11-13: Components of a star schema
Fact tables contain
factual or quantitative
data
Dimension tables
contain descriptions
about the subjects of the
business
1:N relationship
between dimension
tables and fact
tables
Excellent for ad-hoc queries,
but bad for online transaction processing
Dimension tables
are denormalized to
maximize
performance
12
© Prentice Hall, 2002
Figure 11-14: Star schema example
Fact table provides statistics for sales
broken down by product, period and
store dimensions
13
© Prentice Hall, 2002
Figure 11-15: Star schema with sample data

More Related Content

Similar to 06_The ETL (Extract-Transform-Load) Process.ppt

Managing Data Integration Initiatives
Managing Data Integration InitiativesManaging Data Integration Initiatives
Managing Data Integration Initiatives
AllinConsulting
 
Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouse
Komal Choudhary
 
GROPSIKS.pptx
GROPSIKS.pptxGROPSIKS.pptx
GROPSIKS.pptx
avanceregine312
 
ETL_Methodology.pptx
ETL_Methodology.pptxETL_Methodology.pptx
ETL_Methodology.pptx
yogeshsuryawanshi47
 
Pentaho etl-tool
Pentaho etl-toolPentaho etl-tool
Pentaho etl-tool
Sreenivas Kappala
 
Modern database management jeffrey a. hoffer, mary b. prescott,
Modern database management   jeffrey a. hoffer, mary b. prescott,  Modern database management   jeffrey a. hoffer, mary b. prescott,
Modern database management jeffrey a. hoffer, mary b. prescott,
BlackIce86
 
Lecture 3 note.pptx
Lecture 3 note.pptxLecture 3 note.pptx
Lecture 3 note.pptx
TesfanehGorfu
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Subrata Kumer Paul
 
extract, transform, load_Data Analyt.ppt
extract, transform, load_Data Analyt.pptextract, transform, load_Data Analyt.ppt
extract, transform, load_Data Analyt.ppt
Neerupa Chauhan
 
expressor customer webinar with American Tower
expressor customer webinar with American Towerexpressor customer webinar with American Tower
expressor customer webinar with American Tower
guest2295a71
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guide
thomasmary607
 
ETL Process & Data Warehouse Fundamentals
ETL Process & Data Warehouse FundamentalsETL Process & Data Warehouse Fundamentals
ETL Process & Data Warehouse Fundamentals
SOMASUNDARAM T
 
Intro to Data warehousing lecture 10
Intro to Data warehousing   lecture 10Intro to Data warehousing   lecture 10
Intro to Data warehousing lecture 10
AnwarrChaudary
 
Etl data processing system which is very useful for the engineering students
Etl data processing system which is very useful for the engineering studentsEtl data processing system which is very useful for the engineering students
Etl data processing system which is very useful for the engineering students
utsav25khel
 
Chap11
Chap11Chap11
Chap11
lolanyunyu
 
Etl techniques
Etl techniquesEtl techniques
Etl techniques
mahezabeenIlkal
 
Data warehouse 5: Data Reconciliation and Transformation in Data Warehouse
Data warehouse 5: Data Reconciliation and Transformation in Data WarehouseData warehouse 5: Data Reconciliation and Transformation in Data Warehouse
Data warehouse 5: Data Reconciliation and Transformation in Data Warehouse
Vaibhav Khanna
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
vivekjv
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
PanaEk Warawit
 
D01 etl
D01 etlD01 etl
D01 etl
Prince Jain
 

Similar to 06_The ETL (Extract-Transform-Load) Process.ppt (20)

Managing Data Integration Initiatives
Managing Data Integration InitiativesManaging Data Integration Initiatives
Managing Data Integration Initiatives
 
Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouse
 
GROPSIKS.pptx
GROPSIKS.pptxGROPSIKS.pptx
GROPSIKS.pptx
 
ETL_Methodology.pptx
ETL_Methodology.pptxETL_Methodology.pptx
ETL_Methodology.pptx
 
Pentaho etl-tool
Pentaho etl-toolPentaho etl-tool
Pentaho etl-tool
 
Modern database management jeffrey a. hoffer, mary b. prescott,
Modern database management   jeffrey a. hoffer, mary b. prescott,  Modern database management   jeffrey a. hoffer, mary b. prescott,
Modern database management jeffrey a. hoffer, mary b. prescott,
 
Lecture 3 note.pptx
Lecture 3 note.pptxLecture 3 note.pptx
Lecture 3 note.pptx
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
 
extract, transform, load_Data Analyt.ppt
extract, transform, load_Data Analyt.pptextract, transform, load_Data Analyt.ppt
extract, transform, load_Data Analyt.ppt
 
expressor customer webinar with American Tower
expressor customer webinar with American Towerexpressor customer webinar with American Tower
expressor customer webinar with American Tower
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guide
 
ETL Process & Data Warehouse Fundamentals
ETL Process & Data Warehouse FundamentalsETL Process & Data Warehouse Fundamentals
ETL Process & Data Warehouse Fundamentals
 
Intro to Data warehousing lecture 10
Intro to Data warehousing   lecture 10Intro to Data warehousing   lecture 10
Intro to Data warehousing lecture 10
 
Etl data processing system which is very useful for the engineering students
Etl data processing system which is very useful for the engineering studentsEtl data processing system which is very useful for the engineering students
Etl data processing system which is very useful for the engineering students
 
Chap11
Chap11Chap11
Chap11
 
Etl techniques
Etl techniquesEtl techniques
Etl techniques
 
Data warehouse 5: Data Reconciliation and Transformation in Data Warehouse
Data warehouse 5: Data Reconciliation and Transformation in Data WarehouseData warehouse 5: Data Reconciliation and Transformation in Data Warehouse
Data warehouse 5: Data Reconciliation and Transformation in Data Warehouse
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
D01 etl
D01 etlD01 etl
D01 etl
 

More from INyomanSwitrayana

sa.ppt
sa.pptsa.ppt
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.ppt
INyomanSwitrayana
 
IS
ISIS
Handout-INF106-SBD-3.pptx
Handout-INF106-SBD-3.pptxHandout-INF106-SBD-3.pptx
Handout-INF106-SBD-3.pptx
INyomanSwitrayana
 
05_Decision Support and OLAP.pdf
05_Decision Support and OLAP.pdf05_Decision Support and OLAP.pdf
05_Decision Support and OLAP.pdf
INyomanSwitrayana
 
05_Skema Database.ppt
05_Skema Database.ppt05_Skema Database.ppt
05_Skema Database.ppt
INyomanSwitrayana
 
CS269-01 (1).pptx
CS269-01 (1).pptxCS269-01 (1).pptx
CS269-01 (1).pptx
INyomanSwitrayana
 
Lecture2-DT.pptx
Lecture2-DT.pptxLecture2-DT.pptx
Lecture2-DT.pptx
INyomanSwitrayana
 

More from INyomanSwitrayana (8)

sa.ppt
sa.pptsa.ppt
sa.ppt
 
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.ppt
 
IS
ISIS
IS
 
Handout-INF106-SBD-3.pptx
Handout-INF106-SBD-3.pptxHandout-INF106-SBD-3.pptx
Handout-INF106-SBD-3.pptx
 
05_Decision Support and OLAP.pdf
05_Decision Support and OLAP.pdf05_Decision Support and OLAP.pdf
05_Decision Support and OLAP.pdf
 
05_Skema Database.ppt
05_Skema Database.ppt05_Skema Database.ppt
05_Skema Database.ppt
 
CS269-01 (1).pptx
CS269-01 (1).pptxCS269-01 (1).pptx
CS269-01 (1).pptx
 
Lecture2-DT.pptx
Lecture2-DT.pptxLecture2-DT.pptx
Lecture2-DT.pptx
 

Recently uploaded

LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
GauravCar
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
shadow0702a
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
Las Vegas Warehouse
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
gowrishankartb2005
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Divyanshu
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
Introduction to AI Safety (public presentation).pptx
Introduction to AI Safety (public presentation).pptxIntroduction to AI Safety (public presentation).pptx
Introduction to AI Safety (public presentation).pptx
MiscAnnoy1
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
Madan Karki
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
gerogepatton
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
john krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptxjohn krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptx
Madan Karki
 

Recently uploaded (20)

LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
Introduction to AI Safety (public presentation).pptx
Introduction to AI Safety (public presentation).pptxIntroduction to AI Safety (public presentation).pptx
Introduction to AI Safety (public presentation).pptx
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
john krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptxjohn krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptx
 

06_The ETL (Extract-Transform-Load) Process.ppt

  • 1. The ETL (Extract-Transform-Load) Process Data Warehousing and Data Mining
  • 2. 2 © Prentice Hall, 2002 The ETL Process  Capture  Scrub or data cleansing  Transform  Load and Index ETL = Extract, transform, and load
  • 3. 3 © Prentice Hall, 2002 Figure 11-10: Steps in data reconciliation Static extract = capturing a snapshot of the source data at a point in time Incremental extract = capturing changes that have occurred since the last static extract Capture = extract…obtaining a snapshot of a chosen subset of the source data for loading into the data warehouse
  • 4. 4 © Prentice Hall, 2002 Figure 11-10: Steps in data reconciliation (continued) Scrub = cleanse…uses pattern recognition and AI techniques to upgrade data quality Fixing errors: misspellings, erroneous dates, incorrect field usage, mismatched addresses, missing data, duplicate data, inconsistencies Also: decoding, reformatting, time stamping, conversion, key generation, merging, error detection/logging, locating missing data
  • 5. 5 © Prentice Hall, 2002 Figure 11-10: Steps in data reconciliation (continued) Transform = convert data from format of operational system to format of data warehouse Record-level: Selection – data partitioning Joining – data combining Aggregation – data summarization Field-level: single-field – from one field to one field multi-field – from many fields to one, or one field to many
  • 6. 6 © Prentice Hall, 2002 Figure 11-10: Steps in data reconciliation (continued) Load/Index= place transformed data into the warehouse and create indexes Refresh mode: bulk rewriting of target data at periodic intervals Update mode: only changes in source data are written to data warehouse
  • 7. 7 © Prentice Hall, 2002 Figure 11-11: Single-field transformation In general – some transformation function translates data from old form to new form Algorithmic transformation uses a formula or logical expression Table lookup – another approach
  • 8. 8 © Prentice Hall, 2002 Figure 11-12: Multifield transformation M:1 –from many source fields to one target field 1:M –from one source field to many target fields
  • 9. Data Characteristic after ETL Process  Detailed  Detailed rather than summarized  Historical  Periodic data  Normalized  3rd NF or higher  Comprehensive  Enterprise-wide perspective  Timely  Up-to-date (do not need be real-time)  Quality Controlled  Unquestioned quality
  • 10. 10 © Prentice Hall, 2002 Derived Data  Objectives  Ease of use for decision support applications  Fast response to predefined user queries  Customized data for particular target audiences  Ad-hoc query support  Data mining capabilities   Characteristics  Detailed (mostly periodic) data  Aggregate (for summary)  Distributed (to departmental servers) Most common data model = star schema (also called “dimensional model”)
  • 11. 11 © Prentice Hall, 2002 Figure 11-13: Components of a star schema Fact tables contain factual or quantitative data Dimension tables contain descriptions about the subjects of the business 1:N relationship between dimension tables and fact tables Excellent for ad-hoc queries, but bad for online transaction processing Dimension tables are denormalized to maximize performance
  • 12. 12 © Prentice Hall, 2002 Figure 11-14: Star schema example Fact table provides statistics for sales broken down by product, period and store dimensions
  • 13. 13 © Prentice Hall, 2002 Figure 11-15: Star schema with sample data