SlideShare a Scribd company logo
1 of 10
DATA TRANSFORMATION
By
V . TEJASWINI
1117-039-16-135
&
UDAY ANAND
1117-039-16-149
DATA TRANSFORMATION
• The data extracted from the source system cann’t be stored
directly in the data warehouse mainly bcz of 2 reasons :
• First , this is raw data that must be processed to be made usable
in the data warehouse . the data in the operational system is not
usable for making strategic decisions.
• Second , bcz operational data is extracted from many old legacy
systems , the quality of data in those systems may not be good
enough for the data warehouse.
TASKS INVOLVED IN DATA TRANSFORMATION
• FORMAT REVISION :
These revisions include changes to the data types and
lengths of the individual data fields.
• DECODING OF FIELDS:
When the data comes from multiple source systems , the
same data items may have been described by different field values.
• SPLITTING OF FIELDS:
Earlier legacy systems stored names and addresses in
• MERGING OF INFORMATION:
This type of data transformation is neither the opposite
of the previous task nor it means merging a number of fields to
form a single field ; instead , it means bringing together the
relevant information from different data sources.
• CHARACTER SET CONVERSION :
This type of data transformation is done to the textual
data to convert its character set to an agreed standard character set.
• CONVERSIONS OF UNITS :
Many companies have global branches. So , the sales
amount may be represented in different currencies in different
source systems.
CONTINUED…
• DATE AND TIME CONVERSION:
The date and time values also need to be
represented in a standard format.
• SUMMARIZATION:
This type of transformation is done to derive
summarized/aggregate data from the most granular
data . the summarized data will then be loaded in the
data warehouse instead of loading the most granular
level of data.
• KEY RESTRUCTURING:
While Extracting data
from the data sources , you have
to form the primary keys for the
fact tables and the dimensions
tables. You cannot keep the
primary keys of the source data
tables as the primary keys for
the fact and dimension tables
bcz the primary keys of the
source data have built-in
• DE-DUPLICATION:
In many companies, the records for the same customer
may be stored in many files.
Generally , ETL software is of 2 types-
• That produces code and the other that produces a
parameterized run time module.
• The ETL software that generates run time module requires
the legacy data to be flattened before it can be accessed.
ROLE OF DATA TRANSFORMATION
PROCESS
• Map the input data from the source systems to data for data
warehouse repository.
• Clean the data, fill all the missing values by some default value.
• Remove duplicate the records so that they may be stored only
once in the data warehouse.
• Perform splitting and merging of fields.
• Sort the records.
• De-normalize the extracted data according to the dimensional
model of the data warehouse.
• Convert to appropriate data types.
• Perform aggregations and summarizations .
• Inspect the data for referential integrity.
• Consolidation and integration of data from multiple
source systems.
Data transformation

More Related Content

What's hot (20)

Data Preparation.pptx
Data Preparation.pptxData Preparation.pptx
Data Preparation.pptx
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data Mining
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Types of Data Validation
Types of Data ValidationTypes of Data Validation
Types of Data Validation
 
Data Exploration.pptx
Data Exploration.pptxData Exploration.pptx
Data Exploration.pptx
 
Data Wrangling_1.pptx
Data Wrangling_1.pptxData Wrangling_1.pptx
Data Wrangling_1.pptx
 
Missing data handling
Missing data handlingMissing data handling
Missing data handling
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
comparison of CRD, RBD and LSD
comparison of CRD, RBD and LSDcomparison of CRD, RBD and LSD
comparison of CRD, RBD and LSD
 
Kdd process
Kdd processKdd process
Kdd process
 
Neural network
Neural networkNeural network
Neural network
 
Query processing
Query processingQuery processing
Query processing
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysis
 
Decision tree
Decision treeDecision tree
Decision tree
 
Statistics and data science
Statistics and data scienceStatistics and data science
Statistics and data science
 
Data Visualization
Data VisualizationData Visualization
Data Visualization
 
Weka presentation
Weka presentationWeka presentation
Weka presentation
 
Exploratory data analysis data visualization
Exploratory data analysis data visualizationExploratory data analysis data visualization
Exploratory data analysis data visualization
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 

Similar to Data transformation

Etl - Extract Transform Load
Etl - Extract Transform LoadEtl - Extract Transform Load
Etl - Extract Transform LoadABDUL KHALIQ
 
Get started with data migration
Get started with data migrationGet started with data migration
Get started with data migrationThinqloud
 
Merging Multiple Drug Safety and Pharmacovigilance Databases
Merging Multiple Drug Safety and Pharmacovigilance DatabasesMerging Multiple Drug Safety and Pharmacovigilance Databases
Merging Multiple Drug Safety and Pharmacovigilance DatabasesPerficient
 
Extract, Transform and Load.pptx
Extract, Transform and Load.pptxExtract, Transform and Load.pptx
Extract, Transform and Load.pptxJesusaEspeleta
 
ETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testingETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testingVibrant Event
 
ETL Testing - Introduction to ETL Testing
ETL Testing - Introduction to ETL TestingETL Testing - Introduction to ETL Testing
ETL Testing - Introduction to ETL TestingVibrant Event
 
INTRODUCTION TO RDBMS
INTRODUCTION TO RDBMSINTRODUCTION TO RDBMS
INTRODUCTION TO RDBMSVidhya Balan
 
Unit 1: Introduction to DBMS Unit 1 Complete
Unit 1: Introduction to DBMS Unit 1 CompleteUnit 1: Introduction to DBMS Unit 1 Complete
Unit 1: Introduction to DBMS Unit 1 CompleteRaj vardhan
 
Beginning Of DBMS (data base)
Beginning Of DBMS (data base)Beginning Of DBMS (data base)
Beginning Of DBMS (data base)Surya Swaroop
 
Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouseKomal Choudhary
 
TPC-DI - The First Industry Benchmark for Data Integration
TPC-DI - The First Industry Benchmark for Data IntegrationTPC-DI - The First Industry Benchmark for Data Integration
TPC-DI - The First Industry Benchmark for Data IntegrationTilmann Rabl
 
Data Warehouse By Piyush
Data Warehouse By PiyushData Warehouse By Piyush
Data Warehouse By Piyushastronish
 
dbms ppt.pptx database management system
dbms ppt.pptx database management systemdbms ppt.pptx database management system
dbms ppt.pptx database management systemdeeptipal230
 

Similar to Data transformation (20)

Etl - Extract Transform Load
Etl - Extract Transform LoadEtl - Extract Transform Load
Etl - Extract Transform Load
 
Get started with data migration
Get started with data migrationGet started with data migration
Get started with data migration
 
Chapter 6.pptx
Chapter 6.pptxChapter 6.pptx
Chapter 6.pptx
 
Merging Multiple Drug Safety and Pharmacovigilance Databases
Merging Multiple Drug Safety and Pharmacovigilance DatabasesMerging Multiple Drug Safety and Pharmacovigilance Databases
Merging Multiple Drug Safety and Pharmacovigilance Databases
 
ETL_Methodology.pptx
ETL_Methodology.pptxETL_Methodology.pptx
ETL_Methodology.pptx
 
Extract, Transform and Load.pptx
Extract, Transform and Load.pptxExtract, Transform and Load.pptx
Extract, Transform and Load.pptx
 
ETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testingETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testing
 
ETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testingETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL testing
 
ETL Testing - Introduction to ETL Testing
ETL Testing - Introduction to ETL TestingETL Testing - Introduction to ETL Testing
ETL Testing - Introduction to ETL Testing
 
INTRODUCTION TO RDBMS
INTRODUCTION TO RDBMSINTRODUCTION TO RDBMS
INTRODUCTION TO RDBMS
 
DBMS LEC-1PDF.pdf
DBMS LEC-1PDF.pdfDBMS LEC-1PDF.pdf
DBMS LEC-1PDF.pdf
 
Unit 1: Introduction to DBMS Unit 1 Complete
Unit 1: Introduction to DBMS Unit 1 CompleteUnit 1: Introduction to DBMS Unit 1 Complete
Unit 1: Introduction to DBMS Unit 1 Complete
 
Beginning Of DBMS (data base)
Beginning Of DBMS (data base)Beginning Of DBMS (data base)
Beginning Of DBMS (data base)
 
Etl process in data warehouse
Etl process in data warehouseEtl process in data warehouse
Etl process in data warehouse
 
Datawarehouse org
Datawarehouse orgDatawarehouse org
Datawarehouse org
 
TPC-DI - The First Industry Benchmark for Data Integration
TPC-DI - The First Industry Benchmark for Data IntegrationTPC-DI - The First Industry Benchmark for Data Integration
TPC-DI - The First Industry Benchmark for Data Integration
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data Warehouse By Piyush
Data Warehouse By PiyushData Warehouse By Piyush
Data Warehouse By Piyush
 
dbms ppt.pptx database management system
dbms ppt.pptx database management systemdbms ppt.pptx database management system
dbms ppt.pptx database management system
 
database1.pdf
database1.pdfdatabase1.pdf
database1.pdf
 

Recently uploaded

Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with CultureSeta Wicaksana
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876dlhescort
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Centuryrwgiffor
 
Eluru Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort Service
Eluru Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort ServiceEluru Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort Service
Eluru Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort ServiceDamini Dixit
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Serviceritikaroy0888
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesDipal Arora
 
John Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdfJohn Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdfAmzadHosen3
 
Falcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to ProsperityFalcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to Prosperityhemanthkumar470700
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLSeo
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxAndy Lambert
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noidadlhescort
 
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...Sheetaleventcompany
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptxnandhinijagan9867
 
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756dollysharma2066
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageMatteo Carbone
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
How to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League CityHow to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League CityEric T. Tung
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfAdmir Softic
 

Recently uploaded (20)

Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
Eluru Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort Service
Eluru Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort ServiceEluru Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort Service
Eluru Call Girls Service ☎ ️93326-06886 ❤️‍🔥 Enjoy 24/7 Escort Service
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 
John Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdfJohn Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdf
 
Falcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to ProsperityFalcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to Prosperity
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptx
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
 
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptx
 
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
 
Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usage
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
How to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League CityHow to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League City
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
 

Data transformation

  • 1. DATA TRANSFORMATION By V . TEJASWINI 1117-039-16-135 & UDAY ANAND 1117-039-16-149
  • 2. DATA TRANSFORMATION • The data extracted from the source system cann’t be stored directly in the data warehouse mainly bcz of 2 reasons : • First , this is raw data that must be processed to be made usable in the data warehouse . the data in the operational system is not usable for making strategic decisions. • Second , bcz operational data is extracted from many old legacy systems , the quality of data in those systems may not be good enough for the data warehouse.
  • 3. TASKS INVOLVED IN DATA TRANSFORMATION • FORMAT REVISION : These revisions include changes to the data types and lengths of the individual data fields. • DECODING OF FIELDS: When the data comes from multiple source systems , the same data items may have been described by different field values. • SPLITTING OF FIELDS: Earlier legacy systems stored names and addresses in
  • 4. • MERGING OF INFORMATION: This type of data transformation is neither the opposite of the previous task nor it means merging a number of fields to form a single field ; instead , it means bringing together the relevant information from different data sources. • CHARACTER SET CONVERSION : This type of data transformation is done to the textual data to convert its character set to an agreed standard character set. • CONVERSIONS OF UNITS : Many companies have global branches. So , the sales amount may be represented in different currencies in different source systems.
  • 5. CONTINUED… • DATE AND TIME CONVERSION: The date and time values also need to be represented in a standard format. • SUMMARIZATION: This type of transformation is done to derive summarized/aggregate data from the most granular data . the summarized data will then be loaded in the data warehouse instead of loading the most granular level of data.
  • 6. • KEY RESTRUCTURING: While Extracting data from the data sources , you have to form the primary keys for the fact tables and the dimensions tables. You cannot keep the primary keys of the source data tables as the primary keys for the fact and dimension tables bcz the primary keys of the source data have built-in
  • 7. • DE-DUPLICATION: In many companies, the records for the same customer may be stored in many files. Generally , ETL software is of 2 types- • That produces code and the other that produces a parameterized run time module. • The ETL software that generates run time module requires the legacy data to be flattened before it can be accessed.
  • 8. ROLE OF DATA TRANSFORMATION PROCESS • Map the input data from the source systems to data for data warehouse repository. • Clean the data, fill all the missing values by some default value. • Remove duplicate the records so that they may be stored only once in the data warehouse. • Perform splitting and merging of fields. • Sort the records. • De-normalize the extracted data according to the dimensional model of the data warehouse.
  • 9. • Convert to appropriate data types. • Perform aggregations and summarizations . • Inspect the data for referential integrity. • Consolidation and integration of data from multiple source systems.