Scenario:
XYZ Corp. is a parent corporation with 2 handbag stores located in New Jersey and New York.
XYZ needs to setup a system that will gather customer data from all of the different stores and
put it into one place.
Blindly copying data files from each store is not going to be good enough most of the time.
The corporation needs to have a standardized set of data in order to analyze it.
Task:
You will need to do the following:
1. Create a DataMart Create a table called: DimXYZCustomers
Create using: DimSQLCustomers.sql
2. Build an SSIS - ETL solution to get the data from the 2 stores and load to a single data table
Review/analyze the data from all sources. Source files attached
Determine what needs to be standardized based on the requirements below.
The data collected should be changed to a standard format. For instance, the state value should
all be 2-character value such as NY, NJ etc.
Extract data from all sources
Source file 1:NJCustomers.txt
Source file 2:NYCustomers.csv
State 2 character abbreviation
First and Last names Upper case
Load to the a single data location
Note: custAcct will hold the PK of the source tables, however, CustomerKey is the PK of this
dimension (it will auto increment)
Only good data should go to the database; bad data (assume: no account number) should go to an
error log file
Add annotation to your design space
Be sure to add meaningful names to each object
ETL Project name: LastnameFirstname_Week7Assignment
What to Submit:
1. A zipped folder of the entire solution
2. Screenshot of the ETL solution and Data Warehouse created
Naming convention: firstnameLastname_Week7Assignment.zip
HINT: In addition to the objects we are familiar with, use the transformation component (can be
found in the Common toolbox) Union All
/****** Object: Table [dbo].[DimXYZCustomers] ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[DimXYZCustomers](
[custAcct] [numeric](18, 0) NOT NULL,
[custFirst] [nvarchar](50) NULL,
[custLast] [nvarchar](50) NULL,
[strNum] [nvarchar](50) NULL,
[strName] [nvarchar](50) NULL,
[city] [nvarchar](50) NULL,
[state] [nvarchar](50) NULL,
[zip] [int] NULL,
[creditLimit] [money] NULL,
[CustomerKey] [numeric](18, 0) IDENTITY(1,1) NOT NULL,
CONSTRAINT [PK_DimXYZCustomers] PRIMARY KEY CLUSTERED
(
[CustomerKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY
= OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Ny customers.csv
acctNum,firstName,lastName,strNum,strName, city, state, zip,creditLimit
1,Lindon,Jacobs,5, Main St,Miller Place,New York,33176,330.5
2,Charissa,Gaul,34, Azalea Ct,Mt. Sinai,NY,33266,1059
3,Alyssa,wint,22, Sweetgum Lane,Farmingdale,NY,33176,150
4,Brenda,reynolds,45 B, Rocky Rd,Uniondale,New York,33266,5459.35
5,Vishnu,Ruben,123, Candid St,Port Jefferson,NY,33176,235.33
6,Renna,Kelly,4333, Louise Lane, Brookville,New York,33266,459.5
7,Chris,Rusch,12, Main St,Hempstead,NY,33176,2150
8,Lisa,Biolsi,166, Louise L.
ScenarioXYZ Corp. is a parent corporation with 2 handbag stores l.pdf
1. Scenario:
XYZ Corp. is a parent corporation with 2 handbag stores located in New Jersey and New York.
XYZ needs to setup a system that will gather customer data from all of the different stores and
put it into one place.
Blindly copying data files from each store is not going to be good enough most of the time.
The corporation needs to have a standardized set of data in order to analyze it.
Task:
You will need to do the following:
1. Create a DataMart Create a table called: DimXYZCustomers
Create using: DimSQLCustomers.sql
2. Build an SSIS - ETL solution to get the data from the 2 stores and load to a single data table
Review/analyze the data from all sources. Source files attached
Determine what needs to be standardized based on the requirements below.
The data collected should be changed to a standard format. For instance, the state value should
all be 2-character value such as NY, NJ etc.
Extract data from all sources
Source file 1:NJCustomers.txt
Source file 2:NYCustomers.csv
State 2 character abbreviation
First and Last names Upper case
Load to the a single data location
Note: custAcct will hold the PK of the source tables, however, CustomerKey is the PK of this
dimension (it will auto increment)
Only good data should go to the database; bad data (assume: no account number) should go to an
error log file
Add annotation to your design space
Be sure to add meaningful names to each object
ETL Project name: LastnameFirstname_Week7Assignment
What to Submit:
1. A zipped folder of the entire solution
2. Screenshot of the ETL solution and Data Warehouse created
Naming convention: firstnameLastname_Week7Assignment.zip
HINT: In addition to the objects we are familiar with, use the transformation component (can be
found in the Common toolbox) Union All
2. /****** Object: Table [dbo].[DimXYZCustomers] ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[DimXYZCustomers](
[custAcct] [numeric](18, 0) NOT NULL,
[custFirst] [nvarchar](50) NULL,
[custLast] [nvarchar](50) NULL,
[strNum] [nvarchar](50) NULL,
[strName] [nvarchar](50) NULL,
[city] [nvarchar](50) NULL,
[state] [nvarchar](50) NULL,
[zip] [int] NULL,
[creditLimit] [money] NULL,
[CustomerKey] [numeric](18, 0) IDENTITY(1,1) NOT NULL,
CONSTRAINT [PK_DimXYZCustomers] PRIMARY KEY CLUSTERED
(
[CustomerKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY
= OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Ny customers.csv
acctNum,firstName,lastName,strNum,strName, city, state, zip,creditLimit
1,Lindon,Jacobs,5, Main St,Miller Place,New York,33176,330.5
2,Charissa,Gaul,34, Azalea Ct,Mt. Sinai,NY,33266,1059
3,Alyssa,wint,22, Sweetgum Lane,Farmingdale,NY,33176,150
4,Brenda,reynolds,45 B, Rocky Rd,Uniondale,New York,33266,5459.35
5,Vishnu,Ruben,123, Candid St,Port Jefferson,NY,33176,235.33
6,Renna,Kelly,4333, Louise Lane, Brookville,New York,33266,459.5
7,Chris,Rusch,12, Main St,Hempstead,NY,33176,2150
8,Lisa,Biolsi,166, Louise Lane,Brooklyn,New Y.,33266,4459
3. 9,Marcos,Pichardo,76, Main St,Cambria Heights,NY,33176,550
10,Randy,Butler,45, Louise Lane,Rocky Point,N.Y.,33266,1459.35
Ny customer.txt
acctNum,firstName,lastName,strNum,strName, city, state, zip,creditLimit
1,Lindon,Jacobs,5, Main St,Miller Place,New York,33176,330.5
2,Charissa,Gaul,34, Azalea Ct,Mt. Sinai,NY,33266,1059
3,Alyssa,wint,22, Sweetgum Lane,Farmingdale,NY,33176,150
4,Brenda,reynolds,45 B, Rocky Rd,Uniondale,New York,33266,5459.35
5,Vishnu,Ruben,123, Candid St,Port Jefferson,NY,33176,235.33
6,Renna,Kelly,4333, Louise Lane, Brookville,New York,33266,459.5
7,Chris,Rusch,12, Main St,Hempstead,NY,33176,2150
8,Lisa,Biolsi,166, Louise Lane,Brooklyn,New Y.,33266,4459
9,Marcos,Pichardo,76, Main St,Cambria Heights,NY,33176,550
10,Randy,Butler,45, Louise Lane,Rocky Point,N.Y.,33266,1459.35
Instructions
Scenario:
XYZ Corp. is a parent corporation with 2 handbag stores located in New Jersey and New York.
XYZ needs to setup a system that will gather customer data from all of the different stores and
put it into one place.
Blindly copying data files from each store is not going to be good enough most of the time.
The corporation needs to have a standardized set of data in order to analyze it.
Task:
You will need to do the following:
1. Create a DataMart Create a table called: DimXYZCustomers
Create using: DimSQLCustomers.sql
2. Build an SSIS - ETL solution to get the data from the 2 stores and load to a single data table
Review/analyze the data from all sources. Source files attached
Determine what needs to be standardized based on the requirements below.
The data collected should be changed to a standard format. For instance, the state value should
all be 2-character value such as NY, NJ etc.
Extract data from all sources
Source file 1:NJCustomers.txt
Source file 2:NYCustomers.csvTransform the data as follows
4. State 2 character abbreviation
First and Last names Upper case
Load to the a single data location
Note: custAcct will hold the PK of the source tables, however, CustomerKey is the PK of this
dimension (it will auto increment)
Only good data should go to the database; bad data (assume: no account number) should go to an
error log file
Add annotation to your design space
Be sure to add meaningful names to each object
ETL Project name: LastnameFirstname_Week7Assignment
What to Submit:
1. A zipped folder of the entire solution
2. Screenshot of the ETL solution and Data Warehouse created
Naming convention: firstnameLastname_Week7Assignment.zip
HINT: In addition to the objects we are familiar with, use the transformation component (can be
found in the Common toolbox) Union All
/****** Object: Table [dbo].[DimXYZCustomers] ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[DimXYZCustomers](
[custAcct] [numeric](18, 0) NOT NULL,
[custFirst] [nvarchar](50) NULL,
[custLast] [nvarchar](50) NULL,
[strNum] [nvarchar](50) NULL,
[strName] [nvarchar](50) NULL,
[city] [nvarchar](50) NULL,
[state] [nvarchar](50) NULL,
[zip] [int] NULL,
[creditLimit] [money] NULL,
[CustomerKey] [numeric](18, 0) IDENTITY(1,1) NOT NULL,
CONSTRAINT [PK_DimXYZCustomers] PRIMARY KEY CLUSTERED
5. (
[CustomerKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY
= OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Ny customers.csv
acctNum,firstName,lastName,strNum,strName, city, state, zip,creditLimit
1,Lindon,Jacobs,5, Main St,Miller Place,New York,33176,330.5
2,Charissa,Gaul,34, Azalea Ct,Mt. Sinai,NY,33266,1059
3,Alyssa,wint,22, Sweetgum Lane,Farmingdale,NY,33176,150
4,Brenda,reynolds,45 B, Rocky Rd,Uniondale,New York,33266,5459.35
5,Vishnu,Ruben,123, Candid St,Port Jefferson,NY,33176,235.33
6,Renna,Kelly,4333, Louise Lane, Brookville,New York,33266,459.5
7,Chris,Rusch,12, Main St,Hempstead,NY,33176,2150
8,Lisa,Biolsi,166, Louise Lane,Brooklyn,New Y.,33266,4459
9,Marcos,Pichardo,76, Main St,Cambria Heights,NY,33176,550
10,Randy,Butler,45, Louise Lane,Rocky Point,N.Y.,33266,1459.35
Ny customer.txt
acctNum,firstName,lastName,strNum,strName, city, state, zip,creditLimit
1,Lindon,Jacobs,5, Main St,Miller Place,New York,33176,330.5
2,Charissa,Gaul,34, Azalea Ct,Mt. Sinai,NY,33266,1059
6. 3,Alyssa,wint,22, Sweetgum Lane,Farmingdale,NY,33176,150
4,Brenda,reynolds,45 B, Rocky Rd,Uniondale,New York,33266,5459.35
5,Vishnu,Ruben,123, Candid St,Port Jefferson,NY,33176,235.33
6,Renna,Kelly,4333, Louise Lane, Brookville,New York,33266,459.5
7,Chris,Rusch,12, Main St,Hempstead,NY,33176,2150
8,Lisa,Biolsi,166, Louise Lane,Brooklyn,New Y.,33266,4459
9,Marcos,Pichardo,76, Main St,Cambria Heights,NY,33176,550
10,Randy,Butler,45, Louise Lane,Rocky Point,N.Y.,33266,1459.35