This document outlines the design and implementation of a data warehouse for KostLess, a multinational retail company. It includes details on the business case, dimensional model, data definition language to create the schema, ETL processes, sample reports, and project management considerations. The dimensional model includes facts about sales and dimensions for customers, products, time and currency. The schema uses star schema design with dimension and fact tables linked by primary and foreign keys. Sample SQL is provided to define the tables, constraints, and indexes.
2. Table of Contents
INTRODUCTION ...........................................................................................................................................................................................................................2
BUSINESS CASE ...........................................................................................................................................................................................................................3
DATA WAREHOUSE SCHEMA ..................................................................................................................................................................................................6
DIMESSIONAL MODEL ...........................................................................................................................................................................................................6
DATA DEFINITION LANAGUNAGE .....................................................................................................................................................................................7
DATA WAREHOUSE DIRECTORY ..........................................................................................................................................................................................10
EXTRACT, TRANSFORM AND LOAD ....................................................................................................................................................................................13
EXTRACTION AND TRANSFORMATION PROCESS........................................................................................................................................................13
LOADING PROCESS...............................................................................................................................................................................................................16
REPORTS ANALYSIS .................................................................................................................................................................................................................22
PROJECT MANEGEMENT .........................................................................................................................................................................................................25
PROJECT MANAGEMENT DOCUMENTS ..........................................................................................................................................................................25
COMMUNICATION PLAN .....................................................................................................................................................................................................25
LESSONS LEARNED ..............................................................................................................................................................................................................25
CONCLUSION .............................................................................................................................................................................................................................26
REFERENCES ..............................................................................................................................................................................................................................26
1
3. INTRODUCTION
Abstract
KostLess, a retail merchant, has built an inhouse fully integrated, enterprise wide retail merchandise management, catalog and Point of Sale solution. This
new platform is a fully managed and hosted solution that integrates a marketing information, file hygiene, and analytical services. The marketing database
will also incorporate an e-mail technology platform. A powerful combination provides a powerful integrated solution enabling online and offline one-to-one
communication with their customers and prospects.
Easy to use and implement, KostLess retail Merchant can be in live ‘operation’ in a matter of months, not years. Your business can begin experiencing its
benefits in a fraction of the time and (cost!) of other solutions.
Reducing the total cost of ownership on your investment, KostLess retail ensures that this seamless integration of data and systems will enable your retail
business to grow and adapt to evolving business requirements.
Introduction
Point of Sale is the most important element of the retail process; the sales experience at store level will leave customers with a lasting impression of the retail
brand. Retailers need to be equipped with more than just a cash register. A complete Point of Service solution at checkout sets a clear differentiation
amongst competing retailers.
KostLess retail delivers more than just a point of sale. Equipped with powerful tools to help enhance the customer’s experience and a full range of functions
to enable the sales team to drive sales, it will create clear differentiation in the minds of customers.
• In-store inventory tracking with quantities on-hand, on-order, and on-back order
• Reason codes: tracking paid in/out, returns
• Special order handling
• Provide specialized reporting to identify sales trends
• Diagnose data and recommend actions to take including markdowns, transfers, etc…
• Price/Quantity Verification
• Customer capture/sales history
• View inventory of other stores
• Time and attendance
2
4. BUSINESS CASE
KostLess is a multinational company with business ties in several continents; their business operations are concentrated in Australia, Canada, France,
Germany, the United Kingdom, and the United States. KostLess was confronted with the necessity to derive business intelligence from four years of worth
of transactional data (2002, 2003, 2004, and 2005). The construction of a data warehouse for analysis purposes is an option that is appropriate for effectively
and efficiently providing KostLess with the ways and means to reach their contemplated objectives. In order to achieve their identified goal which is to boost
profitability, KostLess transactional trends in terms of sales must be analyzed to identify the consumers that are most valued according to their purchase
history, the products that are most attractive to them, sales periods where there are increases in sales so as to eventually prepare sufficient stock of
merchandise for peak periods. In order to accomplish this, the following business intelligence analysis must be conducted: customer profile analysis, product
sales analysis, and currency analysis. The following questions must be answered in order for our work to be meaningful:
I. Who are the most-valued customers?
II. What were the most profitable products from 2002 to 2005?
III. What are the areas where business was most affluent?
IV. Which currency was transacted with the most from 2002 to 2005?
V. What were the fiscal sales per income level per calendar year?
The following Business Intelligence questions will help us design the queries that will yield the desired knowledge:
Customer Profile Analysis
I. What were the total purchases and gender of each customer?
II. What were the customer purchases per educational level?
III. What were the customer purchases per job class?
IV. What were the customer purchases per marital status?
V. What were the customer purchases per income level?
Product Sales Analysis
I. What were the sales per product for the year 2002?
II. What were the sales per product for the year 2003?
III. What were the sales per product for the year 2004?
IV. What were the sales per product for the year 2005?
Business Area Analysis
I. What were the total sales per territory?
II. What were the total sales per sales region?
3
5. III. What were the total sales per territory group?
IV. What were the total sales per country?
V. What were the total sales per city?
Currency Analysis
I. What were the total sales per currency from 2002 to 2005?
Document 4
10. ALTER TABLE SALES_FACT
ADD CONSTRAINT SALFAT_CUSTNO_FK FOREIGN KEY(CUST_NO) REFERENCES CUSTOMER_DIM(CUST_NO) ON DELETE CASCADE;
ALTER TABLE SALES_FACT
ADD CONSTRAINT SALFAT_CURRENNO_FK FOREIGN KEY(CURRENCY_NO) REFERENCES currency_dim(Currency_No) ON DELETE
CASCADE;
ALTER TABLE SALES_FACT
ADD CONSTRAINT SALFAT_TIMEID_FK FOREIGN KEY(TIME_ID) REFERENCES TIME_DIM(TIME_ID) ON DELETE CASCADE;
ALTER TABLE SALES_FACT
ADD CONSTRAINT SALFACT_TERRICODE_FK FOREIGN KEY(SALES_TERRITORY_CODE) REFERENCES
SALE_TERRITORY_DIM(SALES_TERRITORY_CODE) ON DELETE CASCADE;
ALTER TABLE SALES_FACT
ADD CONSTRAINT SALFACT_PRODNO_FK FOREIGN KEY (PRODUCT_NO) REFERENCES PRODUCT_DIM(PRODUCT_NO) ON DELETE CASCADE;
ALTER TABLE SALES_FACT
ADD CONSTRAINT SALFACT_PK PRIMARY KEY (ORDER_NO);
CREATE INDEX SALCURRCENCYNO_IX ON SALES_FACT(Currency_No);
CREATE INDEX SALCUSTNO_IX ON SALES_FACT(CUST_NO);
CREATE INDEX SALTIMEID_IX ON SALES_FACT(TIME_ID);
CREATE INDEX SALPRODNO_IX ON SALES_FACT(PRODUCT_NO);
CREATE INDEX SALTERRITORYCODE_IX ON SALES_FACT(SALES_TERRITORY_CODE);
Please open the file below for the contents of the spool file.
SCHEAM_SPOOL2.TXT
9
11. DATA WAREHOUSE DIRECTORY
Figure 2Data Warehouse Model After Implementation
SELECT CONSTRAINT_NAME, CONSTRAINT_TYPE, TABLE_NAME, R_CONSTRAINT_NAME
FROM USER_CONSTRAINTS
WHERE TABLE_NAME = 'CUSTOMER_DIM' OR TABLE_NAME ='TIME_DIM'
OR TABLE_NAME = 'PRODUCT_DIM' OR TABLE_NAME = 'SALES_FACT' OR TABLE_NAME = 'CURRENCY_DIM' OR
TABLE_NAME = 'SALE_TERRITORY_DIM'
ORDER BY TABLE_NAME
CONSTRAINT_NAME CONSTRAINT_TYPE TABLE_NAME R_CONSTRAINT_NAME
------------------------------ --------------- ------------------------------ ------------------------
------
CURRECURRENCYNO_PK P CURRENCY_DIM
CUSTNOACC_PK P CUSTOMER_DIM
ACCOUNTNO_UK U CUSTOMER_DIM
CUSTDSALTERRI_CODE_FK R CUSTOMER_DIM SAL_TERRICODE_PK
PRODDIM_PK P PRODUCT_DIM
PRODUCTCODE_UK U PRODUCT_DIM
SALFAT_CUSTNO_FK R SALES_FACT ACCOUNTNO_UK
SALFAT_CURRENNO_FK R SALES_FACT CURRECURRENCYNO_PK
SALFAT_TIMEID_FK R SALES_FACT TIMETIME_ID_UK
SALFACT_TERRICODE_FK R SALES_FACT SAL_TERRICODE_PK
SALFACT_PRODNO_FK R SALES_FACT PRODUCTCODE_UK
SALFACT_PK P SALES_FACT
SAL_TERRICODE_PK P SALE_TERRITORY_DIM
TIMED_TIMECID_PK P TIME_DIM
TIMETIME_ID_UK U TIME_DIM
15 rows selected
SELECT CONSTRAINT_NAME, STATUS, DEFERRABLE,INDEX_NAME
FROM USER_CONSTRAINTS
WHERE TABLE_NAME = 'CUSTOMER_DIM' OR TABLE_NAME ='TIME_DIM'
OR TABLE_NAME = 'PRODUCT_DIM' OR TABLE_NAME = 'SALES_FACT' OR TABLE_NAME = 'CURRENCY_DIM' OR
TABLE_NAME = 'SALE_TERRITORY_DIM'
ORDER BY TABLE_NAME
CONSTRAINT_NAME STATUS DEFERRABLE INDEX_NAME
------------------------------ -------- -------------- ------------------------------
CURRECURRENCYNO_PK ENABLED NOT DEFERRABLE CURRECURRENCYNO_PK
CUSTNOACC_PK ENABLED NOT DEFERRABLE CUSTNOACC_PK
ACCOUNTNO_UK ENABLED NOT DEFERRABLE ACCOUNTNO_UK
CUSTDSALTERRI_CODE_FK ENABLED NOT DEFERRABLE
PRODDIM_PK ENABLED NOT DEFERRABLE PRODDIM_PK
PRODUCTCODE_UK ENABLED NOT DEFERRABLE PRODUCTCODE_UK
SALFAT_CUSTNO_FK ENABLED NOT DEFERRABLE
SALFAT_CURRENNO_FK ENABLED NOT DEFERRABLE
SALFAT_TIMEID_FK ENABLED NOT DEFERRABLE
SALFACT_TERRICODE_FK ENABLED NOT DEFERRABLE
SALFACT_PRODNO_FK ENABLED NOT DEFERRABLE
SALFACT_PK ENABLED NOT DEFERRABLE SALFACT_PK
SAL_TERRICODE_PK ENABLED NOT DEFERRABLE SAL_TERRICODE_PK
TIMED_TIMECID_PK ENABLED NOT DEFERRABLE TIMED_TIMECID_PK
TIMETIME_ID_UK ENABLED NOT DEFERRABLE TIMETIME_ID_UK
15 rows selected
SELECT INDEX_NAME, INDEX_TYPE, TABLE_NAME, TABLE_TYPE
FROM USER_INDEXES
WHERE TABLE_NAME = 'CUSTOMER_DIM' OR TABLE_NAME ='TIME_DIM'
OR TABLE_NAME = 'PRODUCT_DIM' OR TABLE_NAME = 'SALES_FACT' OR TABLE_NAME = 'CURRENCY_DIM' OR
TABLE_NAME = 'SALE_TERRITORY_DIM'
10
12. ORDER BY TABLE_NAME
INDEX_NAME INDEX_TYPE TABLE_NAME TABLE_TYPE
------------------------------ --------------------------- ------------------------------ -----------
CURRECURRENCYNO_PK NORMAL CURRENCY_DIM TABLE
CUSTNOACC_PK NORMAL CUSTOMER_DIM TABLE
ACCOUNTNO_UK NORMAL CUSTOMER_DIM TABLE
CUSTSALTERRICD_IX NORMAL CUSTOMER_DIM TABLE
PRODDIM_PK NORMAL PRODUCT_DIM TABLE
SYS_IL0000059173C00008$$ LOB PRODUCT_DIM TABLE
PRODUCTCODE_UK NORMAL PRODUCT_DIM TABLE
SALCUSTNO_IX NORMAL SALES_FACT TABLE
SALTERRITORYCODE_IX NORMAL SALES_FACT TABLE
SALFACT_PK NORMAL SALES_FACT TABLE
SALPRODNO_IX NORMAL SALES_FACT TABLE
SALCURRCENCYNO_IX NORMAL SALES_FACT TABLE
SALTIMEID_IX NORMAL SALES_FACT TABLE
SAL_TERRICODE_PK NORMAL SALE_TERRITORY_DIM TABLE
TIMETIME_ID_UK NORMAL TIME_DIM TABLE
TIMED_TIMECID_PK NORMAL TIME_DIM TABLE
16 rows selected
descsales_fact
Name Null Type
------------------------------ -------- --------------------------------------------------------------
------------------------------------------------------------------------------------------------------
-------------------------
CUST_NO NUMBER
CURRENCY_NO NUMBER
TIME_ID NUMBER
PRODUCT_NO NUMBER
QUNTITY_ORDERED NUMBER
ORDER_NO NOT NULL CHAR(15)
SALES_TERRITORY_CODE NUMBER
UNIT_PRICE NUMBER
FREIGHT_AMOUNT NUMBER
TAX_AMOUNT NUMBER
FINAL_AMOUNT NUMBER
desc customer_dim
Name Null Type
------------------------------ -------- --------------------------------------------------------------
------------------------------------------------------------------------------------------------------
-------------------------
CUST_NO NOT NULL NUMBER
ACCOUNT_NO NOT NULL VARCHAR2(20)
FIRST_NAME VARCHAR2(50)
MIDDLE_NAME CHAR(5)
LAST_NAME VARCHAR2(50)
GENDER CHAR(10)
JOB_CLASS VARCHAR2(30)
PHONE VARCHAR2(25)
CUST_ADDRESS VARCHAR2(50)
DOB DATE
TOTAL_CHILDREN whic
EDUCATIONAL_LEVEL VARCHAR2(30)
MARITAL_STATUS CHAR(12)
CUST_CITY VARCHAR2(30)
STATE_CODE VARCHAR2(12)
STATE_NAME VARCHAR2(40)
COUNTRY_CODE VARCHAR2(12)
COUNTRY_NAME VARCHAR2(50)
POSTAL_CODE VARCHAR2(25)
SALES_TERRITORY_CODE NUMBER
descproduct_dim
Name Null Type
------------------------------ -------- --------------------------------------------------------------
------------------------------------------------------------------------------------------------------
-------------------------
PRODUCT_NO NOT NULL NUMBER
PRODUCT_CODE NOT NULL VARCHAR2(35)
WEIGHT_CODE CHAR(6)
COLOR VARCHAR2(20)
MODEL_NAME VARCHAR2(55)
PROD_NAME VARCHAR2(50)
WEIGHT FLOAT(126)
PRODUCT_DESCRIPTION CLOB()
PRODUCT_SIZE CHAR(12)
PRODUCT_CATEGORY VARCHAR2(50)
desc currency_dim
Name Null Type
------------------------------ -------- --------------------------------------------------------------
------------------------------------------------------------------------------------------------------
-------------------------
CURRENCY_NO NOT NULL NUMBER(38)
CURRENCY_COUNTRY_CODE VARCHAR2(25)
CURRENCY_NAME VARCHAR2(50)
desc sale_territory_dim
Name Null Type
------------------------------ -------- --------------------------------------------------------------
------------------------------------------------------------------------------------------------------
-------------------------
SALES_TERRITORY_CODE NOT NULL NUMBER
SALES_TERRITORY_GROUP VARCHAR2(25)
SALES_TERRITORY_COUNTRY VARCHAR2(25)
SALES_TERRITORY_REGION VARCHAR2(25)
desc time_dim
Name Null Type
------------------------------ -------- --------------------------------------------------------------
------------------------------------------------------------------------------------------------------
-------------------------
CALENDAR_QUARTER NUMBER
CALENDAR_YEAR CHAR(8)
DAY_NUMBER_OF_MONTH NUMBER
DAY_NUMBER_OF_WEEK NUMBER
DAY_NUMBER_OF_YEAR NUMBER
WEEK_NAME CHAR(12)
11
13. MONTH_NAME CHAR(19)
FISCAL_QUARTER NUMBER
FISCAL_YEAR CHAR(8)
TIME_ID NOT NULL NUMBER
WEEK_NUMBER_OF_YEAR NUMBER
DATE_CODE NOT NULL DATE
MONTH_NUMBER_OF_YEAR NUMBER
12
14. EXTRACT, TRANSFORM AND LOAD
The data sets for the data warehouse tables in Access 2010 database format. The data sets were extracted from MS SQL Server
AdventureWorks DW. Due to the OLE problems the excel format could not be attached, an access format was created to make it
easier to view the entire data sets instead of the individual text files. No relationships were defined in the Access database format.
DATASETS.accdb
EXTRACTION AND TRANSFORMATION PROCESS
The data sets where extracted from MS SQL Server AdventureWorks DW dbo.FactInterntSales, dbo.DimProduct,
dbo.DimProductCategory, dbo.DimPoductSubcategory , dbo.DimCurrency, dbo.DimTime, dbo.DimSalesTerritory,
dbo.DimCustomer and dbo.DimGeography dimension and fact tables as illustrated in figure4. A database named DB665 was
created to store tables that would hold the data contents extracted from each tables using Select columns into table name from tables
to create a tables statements. This syntax is the same as to Create table name as select columns from table used in Oracle.
Figure 3 Database Folder MS SQL Server
Figure 4 Adventure Works DW Folder MS SQL Server
The AdventureWorks Dimensional Model based on specific tables is illustrated on figure 5.
13
16. To reduce the size of the tables in the Data Warehouse several join operations were utilized to create new tables to hold the data
contents.
DBO.Customer_Dim2 Table
In the AdventureWorks DW schema,dbo.DimCustomer and dbo.DimGeography were linked through GegoraphyKey. To select the
only important attributes from both tables join operations were used to create. DBO.Customer_Dim2 Table along with column names
in the select statement.
USE [DB665]
SELECT C.FirstName,C.MiddleName, C.LastName,C.BirthDate AS DOB, C.MaritalStatus,
C.Gender, C.CustomerKey AS CUST_NO,C.CustomerAlternateKey as ACCOUNT_NO, C.YearlyIncome, C.TotalChildren,
C.NumberCarsOwned, C.NumberChildrenAtHome, C.EnglishOccupation AS JOB_TITLE, C.EnglishEducation AS
EDUCATIONAL_LEVEL,
C.AddressLine1 AS CUST_ADDRESS,
C.Phone, G.City, G.CountryRegionCode, G.EnglishCountryRegionName AS COUNTRY_NAME, G.SalesTerritoryKey,
G.PostalCode AS Postal_Code, G.StateProvinceCode AS STATE_CODE,G.StateProvinceName AS STATE_NAME
INTO dbo.customer_dim2
FROM AdventureWorksDW.dbo.DimCustomer C
INNERJOIN
AdventureWorksDW.dbo.DimGeography G
ON
C.GeographyKey = G.GeographyKey
Dbo.Sales_Fact Table
This table would hold all the data contents selected from the dbo.FactInterntSales since there are one unique primary key constraint
there were no needed to generate another alternative keys.
USE [DB665]
select s.CustomerKey as cust_no, s.CurrencyKey as currency_no, s.OrderDateKey as Time_ID,
s.ProductKey AS PROD_NO, s.OrderQuantity AS QUNTITY_ORDERED,
s.SalesOrderNumber AS ORDER_NO, s.SalesTerritoryKey AS TERRITORY_KEY, s.UnitPrice AS UNIT_PRICE,
s.Freight, s.TaxAmt, s.SalesAmount+s.Freight+s.TaxAmt as Final_Amount
into dbo.sales_fact
from AdventureWorksDW.dbo.FactInternetSales s
Dbo.Product_Dim4 Table
dbo.DimProduct, dbo.DimProductCategory and dbo.DimPoductSubcategory were linked together in the original schema. For a
technical problems only the dbo.DimProduct could be selected to create the dbo.product_dim4 table. Another column were added
for the product category.
USE[DB665]
select p.ProductAlternateKey as product_code, p.ProductKey as prod_no,
p.WeightUnitMeasureCode as weight_code,
p.Color as color,
p.ModelName as model_name,
p.Weight as weight, p.EnglishDescription as PRODUCT_DESCRIPTION, p.Size as product_szie,
P.EnglishProductName AS PROD_NAME
into dbo.product_dim4
from AdventureWorksDW.dbo.DimProduct p;
USE[DB665]
Altertable dbo.product_dim4
add product_category nvarchar(50);
USE [DB665];
update dbo.product_dim4
set product_category=(select e.EnglishProductSubcategoryName
from AdventureWorksDW.dbo.DimProductSubcategory e
where dbo.product_dim4.ProductSubcategoryKey = e.ProductSubcategoryKey
);
Some of the products do not have a category to be used as a classifer. I updated the product category
USE [DB665]
update dbo.product_dim4
set product_category ='OTHER'
WHERE product_category ='NULL'
Dbo.SALES_TERRITORY_DIM
This table would hold all the data contents exctracted from the AdventureworksDWdbo.DimSalesTerritory table.
use [DB665]
SELECT D.SalesTerritoryKey,D.SalesTerritoryCountry, D.SalesTerritoryGroup,
D.SalesTerritoryRegion
INTO
dbo.SALES_TERRITORY_DIM
FROM AdventureWorksDW.dbo.DimSalesTerritory D
15
17. Dbo.Time_Dim
This is table would hold all data contents extracted from the AdventureWorksDW ,dbo.DimTime,
use [DB665]
select t.CalendarQuarter AS Calendar_Quarter ,t.CalendarYear AS Calendar_Year , t.DayNumberOfMonth AS
Day_Number_Of_Month , t.DayNumberOfWeek AS Day_Number_Of_Week , t.DayNumberOfYear AS Day_Number_Of_Year
, t.EnglishDayNameOfWeek as WEEK_NAME, t.EnglishMonthName as MONTH_NAME, t.FiscalQuarter AS Fiscal_Quarter,
t.FiscalYear AS Fiscal_Year, t.MonthNumberOfYear as Month_Number_of_year,
t.TimeKey as time_id, t.WeekNumberOfYear as week_number_of_year, t.FullDateAlternateKey as date_code
into dbo.time_dim_new
from AdventureWorksDW.dbo.DimTime t
For AdventureWorksDWdbo.DimCurrency no tables were created since its contents were short. Rather it was extracted directly from
the SQL Database.
The SQL Server Import and Export utility were utilized to export data from MS SQL Server 2008 into text files. Each table has to be
manually extracted and thecontents for each table were loaded into specified empty text file. For excel files, a connection was
established using the Data connection Wizard from Excel between SQL Server and Excel. Data was extracted for each table in
different excel worksheets within one workbook.
The SQL Server Import and Export Wizard were also used to export data from SQL Server into Oracle 11g to make easier to
determine the data types that would be used in creating the data warehouse in Oracle 10g database. For product description a CLOB
data type was used since it contains large text and Customer birth date and date code was change to date data type in Oracle from
Timestamp to a Date data type to eliminate the additional time contents.
Oracle_data_typs (4).txt
LOADING PROCESS.
Oracle SQL Loader was utilized to load the data from a CSV file into Oracle 10g database which resides on University DB Account.
Figure 6 Loading Map
16
18. /**Script to load the data into product_dim table**/
LOAD DATA
INFILE 'product.csv'
into table PRODUCT_DIM
fields TERMINATED BY ","
optionally enclosed by '"'
(PRODUCT_NO,PRODUCT_CODE,WEIGHT_CODE,COLOR,MODEL_NAME,PROD_NAME,WEIGHT,PRODUCT_DESCRIP
TION,PRODUCT_SZIE,PRODUCT_CATEGORY)
Log out of the Oracle server
SQL*Loader: Release 10.2.0.4.0 - Production on Wed Nov 2 23:43:50 2011
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Control File: product.ctl
Data File: product.csv
Bad File: product.bad
Discard File: none specified
(Allow all discards)
Number to load: ALL
Number to skip: 0
Errors allowed: 50
Bind array: 64 rows, maximum of 256000 bytes
Continuation: none specified
Path used: Conventional
Table PRODUCT_DIM, loaded from every logical record.
Insert option in effect for this table: INSERT
Column Name Position Len TermEnclDatatype
------------------------------ ---------- ----- ---- ---- ---------------------
PRODUCT_NO FIRST * , O(") CHARACTER
PRODUCT_CODE NEXT * , O(") CHARACTER
WEIGHT_CODE NEXT * , O(") CHARACTER
COLOR NEXT * , O(") CHARACTER
MODEL_NAME NEXT * , O(") CHARACTER
PROD_NAME NEXT * , O(") CHARACTER
WEIGHT NEXT * , O(") CHARACTER
PRODUCT_DESCRIPTION NEXT * , O(") CHARACTER
PRODUCT_SZIE NEXT * , O(") CHARACTER
PRODUCT_CATEGORY NEXT * , O(") CHARACTER
Record 1: Rejected - Error on table PRODUCT_DIM, column PRODUCT_NO.
ORA-01722: invalid number
Table PRODUCT_DIM:
606 Rows successfully loaded.
1 Row not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Space allocated for bind array: 165120 bytes(64 rows)
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 607
Total logical records rejected: 1
Total logical records discarded: 0
Run began on Wed Nov 02 23:43:50 2011
Run ended on Wed Nov 02 23:43:57 2011
Elapsed time was: 00:00:06.96
CPU time was: 00:00:00.13
17
19. /** Script to load the data into Currency_dim table**/
LOAD DATA
INFILE 'currency.csv'
into table currency_DIM
fields TERMINATED BY ","
optionally enclosed by '"'
(CURRENCY_NO,CURRENCY_COUNTRY_CODE,CURRENCY_NAME)
Log out of the Oracle server
SQL*Loader: Release 10.2.0.4.0 - Production on Thu Nov 3 00:25:43 2011
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Control File: currency.ctl
Data File: currency.csv
Bad File: currency.bad
Discard File: none specified
(Allow all discards)
Number to load: ALL
Number to skip: 0
Errors allowed: 50
Bind array: 64 rows, maximum of 256000 bytes
Continuation: none specified
Path used: Conventional
Table CURRENCY_DIM, loaded from every logical record.
Insert option in effect for this table: INSERT
Column Name Position Len TermEnclDatatype
------------------------------ ---------- ----- ---- ---- ---------------------
CURRENCY_NO FIRST * , O(") CHARACTER
CURRENCY_COUNTRY_CODE NEXT * , O(") CHARACTER
CURRENCY_NAME NEXT * , O(") CHARACTER
Record 1: Rejected - Error on table CURRENCY_DIM, column CURRENCY_NO.
ORA-01722: invalid number
Table CURRENCY_DIM:
105 Rows successfully loaded.
1 Row not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Space allocated for bind array: 49536 bytes(64 rows)
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 106
Total logical records rejected: 1
Total logical records discarded: 0
Run began on Thu Nov 03 00:25:43 2011
Run ended on Thu Nov 03 00:25:45 2011
Elapsed time was: 00:00:01.58
CPU time was: 00:00:00.05
/** Script to load the data into Time_dim table **/
LOAD DATA
INFILE 'time_dim.csv'
into table Time_DIM
fields TERMINATED BY ","
optionally enclosed by '"'
(CALENDAR_QUARTER,CALENDAR_YEAR,DAY_NUMBER_OF_MONTH,DAY_NUMBER_OF_WEEK,DAY_
NUMBER_OF_YEAR,WEEK_NAME,MONTH_NAME,FISCAL_QUARTER,FISCAL_YEAR,TIME_ID,WEEK_NU
MBER_OF_YEAR,DATE_CODE,MONTH_NUMBER_OF_YEAR)
Log out of the Oracle server
SQL*Loader: Release 10.2.0.4.0 - Production on Thu Nov 3 00:01:02 2011
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Control File: time.ctl
Data File: time_dim.csv
Bad File: time_dim.bad
Discard File: none specified
(Allow all discards)
18
20. Number to load: ALL
Number to skip: 0
Errors allowed: 50
Bind array: 64 rows, maximum of 256000 bytes
Continuation: none specified
Path used: Conventional
Table TIME_DIM, loaded from every logical record.
Insert option in effect for this table: INSERT
Column Name Position Len TermEnclDatatype
------------------------------ ---------- ----- ---- ---- ---------------------
CALENDAR_QUARTER FIRST * , O(") CHARACTER
CALENDAR_YEAR NEXT * , O(") CHARACTER
DAY_NUMBER_OF_MONTH NEXT * , O(") CHARACTER
DAY_NUMBER_OF_WEEK NEXT * , O(") CHARACTER
DAY_NUMBER_OF_YEAR NEXT * , O(") CHARACTER
WEEK_NAME NEXT * , O(") CHARACTER
MONTH_NAME NEXT * , O(") CHARACTER
FISCAL_QUARTER NEXT * , O(") CHARACTER
FISCAL_YEAR NEXT * , O(") CHARACTER
TIME_ID NEXT * , O(") CHARACTER
WEEK_NUMBER_OF_YEAR NEXT * , O(") CHARACTER
DATE_CODE NEXT * , O(") CHARACTER
MONTH_NUMBER_OF_YEAR NEXT * , O(") CHARACTER
Record 1: Rejected - Error on table TIME_DIM, column CALENDAR_QUARTER.
ORA-01722: invalid number
Table TIME_DIM:
1158 Rows successfully loaded.
1 Row not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Space allocated for bind array: 214656 bytes(64 rows)
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 1159
Total logical records rejected: 1
Total logical records discarded: 0
Run began on Thu Nov 03 00:01:02 2011
Run ended on Thu Nov 03 00:01:05 2011
Elapsed time was: 00:00:03.00
CPU time was: 00:00:00.13
/* Script to load the data into Sale_Territory_dim table*/
LOAD DATA
INFILE 'Sales.csv'
into table SALE_TERRITORY_DIM
fields TERMINATED BY ","
optionally enclosed by '"'
(SALES_TERRITORY_CODE,SALES_TERRITORY_GROUP,SALES_TERRITORY_COUNTRY,SALES_TERRIT
ORY_REGION,SALES_TERRITORY_NO)
Log out of the Oracle server
SQL*Loader: Release 10.2.0.4.0 - Production on Wed Nov 2 23:50:50 2011
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Control File: Sales.ctl
Data File: Sales.csv
19
21. Bad File: Sales.bad
Discard File: none specified
(Allow all discards)
Number to load: ALL
Number to skip: 0
Errors allowed: 50
Bind array: 64 rows, maximum of 256000 bytes
Continuation: none specified
Path used: Conventional
Table SALE_TERRITORY_DIM, loaded from every logical record.
Insert option in effect for this table: INSERT
Column Name Position Len TermEnclDatatype
------------------------------ ---------- ----- ---- ---- ---------------------
SALES_TERRITORY_CODE FIRST * , O(") CHARACTER
SALES_TERRITORY_GROUP NEXT * , O(") CHARACTER
SALES_TERRITORY_COUNTRY NEXT * , O(") CHARACTER
SALES_TERRITORY_REGION NEXT * , O(") CHARACTER
SALES_TERRITORY_NO NEXT * , O(") CHARACTER
Record 1: Rejected - Error on table SALE_TERRITORY_DIM, column SALES_TERRITORY_CODE.
ORA-01722: invalid number
Table SALE_TERRITORY_DIM:
11 Rows successfully loaded.
1 Row not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Space allocated for bind array: 82560 bytes(64 rows)
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 12
Total logical records rejected: 1
Total logical records discarded: 0
Run began on Wed Nov 02 23:50:50 2011
Run ended on Wed Nov 02 23:50:51 2011
Elapsed time was: 00:00:00.86
CPU time was: 00:00:00.05
/** Script to load the data into customer_dim table**/
LOAD DATA
INFILE 'customer_dim.csv'
into table customer_dim
fields TERMINATED BY ","
optionally enclosed by '"'
(CUST_NO,ACCOUNT_NO,FIRST_NAME,MIDDLE_NAME,LAST_NAME,GENDER,JOB_CLASS,PHONE,CUST
_ADDRESS,DOB,TOTAL_CHILDREN,YEARLY_
INCOME,NUMBER_CHILDREN_HOME,TOTAL_CAR_OWNED,EDUCATIONAL_LEVEL,MARITAL_STATUS,
CUST_CITY,STATE_CODE,STATE_NAME,
COUNTRY_CODE,COUNTRY_NAME,POSTAL_CODE,SALES_TERRITORY_CODE)
Log out of the Oracle server
SQL*Loader: Release 10.2.0.4.0 - Production on Wed Nov 2 23:56:00 2011
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Control File: customer_dim.ctl
Data File: customer_dim.csv
Bad File: customer_dim.bad
Discard File: none specified
(Allow all discards)
Number to load: ALL
Number to skip: 0
Errors allowed: 50
Bind array: 64 rows, maximum of 256000 bytes
20
22. Continuation: none specified
Path used: Conventional
Table CUSTOMER_DIM, loaded from every logical record.
Insert option in effect for this table: INSERT
Column Name Position Len TermEnclDatatype
------------------------------ ---------- ----- ---- ---- ---------------------
CUST_NO FIRST * , O(") CHARACTER
ACCOUNT_NO NEXT * , O(") CHARACTER
FIRST_NAME NEXT * , O(") CHARACTER
MIDDLE_NAME NEXT * , O(") CHARACTER
LAST_NAME NEXT * , O(") CHARACTER
GENDER NEXT * , O(") CHARACTER
JOB_CLASS NEXT * , O(") CHARACTER
PHONE NEXT * , O(") CHARACTER
CUST_ADDRESS NEXT * , O(") CHARACTER
DOB NEXT * , O(") CHARACTER
TOTAL_CHILDREN NEXT * , O(") CHARACTER
YEARLY_INCOME NEXT * , O(") CHARACTER
NUMBER_CHILDREN_HOME NEXT * , O(") CHARACTER
TOTAL_CAR_OWNED NEXT * , O(") CHARACTER
EDUCATIONAL_LEVEL NEXT * , O(") CHARACTER
MARITAL_STATUS NEXT * , O(") CHARACTER
CUST_CITY NEXT * , O(") CHARACTER
STATE_CODE NEXT * , O(") CHARACTER
STATE_NAME NEXT * , O(") CHARACTER
COUNTRY_CODE NEXT * , O(") CHARACTER
COUNTRY_NAME NEXT * , O(") CHARACTER
POSTAL_CODE NEXT * , O(") CHARACTER
SALES_TERRITORY_CODE NEXT * , O(") CHARACTER
value used for ROWS parameter changed from 64 to 43
Record 1: Rejected - Error on table CUSTOMER_DIM, column CUST_NO.
ORA-01722: invalid number
Record 5440: Rejected - Error on table CUSTOMER_DIM, column MIDDLE_NAME.
ORA-12899: value too large for column "DB665A02"."CUSTOMER_DIM"."MIDDLE_NAME" (actual: 10, maximum:
5)
Record 11636: Rejected - Error on table CUSTOMER_DIM, column MIDDLE_NAME.
ORA-12899: value too large for column "DB665A02"."CUSTOMER_DIM"."MIDDLE_NAME" (actual: 6, maximum:
5)
Record 15102: Rejected - Error on table CUSTOMER_DIM, column MIDDLE_NAME.
ORA-12899: value too large for column "DB665A02"."CUSTOMER_DIM"."MIDDLE_NAME" (actual: 6, maximum:
5)
Table CUSTOMER_DIM:
18481 Rows successfully loaded.
4 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Space allocated for bind array: 255162 bytes(43 rows)
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 18485
Total logical records rejected: 4
Total logical records discarded: 0
Run began on Wed Nov 02 23:56:00 2011
Run ended on Wed Nov 02 23:57:05 2011
Elapsed time was: 00:01:04.81
CPU time was: 00:00:02.24
/*Script to load the data into sales_fact table*/
LOAD DATA
INFILE 'sales_fact.csv'
into table SALES_FACT
fields TERMINATED BY ","
optionally enclosed by '"'
(CUST_NO,CURRENCY_NO,TIME_ID,PRODUCT_NO,QUNTITY_ORDERED,ORDER_NO,SALES_TERRITORY_CODE,UN
IT_PRICE,FREIGHT_AMOUNT,TAX_AMOUNT,FINAL_AMOUNT,TRANSACTION_ID)
Log out of the Oracle server
SQL*Loader: Release 10.2.0.4.0 - Production on Thu Nov 3 11:30:30 2011
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Control File: sales_fact.ctl
Data File: sales_fact.csv
Bad File: sales_fact.bad
Discard File: none specified
(Allow all discards)
Number to load: ALL
Number to skip: 0
Errors allowed: 50
Bind array: 64 rows, maximum of 256000 bytes
21
23. Continuation: none specified
Path used: Conventional
Table SALES_FACT, loaded from every logical record.
Insert option in effect for this table: INSERT
Column Name Position Len TermEnclDatatype
------------------------------ ---------- ----- ---- ---- ---------------------
CUST_NO FIRST * , O(") CHARACTER
CURRENCY_NO NEXT * , O(") CHARACTER
TIME_ID NEXT * , O(") CHARACTER
PRODUCT_NO NEXT * , O(") CHARACTER
QUNTITY_ORDERED NEXT * , O(") CHARACTER
ORDER_NO NEXT * , O(") CHARACTER
SALES_TERRITORY_CODE NEXT * , O(") CHARACTER
UNIT_PRICE NEXT * , O(") CHARACTER
FREIGHT_AMOUNT NEXT * , O(") CHARACTER
TAX_AMOUNT NEXT * , O(") CHARACTER
FINAL_AMOUNT NEXT * , O(") CHARACTER
TRANSACTION_ID NEXT * , O(") CHARACTER
Table SALES_FACT:
2368 Rows successfully loaded.
0 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Space allocated for bind array: 198144 bytes(64 rows)
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 2368
Total logical records rejected: 51
Total logical records discarded: 0
Run began on Thu Nov 03 11:30:30 2011
Run ended on Thu Nov 03 11:30:49 2011
Elapsed time was: 00:00:19.63
CPU time was: 00:00:00.23
REPORTS ANALYSIS
Microsoft Excel and Power Pivot were used to create the reports.
Please see query reports Excel file for the results.
/* =================== ===================== */
Query 1
/* =================== ===================== */
selectcust.cust_no,cust.gender,RANK() OVER(ORDER BY sum(final_amount))
from customer_dim cust inner join sales_factsal
oncust.cust_no = sal.cust_no
wherefinal_amount is not null
group by cust.cust_no,cust.gender;
/* =================== ===================== */
Query 2A
/* =================== ===================== */
selectprod.product_no, ltime.fiscal_year, sum(sal.final_amount)
fromproduct_dim prod inner join sales_factsal
onprod.product_no = sal.product_no
inner join time_dim ltime on ltime.time_id = sal.time_id
whereltime.fiscal_year = 2002
group by grouping sets(ltime.fiscal_year,prod.product_no);
/* =================== ===================== */
Query 2B
/* =================== ===================== */
selectprod.product_no, ltime.fiscal_year, sum(sal.final_amount)
fromproduct_dim prod inner join sales_factsal
onprod.product_no = sal.product_no
inner join time_dim ltime on ltime.time_id = sal.time_id
whereltime.fiscal_year = 2003
group by grouping sets(ltime.fiscal_year,prod.product_no);
/* =================== ===================== */
Query 2C
/* =================== ===================== */
selectprod.product_no, ltime.fiscal_year, sum(sal.final_amount)
fromproduct_dim prod inner join sales_factsal
22
24. onprod.product_no = sal.product_no
inner join time_dim ltime on ltime.time_id = sal.time_id
whereltime.fiscal_year = 2004
group by grouping sets(ltime.fiscal_year,prod.product_no);
/* =================== ===================== */
Query 2D
/* =================== ===================== */
selectprod.product_no, ltime.fiscal_year, sum(sal.final_amount)
fromproduct_dim prod inner join sales_factsal
onprod.product_no = sal.product_no
inner join time_dim ltime on ltime.time_id = sal.time_id
whereltime.fiscal_year = 2005
group by grouping sets(ltime.fiscal_year,prod.product_no);
/* =================== ===================== */
/* =================== ===================== */
QUERY 3 A
/* =================== ===================== */
selectsalter.SALES_TERRITORY_CODE, sum(final_amount)
from sale_territory_dim salter inner join sales_factsal
onsalter.SALES_TERRITORY_CODE = sal.SALES_TERRITORY_CODE
wherefinal_amount is not null
group by rollup(salter.SALES_TERRITORY_CODE);
/* =================== ===================== */
/* =================== ===================== */
QUERY 3 B
/* =================== ===================== */
selectsalter.SALES_TERRITORY_REGION, sum(final_amount)
from sale_territory_dim salter inner join sales_factsal
onsalter.SALES_TERRITORY_CODE = sal.SALES_TERRITORY_CODE
wherefinal_amount is not null
group by rollup(salter.SALES_TERRITORY_REGION);
/* =================== ===================== */
/* =================== ===================== */
QUERY 3 C
/* =================== ===================== */
selectsalter.SALES_TERRITORY_GROUP, sum(final_amount)
from sale_territory_dim salter inner join sales_factsal
onsalter.SALES_TERRITORY_CODE = sal.SALES_TERRITORY_CODE
wherefinal_amount is not null
group by Cube(salter.SALES_TERRITORY_GROUP);
/* =================== ===================== */
/* =================== ===================== */
QUERY 3 D
/* =================== ===================== */
selectsalter.SALES_TERRITORY_COUNTRY, sum(final_amount)
from sale_territory_dim salter inner join sales_factsal
onsalter.SALES_TERRITORY_CODE = sal.SALES_TERRITORY_CODE
wherefinal_amount is not null
group by Cube(salter.SALES_TERRITORY_COUNTRY);
/* =================== ===================== */
/* =================== ===================== */
QUERY 4
/* =================== ===================== */
selectsalter.SALES_TERRITORY_CODE,ltime.fiscal_year, sum(final_amount)
from sale_territory_dim salter inner join sales_factsal
onsalter.SALES_TERRITORY_CODE = sal.SALES_TERRITORY_CODE
inner join time_dim ltime
onltime.time_id = sal.time_id
wherefinal_amount is not null
group by rollup(salter.SALES_TERRITORY_CODE,ltime.fiscal_year);
23
25. /* =================== ===================== */
QUERY 5
/* =================== ===================== */
selectd.Currency_Name, ltime.Fiscal_Year,sum(final_amount) as total_amount
from Currency_dim d
inner join
SALES_FACT sal
on
d.currency_no = sal.currency_no
inner join
TIME_DIM ltime
on
ltime.time_id = sal.Time_ID
inner join
SALE_TERRITORY_DIM salter
onsalter.SALEs_TERRITORY_CODE = sal.SALES_TERRITORY_CODE
whereltime.Fiscal_Year in (2002, 2003, 2004, 2005)
group by ROLLUP(sal.currency_no, d.Currency_Name, ltime.Fiscal_Year);
/* =================== ===================== */
QUERY 6
/* =================== ===================== */
selectd.EDUCATIONAL_LEVEL, ltime.Fiscal_Year, sum(final_amount) as total_amount
from CUSTOMER_DIM d inner join
SALES_FACT f
onf.cust_no = d.CUST_NO
inner join TIME_DIM ltime
onltime.time_id = f.Time_ID
whereltime.Fiscal_Year in (2002, 2003, 2004, 2005)
group by rollup(d.EDUCATIONAL_LEVEL, ltime.Fiscal_Year);
/* =================== ===================== */
QUERY 7
/* =================== ===================== */
selectd.JOB_CLASS, ltime.Fiscal_Year, sum(final_amount) as total_amount
fromcustomer_DIM d inner join
SALES_FACT f
onf.cust_no = d.CUST_NO
inner join TIME_DIM ltime
onltime.time_id = f.Time_ID
whereltime.Fiscal_Year in (2002, 2003, 2004, 2005)
group by rollup(d.JOB_CLASS, ltime.Fiscal_Year);
/* =================== ===================== */
QUERY 8
/* =================== ===================== */
selectd.CUST_CITY, ltime.Fiscal_Year, sum(final_amount) as total_amount
fromcustomer_DIM d inner join
SALES_FACT f
onf.cust_no = d.CUST_NO
inner join TIME_DIM ltime
onltime.time_id = f.Time_ID
whereltime.Fiscal_Year in (2002, 2003, 2004, 2005)
group by rollup(d.CUST_CITY, ltime.Fiscal_Year);
/* =================== ===================== */
QUERY 9
/* =================== ===================== */
selectd.MARITAL_STATUS, ltime.Fiscal_Year, sum(final_amount) as total_amount
fromcustomer_DIM d inner join
SALES_FACT f
onf.cust_no = d.CUST_NO
inner join TIME_DIM ltime
onltime.time_id = f.Time_ID
whereltime.Fiscal_Year in(2002, 2003, 2004, 2005)
group by ROLLUP(d.MARITAL_STATUS, ltime.Fiscal_Year);
/* =================== ===================== */
QUERY 10
/* =================== ===================== */
selectd.YEARLY_INCOME , ltime.Fiscal_Year, sum(final_amount) as total_amount
fromcustomer_DIM d inner join
SALES_FACT f
onf.cust_no = d.CUST_NO
inner join TIME_DIM ltime
onltime.time_id = f.Time_ID
whereltime.Fiscal_Year in(2002, 2003, 2004, 2005)
group by ROLLUP(d.YEARLY_INCOME , ltime.Fiscal_Year);
24
26. * =================== ===================== */
QUERY 11
/* =================== ===================== */
selectd.YEARLY_INCOME , ltime.Fiscal_Year, sum(final_amount) as total_amount
fromcustomer_DIM d inner join
SALES_FACT f
onf.cust_no = d.CUST_NO
inner join TIME_DIM ltime
onltime.time_id = f.Time_ID
whereltime.Fiscal_Year in(2002, 2003, 2004, 2005)
group by ROLLUP(d.YEARLY_INCOME , ltime.Fiscal_Year);
PROJECT MANEGEMENT
PROJECT MANAGEMENT DOCUMENTS
Project Management file
Members Role Backup
Luc MBENOUN Business Analyst Catherine
Catherine NEWSOME DBA Luc
Sunny OKORO DBA Ramjee
Ramjee PAHADEE Project Manager / Team Lead Sunny
Table 1 Work Breakdown Schedule
Software Utilized
Microsoft SQL Server 2008
Microsoft Visio 2010
Microsoft Project 2007
Microsoft Word, Excel and Access 2010
Oracle JDeveloper 11g
Oracle Data Modeler
Oracle Database 11g and 10g
Microsoft Pivot
COMMUNICATION PLAN
Throughout the course of the group utilized E-mails, Chat Rooms and Group conference board to communicate with each other.
LESSONS LEARNED
Technical
Working with both Oracle 11&10g and Microsoft SQL Server 2008 produced many challenges. One problem is the syntax or the
language structurebetween the three database systems. Creating a temporary tables in Oracle the SQL Syntax is create table
table_nameas select columns<filters or joins operations>but in SQL Sever the syntax is select columns intotable_nameefrom
table_name. It took me a while to understand that when working with SQL Server I had to turn off completely my Oracle Knowledge
because it won’t work most of the time. One example was the DSEC command used to describe table structure in Oracle. Many time
I applied that command to SQL Server only to receive errors messages instead of using USE [database_name] EXC table_name.
Similar errors still found their way in Oracle like typing the [USE database_name] each time I tried to execute a SQL statement.
Data Extraction remanded the biggest technical challenge since I reformatted my computer and as result I have completely wiped out
the entire database systems including SQL Server and MySQL. It took me few days to get MS SQL Server Online after installing all
the windows components. After I have discovered possible data source for the project, I was able to communicate back and forth
with other members to determine the types of data we needed for the project. From those communications with MS Catherine, I was
able to determine the best course to take with Adventure WorksDW by eliminating unwanted columns and combing several tables
together to reduce the size of the tables needed to fit our model. At times the tables won’t join for security protocols embedded in
their structure which made me to create a temporary staging table with the same columns and data types as the join tables. Then
manually the staging tables were populated with individual select statements from the two tables.
25
27. Transferring the database from SQL Server to Oracle or other file methods reminded me of the classic game of Tom and Jerry or
Bugs bunny and friends considering that each database has different structures. I researched on various tools but each of them has set
of protocols that never worked for compatibility issues. I discovered MS SQL Server Import and Export utilitiestutorial on YouTube
which allowed me totransfer databasefiles into textand CSV or other database systems like Access and Oracle 11g. The table(s) and
column(s) names have to match to the standards of Oracle to allow individual columns to be selected in Oracle like Select cust_no
FROM customer_dim unless the database would only allow Select * from Customer_dim. The tables and columns has to be
recreated in SQL Server with a shorter names using column alias for columns in the Select statements used to create the temporary
tables before they were exported back to Oracle.
Loading the data into UMUC database failed many times. Missing data or column worth of data and mismatch between the names of
the columns occurred more often than failed triggers. Regularly I communicated with MS Catherine to discuss the status of the
loading process and each time we identified new problems, we evaluated several alternatives to fix the problems. One alternative I
discovered was that Oracle JDeveloper can generate insert statements from a text file based on predefined tables in Oracle database
which eliminated the need to manually write them out considering the large amount of data in a data warehouse. Using the insert
statements our data warehouse could have been easily populated thus bypassing the SQL Loader process if we had continued to
experience problems with SQL loader.
With missing data or mismatch between the names of the columns I have to go back to SQLServer to verify that the tables has the
correct data and in some cases they have to be recreated and then exported back to a csv and text files for a reload. The third
problem with data loading occurred because of the trigger created to generate alternate keys for the sales_fact and sale_
territoritory_dim and currency_dim failed. As a result the entire schema was changed by dropping the columns and then reloading the
data into the data warehouse after verifying each table data integrity in SQL Server.
Reverse engineering with MicrosoftVisio to capture the dimensional model in oracle was kind of problematic even though it worked
with MS SQL Server by allowing me to capture AdventureWorksDW dimensional model by connecting to SQL Serverto
selectspecific tables in order for MS Visio to generate the model. An alternative was to use Oracle data modeler to conduct the same
process to capture the dimensional model of our data warehouse after it has been implemented to include indexes and alternate keys
which is not shown with the original model. This process made it easier for us to capture any changes to the schema without having
to manually redraw the model by hand. In the end Microsoft products mostly works well with Microsoft Products and other vendors
like Oracles works well with themselves.
CONCLUSION
KostLess envisions its tools to allow retailers to take full advantage of the information available in their enterprise system.
By implementing a data warehouse solution, KostLess can store all their sales related data captured from point sales systems in one
centralized location that makes it easier for various mangers and decision makers within the company to generate various reports in
one consistent format. With a more timely decision making process, retailers are better able to effectively identify and exploit any
trends or competitive advantages.
REFERENCES
Inmon, W. H. (1995) ‘What is a Data Warehouse?’ Prism, Vol. 1, No. 1..
SINGH, AJIT; UPADHYAY, D. C.; YADAV, HEMANT. International Journal of Engineering Science & Technology, 2011, Vol. 3
Issue 7, p6049-6057, 9p
Farhan, Marwa S.; Marie, Mohamed E.; El-Fangary, Laila M.; Helmy, Yehia K.An Integrated Conceptual Model for Temporal Data
Warehouse Security.Computer & Information Science, 2011, Vol. 4 Issue 4, p46-57, 12p, 5 Diagrams, 1 Chart, 3 Graphs; DOI:
10.5539/cis.v4n4p46.
Ahmed, Eya Ben; Nabli, Ahlem; Gargouri, Faïez.A SURVEY OF USER-CENTRIC DATA WAREHOUSES. International Journal of
Database Management Systems, May2011, Vol. 3 Issue 2, p59-71, 13p, 3 Diagrams, 1 Chart; DOI: 10.5121/ijdms.2011.3204
Emil, Burtescu. Annals of the University of Oradea, Economic Science Series, 2009, Vol. 18 Issue 4, p914-917, 4p, 2 Diagrams, 1
Chart.
Nielsen, P, White,M and Parui, U(2009). Microsoft SQL Server 2008 Bible.WILEY Publications.
26