SlideShare a Scribd company logo
1 of 115
PIMS Data Warehouse
COMSATS Institute of Information Technology, Islamabad
Patient Information and Monitoring
System Using Data Warehousing
By
Tahir Ayoub
SP08-BCS-052
Faraz Ahmed
SP08--BCS-015
Supervisor: Muhammad Mustafa Khattak
Bachelor of Computer Science (2008-2012)
The candidate confirms that the work submitted is his own and appropriate credit has
been given where reference has been made to the work of others
PIMS Data Warehouse
2
DECLARATION
We hereby declare that this software, neither as a whole nor as a part hereof has been
copied out from any source. It is further declared that we have developed this
Software and the accompanied report entirely on the basis of our personal efforts
made under the sincere guidance of our seniors and teachers. If any part of this report
is proved to be copied out or found to be reported, we shall standby the consequences.
No portion of the work presented in this report has been submitted in support of any
other degree or qualification of this or any other university or institute of learning.
Tahir Ayoub
SP08-BCS-052
Faraz Ahmed
SP08-BCS-015
PIMS Data Warehouse
3
CERTIFICATE OF
APPROVAL
It is to certify that the final year project of BS (CS) “PATIENT INFORMATION AND
MONITORING SYSTEM USING DATA WAREHOUSE” was developed by “Tahir
Ayoub (CIIT/SP08-BCS-052)” and “Faraz Ahmed (CIIT/SP08-BCS-015)” under the
supervision of “Muhammad Mustafa Khattak” and that in their opinion; it is fully
adequate, in scope and quality for the degree of Bachelors of Science in Computer
Sciences.
---------------------------------------
Supervisor
---------------------------------------
External Examiner
---------------------------------------
Head of Department
(Department of Computer Science)
PIMS Data Warehouse
4
EXECUTIVE SUMMARY
There are a number of reasons for which the migration form Relational Database
Management System (RDBMS) to Data Warehouse is required.
Data warehouse is an informational environment that:
 Provides an integrated and total view of enterprise.
 Make the enterprise current and historical information easily available for
decision making.
 Make decision-support transactions possible without hindering operational
systems.
 Renders the organization’s information consistent.
 Presents a flexible and interactive source of strategic information.

This is a solution for a user with the prior knowledge of data warehouse design
concepts. The Warehouse will provide support for intelligent user to create data
warehouse schema from an existing OLTP systems consisting of a relational database,
which will be MS Notepad, MS Excel, MS Access, SQL server 2008. The target
system i.e., the warehouse will also be implemented in SQL Server 2008 r2.
PIMS Data Warehouse
5
ACKNOWLEDGMENT
ALLAH the Almighty! We are thankful to you, for giving us with the courage to take,
and for your infinite help to complete this project. Without your help we would have
never been able to complete this project.
Thanks to all the teachers for guiding us throughout our stay at this university and all
the friends for providing us a beautiful company which we will never forget.
And last but not the least, thanks to our parents because without their love, affection,
and prayers for us; our studies and this project were not achievable.
-------------------------------- --------------------------------
Tahir Ayoub Faraz Ahmed
PIMS Data Warehouse
6
Abbreviations
ODS Operational Data source
SSIS SQL Server Integration Services
SSAS SQL Analysis Services
SSRS SQL Reporting Services
DWH Data Warehouse
OLAP Online Analysis Process
OLTP Online Analytical Process
ETL Extract Transform Load
PIMS Data Warehouse
7
Contents
1.Introduction........................................................................................................ 13
1.1 Brief .................................................................................................................................13
1.2 Relevance to Course Modules............................................................................................. 13
1.3 Project Background............................................................................................................ 14
1.4 Literature Review .............................................................................................................. 14
1.4.1 Area of Knowledge....................................................................................................... 14
1.4.2 Decision Support Systems (DSS)................................................................................... 15
1.4.3 Data Warehouse ........................................................................................................... 16
1.4.4 Development Lifecycle .................................................................................................16
1.4.5 Data Warehouse SDLC.................................................................................................18
1.4.6 Classical SDLC............................................................................................................ 18
1.4.7 Overview of ETL.......................................................................................................... 18
1.4.8 Major Functions ........................................................................................................... 19
1.5 Methodology and Software Life Cycle ................................................................................ 20
2 Problem Definition.............................................................................................. 22
2.1 Purpose............................................................................................................................. 22
2.2 Product Functions .............................................................................................................. 22
2.3 Proposed Architecture........................................................................................................ 23
2.3.1 Basics of Data Warehouse and ETL............................................................................... 23
2.3.2 Data Warehouse Architectures....................................................................................... 25
2.3.3 Basic Data Warehouse Architecture............................................................................... 25
Figure 3 Basic Data Warehouse Architecture.............................................................................. 25
2.3.4 Data Warehouse Architecture with Staging Area ............................................................ 26
Figure 4 Data Warehouse Architecture with Staging Area........................................................... 26
2.3.5 Data Warehouse Architecture with Staging Area and Data Marts..................................... 26
Figure 5 Data Warehouse Architecture with Staging Area and Data Marts....................................26
2.3.6 Data Warehouse Modeling ............................................................................................ 27
2.3.7 ETL Operation ............................................................................................................. 27
3 Requirements Analysis........................................................................................ 32
3.1 Project Overview............................................................................................................... 32
3.1.1 Data Profiling............................................................................................................... 32
3.1.2 Warehouse Schema Generation ..................................................................................... 32
3.1.3 Data Extraction............................................................................................................. 32
3.1.4 Data Transformation..................................................................................................... 32
3.1.5 Data Loading................................................................................................................ 32
3.2 Functional Requirements ....................................................................................................33
3.2.1 Data Profiling............................................................................................................... 33
3.2.2 Warehouse Schema Generation ..................................................................................... 33
3.2.3 Data Extraction............................................................................................................. 33
3.2.4 Data Transformation..................................................................................................... 34
3.2.5 Data Loading................................................................................................................ 35
3.3 Nonfunctional Requirements .............................................................................................. 35
3.3.1 Performance Requirements............................................................................................ 35
3.3.2 Safety Requirements ..................................................................................................... 35
3.3.3 Reliability Requirements............................................................................................... 35
3.4 External Interface Requirements ......................................................................................... 35
3.4.1 User Interface............................................................................................................... 35
3.4.2 Hardware Resources..................................................................................................... 35
3.4.3 Hardware Interfaces...................................................................................................... 36
3.5 Use Case Specifications ..................................................................................................... 36
3.5.1 Connect RDBMS User..................................................................................................36
3.5.2 Connect Data Warehouse Schema.................................................................................. 36
3.5.3 Connect database User..................................................................................................37
3.5.4 Load Relational Database Model................................................................................... 37
3.5.5 Identify Table Names....................................................................................................38
PIMS Data Warehouse
8
3.5.6 Identify Columns with Data Types................................................................................. 38
3.5.7 Identify Relationships between Tables ........................................................................... 39
3.5.8 Load Warehouse Schema.............................................................................................. 39
3.5.9 Map Columns............................................................................................................... 40
3.5.10 Extract Data from RDBMS........................................................................................... 41
3.5.11 Transform Extracted Data ............................................................................................. 41
3.5.12 Load Data in Warehouse............................................................................................... 42
4 The Design.......................................................................................................... 44
4.1 Modules............................................................................................................................ 44
4.1.1 Connectivity................................................................................................................. 44
4.1.2 RDBMS Details ........................................................................................................... 44
4.1.3 Schema Generation....................................................................................................... 44
4.1.4 Column Mappings ........................................................................................................ 45
4.1.5 Extraction .................................................................................................................... 45
4.1.6 Transformation............................................................................................................. 45
4.1.7 Loading ....................................................................................................................... 45
5 UML Structure Diagram....................................................................................... 45
5.1 Class Diagram................................................................................................................... 45
5.1.1 City lab........................................................................................................................ 45
Figure 7 City lab....................................................................................................................... 45
5.1.2 Health ways ................................................................................................................. 46
5.1.3 Clinic........................................................................................................................... 47
5.1.4 CMH hospital............................................................................................................... 48
5.1.5 Urwah lab.................................................................................................................... 49
5.2 Object diagram .................................................................................................................. 50
5.3 Component Diagram.......................................................................................................... 51
5.4 Deployment Diagram......................................................................................................... 52
5.5 Composite Structure Diagram............................................................................................. 53
5.6 Package diagram................................................................................................................ 54
6 UML Behavior Diagrams...................................................................................... 56
6.1 Use Case Diagram ............................................................................................................. 56
6.2 Activity Diagram ............................................................................................................... 57
6.2.1 Create new project........................................................................................................ 57
6.2.2 Open existing project....................................................................................................57
6.2.3 Close project................................................................................................................ 58
Figure 20 Close project............................................................................................................. 58
6.2.4 Create mapping ............................................................................................................ 58
6.2.5 Load RDBMS.............................................................................................................. 59
6.3 State Machine diagram....................................................................................................... 60
6.3.1 Report.......................................................................................................................... 60
6.3.2 ETL............................................................................................................................. 61
7 UML Interaction Diagrams................................................................................... 63
7.1 Sequence Diagram............................................................................................................. 63
7.1.1 Create NewProject....................................................................................................... 63
7.1.2 Open Existing Project ...................................................................................................64
7.1.3 Close Project................................................................................................................ 65
7.1.4 Load RDBMS Details ...................................................................................................66
7.1.5 Create Schema.............................................................................................................. 67
7.1.6 Create Mappings .......................................................................................................... 69
7.1.7 Data Extraction............................................................................................................. 70
7.1.8 Data Transformations....................................................................................................71
7.1.9 Report Generation ........................................................................................................ 73
7.1.10 ETL............................................................................................................................. 74
7.2 Communication Diagram ...................................................................................................75
7.2.1 ETL............................................................................................................................. 75
7.2.2 Report.......................................................................................................................... 75
PIMS Data Warehouse
9
7.3 Interaction Overview.......................................................................................................... 76
7.3.1 ETL............................................................................................................................. 76
7.3.2 Warehouse Interaction ..................................................................................................76
7.3.3 Access Model............................................................................................................... 77
8 Implementation.................................................................................................. 79
8.1 System Implementation...................................................................................................... 79
8.2 Back end software SQL server r2........................................................................................ 79
8.2.1 Snowflake Schema ....................................................................................................... 79
Figure 41 Snowflake Schema ....................................................................................................79
8.2.2 ETL SSIS..................................................................................................................... 80
8.2.3 Overview of PIMS ETL................................................................................................ 80
8.2.4 General Overview of PIMS City Lab ETL...................................................................... 81
8.2.5 General Overview of PIMS Clinics ETL ........................................................................ 81
8.2.6 General Overview of PIMS CMH Hospital ETL............................................................. 82
8.2.7 General Overview of PIMS Fact table ETL ....................................................................83
8.2.8 General Overview of PIMS Heath ways ETL .................................................................84
8.2.9 General Overview of PIMS Urwah Lab ETL..................................................................85
8.2.10 General Overview of PIMS ETL of date format.............................................................. 86
8.2.11 PIMS ETL of date format.............................................................................................. 86
8.3 SSAS................................................................................................................................ 88
8.3.1 Overview of OLAP Cube.............................................................................................. 88
8.3.2 Overview of OLAP Cube drill down.............................................................................. 89
8.4 General Overview of Dimensions ....................................................................................... 90
8.4.1 Overview of Date time dimensions ................................................................................ 90
8.4.2 Date time English month calculation.............................................................................. 91
8.4.3 Overview of Doctor Dimension ..................................................................................... 92
8.5 Graphical User Interface..................................................................................................... 93
8.5.1 Overview of SSRS report 1........................................................................................... 93
8.5.2 SSRS report 1 month wise............................................................................................. 94
8.5.3 SSRS report 2 total Patients in different cities w.r.t year.................................................. 95
8.5.4 SSRS report 3 disease wise ........................................................................................... 96
8.5.5 SSRS report 3a Total patients affected by disease in different years.................................97
8.5.6 SSRS report 3c tabular disease report............................................................................. 98
8.5.7 SSRS report 4 Patient Detail.......................................................................................... 99
8.5.8 SSRS report 5 Dr. Detail............................................................................................. 100
8.5.9 SSRS report 5a Dr. Detail .......................................................................................... 101
8.5.10 SSRS report 5b Patient Detail...................................................................................... 101
8.5.11 SSRS report 5c Report Patient Detail........................................................................... 102
9 Testing and Evaluation.......................................................................................104
9.1 Testing............................................................................................................................ 104
9.1.1 Black Box Testing ...................................................................................................... 104
9.2 Testing of PIMS Data warehouse...................................................................................... 105
9.3 Test Cases....................................................................................................................... 106
10 Future Work...............................................................................................111
11 References.................................................................................................113
PIMS Data Warehouse
10
Table of Figures
Figure 1 Methodology and Software Life Cycle................................................................. 20
Figure 2 Contrasting OLTP and Data Warehousing Environments....................................... 24
Figure 3 Basic Data Warehouse Architecture.................................................................... 25
Figure 4 Data Warehouse Architecture with StagingArea................................................. 26
Figure 5 Data Warehouse Architecture with StagingArea and Data Marts ......................... 26
Figure 6 ETL Operation.................................................................................................... 28
Figure 7 City lab.............................................................................................................. 45
Figure 8 Healthways....................................................................................................... 46
Figure 9 Clinic................................................................................................................. 47
Figure 10 CMH hospital................................................................................................... 48
Figure 11 Urwah lab........................................................................................................ 49
Figure 12 Object diagram................................................................................................ 50
Figure 13 Component Diagram........................................................................................ 51
Figure 14 Deployment Diagram....................................................................................... 52
Figure 15 Composite Structure Diagram........................................................................... 53
Figure 16 Package diagram.............................................................................................. 54
Figure 17 Use Case Diagram............................................................................................ 56
Figure 18 Activity Diagram(Create new project)............................................................... 57
Figure 19 Activity Diagram(Open existing project)............................................................ 57
Figure 20 Close project.................................................................................................... 58
Figure 21 Activity Diagram(Create mapping).................................................................... 58
Figure 22 Activity Diagram(Load RDBMS) ........................................................................ 59
Figure 23 State Machine Diagram (Report)....................................................................... 60
Figure 24 State Machine Diagram (ETL)............................................................................ 61
Figure 25 Sequence Diagram (Create New Project)........................................................... 63
Figure 26 Sequence Diagram (Open Existing Project) ........................................................ 64
Figure 27 Sequence Diagram (Close Project)..................................................................... 65
Figure 28 Sequence Diagram (Load RDBMS Details).......................................................... 66
Figure 29 Sequence Diagram (Create Schema).................................................................. 68
Figure 30 Sequence Diagram (Create Mappings)............................................................... 69
Figure 31 Sequence Diagram (Data Extraction)................................................................. 70
Figure 32 Sequence Diagram (Data Transformations)Data Loading.................................... 71
Figure 33 Sequence Diagram (Data Loading)..................................................................... 72
Figure 34 Sequence Diagram (Report Generation)............................................................ 73
Figure 35 Sequence Diagram (ETL)................................................................................... 74
Figure 36 Communication Diagram (ETL).......................................................................... 75
Figure 37 Communication Diagram (Report)..................................................................... 75
Figure 38 Interaction Overview (ETL) ............................................................................... 76
Figure 39 Interaction Overview (Warehouse Interaction)Web Diagrams ............................ 76
Figure 40 Web Diagrams (Access Model).......................................................................... 77
Figure 41 Snowflake Schema........................................................................................... 79
Figure 42 Overview of PIMS ETL....................................................................................... 80
Figure 43 General Overview of PIMS City Lab ETL ............................................................. 81
Figure 44 General Overview of PIMS Clinics ETL................................................................ 81
PIMS Data Warehouse
11
Figure 45 General Overview of PIMS CMH Hospital ETL..................................................... 82
Figure 46 General Overview of PIMS Fact table ETL .......................................................... 83
Figure 47 General Overview of PIMS Heath ways ETL........................................................ 84
Figure 48 General Overview of PIMS Urwah Lab ETL ......................................................... 85
Figure 49 General Overview of PIMS ETL of date format................................................... 86
Figure 50 PIMS ETL of date format................................................................................... 87
Figure 51 Overview of OLAP Cube.................................................................................... 88
Figure 52 Overview of OLAP Cube drill down.................................................................... 89
Figure 53 Overview of Date time dimensions.................................................................... 90
Figure 54 Date time English month calculation................................................................. 91
Figure 55 Overview of Doctor Dimension......................................................................... 92
Figure 56 Overview of SSRS report 1................................................................................ 93
Figure 57 SSRS report 1 month wise................................................................................. 94
Figure 58 SSRS report 2 total Patients in different cities w.r.tyear..................................... 95
Figure 59 SSRS report 3 disease wise ............................................................................... 96
Figure 60 SSRS report 3a Total patients affected by disease in different years.................... 97
Figure 61 SSRS report Total revenue of CMH Hospitals yearly............................................ 98
Figure 62 SSRS report 4 Patient Detail.............................................................................. 99
Figure 63 SSRS report 5 Dr. Detail...................................................................................100
Figure 64 SSRS report 5b Patient Detail...........................................................................101
Figure 65 SSRS report 5c Report Patient Detail................................................................102
PIMS Data Warehouse
12
Chapter 1
Introduction
PIMS Data Warehouse
13
1. Introduction
1.1 Brief
This Documentation includes the detailed description of “Patient Information and
monitoring system using data warehousing”. It covers all the phases of system
development including requirement analysis, designing, and implementation and
testing.
The aim of this project is to make patient information and monitoring system using
data-ware housing. Also, worldwide the healthcare industry is looking for technology
that can aid with establishing on-line clinical repositories enabling rapid access to
shared information that can help find cures for prevalent medical conditions. It not
only facilitates the hospitals as well as doctors but also facilitates the government
sector to make critical decisions; our project shall help them in decision making. The
project is divided in many parts like:
First comes different ODS, our project has 5 different ODS’s namely,
 Healthways.xlx
 Urwahlab.xlsx
 CMH Hospitals SQL
 Clinics.txt
 City_Lab.accdb
Second come the ETL part,
 The tools used for ETL is SSIS, which is used to Extract data from ODS’s,
transformed them to our standard formats and load them to DWH.
Thirdly after doing ETL we made OLAP cubes
 The tool used for it was SSAS.
Fourthly after making OLAP cubes we made reports (having queries both from
OLAP cubes and DWH).
 The tool used to make reports is SSRS.
Then comes the last part i.e. these reports have to be shown to front end users the
tools used for it is ASP.net
1.2 Relevance to Course Modules
The course of “data base” provides us the basic knowledge about databases which is
the one of the basic requirement of our project. “Human Computer Interaction”
helped us in designing user friendly GUI and reports. The most important is the
PIMS Data Warehouse
14
internet/tutorials part by which we learned a lot about data warehouse, which were
very beneficial during the project.
1.3 Project Background
Health care has become one of the most important service industries that are
undergoing rapid structural transformations. Healthcare remains a paper-intensive and
minimally automated and digitized industry. CBS market watch reported that an
estimated 90% of all patient information remains on paper. Also, worldwide the
healthcare industry is looking for technology that can aid with establishing on-line
clinical repositories enabling rapid access to shared information that can help find
cures for prevalent medical conditions.
As in Pakistan there is no such concept of DWH in healthcare department, so we are
making this project and the basic theme behind is to make patient information and
monitoring system using data-ware housing. Also, worldwide the healthcare industry
is looking for technology that can aid with establishing on-line clinical repositories
enabling rapid access to shared information that can help find cures for prevalent
medical conditions. It not only facilitates the hospitals as well as doctors but also
facilitates the government sector to make critical decisions; our project shall help
them in decision making.
1.4 Literature Review
The origins of DSS processing hark back to the very early days of computers and
information systems. It is interesting that decision support system (DSS) processing
developed out of a long and complex evolution of information technology. Its
evolution continues today.
The Data warehouse architecture has evolved throughout the history of the different
stages of information processing. The information contained in a warehouse flows
from the same operational systems that could not be directly used to produce strategic
information. The data-warehouse user also called the DSS analyst is a business person
first and foremost, and a technician second. The primary job of the DSS analyst is to
define and discover information used in corporate decision-making.
For developing a complete understanding the chapter starts with explaining Data
Warehouse and basics of the ETL Operations.
1.4.1 Area of Knowledge
This project mainly concerns with healthcare of Data Warehouse design and ETL
Operation required populating the Data Warehouse. Complete and clear knowledge of
Decision Support System (DSS) and Date Warehouse designing is essential for
understanding this project.
PIMS Data Warehouse
15
1.4.2 Decision Support Systems (DSS)
The origin of DSS processing hark back to the very early days of computers and
information systems. It is interesting that decision support system (DSS) processing
developed out of a long and complex evolution of information technology. Its
evolution continues today. By the mid-1970s, online transaction processing (OLTP)
made faster access to data possible, opening whole new vistas for business and
processing. The computer could now be used for tasks not previously possible,
including driving reservations systems, bank teller systems, manufacturing control
systems, and the like.
Throughout this period, organizations accumulated growing amounts of data stored in
their operational databases. However, in recent times where such systems are
common place, organizations are focusing on ways to use operational data to support
decision-making, as a means of gaining competitive advantage. Business executives
have become desperate for information to stay competitive and improve the bottom
line. Although operational systems provide information to run the day-to-day
operations, these cannot be readily used to make strategic decisions. Businesses,
therefore, are compelled to turn to new ways of getting strategic information. IT
departments have been attempting to provide information to the key business
personnel in their companies for making strategic decisions. Sometimes an IT
department could produce ad hoc reports from a single application. In most cases, the
reports would need data from multiple systems, requiring the rewriting of existing
programs to create intermediary files that could be used to produce ad hoc reports.
Most of the attempts by the IT in the past ended in failure. The users could not clearly
define what they wanted in the first place. Once they saw the first set of reports, they
wanted more data in different formats. The chain continued. This was mainly because
of the very nature of the process of making strategic decisions. We have been trying
all along to provide strategic information from the operational systems. Information
needed for strategic decisions making has to be available in an interactive manner.
The use must be able to query online, get results, and query some more. The
information must be in a format suitable for analysis. If we need the ability to provide
strategic information, we must get the information from altogether different types of
systems. For example, the following queries cannot be answered using by a simple
operational system as it contains static data:
 How profitable shall the company be next quarter?
 Who are the top ten customers during the last six months?
 What was the profit last month and how much did it differ from the profit of
the same month during the last three years?
 What is the relationship between the total annual revenue generated by each
branch office and the total number of sales staff assigned to each branch
office?
PIMS Data Warehouse
16
Operational systems support the business processes of the company. They are used to
watch how the business runs, and then make strategic decisions to improve the
business. The concept of data warehouse is deemed the solution to meet the
requirements of a system capable of supporting decision making, receiving data from
multiple data sources.
1.4.3 Data Warehouse
A data warehouse is a relational database that is designed for query and analysis
rather than for transaction processing. It usually contains historical data derived from
transaction data, but it can include data from other sources. It separates analysis
workload from transaction workload and enables an organization to consolidate data
from several sources.
In addition to a relational database, a data warehouse environment includes an
extraction, transportation, transformation, and loading (ETL) solution, an online
analytical processing (OLAP) engine, client analysis tools, and other applications that
manage the process of gathering data and delivering it to business users.
Data warehouse is an informational environment that:
 Provides an integrated and total view of enterprise.
 Make the enterprise current and historical information easily available for
decision making.
 Make decision-support transactions possible without hindering operational
systems.
 Renders the organization’s information consistent.
 Presents a flexible and interactive source of strategic information.
The users of the data warehouse environment have a completely different approach to
using the system. Unlike operational users who have a straightforward approach to
defining their requirements, the data warehouse user operates in a mindset of
discovery. The end user of the data warehouse says, “Give me what I say I want, and
then I can tell you what I really want.”
1.4.4 Development Lifecycle
Operational data is usually application oriented and as a consequence is an integrated,
whereas data warehouse data must be integrated. Other major differences also exist
between the operational level of data and processing and the data warehouse level of
data and processing. The development life cycles of these systems can be a profound
concern, the operational environment is supported by the classical systems
development life cycle (the SDLC). The SDLC is often called the “waterfall”
development approach because the different activities are specified and one activity-
upon its completion-spills down into the next activity and triggers its start.
PIMS Data Warehouse
17
The development of the data warehouse operates under a very different life cycle,
sometimes called the CLDS (the reverse of the SDLC). The classical SDLC is driven
by requirements.
The CLDS is almost exactly the reverse: The CLDS starts with data. Once the data is
in hand, it is integrated and then tested to see what bias there is to the data, if any.
Programs are then written against the data. The results of the programs are analyzed,
and finally the requirements of the system are understood. The CLDS is usually called
a “spiral” development methodology.
The classical system development life cycle (SDLC) does not work in the world of the
DSS analyst. The SDLC assumes that requirements are known at the start of design
(or at least can be discovered). In the world of the DSS analyst, though, new
requirements usually are the last thing to be discovered in the DSS development life
cycle. The DSS analyst starts with existing requirements, but factoring in new
requirements is almost impossibility. A very different development life cycle is
associated with the data warehouse.
PIMS Data Warehouse
18
1.4.5 Data Warehouse SDLC
 Implement Warehouse
 Integrate data
 Test for bias
 Program against data
 Design DSS system
 Analyze results
 Understand requirements
1.4.6 Classical SDLC
 Requirements gathering
 Analysis
 Design
 Programming
 Testing
 Integration
 Implementation
The CLDS is a classic data-driven development life cycle, while the SDLC is a classic
requirements-driven development life cycle.
1.4.7 Overview of ETL
When data is required to be loaded in data warehouse regularly, so that it can serve its
purpose of facilitating business analysis. To do this, data from one or more
operational systems needs to be extracted and copied into the warehouse. The process
of extracting data from source systems and bringing it into the data warehouse is
commonly called ETL, which stands for extraction, transformation, and loading. The
acronym ETL is perhaps too simplistic, because it omits the transportation phase and
implies that each of the other phases of the process is distinct. We refer to the entire
process, including data loading, as ETL. The ETL refers to a broad process, and not
three well-defined steps.
The methodology and tasks of ETL have been well known for many years, and are not
necessarily unique to data warehouse environments: a wide variety of proprietary
applications and database systems are the IT backbone of any enterprise. Data has to
be shared between applications or systems, trying to integrate them, giving at least
two applications the same picture of the world. This data sharing was mostly
addressed by mechanisms similar to what is now called ETL.
Data warehouse environments face the same challenge with the additional burden that
they not only have to exchange but to integrate, rearrange and consolidate data over
many systems, thereby providing a new unified information base for business
PIMS Data Warehouse
19
intelligence. Additionally, the data volume in data warehouse environments tends to
be very large.
What happens during the ETL process? During extraction, the desired data is
identified and extracted from many different sources, including database systems and
applications. Very often, it is not possible to identify the specific subset of interest;
therefore more data than necessary has to be extracted, so the identification of the
relevant data shall be done at a later point in time. Depending on the source system's
capabilities (for example, operating system resources), some transformations may
take place during this extraction process. The size of the extracted data varies from
hundreds of kilobytes up to gigabytes, depending on the source system and the
business situation. The same is true for the time delta between two (logically)
identical extractions: the time span may vary between days/hours and minutes to near
real-time. Web server log files for example can easily become hundreds of megabytes
in a very short period of time.
1.4.8 Major Functions
The basic purpose of the project is to build a healthcare system that not only help
doctors, patients but also the government sector to take major decision on health
department, The main features provided by the software to use ETL to:
 Create a definition of a data warehouse.
 Configure the definitions for a physical instance of the data warehouse.
 Validate the set of definitions and their configurations.
 Create and populate the data warehouse instance.
 Data transformations.
 Deploy and initially load the data warehouse instance.
 Maintain the physical instance by conditionally refreshing.
ETL supports the design of relational database schemas, ETL processes and End User
tool environments through the client.
Source systems play an important role in ETL a solution. Instead of creating metadata
manually, ETL provides integrated components that import the relevant information
into its repository.
To ensure the quality and completeness of the data in the repository ETL provides
extensive validation within the repository. Validation helps to keep a complex system
in an accurate and coherent state.
PIMS Data Warehouse
20
1.5 Methodology and Software Life Cycle
Figure 1 Methodology and Software Life Cycle
PIMS Data Warehouse
21
Chapter 2
Problem definition
PIMS Data Warehouse
22
2 Problem Definition
2.1 Purpose
As in Pakistan there is no such concept of data warehouse in healthcare department,
so we are making this project and the basic theme behind is to make patient
information and monitoring system using data-ware housing. Also, worldwide the
healthcare industry is looking for technology that can aid with establishing on-line
clinical repositories enabling rapid access to shared information that can help find
cures for prevalent medical conditions. It not only facilitates the hospitals as well as
doctors but also facilitates the government sector to make critical decisions; our
project shall help them in decision making.
Health care has become one of the most important service industries that are
undergoing rapid structural transformations. Healthcare remains a paper-intensive and
minimally automated and digitized industry. CBS market watch reported that an
estimated 90% of all patient information remains on paper. Also, worldwide the
healthcare industry is looking for technology that can aid with establishing on-line
clinical repositories enabling rapid access to shared information that can help find
cures for prevalent medical conditions.
2.2 Product Functions
A PIMS based on Data warehouse system that not only contains the historical data but
also helps the specific users in decision making.
 It helps users to take decisions on healthcare department.
 It helps the users in making decisions on patient related cities to focus on them
for disease cure purpose.
 System not only provides graphical reports but also provide drill down tabular
reports to help understand the healthcare issues.
 System not only calculates the spreading of disease in required cities but also
calculate it with respect to time (containing year, quarter, month, day as well)
of it.
 System tells the doctors to focus on such age patients having required disease.
 System also tells the user that they generate how much revenue in a year,
quarter, month, day, hour, min and sec from which patient.
 System not only inform doctors that how much patients they are handling but
also tell the doctors about their specific patient’s gender and their previous
reports and there disease status and so on.
PIMS Data Warehouse
23
2.3 Proposed Architecture
2.3.1 Basics of Data Warehouse and ETL
2.3.1.1 What is a Data Warehouse?
A common way of introducing data warehousing is to refer to the characteristics of a
data warehouse as set forth:
 Subject Oriented
 Integrated
 Nonvolatile
 Time Variant
2.3.1.2 Subject Oriented
Data warehouses are designed to help you analyze data. For example, to learn more
about your company’s sales data, you can build a warehouse that concentrates on
sales. Using this warehouse, you can answer questions like "Who was our best
customer for this item last year?" This ability to define a data warehouse by subject
matter, sales in this case makes the data warehouse subject oriented.
2.3.1.3 Integrated
Integration is closely related to subject orientation. Data warehouses must put data
from disparate sources into a consistent format. They must resolve such problems as
naming conflicts and inconsistencies among units of measure. When they achieve
this, they are said to be integrated.
2.3.1.4 Nonvolatile
Nonvolatile means that, once entered into the warehouse, data should not change.
This is logical because the purpose of a warehouse is to enable you to analyze what
has occurred.
2.3.1.5 Time Variant
In order to discover trends in business, analysts need large amounts of data. This is
very much in contrast to online transaction processing (OLTP) systems, where
performance requirements demand that historical data be moved to an archive. A data
warehouse’s focus on change over time is what is meant by the term time variant.
PIMS Data Warehouse
24
2.3.1.6 Contrasting OLTP and Data Warehousing Environments
Figure 2 Contrasting OLTP and Data Warehousing Environments
Data warehouses and OLTP systems have very different requirements. Here are some
examples of differences between typical data warehouses and OLTP systems:
2.3.1.7 Workload
Data warehouses are designed to accommodate ad hoc queries. You might not know
the workload of your data warehouse in advance, so a data warehouse should be
optimized to perform well for a wide variety of possible query operations.
OLTP systems support only predefined operations. Your applications might be
specifically tuned or designed to support only these operations.
2.3.1.8 Data modifications
A data warehouse is updated on a regular basis by the ETL process (run nightly,
weekly monthly, or yearly) using bulk data modification techniques. The end users of
a data warehouse do not directly update the data warehouse.
In OLTP systems, end users routinely issue individual data modification statements to
the database. The OLTP database is always up to date, and reflects the current state of
each business transaction.
2.3.1.9 Schema design
Data warehouses often use denormalized or partially denormalized schemas (such as a
star schema) to optimize query performance.
OLTP systems often use fully normalized schemas to optimize update/insert/delete
performance, and to guarantee data consistency.
PIMS Data Warehouse
25
2.3.1.10 Typical operations
A typical data warehouse query scans thousands or millions of rows. For example,
"Find the total sales for all customers last month."
A typical OLTP operation accesses only a handful of records. For example, "Retrieve
the current order for this customer."
2.3.1.11 Historical data
Data warehouses usually store many months or years of data. This is to support
historical analysis.
OLTP systems usually store data from only a few weeks or months. The OLTP
system stores only historical data as needed to successfully meet the requirements of
the current transaction.
2.3.2 Data Warehouse Architectures
Data warehouses and their architectures vary depending upon the specifics of an
organization's situation. Three common architectures are:
 Data Warehouse Architecture (Basic).
 Data Warehouse Architecture (with a Staging Area).
 Data Warehouse Architecture (with a Staging Area and Data Marts).
2.3.3 Basic Data Warehouse Architecture
Figure 3 Basic Data Warehouse Architecture
PIMS Data Warehouse
26
2.3.4 Data Warehouse Architecture with Staging Area
Figure 4 Data Warehouse Architecture with Staging Area
2.3.5 Data Warehouse Architecture with Staging Area and Data Marts
Figure 5 Data Warehouse Architecture with Staging Area and Data
Marts
PIMS Data Warehouse
27
2.3.6 Data Warehouse Modeling
One question that very often arises at my data warehousing presentations is: Which
data modeling tool is best for data warehousing? The answer is simple: your brain.
While all the various data modeling tools have their pros and cons, none of them is so
intrinsically better than the rest for data warehousing as to rate a recommendation. For
example, none of the current data modeling tools cleanly diagrams or records any
meta-data regarding how facts and aggregates might use partitioning and/or
materialized views. For data warehousing, the physical data model is useful merely as
a roadmap for the ETL programmers. The real physical object implementation is far
too complex for modeling tools to handle.
Some basic steps for transforming an OLTP model into a star schema design are:
 DE normalizes lookup relationships.
 DE normalizes parent/child relationships.
 Create and populate a time dimension.
 Create hierarchies of data within dimensions.
 Consider using surrogate or meaningless keys.
In dimensional modeling of a Data Warehouse, there are generally only two kinds of
tables:
2.3.6.1 Dimensions
Dimensions are relatively small, DE normalized lookup tables containing business
descriptive columns that end-users reference to define their restriction criteria for ad-
hoc business intelligence queries.
2.3.6.2 Facts
Facts are extremely large tables whose primary keys are formed from the
concatenation of all the columns that are foreign keys referencing related dimension
tables. Facts also possess numerically additive, non-key columns utilized to satisfy
calculations required by end-user ad-hoc business intelligence queries. The key point
is that to be successful, fact table implementations must accommodate the different
requirements.
2.3.7 ETL Operation
ETL involves three (3) major operations:
 Data Extraction.
 Transformation.
 Loading.
PIMS Data Warehouse
28
Extraction Transformation
&
Schema Design
Loading
OLTP System Data
Warehouse
Figure 6 ETL Operation
2.3.7.1 Data Extraction
Extraction is the operation of extracting data from a source system for further use in a
data warehouse environment. This is the first step of the ETL process. After the
extraction, this data can be transformed and loaded into the data warehouse.
The source systems for a data warehouse are typically transaction processing
applications. For example, one of the source systems for a sales analysis data
warehouse might be an order entry system that records all of the current order
activities.
Designing and creating the extraction process is often one of the most time consuming
tasks in the ETL process and, indeed, in the entire data warehousing process. The
source systems might be very complex and poorly documented, and thus determining
which data needs to be extracted can be difficult. The data has to be extracted
normally not only once, but several times in a periodic manner to supply all changed
data to the warehouse and keep it up-to-date. Moreover, the source system typically
cannot be modified, nor can its performance or availability be adjusted, to
accommodate the needs of the data warehouse extraction process.
These are important considerations for extraction and ETL in general. It assumes that
the data warehouse team has already identified the data that shall be extracted, and
discusses common techniques used for extracting data from source databases.
Designing this process means making decisions about the following two main aspects:
 Which extraction method do I choose?
This influences the source system, the transportation process, and the time
needed for refreshing the warehouse.
 How do I provide the extracted data for further processing?
This influences the transportation method, and the need for cleaning and
transforming the data.
2.3.7.2 Data Transformation
Data transformations are often the most complex and, in terms of processing time, the
most costly part of the ETL process. They can range from simple data conversions to
PIMS Data Warehouse
29
extremely complex data scrubbing techniques. Many, if not all, data transformations
can occur within a database, although transformations are often implemented outside
of the database (for example, on flat files) as well.
2.3.7.3 Data Loading
Data is loaded into a data warehouse in two fundamental ways: a record at a time
through a language interface or en masse with a utility. As a rule, loading data by
means of a utility is much faster. In addition, indexes must be efficiently loaded at the
same time the data is loaded. In some cases, the loading of the indexes may be
deferred in order to spread the workload evenly.
As the burden of the volume of loading becomes an issue, the load is often
parallelized. When this happens, the data being loaded is divided into one of several
job streams. Once the input data is divided, each job stream is executed independently
of the other job streams. In doing so, the elapsed time needed for loading is reduced
by the number of job streams (roughly speaking).
Another related approach to the efficient loading of very large amounts of data is
staging the data prior to loading. As a rule, large amounts of data are gathered into a
buffer area before being processed by extract/transfer/load (ETL) software. The
staged data is merged, perhaps edited, summarized, and so forth, before it passes into
the ETL layer. Staging of data is needed only where the amount of data is large and
the complexity of processing is high.
2.3.7.4 Data Profiling
Data profiling is not a glamorous task. It is also not something that you can do once
and forget about it. Proper data profiling methodology must become a standard part of
both your business and IT infrastructure to allow you to diagnose the health of your
systems.
Today, many organizations attempt to conduct data profiling tasks manually. With
very few columns and minimal rows to profile, this may be practical. But
organizations today have thousands of columns and millions (or billions) of records.
Profiling this data manually would require an inordinate amount of human
intervention that would still be error-prone and subjective.
In practice, your organization needs a data profiling tool that can automatically
process data from many data source and process hundreds or thousands of columns
across many data sources. Data profiling in practice consists of three distinct phases:
 Initial profiling and data assessment.
 Integration of profiling into automated processes.
 Handoff of profiling results to data quality and data integration processes.
PIMS Data Warehouse
30
The most effective data management tools can address all of these initiatives. Data
analysis reporting alone is just a small part of your overall data initiative. The results
from data profiling serve as the foundation for data quality and data integration
initiatives. Look for a data profiling solution that allows you to construct data
correction, validation and verification routines directly from the profiling reports. This
shall help you combine the effort of data inspection and correction phases, helping to
streamline your data management process.
2.3.7.5 Assumptions and Dependencies
2.3.7.5.1 Assumptions
 The software can be used round the clock.
 The requirements can be change with time.
2.3.7.5.2 Dependencies
 User should have domain knowledge of computers.
 DWH should be maintained with time to time
2.3.7.6 Project Deliverables
 Executable in running condition
 Detailed Final Draft (Report)
2.3.7.7 Operating Environment
2.3.7.7.1 Software
 Operating system: Windows Xp/7
 Dream viewer, Visual studio
 Backhand Data Mart
2.3.7.7.2 Web services
 Opera version
 Mozilla Firefox
 Microsoft Internet Explorer version
 Apple Safari version
 Google Chrome version
PIMS Data Warehouse
31
Chapter 3
Requirement Analysis
PIMS Data Warehouse
32
3 Requirements Analysis
The analysis phase defines the requirements of the system, independent of how these
requirements shall be accomplished. This phase defines the problem that the end user
is trying to solve. The deliverable result at the end of this phase is a requirement
document. Ideally, this document states in a clear and precise fashion what is to be
built. This analysis represents the ``what'' phase. The requirement document tries to
capture the requirements from the end user’s perspective by defining goals and
interactions at a level removed from the implementation details.
3.1 Project Overview
The product shall provide following functionality regarding ETL Operations:
3.1.1 Data Profiling
 Analyzing the column properties.
 Analyzing the relationships.
3.1.2 Warehouse Schema Generation
The user shall provide an OLTP source system and shall generate a target warehouse
schema shall be a star schema
3.1.3 Data Extraction
 The data from the OLTP system shall be extracted into files.
 Extracted data shall be further processed before loading into the warehouse.
3.1.4 Data Transformation
The extracted data is raw and cannot be placed in the data warehouse without
enriching it.
 The extracted data shall be processed within the staging area according to the
required format.
 Its quality shall be improved, and
 Shall be made ready to be loaded into the data warehouse.
3.1.5 Data Loading
This process again is quite cumbersome and shall require special techniques and
methods so that all the records are applied successfully to the data warehouse.
 The data prepared after transformation shall be applied to the data warehouse
database and shall be stored there.
 Load images are created to correspond to the target files to be loaded in the
data warehouse database.
PIMS Data Warehouse
33
 Mapping functions shall be provided, which shall map the source system
records to the target warehouse
3.2 Functional Requirements
Following are some basic requirements described briefly.
3.2.1 Data Profiling
PIMS.DP.F.0010:
Identify the Data Type of the columns.
PIMS.DP.F.0020:
Identify the Maximum Length of the columns.
PIMS.DP.F.0030:
Identify the Null Rule of the columns.
PIMS.DP.F.0040:
Identify the Unique Rule of the columns.
PIMS.DP.F.0050:
Identify the relationships between the Tables.
3.2.2 Warehouse Schema Generation
PIMS.WSG.F.0010:
Support for a Star schema (is a logical structure that has a fact table in the center,
surrounded by dimension tables) shall be provided for the warehouse schema design.
PIMS.WSG.F.0020:
Functions shall be provided so that the user can transform the source data into the
target system
PIMS.WSG.F.0030:
Generated schema shall be implemented in Microsoft Visual Studio R2 as warehouse
objects such as facts, dimension tables.
3.2.3 Data Extraction
PIMS.DE.F.0010:
The source systems for a data warehouse are typically OLTP system which shall be
Microsoft Visual Studio R2.
PIMS Data Warehouse
34
PIMS.DE.F.0020:
The data from the OLTP system shall be extracted into files.
PIMS.DE.F.0030:
Data has to be extracted for each incremental load as well as for one time initial full
load.
PIMS.DE.F.0040:
Extracted data shall be further processed before loading into the warehouse.
3.2.4 Data Transformation
PIMS.DT.F.0010:
Quality of the data is to be improved.
PIMS.DT.F.0020:
Transformation functions provided shall transform the data format.
PIMS.DT.F.0030:
Standardization of data for different sources shall be done.
PIMS.DT.F.0040:
Selection takes place at the beginning of the whole process of data transformation.
Either whole records or parts of several records can be selected from the source
system.
PIMS.DT.F.0050:
Splitting/ Joining includes the types of data manipulation needed to perform on the
selected parts of the source records.
PIMS.F.F.0060:
Conversion includes all wide variety of rudimentary conversions of single fields for
two primary reasons, one to standardize among the data extraction from different
sources, another to make the fields usable and understandable to the user.
PIMS.DT.F.0070:
For summarization, the transformation function is used for summarizing the facts.
PIMS.DT.F.0080:
Enrichment is the rearrangement and simplification of individual fields to make them
more useful for the data.
PIMS Data Warehouse
35
3.2.5 Data Loading
PIMS.DL.F.0010:
Load images are created to correspond to the target files to be loaded in the data
warehouse database.
PIMS.DL.F.0020:
Identification of source system’s table fields mapping to warehouse table.
PIMS.DL.F.0030:
Loading must be efficient.
3.3 Nonfunctional Requirements
3.3.1 Performance Requirements
This shall be very important system being used in the development cycle of ETL so it
must be efficient. As it shall be communicating with the SQL server r2 and heavy
resources are required for processing so it must utilize the hardware resources
optimally.
3.3.2 Safety Requirements
The process of testing a test case may take very long time and it is important to keep
the track of intermediate results so in case of any failure the work already done is not
lost.
3.3.3 Reliability Requirements
The process involves different phases and the data after each phase should be secure
and reliable.
3.4 External Interface Requirements
3.4.1 User Interface
The Users of this application shall be professionals as well as normal users so the user
interface has to be comprehensive that provides the required user the ability to control
everything and extract the required information easily and as quickly as possible.
3.4.2 Hardware Resources
The application requires heavy processing resources. Latest hardware resources are
required for the efficient and effective working of the application.
PIMS Data Warehouse
36
3.4.3 Hardware Interfaces
The software shall not interact with any hardware; it shall only use the operating
systems services for connecting to the SQL Database.
3.5 Use Case Specifications
3.5.1 Connect RDBMS User
Pre-Condition: Required Schema and User must Exist
Description: Connect to an existing schema using a user name and
password.
Actor: User
Success Scenario: User provides the user name and password.
User is connected to the schema with the user name provided.
User can view the required details of the schema.
Alternate Scenarios: None
Post-Condition: The details of the RDBMS should be loaded for the viewing
purpose.
3.5.2 Connect Data Warehouse Schema
Pre-Condition: A proper Data Warehouse schema should exist.
Description: Open a previously created Data Warehouse with the provide
user name and password.
Actor: User
Success Scenario: User provides the user name and password.
User is connected to the schema with the user name provided.
User can view the details of the schema.
PIMS Data Warehouse
37
Alternate Scenarios: None
Post-Condition: The details of the Data Warehouse should be loaded.
3.5.3 Connect database User
Pre-Condition: The user and the schema to be connected must exist
Description: Connect to an existing database user with required schema.
Actor: User
Success Scenario: User should be connected to the required schema with the user
rights assigned to the user.
Alternate Scenarios: None.
Post-Condition: The details of the schema are populated for the connected user
to further operations required.
3.5.4 Load Relational Database Model
Pre-Condition: Relational Database model must exist.
User must have the rights to connect to the database.
Description: Load an existing RDBMS for the required operations.
Actor: User
Success Scenario: User connects the relational database by specifying the user
name and password.
The entire ERD Model of the RDBMS is loaded.
Relationships between these tables are populated.
User can view the details of the tables.
PIMS Data Warehouse
38
User can perform the desired operations.
Alternate Scenarios: None
Post-Condition: None
3.5.5 Identify Table Names
Pre-Condition: Database must exist.
The tables must exist in the database.
Description: Identify all the tables in the database.
Actor: User
Success Scenario: User selects to view the tables of the Database.
The entire table names in the database are loaded after
transformation.
Alternate Scenarios: None
Post-Condition: None
3.5.6 Identify Columns with Data Types
Pre-Condition: The tables must be specified for which the columns are to be
populated.
Description: Identify column names with data type of each columns column
for the specified table names.
Actor: User
Success Scenario: User selects to view the complete database.
PIMS Data Warehouse
39
All the required table columns names and their data types are
populate.
Alternate Scenarios: None.
Post-Condition: None.
3.5.7 Identify Relationships between Tables
Pre-Condition: All The table name and column name for each table must be
populated to identify the relationships.
Description: Identify relationships between the tables.
Actor: User
Success Scenario: User selects to view the complete database design.
All the relationships are loaded with the details of the Primary
Keys and Foreign Keys of the complete database.
Alternate Scenarios: None.
Post-Condition: None.
3.5.8 Load Warehouse Schema
Pre-Condition: The Warehouse Schema must be created before and the fact
and dimension tables must exist.
Description: Load an existing Data Warehouse Schema for cubes and
reports.
Actor: User
Success Scenario: User loads the existing data warehouse schema.
PIMS Data Warehouse
40
The details of the schema are loaded.
The Details are displayed to the user.
Alternate Scenarios: None.
Post-Condition: None.
3.5.9 Map Columns
Pre-Condition: A complete Data Warehouse schema must exist with all the
facts and dimensions properly defined.
Description: The Mapping of columns between the columns of Data
Warehouse Facts/Dimensions and the columns of Tables from
an RDBMS.
Actor: User
Success Scenario: User maps each column of the facts and dimension tables to
the relational database model columns as required.
The mapping between the columns is maintained for the ETL
operations.
All the mappings are stored in the file.
Alternate Scenarios: Load an existing Data Warehouse schema with the already
defined mappings.
Post-Condition: None.
PIMS Data Warehouse
41
3.5.10 Extract Data from RDBMS
Pre-Condition: All the mappings should be done for the effective and efficient
load.
Description: Extract Data from the Database to be loaded in the Data
Warehouse.
Actor: User
Success Scenario: User selects to extract the data from the source system.
The extracted data is to be extracted in accordance with the
target system.
The extracted data is saved in the SQL server 2008.
On the Completion of extractions of data from the source, the
data is ready for the transformation or loading.
Alternate Scenarios: None.
Post-Condition: None.
3.5.11 Transform Extracted Data
Pre-Condition: All the data must be extracted from the source system.
Description: Transform the extracted data, as required, before loading the
data into the target Warehouse.
Actors: User.
Success Scenario: Select the Transformation functions required to be performed
be loading the data into the Data Warehouse.
User specifies the transformation functions to be performed.
User selects to apply the transformation functions.
PIMS Data Warehouse
42
Alternate Scenarios: If user does not provide any transformation function the data
shall be loaded without any changes in some cases.
Post-Condition: None.
3.5.12 Load Data in Warehouse
Pre-Condition: All the data to be loaded should be maintained in the SQL
server 2008.
Description: Loading the Extracted Data from different ODS’s to the Target
Data Warehouse.
Actor: User
Success Scenario: User selects to load the data into the target system.
The transformations to be performed are applied to the
extracted data.
After the successful transformation of data the data is loaded
to the target data warehouse.
All the changes are saved.
Alternate Scenarios: None
Post-Condition: Prompt for the Successful ETL.
PIMS Data Warehouse
43
Chapter 4
The Design
PIMS Data Warehouse
44
4 The Design
In the design phase the architecture is established. This phase starts with the
requirement document delivered by the requirement phase and maps the requirements
into architecture. The architecture defines the components, their interfaces and
behaviors. The deliverable design document is the architecture. The design document
describes a plan to implement the requirements. This phase represents the ``how''
phase. Details on computer programming languages and environments, machines,
packages, application architecture, distributed architecture layering, memory size,
platform, algorithms, data structures, global type definitions, interfaces, and many
other engineering details are established. The design may include the usage of
existing components.
4.1 Modules
The current system can easily be divided into four modules. These modules are quite
independent from each other and have very simple and well defined interface between
each other. These things play a very important role in the successful working of the
modules. Basic modules in our system are as follows.
4.1.1 Connectivity
The functionality of this module shall be to provide the communication between
Database and its client. Connectivity shall provide the interface for communication
for providing full operational environment in order to achieve the efficient interaction
within the system.
4.1.2 RDBMS Details
The detailed schema of the relational database model shall be loaded with the help of
this module. This module loads:
 Table Name
 Columns Names
 Data Types
 Constraints
 Relationships
 Path Details of Database Traversals.
4.1.3 Schema Generation
This module shall provide its users a very easy way to create the schema of the Data
Warehouse. Schema Generator provides the functionality for defining:
 Facts
 Dimensions
PIMS Data Warehouse
45
4.1.4 Column Mappings
This module shall provide the functionality to map the columns from relational
database model with the Data Warehouse Facts or Dimension Tables. The
modifications in the columns shall also be managed in this module.
4.1.5 Extraction
The procedure for the data extraction shall be defined with the help of the provided
mappings. This module shall extract the data from the relational database in
accordance with the defined procedures of the effective extractions using the
mappings.
4.1.6 Transformation
Different Transformation functions shall be provided. The user shall select the type of
transformations and the desired output of the transformation. These transformations
procedures shall be maintained for the preload transformations.
4.1.7 Loading
Loading mechanism shall be defined to incorporate the defined transformations. The
data shall be transformed and loaded in the single step. Initial the data shall be
transformed according to the specified functions and the then shall be loaded to the
designed Data Warehouse schema.
PIMS Data Warehouse
Chapter 5
UML Structural Diagram
PIMS Data Warehouse
45
5 UML Structure Diagram
5.1 Class Diagram
5.1.1 City lab
-Cl_Id
-Cl_Address
-Cl_Phone
-Cl_Fax
-Cl_City
Branches
-Cl_Dept_Id
-Cl_Id
-Cl_Dept_Name
-Cl_Dept_Location
Department
-Cl_S_Id
-Cl_S_FName
-Cl_S_LName
-Cl_S_Sex
-Cl_S_Address
-Cl_S_Salary
-Cl_S_Phone
-Cl_S_Mobile
-Cl_Dept_Id
-Cl_Shift_Id
-Cl_S_Password
Staff
-Cl_T_Id
-Cl_P_Id
-Cl_S_Id
-Cl_Test_Category_Id
Test
-Cl_Eqp_Id
-Cl_Eqp_Name
-Cl_Eqp_Company
-Cl_Dept_Id
-Cl_Status
Equipment
-Cl_R_Id
-Cl_S_Id
-Cl_T_Id
-Cl_Dr_Id
-Cl_Date_Time
-Cl_Disease_Id
-Cl_Disease_Status
Report
-Cl_P_Invoice_No
-Cl_S_Id
-Cl_P_Id
-Cl_Status
-Cl_Total_Price
PatientInvoice
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Urine_PC
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Cholesterol
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Urine_CE
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Test_CBC
-Cl_Shift_Id
-Cl_Shift_Name
-Cl_Shift_Start
-Cl_Shift_End
Shift
-Cl_Sup_Id
-Cl_Sup_Name
Supplier
-Cl_Dr_Id
-Cl_Dr_Name
-Cl_Specialization
-Cl_Dr_Gender
-Cl_Address
-Cl_Dr_Phone
-Cl_Dept_Id
-Cl_Shift_Id
-Cl_Password
Doctor
-Cl_Order_No
-Cl_S_Id
-Cl_Sup_Id
-Cl_Order_Date
-Cl_Status
OrderRequest
-Cl_Order_No
-Cl_Order_Item_Id
-Cl_Eqp_Name
-Cl_DateTime
OrderDetail
-Cl_R_Dr_Id
-Cl_R_Dr_Name
-Cl_R_Dr_Address
-Cl_R_Dr_Specialization
-Cl_R_Dr_Phone_No
-Cl_R_Dr_Mobile_No
ReferedDoctor
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_CBC_DC
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Urine_ME
-Cl_Test_Category_Id
-Cl_T_Type
Test_Category
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Triglycerides
-Cl_Sup_Id
-Cl_Eqp_Name
-Cl_Total_Invoice
SupplierInvoice
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_LFT
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_CBC_AV
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Ant_HCV
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Test_Dengue
-Cl_P_Id
-Cl_Id
-Cl_P_FName
-Cl_P_LName
-Cl_P_Gender
-Cl_P_Address
-Cl_P_Phone
-Cl_P_Mobile
-Cl_R_Dr_Id
Patient
-Cl_Disease_Id
-Cl_Disease_Name
Disease
-Cl_Test_Name
-Cl_Result
-Cl_Reference_Value
-Cl_R_Id
Analysis
-Cl_Patient_Invoice_No
-Cl_Total_Test
-Cl_Total_Price
PatientInvoiceDetail
Figure 7 City lab
PIMS Data Warehouse
46
5.1.2 Health ways
-Hw_Id
-Hw_Address
-Hw_Phone
-Hw_Fax
-Hw_City
Branches
-Hw_Department_Id
-Hw_Id
-Hw_Department_Name
-Hw_Department_Location
Department
-Hw_Staff_Id
-Hw_Staff_FName
-Hw_Staff_LName
-Hw_Staff_Sex
-Hw_Staff_Address
-Hw_Staff_Salary
-Hw_Staff_Phone
-Hw_Staff_Mobile
-Hw_Dept_Id
-Hw_Shift_Id
-Hw_Staff_Password
Staff
-Hw_Test_Id
-Hw_Patient_Id
-Hw_Staff_Id
-Hw_Test_Category_Id
Test
-Hw_Equipment_Id
-Hw_Equipment_Name
-Hw_Equipment_Company
-Hw_Dept_Id
-Hw_Status
Equipment
-Hw_Report_Id
-Hw_Staff_Id
-Hw_Test_Id
-Hw_Doctor_Id
-Hw_DateTime
-Hw_Disease_Id
-Hw_Disease_Status
Report
-Hw_Patient_Invoice_No
-Hw_Staff_Id
-Hw_Patient_Id
-Hw_Status
-Hw_Total_Price
PatientInvoice
-Hw_Test_Category_Id
-Hw_Test_Name
-Hw_Reference_Value
Sub_Urine_PC
-Hw_Test_Category_Id
-Hw_Test_Name
-Hw_Reference_Value
Sub_Cholesterol
-Hw_Test_Category_Id
-Hw_Test_Name
-Hw_Reference_Value
Sub_Urine_CE
-Hw_Test_Category_Id
-Hw_Test_Name
-Hw_Reference_Value
Sub_Test_CBC
-Hw_Shift_Id
-Hw_Shift_Name
-Hw_Shift_Start
-Hw_Shift_End
Shift
-Hw_Supplier_Id
-Hw_Supplier_Name
Supplier
-Hw_Doctor_Id
-Hw_Doctor_Name
-Hw_Specialization
-Hw_Doctor_Gender
-Hw_Address
-Hw_Doctor_Phone
-Hw_Department_Id
-Hw_Shift_Id
-Hw_Password
Doctor
-Hw_Order_No
-Hw_Staff_Id
-Hw_Supplier_Id
-Hw_Order_Date
-Hw_Status
OrderRequest
-Hw_Order_No
-Hw_Order_Item_Id
-Hw_Equipment_Name
-Hw_DateTime
OrderDetail
-Hw_Refered_Doctor_Id
-Hw_Refered_Doctor_Name
-Hw_Refered_Doctor_Address
-Hw_Refered_Doctor_Specialization
-Hw_Refered_Doctor_Phone_No
-Hw_Refered_Doctor_Mobile_No
ReferedDoctor
-Hw_Test_Category_Id
-Hw_Test_Name
-Hw_Reference_Value
Sub_CBC_DC
-Hw_Test_Category_Id
-Hw_Test_Name
-Hw_Reference_Value
Sub_Urine_ME
-Hw_Test_Category_Id
-Hw_Test_Type
Test_Category
-Hw_Test_Category_Id
-Hw_Test_Name
-Hw_Reference_Value
Sub_Triglycerides
-Hw_Supplier_Id
-Hw_Equipment_Name
-Hw_Total_Invoice
SupplierInvoice
-Hw_Test_Category_Id
-Hw_Test_Name
-Hw_Reference_Value
Sub_LFT
-Hw_Test_Category_Id
-Hw_Test_Name
-Hw_Reference_Value
Sub_CBC_AV
-Hw_Test_Category_Id
-Hw_Test_Name
-Hw_Reference_Value
Sub_Ant_HCV
-Hw_Test_Category_Id
-Hw_Test_Name
-Hw_Reference_Value
Sub_Test_Dengue
-Hw_Patient_Id
-Hw_Id
-Hw_Patient_FName
-Hw_Patient_LName
-Hw_Patient_Gender
-Hw_Patient_Address
-Hw_Patient_Phone
-Hw_Patient_Mobile
-Hw_Report_Dr_Id
Patient
-Hw_Disease_Id
-Hw_Disease_Name
Disease
-Hw_Test_Name
-Hw_Result
-Hw_Reference_Value
-Hw_Report_Id
Analysis
-Hw_Patient_Invoice_No
-Hw_Total_Test
-Hw_Total_Price
PatientInvoiceDetail
Figure 8 Health ways
PIMS Data Warehouse
47
5.1.3 Clinic
-C_Id
-C_Address
-C_Phone
-C_Fax
-C_City
Branch
-C_Dr_Id
-C_Dr_Name
-C_Dr_Specialization
-C_Dr_Gender
-C_Address
-C_Dr_Phone
-C_Id
-C_Password
Dr
-C_P_Id
-C_Id
-C_P_FName
-C_P_LName
-C_P_Gender
-C_P_Address
-C_P_Phone
-C_P_Mobile
-C_R_Dr_Id
Patients
-C_P_Invoice_No
-C_S_Id
-C_P_Id
-C_Status
-C_Total_Price
Invoice
-C_Pre_Id
-C_Dr_Id
-C_Pat_Id
-C_Pre_Date
Prescription
-C_Pre_Id
-C_Med_Id
-C_Med_Name
-C_Med_Dossage
Prescription_Detail
Figure 9 Clinic
PIMS Data Warehouse
48
5.1.4 CMH hospital
-Hospital_Id
-Hospital_Name
-Hospital_Address
-Hospital_City
-Hospital_Phone_No
-Hospital_Fax_No
Branches
-Dept_Id
-Dept_Name
-Dept_Location
-Hospital_Id
Department
-Staff_Id
-Staff_Name
-Staff_Father_Name
-Staff_Sex
-Staff_Address
-Staff_Salary
-Staff_Phone
-Staff_Mobile
-Dept_Id
-Shift_Id
-Password
Staff
-Test_Id
-Pat_Id
-Staff_Id
-Test_Category_Id
Test
-Cl_Eqp_Id
-Cl_Eqp_Name
-Cl_Eqp_Company
-Lab_Id
-
Equipment
-Report_Id
-Staff_Id
-Test_Id
-Dr_Id
-Date_Time
-Disease_Id
-Disease_Status
Report
-Pat_Invoice_No
-Staff_Id
-Pat_Id
-Status
-
PatientInvoice
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Urine_PC
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Cholesterol
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Urine_CE
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Test_CBC
-Shift_Id
-Shift_Name
-Shift_Start
-Shift_End
Shift
-Sup_Id
-Sup_Name
Supplier
-Dr_Id
-Dr_Name
-Dr_FatherName
-Dr_Specialization
-Dr_Gender
-Dr_Address
-Dr_Phone
-Dr_Mobile_No
-Dept_Id
-Shift_Id
-Password
Doctor
-Order_Id
-Sup_Id
-Staff_Id
-Order_Date
-Status
Orders
-Order_Eqp_Name
-Order_Eqp_Company
-Total_No_Of_Items
-Estimated_Total_Price
-Order_Id
OrderDetail
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_CBC_DC
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Urine_ME
-Test_Category_Id
-Test_Type
Test_Category
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Triglycerides
-Eqp_Name
-Eqp_Unit_Items
-Total_Invoice
-Sup_Id
SupplierInvoice
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_LFT
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_CBC_AV
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Ant_HCV
-Cl_Test_Category_Id
-Cl_Test_Name
-Cl_Reference_Value
Sub_Test_Dengue
-Patient_Id
-Patient_FatherName
-Patient_FullName
-Patient_Gender
-Patient_Address
-Patient_Age
-Patient_Type
-Patient_Status
-Hopital_Id
Patient
-Disease_Id
-Disease_Name
Disease
-Test_Name
-Result
-Reference_Value
-Report_Id
Report_Analysis
-Med_Id
-Med_Name
-Staff_Id
-Dept_Id
Pharmacy
-Reg_Id
-Staff_Id
-Pat_Id
Registery
-In_Pat_Id
-Ward_Id
-Date_Admission
-Pat_Id
InPatient
-Ward_Id
-Ward_Name
-Ward_Floor
-Dept_Id
Ward
-Room_Id
-Room_Type
-Room_Floor_Location
-Ward_Id
Room
-Bed_No
-Room_Id
-Ward_Id
Bed
-Nurse_Id
-Nurse_FirstName
-Nurse_LastName
-Nurse_Address
-Nurse_Phone
-Ward_Id
-Shift_Id
Nurse
-Dr_A_Id
-Dr_A_Name
-Dr_A_FatherName
-Dr_A_Address
-Dr_A_Phone
-Dr_A_Salary
-Shift_Id
-Dr_Id
-Password
Dr_Assistant
-Batch_No
-Med_Id
-Med_Doasge
-Med_Mg_Date
-Med_Exp_Date
MedicanDetail
-Insurance_No
-Insurance_Co
-Pat_Id
-Expiry_Date
Insurance
-B_B_Name
-Eqp_Id
-Staff_Id
-Dept_Id
BloodBank
-Lab_Id
-Lab_Location
-Dept_Id
Lab
-Blood_Id
-Blood_Type
-Comment
-B_B_Name
-Pat_Id
Blood
-Out_Pat_Id
-DateTime
-Pat_Id
Out_Patient
-Pre_Id
-Dr_Id
-Pat_Id
-Pre_Date
Prescription
-Pre_Id
-Med_Id
-Med_Name
-Med_Dosage
PrescriptionDetail
Figure 10 CMH hospital
PIMS Data Warehouse
49
5.1.5 Urwah lab
-Ur_Id
-Ur_Address
-Ur_Phone
-Ur_Fax
-Ur_City
Branches
-Ur_Dept_Id
-Ur_Id
-Ur_Dept_Name
-Ur_Dept_Location
Department
-Ur_S_Id
-Ur_S_FName
-Ur_S_LName
-Ur_S_Sex
-Ur_S_Address
-Ur_S_Salary
-Ur_S_Phone
-Ur_S_Mobile
-Ur_Dept_Id
-Ur_Shift_Id
-Ur_S_Password
Staff
-Ur_T_Id
-Ur_P_Id
-Ur_S_Id
-Ur_Test_Category_Id
Test
-Ur_Eqp_Id
-Ur_Eqp_Name
-Ur_Eqp_Company
-Ur_Dept_Id
-Ur_Status
Equipment
-Ur_R_Id
-Ur_S_Id
-Ur_T_Id
-Ur_Dr_Id
-Ur_DateTime
-Ur_Disease_Id
-Ur_Disease_Status
Report
-Ur_P_Invoice_No
-Ur_S_Id
-Ur_P_Id
-Ur_Status
-Ur_Total_Price
PatientInvoice
-Ur_Test_Category_Id
-Ur_Test_Name
-Ur_Reference_Value
Sub_Urine_PC
-Ur_Test_Category_Id
-Ur_Test_Name
-Ur_Reference_Value
Sub_Cholesterol
-Ur_Test_Category_Id
-Ur_Test_Name
-Ur_Reference_Value
Sub_Urine_CE
-Ur_Test_Category_Id
-Ur_Test_Name
-Ur_Reference_Value
Sub_Test_CBC
-Ur_Shift_Id
-Ur_Shift_Name
-Ur_Shift_Start
-Ur_Shift_End
Shift
-Ur_Sup_Id
-Ur_Sup_Name
Supplier
-Ur_Dr_Id
-Ur_Dr_Name
-Ur_Specialization
-Ur_Dr_Gender
-Ur_Address
-Ur_Dr_Phone
-Ur_Dept_Id
-Ur_Shift_Id
-Ur_Password
Doctor
-Ur_Order_No
-Ur_S_Id
-Ur_Sup_Id
-Ur_Order_Date
-Ur_Status
OrderRequest
-Ur_Order_No
-Ur_Order_Item_Id
-Ur_Eqp_Name
-Ur_DateTime
OrderDetail
-Ur_R_Dr_Id
-Ur_R_Dr_Name
-Ur_R_Dr_Address
-Ur_R_Dr_Specialization
-Ur_R_Dr_Phone_No
-Ur_R_Dr_Mobile_No
ReferedDoctor
-Ur_Test_Category_Id
-Ur_Test_Name
-Ur_Reference_Value
Sub_CBC_DC
-Ur_Test_Category_Id
-Ur_Test_Name
-Ur_Reference_Value
Sub_Urine_ME
-Ur_Test_Category_Id
-Ur_T_Type
Test_Category
-Ur_Test_Category_Id
-Ur_Test_Name
-Ur_Reference_Value
Sub_Triglycerides
-Ur_Sup_Id
-Ur_Eqp_Name
-Ur_Total_Invoice
SupplierInvoice
-Ur_Test_Category_Id
-Ur_Test_Name
-Ur_Reference_Value
Sub_LFT
-Ur_Test_Category_Id
-Ur_Test_Name
-Ur_Reference_Value
Sub_CBC_AV
-Url_Test_Category_Id
-Ur_Test_Name
-Ur_Reference_Value
Sub_Ant_HCV
-Ur_Test_Category_Id
-Ur_Test_Name
-Ur_Reference_Value
Sub_Test_Dengue
-Ur_P_Id
-Ur_Id
-Ur_P_FName
-Ur_P_LName
-Ur_P_Gender
-Ur_P_Address
-Ur_P_Phone
-Ur_P_Mobile
-Ur_R_Dr_Id
Patient
-Ur_Disease_Id
-Ur_Disease_Name
Disease
-Ur_Test_Name
-Ur_Result
-Ur_Reference_Value
-Ur_R_Id
Analysis
-Ur_Patient_Invoice_No
-Ur_Total_Test
-Ur_Total_Price
PatientInvoiceDetail
Figure 11 Urwah lab
PIMS Data Warehouse
50
5.2 Object diagram
PIMS : PIMS
pat : Patient hosp : Hospital lab : LabClinics
h_lab : Healthways_lab
u_lab : Urwah_Lab
c_lab : City_lab
stf : staff
presc : Prescription doc : Doctor lab_rep : LabClinic_Reporters
rep : Reports
Test : Lab tests
Ans : Anslysis
Figure 12 Object diagram
PIMS Data Warehouse
51
5.3 Component Diagram
There shall be four different data base components from which data shall be extracted,
transformed, loaded (ETL) into single component of data warehouse (data mart).
ODB-1 ODB-2
ETL
ODB-3 ODB-4
Data Warehouse (Data Mart)
Figure 13 Component Diagram
PIMS Data Warehouse
52
5.4 Deployment Diagram
Deployment shall be divided into four levels Database Server maintain SQL Server then Data
Warehouse Server maintain Data Warehouse SQL Server then Application Server maintain Web
Application and Client Workstation that can view Web Application through Interface.
Database Server
Data Warehouse Server
Application Server
Client Work Station
Interface
Web Application
Web ApplicationVisual Studio
SQL Server
SQL Server
Maintain Database
Maintain Data
Warehouse
Deployment Diagram
Figure 14 Deployment Diagram
PIMS Data Warehouse
53
5.5 Composite Structure Diagram
Administrator send request to reporting manager then request goes to SQL server to ETL tool
to report tools to query manager. In provided interface report shown.
Report
Report Report Tools
Query
Manager
DWH
Manager
ETL
Tool
SQL
Server
Reporting
Manager
Administrator
Checking
Design and
Structure
Figure 15 Composite Structure Diagram
PIMS Data Warehouse
54
5.6 Package diagram
User through interface either registers or login after successful login view report with help of
data mining from OLTP process.
Data Mart (PIMS) store all information about patient which came through ETL process from
four different Operational Data Stores.
Application
Login
Regsiter
User
Interface
ReportData Mining
OLAP
Data
Warehouse
Data Mart
(PIMS)
ETL
Operational
Data Store-1
Operational
Data Store-2
Operational
Data Store-3
Operational
Data Store-4
Figure 16 Package diagram
PIMS Data Warehouse
55
Chapter 6
UML Behavior Diagrams
PIMS Data Warehouse
56
6 UML Behavior Diagrams
6.1 Use Case Diagram
Figure 17 Use Case Diagram
PIMS Data Warehouse
57
6.2 Activity Diagram
6.2.1 Create new project
Create New Project New ConnectionUser Open Connection
ODS's
Set project information
[No]
[Yes]
Set global valuesLoad Project
Figure 18 Activity Diagram (Create new project)
6.2.2 Open existing project
Open project Check connectionsUser Open Connection
ODS's
Set project information
[No]
[Yes]
Set global valuesLoad Project
Figure 19 Activity Diagram (Open existing project)
PIMS Data Warehouse
58
6.2.3 Close project
Open project Finalize changesUser Save changes
Commit changes
Close connection
[No]
Changes savedClosed
[Yes]
Figure 20 Close project
6.2.4 Create mapping
Create mappings New mappingUser Get source details
Get target details
Provide mapping
[No]
[Yes] Validate mappings
Finalize mappings
Figure 21 Activity Diagram (Create mapping)
PIMS Data Warehouse
59
6.2.5 Load RDBMS
Load ODS connect DBUser Load table names
Load column names
Load relationships
[No]
[Yes]
Identify relationshipspopulate DB details
Figure 22 Activity Diagram (Load RDBMS)
PIMS Data Warehouse
60
6.3 State Machine diagram
6.3.1 Report
ETL
Query to generate report Process required data Send data
Apply defined rulesGenerate report
[No]
[Yes]
Figure 23 State Machine Diagram (Report)
PIMS Data Warehouse
61
6.3.2 ETL
ETL
Load Project get information check meta data
Perform ETLExtract
Transform
Load
DWH
Figure 24 State Machine Diagram (ETL)
PIMS Data Warehouse
62
Chapter 7
UML Interaction Diagrams
PIMS Data Warehouse
63
7 UML Interaction Diagrams
7.1 Sequence Diagram
A sequence diagram shows an interaction arranged in time sequence. In particular, it shows the
instances participating in the interaction by their “lifelines” and the stimuli they exchange
arranged in time sequence
7.1.1 Create New Project
MainForm NewProject Connect
CreateNewProject
NewConnection
SetProjectInfo
SetGlobalValues
LoadProject
OpenConnection
OpenedConnection
Figure 25 Sequence Diagram (Create New Project)
PIMS Data Warehouse
64
7.1.2 Open Existing Project
MainForm Project Connect
OpenProject
NewConnection
LoadProjectInfo
LoadGlobalValues
OpenedConnection
LoadProject
OpenConnection
Figure 26 Sequence Diagram (Open Existing Project)
PIMS Data Warehouse
65
7.1.3 Close Project
MainForm
CloseProject
FinalizeChanges
SaveChanges
CommintChanges
CloseConnection
ChangesSaved
Reset
Projects Connect
Figure 27 Sequence Diagram (Close Project)
PIMS Data Warehouse
66
7.1.4 Load RDBMS Details
MainForm RDBDetail RDBRelationships
LoadDBDetails
ConnectDB
LoadTableNames
LoadColumnNames
LoadRelationships
IdentifyRelationships
RelationshipDetails
PopulateDBDetails
Figure 28 Sequence Diagram (Load RDBMS Details)
PIMS Data Warehouse
67
7.1.5 Create Schema
MainFormCreateStarCreateSchemaCreateFactCreateDimension
LoadSchemaCreator
InitializeSchemaDetails
CreateSchema
NewSchema
CreateNewFact
DefineTable
SetTableDetails
FactCreated
CreateNewDimension
DefineTable
SetTableDetails
DimensionCreated
LinkFactsAndDimensions
FinalizeSchema
SaveSchema
PopulateSchemaDetails
PIMS Data Warehouse
68
MainFormCreateStarCreateSchemaCreateFactCreateDimension
LoadSchemaCreator
InitializeSchemaDetails
CreateSchema
NewSchema
CreateNewFact
DefineTable
SetTableDetails
FactCreated
CreateNewDimension
DefineTable
SetTableDetails
DimensionCreated
LinkFactsAndDimensions
FinalizeSchema
SaveSchema
PopulateSchemaDetails
Figure 29 Sequence Diagram (Create Schema)
PIMS Data Warehouse
69
7.1.6 Create Mappings
MainForm TableMappings Source Target
CreateMappings
NewMapping
GetSourceDetails
SourceDetails
GetTargetDetials
TargetDetails
ProvideMappings
ValidateMappings
FinalizeMappings
SavedMappings
Figure 30 Sequence Diagram (Create Mappings)
PIMS Data Warehouse
70
7.1.7 Data Extraction
DataExtractor Mappings Source
NewExtraction
LoadMappingDetails
MappingDetails
CreateExtractionProcedures
ExtractData
CreateDataFiles
DataFiles
SaveFiles
Figure 31 Sequence Diagram (Data Extraction)
PIMS Data Warehouse
71
7.1.8 Data Transformations
MainForm DataTransformation TransformationFunctions
Initialize
NewTransformation
SelectTransformation
SetTransformationDetails
InitTransfromData
TransformationDetails
ApplyTransformations
TransformedData
ExtractedData
GetExtractedData
ExtractedData
TransformationLog
Figure 32 Sequence Diagram (Data Transformations)
PIMS Data Warehouse
72
Data Loading
DataLoader TargetTransformedData
GetTransfromedData
TransfromedData
NewLoad
LoadData
InitializeTarget
SendData
SaveData
SavedDataLog
Figure 33 Sequence Diagram (Data Loading)
PIMS Data Warehouse
73
7.1.9 Report Generation
Doctor shall select specific criteria to generate report and send to data warehouse manager and
it shall give some acknowledgement and send request to report generation tool and it shall
generate report according to criteria set by doctor.
Web Interface Data Warehouse Manager
Send Report
Doctor
Report Generation Tool
Criteria to generate report
Process required data
Send data
Apply defined rules
Acknowledge
Figure 34 Sequence Diagram (Report Generation)
PIMS Data Warehouse
74
7.1.10 ETL
Data came from operational data stores that shall be extracted by extract manager then sent to
transform manager that set data into standard format then to cleaning manager to get useful
information then to load manager and finally it shall be loaded to data warehouse.
ODS Extract Manager Transform Manager Cleaning Manager Load Manager Data Warehouse
Req()
Get_data()
Extracted Data
Transformed Data
Clean Data
Load to DWH
Extract_data()
Transform Data
Clean()
Figure 35 Sequence Diagram (ETL)
PIMS Data Warehouse
75
7.2 Communication Diagram
7.2.1 ETL
Data source respond to ETL request then it shall extract data with help of extract manager then
sent to transform manager that set data into standard format then to cleaning manager to get
useful information then to load manager and finally it shall be loaded to data warehouse.
ETL Tool 2.Extract()
Extract
Manager 3.Transform
Data
Request()
Data Source
1.Respond to Request
4.C
leaning
D
ata
Transform
Manager
Clean
Manager
Load
Manager
5.Load Data
Data
Warehouse
Figure 36 Communication Diagram (ETL)
7.2.2 Report
User request generate or view report depends on need of user the request shall be sent to
reporting tool to generate required report or view required report by reporting manager.
Reporting
Tool
1.GenerateReport
2.ViewReport
Report
Manager
2.1.1.See Specific
Report
1.1.Generate Required Report
2.1.View Required Report
Web
Application
User
User Requests
Request()
1.1.1. Report Generated
Figure 37 Communication Diagram (Report)
PIMS Data Warehouse
76
7.3 Interaction Overview
7.3.1 ETL
Extract manager extract data then sent to transform manager that set data into standard format
then to cleaning manager to get useful information then to load manager and finally it shall be
loaded to Data Warehouse.
Extract Manager
Transform
Manager
Cleaning
Manager
Load Manager
Data Ware
House
Figure 38 Interaction Overview (ETL)
7.3.2 Warehouse Interaction
Doctor view patient history through query manager analyzes patient history and reporting tool
suggests prescription depending on analysis of patient history.
Doctor
View Patient
History
Analyze Patient
History
Prescription
Query Manager
Reporting Tool
Figure 39 Interaction Overview (Warehouse Interaction)
PIMS Data Warehouse
77
Web Diagrams
7.3.3 Access Model
<<Navigation Class>>
User/Staff
<<Menu>>
User Menu
<<Guided Tour>>
SingIn Page <<Guided Tour>>
Contact Us
<<Guided Tour>>
About Us
<<Guided Tour>>
SignUp
<<Navigation Class>>
Staff
<<Navigation Class>>
User
Authorized Staff
Authorized User
<<Guided Tour>>
Home Page
<<Menu>>
Staff Menu
<<Menu>>
User Menu
<<Navigation Class>>
Staff register member
<<Navigation Class>>
Staff upload Report
<<Navigation Class>>
Staff/User View report
Staff
Staff
User
Staff
<<Guided Tour>>
SignOut
<<Navigation Class>>
Staff/User Change
Password
Staff
User
Figure 40 Web Diagrams (Access Model)
PIMS Data Warehouse
78
Chapter 8
Implementation
PIMS Data Warehouse
79
8 Implementation
8.1 System Implementation
In this chapter our project detail is discussed and the implementation of the patient information
and monitoring system based on data warehouse. The implementation of the system is done in
two parts:
 Back end software SQL server r2
 GUI
8.2 Back end software SQL server r2
Back end is also subdivided into 2parts, namely
 SSIS
 SSAS
8.2.1 Snowflake Schema
Figure 41 Snowflake Schema
PIMS Data Warehouse
80
8.2.2 ETL SSIS
First comes the ETL part for that we have used SQL server Integration services (SSIS). When
data is required to be loaded in data warehouse regularly, so that it can serve its purpose of
facilitating business analysis. To do this, data from one or more operational systems needs to be
extracted and copied into the warehouse. The process of extracting data from source systems and
bringing it into the data warehouse is commonly called ETL, which stands for extraction,
transformation, and loading. The acronym ETL is perhaps too simplistic, because it omits the
transportation phase and implies that each of the other phases of the process is distinct. We refer
to the entire process, including data loading, as ETL. The ETL refers to a broad process, and not
three well-defined steps.
What happens during the ETL process? During extraction, the desired data is identified and
extracted from many different sources, including database systems and applications. Very often,
it is not possible to identify the specific subset of interest; therefore more data than necessary has
to be extracted, so the identification of the relevant data shall be done at a later point in time.
Depending on the source system's capabilities (for example, operating system resources), some
transformations may take place during this extraction process. The size of the extracted data
varies from hundreds of kilobytes up to gigabytes, depending on the source system and the
business situation. The same is true for the time delta between two (logically) identical
extractions: the time span may vary between days/hours and minutes to near real-time. Web
server log files for example can easily become hundreds of megabytes in a very short period of
time.
8.2.3 Overview of PIMS ETL
Figure 42 Overview of PIMS ETL
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report
PIMS_Final_Report

More Related Content

Viewers also liked

Project 1 integration march 2015
Project 1 integration march 2015Project 1 integration march 2015
Project 1 integration march 2015
apongmalik
 
w/ DATUM INTERIORS
w/ DATUM INTERIORSw/ DATUM INTERIORS
w/ DATUM INTERIORS
Cyril Enghog
 
Sa buhatan ay may silbi
Sa buhatan ay may silbiSa buhatan ay may silbi
Sa buhatan ay may silbi
maynard23
 
Scanned from a Xerox Multifunction Device-1
Scanned from a Xerox Multifunction Device-1Scanned from a Xerox Multifunction Device-1
Scanned from a Xerox Multifunction Device-1
Roger Struble
 

Viewers also liked (15)

Report
ReportReport
Report
 
Project 1 integration march 2015
Project 1 integration march 2015Project 1 integration march 2015
Project 1 integration march 2015
 
Paul Resume (1)
Paul Resume (1)Paul Resume (1)
Paul Resume (1)
 
w/ DATUM INTERIORS
w/ DATUM INTERIORSw/ DATUM INTERIORS
w/ DATUM INTERIORS
 
Proposal environment
Proposal environmentProposal environment
Proposal environment
 
Sa buhatan ay may silbi
Sa buhatan ay may silbiSa buhatan ay may silbi
Sa buhatan ay may silbi
 
Final digital marketing pp
Final digital marketing ppFinal digital marketing pp
Final digital marketing pp
 
Scanned from a Xerox Multifunction Device-1
Scanned from a Xerox Multifunction Device-1Scanned from a Xerox Multifunction Device-1
Scanned from a Xerox Multifunction Device-1
 
Know more about MediSecure Electronic Prescription software
Know more about MediSecure Electronic Prescription software Know more about MediSecure Electronic Prescription software
Know more about MediSecure Electronic Prescription software
 
Calculated MOC 1
Calculated MOC 1Calculated MOC 1
Calculated MOC 1
 
Monster Power CV Search
Monster Power CV SearchMonster Power CV Search
Monster Power CV Search
 
Franz liszt
Franz lisztFranz liszt
Franz liszt
 
Cnc final assignment
Cnc final assignmentCnc final assignment
Cnc final assignment
 
Revista decoração
Revista decoraçãoRevista decoração
Revista decoração
 
YR_SV
YR_SVYR_SV
YR_SV
 

Similar to PIMS_Final_Report

TresW CP Summary 2015 PUBLIC
TresW CP Summary 2015 PUBLICTresW CP Summary 2015 PUBLIC
TresW CP Summary 2015 PUBLIC
Nidal El-Ayache
 
comspace technology profile
comspace technology profilecomspace technology profile
comspace technology profile
Wao Wamola
 

Similar to PIMS_Final_Report (20)

A treatise on SAP logistics information reporting
A treatise on SAP logistics information reportingA treatise on SAP logistics information reporting
A treatise on SAP logistics information reporting
 
Acetech company profile
Acetech company profileAcetech company profile
Acetech company profile
 
TresW CP Summary 2015 PUBLIC
TresW CP Summary 2015 PUBLICTresW CP Summary 2015 PUBLIC
TresW CP Summary 2015 PUBLIC
 
SAP BODS 4.2
SAP BODS 4.2 SAP BODS 4.2
SAP BODS 4.2
 
Hospital management System (asp.net with c#)Project report
Hospital management System (asp.net with c#)Project reportHospital management System (asp.net with c#)Project report
Hospital management System (asp.net with c#)Project report
 
Blockchain and LMS: A Proof-of-Concept
Blockchain and LMS: A Proof-of-ConceptBlockchain and LMS: A Proof-of-Concept
Blockchain and LMS: A Proof-of-Concept
 
Assi 3 tm
Assi 3 tmAssi 3 tm
Assi 3 tm
 
Proposal with sdlc
Proposal with sdlcProposal with sdlc
Proposal with sdlc
 
comspace technology profile
comspace technology profilecomspace technology profile
comspace technology profile
 
Banking Management System Project documentation
Banking Management System Project documentationBanking Management System Project documentation
Banking Management System Project documentation
 
Email Infrastructure: Open Source vs. Commercial MTAs
Email Infrastructure: Open Source vs. Commercial MTAsEmail Infrastructure: Open Source vs. Commercial MTAs
Email Infrastructure: Open Source vs. Commercial MTAs
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
Online Helpdesk System
Online Helpdesk SystemOnline Helpdesk System
Online Helpdesk System
 
Microsoft India - System Center Controlling Costs and Driving Agility Whitepaper
Microsoft India - System Center Controlling Costs and Driving Agility WhitepaperMicrosoft India - System Center Controlling Costs and Driving Agility Whitepaper
Microsoft India - System Center Controlling Costs and Driving Agility Whitepaper
 
OSPX - Professional PostgreSQL Certification Scheme v20201111.pdf
OSPX - Professional PostgreSQL Certification Scheme v20201111.pdfOSPX - Professional PostgreSQL Certification Scheme v20201111.pdf
OSPX - Professional PostgreSQL Certification Scheme v20201111.pdf
 
Spreadsheet server
Spreadsheet serverSpreadsheet server
Spreadsheet server
 
2015 siguccs itsm panel
2015 siguccs itsm panel2015 siguccs itsm panel
2015 siguccs itsm panel
 
Future of data center
Future of data centerFuture of data center
Future of data center
 
SplunkLive! Zurich 2018: Integrating Metrics and Logs
SplunkLive! Zurich 2018: Integrating Metrics and LogsSplunkLive! Zurich 2018: Integrating Metrics and Logs
SplunkLive! Zurich 2018: Integrating Metrics and Logs
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 

PIMS_Final_Report

  • 1. PIMS Data Warehouse COMSATS Institute of Information Technology, Islamabad Patient Information and Monitoring System Using Data Warehousing By Tahir Ayoub SP08-BCS-052 Faraz Ahmed SP08--BCS-015 Supervisor: Muhammad Mustafa Khattak Bachelor of Computer Science (2008-2012) The candidate confirms that the work submitted is his own and appropriate credit has been given where reference has been made to the work of others
  • 2. PIMS Data Warehouse 2 DECLARATION We hereby declare that this software, neither as a whole nor as a part hereof has been copied out from any source. It is further declared that we have developed this Software and the accompanied report entirely on the basis of our personal efforts made under the sincere guidance of our seniors and teachers. If any part of this report is proved to be copied out or found to be reported, we shall standby the consequences. No portion of the work presented in this report has been submitted in support of any other degree or qualification of this or any other university or institute of learning. Tahir Ayoub SP08-BCS-052 Faraz Ahmed SP08-BCS-015
  • 3. PIMS Data Warehouse 3 CERTIFICATE OF APPROVAL It is to certify that the final year project of BS (CS) “PATIENT INFORMATION AND MONITORING SYSTEM USING DATA WAREHOUSE” was developed by “Tahir Ayoub (CIIT/SP08-BCS-052)” and “Faraz Ahmed (CIIT/SP08-BCS-015)” under the supervision of “Muhammad Mustafa Khattak” and that in their opinion; it is fully adequate, in scope and quality for the degree of Bachelors of Science in Computer Sciences. --------------------------------------- Supervisor --------------------------------------- External Examiner --------------------------------------- Head of Department (Department of Computer Science)
  • 4. PIMS Data Warehouse 4 EXECUTIVE SUMMARY There are a number of reasons for which the migration form Relational Database Management System (RDBMS) to Data Warehouse is required. Data warehouse is an informational environment that:  Provides an integrated and total view of enterprise.  Make the enterprise current and historical information easily available for decision making.  Make decision-support transactions possible without hindering operational systems.  Renders the organization’s information consistent.  Presents a flexible and interactive source of strategic information.  This is a solution for a user with the prior knowledge of data warehouse design concepts. The Warehouse will provide support for intelligent user to create data warehouse schema from an existing OLTP systems consisting of a relational database, which will be MS Notepad, MS Excel, MS Access, SQL server 2008. The target system i.e., the warehouse will also be implemented in SQL Server 2008 r2.
  • 5. PIMS Data Warehouse 5 ACKNOWLEDGMENT ALLAH the Almighty! We are thankful to you, for giving us with the courage to take, and for your infinite help to complete this project. Without your help we would have never been able to complete this project. Thanks to all the teachers for guiding us throughout our stay at this university and all the friends for providing us a beautiful company which we will never forget. And last but not the least, thanks to our parents because without their love, affection, and prayers for us; our studies and this project were not achievable. -------------------------------- -------------------------------- Tahir Ayoub Faraz Ahmed
  • 6. PIMS Data Warehouse 6 Abbreviations ODS Operational Data source SSIS SQL Server Integration Services SSAS SQL Analysis Services SSRS SQL Reporting Services DWH Data Warehouse OLAP Online Analysis Process OLTP Online Analytical Process ETL Extract Transform Load
  • 7. PIMS Data Warehouse 7 Contents 1.Introduction........................................................................................................ 13 1.1 Brief .................................................................................................................................13 1.2 Relevance to Course Modules............................................................................................. 13 1.3 Project Background............................................................................................................ 14 1.4 Literature Review .............................................................................................................. 14 1.4.1 Area of Knowledge....................................................................................................... 14 1.4.2 Decision Support Systems (DSS)................................................................................... 15 1.4.3 Data Warehouse ........................................................................................................... 16 1.4.4 Development Lifecycle .................................................................................................16 1.4.5 Data Warehouse SDLC.................................................................................................18 1.4.6 Classical SDLC............................................................................................................ 18 1.4.7 Overview of ETL.......................................................................................................... 18 1.4.8 Major Functions ........................................................................................................... 19 1.5 Methodology and Software Life Cycle ................................................................................ 20 2 Problem Definition.............................................................................................. 22 2.1 Purpose............................................................................................................................. 22 2.2 Product Functions .............................................................................................................. 22 2.3 Proposed Architecture........................................................................................................ 23 2.3.1 Basics of Data Warehouse and ETL............................................................................... 23 2.3.2 Data Warehouse Architectures....................................................................................... 25 2.3.3 Basic Data Warehouse Architecture............................................................................... 25 Figure 3 Basic Data Warehouse Architecture.............................................................................. 25 2.3.4 Data Warehouse Architecture with Staging Area ............................................................ 26 Figure 4 Data Warehouse Architecture with Staging Area........................................................... 26 2.3.5 Data Warehouse Architecture with Staging Area and Data Marts..................................... 26 Figure 5 Data Warehouse Architecture with Staging Area and Data Marts....................................26 2.3.6 Data Warehouse Modeling ............................................................................................ 27 2.3.7 ETL Operation ............................................................................................................. 27 3 Requirements Analysis........................................................................................ 32 3.1 Project Overview............................................................................................................... 32 3.1.1 Data Profiling............................................................................................................... 32 3.1.2 Warehouse Schema Generation ..................................................................................... 32 3.1.3 Data Extraction............................................................................................................. 32 3.1.4 Data Transformation..................................................................................................... 32 3.1.5 Data Loading................................................................................................................ 32 3.2 Functional Requirements ....................................................................................................33 3.2.1 Data Profiling............................................................................................................... 33 3.2.2 Warehouse Schema Generation ..................................................................................... 33 3.2.3 Data Extraction............................................................................................................. 33 3.2.4 Data Transformation..................................................................................................... 34 3.2.5 Data Loading................................................................................................................ 35 3.3 Nonfunctional Requirements .............................................................................................. 35 3.3.1 Performance Requirements............................................................................................ 35 3.3.2 Safety Requirements ..................................................................................................... 35 3.3.3 Reliability Requirements............................................................................................... 35 3.4 External Interface Requirements ......................................................................................... 35 3.4.1 User Interface............................................................................................................... 35 3.4.2 Hardware Resources..................................................................................................... 35 3.4.3 Hardware Interfaces...................................................................................................... 36 3.5 Use Case Specifications ..................................................................................................... 36 3.5.1 Connect RDBMS User..................................................................................................36 3.5.2 Connect Data Warehouse Schema.................................................................................. 36 3.5.3 Connect database User..................................................................................................37 3.5.4 Load Relational Database Model................................................................................... 37 3.5.5 Identify Table Names....................................................................................................38
  • 8. PIMS Data Warehouse 8 3.5.6 Identify Columns with Data Types................................................................................. 38 3.5.7 Identify Relationships between Tables ........................................................................... 39 3.5.8 Load Warehouse Schema.............................................................................................. 39 3.5.9 Map Columns............................................................................................................... 40 3.5.10 Extract Data from RDBMS........................................................................................... 41 3.5.11 Transform Extracted Data ............................................................................................. 41 3.5.12 Load Data in Warehouse............................................................................................... 42 4 The Design.......................................................................................................... 44 4.1 Modules............................................................................................................................ 44 4.1.1 Connectivity................................................................................................................. 44 4.1.2 RDBMS Details ........................................................................................................... 44 4.1.3 Schema Generation....................................................................................................... 44 4.1.4 Column Mappings ........................................................................................................ 45 4.1.5 Extraction .................................................................................................................... 45 4.1.6 Transformation............................................................................................................. 45 4.1.7 Loading ....................................................................................................................... 45 5 UML Structure Diagram....................................................................................... 45 5.1 Class Diagram................................................................................................................... 45 5.1.1 City lab........................................................................................................................ 45 Figure 7 City lab....................................................................................................................... 45 5.1.2 Health ways ................................................................................................................. 46 5.1.3 Clinic........................................................................................................................... 47 5.1.4 CMH hospital............................................................................................................... 48 5.1.5 Urwah lab.................................................................................................................... 49 5.2 Object diagram .................................................................................................................. 50 5.3 Component Diagram.......................................................................................................... 51 5.4 Deployment Diagram......................................................................................................... 52 5.5 Composite Structure Diagram............................................................................................. 53 5.6 Package diagram................................................................................................................ 54 6 UML Behavior Diagrams...................................................................................... 56 6.1 Use Case Diagram ............................................................................................................. 56 6.2 Activity Diagram ............................................................................................................... 57 6.2.1 Create new project........................................................................................................ 57 6.2.2 Open existing project....................................................................................................57 6.2.3 Close project................................................................................................................ 58 Figure 20 Close project............................................................................................................. 58 6.2.4 Create mapping ............................................................................................................ 58 6.2.5 Load RDBMS.............................................................................................................. 59 6.3 State Machine diagram....................................................................................................... 60 6.3.1 Report.......................................................................................................................... 60 6.3.2 ETL............................................................................................................................. 61 7 UML Interaction Diagrams................................................................................... 63 7.1 Sequence Diagram............................................................................................................. 63 7.1.1 Create NewProject....................................................................................................... 63 7.1.2 Open Existing Project ...................................................................................................64 7.1.3 Close Project................................................................................................................ 65 7.1.4 Load RDBMS Details ...................................................................................................66 7.1.5 Create Schema.............................................................................................................. 67 7.1.6 Create Mappings .......................................................................................................... 69 7.1.7 Data Extraction............................................................................................................. 70 7.1.8 Data Transformations....................................................................................................71 7.1.9 Report Generation ........................................................................................................ 73 7.1.10 ETL............................................................................................................................. 74 7.2 Communication Diagram ...................................................................................................75 7.2.1 ETL............................................................................................................................. 75 7.2.2 Report.......................................................................................................................... 75
  • 9. PIMS Data Warehouse 9 7.3 Interaction Overview.......................................................................................................... 76 7.3.1 ETL............................................................................................................................. 76 7.3.2 Warehouse Interaction ..................................................................................................76 7.3.3 Access Model............................................................................................................... 77 8 Implementation.................................................................................................. 79 8.1 System Implementation...................................................................................................... 79 8.2 Back end software SQL server r2........................................................................................ 79 8.2.1 Snowflake Schema ....................................................................................................... 79 Figure 41 Snowflake Schema ....................................................................................................79 8.2.2 ETL SSIS..................................................................................................................... 80 8.2.3 Overview of PIMS ETL................................................................................................ 80 8.2.4 General Overview of PIMS City Lab ETL...................................................................... 81 8.2.5 General Overview of PIMS Clinics ETL ........................................................................ 81 8.2.6 General Overview of PIMS CMH Hospital ETL............................................................. 82 8.2.7 General Overview of PIMS Fact table ETL ....................................................................83 8.2.8 General Overview of PIMS Heath ways ETL .................................................................84 8.2.9 General Overview of PIMS Urwah Lab ETL..................................................................85 8.2.10 General Overview of PIMS ETL of date format.............................................................. 86 8.2.11 PIMS ETL of date format.............................................................................................. 86 8.3 SSAS................................................................................................................................ 88 8.3.1 Overview of OLAP Cube.............................................................................................. 88 8.3.2 Overview of OLAP Cube drill down.............................................................................. 89 8.4 General Overview of Dimensions ....................................................................................... 90 8.4.1 Overview of Date time dimensions ................................................................................ 90 8.4.2 Date time English month calculation.............................................................................. 91 8.4.3 Overview of Doctor Dimension ..................................................................................... 92 8.5 Graphical User Interface..................................................................................................... 93 8.5.1 Overview of SSRS report 1........................................................................................... 93 8.5.2 SSRS report 1 month wise............................................................................................. 94 8.5.3 SSRS report 2 total Patients in different cities w.r.t year.................................................. 95 8.5.4 SSRS report 3 disease wise ........................................................................................... 96 8.5.5 SSRS report 3a Total patients affected by disease in different years.................................97 8.5.6 SSRS report 3c tabular disease report............................................................................. 98 8.5.7 SSRS report 4 Patient Detail.......................................................................................... 99 8.5.8 SSRS report 5 Dr. Detail............................................................................................. 100 8.5.9 SSRS report 5a Dr. Detail .......................................................................................... 101 8.5.10 SSRS report 5b Patient Detail...................................................................................... 101 8.5.11 SSRS report 5c Report Patient Detail........................................................................... 102 9 Testing and Evaluation.......................................................................................104 9.1 Testing............................................................................................................................ 104 9.1.1 Black Box Testing ...................................................................................................... 104 9.2 Testing of PIMS Data warehouse...................................................................................... 105 9.3 Test Cases....................................................................................................................... 106 10 Future Work...............................................................................................111 11 References.................................................................................................113
  • 10. PIMS Data Warehouse 10 Table of Figures Figure 1 Methodology and Software Life Cycle................................................................. 20 Figure 2 Contrasting OLTP and Data Warehousing Environments....................................... 24 Figure 3 Basic Data Warehouse Architecture.................................................................... 25 Figure 4 Data Warehouse Architecture with StagingArea................................................. 26 Figure 5 Data Warehouse Architecture with StagingArea and Data Marts ......................... 26 Figure 6 ETL Operation.................................................................................................... 28 Figure 7 City lab.............................................................................................................. 45 Figure 8 Healthways....................................................................................................... 46 Figure 9 Clinic................................................................................................................. 47 Figure 10 CMH hospital................................................................................................... 48 Figure 11 Urwah lab........................................................................................................ 49 Figure 12 Object diagram................................................................................................ 50 Figure 13 Component Diagram........................................................................................ 51 Figure 14 Deployment Diagram....................................................................................... 52 Figure 15 Composite Structure Diagram........................................................................... 53 Figure 16 Package diagram.............................................................................................. 54 Figure 17 Use Case Diagram............................................................................................ 56 Figure 18 Activity Diagram(Create new project)............................................................... 57 Figure 19 Activity Diagram(Open existing project)............................................................ 57 Figure 20 Close project.................................................................................................... 58 Figure 21 Activity Diagram(Create mapping).................................................................... 58 Figure 22 Activity Diagram(Load RDBMS) ........................................................................ 59 Figure 23 State Machine Diagram (Report)....................................................................... 60 Figure 24 State Machine Diagram (ETL)............................................................................ 61 Figure 25 Sequence Diagram (Create New Project)........................................................... 63 Figure 26 Sequence Diagram (Open Existing Project) ........................................................ 64 Figure 27 Sequence Diagram (Close Project)..................................................................... 65 Figure 28 Sequence Diagram (Load RDBMS Details).......................................................... 66 Figure 29 Sequence Diagram (Create Schema).................................................................. 68 Figure 30 Sequence Diagram (Create Mappings)............................................................... 69 Figure 31 Sequence Diagram (Data Extraction)................................................................. 70 Figure 32 Sequence Diagram (Data Transformations)Data Loading.................................... 71 Figure 33 Sequence Diagram (Data Loading)..................................................................... 72 Figure 34 Sequence Diagram (Report Generation)............................................................ 73 Figure 35 Sequence Diagram (ETL)................................................................................... 74 Figure 36 Communication Diagram (ETL).......................................................................... 75 Figure 37 Communication Diagram (Report)..................................................................... 75 Figure 38 Interaction Overview (ETL) ............................................................................... 76 Figure 39 Interaction Overview (Warehouse Interaction)Web Diagrams ............................ 76 Figure 40 Web Diagrams (Access Model).......................................................................... 77 Figure 41 Snowflake Schema........................................................................................... 79 Figure 42 Overview of PIMS ETL....................................................................................... 80 Figure 43 General Overview of PIMS City Lab ETL ............................................................. 81 Figure 44 General Overview of PIMS Clinics ETL................................................................ 81
  • 11. PIMS Data Warehouse 11 Figure 45 General Overview of PIMS CMH Hospital ETL..................................................... 82 Figure 46 General Overview of PIMS Fact table ETL .......................................................... 83 Figure 47 General Overview of PIMS Heath ways ETL........................................................ 84 Figure 48 General Overview of PIMS Urwah Lab ETL ......................................................... 85 Figure 49 General Overview of PIMS ETL of date format................................................... 86 Figure 50 PIMS ETL of date format................................................................................... 87 Figure 51 Overview of OLAP Cube.................................................................................... 88 Figure 52 Overview of OLAP Cube drill down.................................................................... 89 Figure 53 Overview of Date time dimensions.................................................................... 90 Figure 54 Date time English month calculation................................................................. 91 Figure 55 Overview of Doctor Dimension......................................................................... 92 Figure 56 Overview of SSRS report 1................................................................................ 93 Figure 57 SSRS report 1 month wise................................................................................. 94 Figure 58 SSRS report 2 total Patients in different cities w.r.tyear..................................... 95 Figure 59 SSRS report 3 disease wise ............................................................................... 96 Figure 60 SSRS report 3a Total patients affected by disease in different years.................... 97 Figure 61 SSRS report Total revenue of CMH Hospitals yearly............................................ 98 Figure 62 SSRS report 4 Patient Detail.............................................................................. 99 Figure 63 SSRS report 5 Dr. Detail...................................................................................100 Figure 64 SSRS report 5b Patient Detail...........................................................................101 Figure 65 SSRS report 5c Report Patient Detail................................................................102
  • 13. PIMS Data Warehouse 13 1. Introduction 1.1 Brief This Documentation includes the detailed description of “Patient Information and monitoring system using data warehousing”. It covers all the phases of system development including requirement analysis, designing, and implementation and testing. The aim of this project is to make patient information and monitoring system using data-ware housing. Also, worldwide the healthcare industry is looking for technology that can aid with establishing on-line clinical repositories enabling rapid access to shared information that can help find cures for prevalent medical conditions. It not only facilitates the hospitals as well as doctors but also facilitates the government sector to make critical decisions; our project shall help them in decision making. The project is divided in many parts like: First comes different ODS, our project has 5 different ODS’s namely,  Healthways.xlx  Urwahlab.xlsx  CMH Hospitals SQL  Clinics.txt  City_Lab.accdb Second come the ETL part,  The tools used for ETL is SSIS, which is used to Extract data from ODS’s, transformed them to our standard formats and load them to DWH. Thirdly after doing ETL we made OLAP cubes  The tool used for it was SSAS. Fourthly after making OLAP cubes we made reports (having queries both from OLAP cubes and DWH).  The tool used to make reports is SSRS. Then comes the last part i.e. these reports have to be shown to front end users the tools used for it is ASP.net 1.2 Relevance to Course Modules The course of “data base” provides us the basic knowledge about databases which is the one of the basic requirement of our project. “Human Computer Interaction” helped us in designing user friendly GUI and reports. The most important is the
  • 14. PIMS Data Warehouse 14 internet/tutorials part by which we learned a lot about data warehouse, which were very beneficial during the project. 1.3 Project Background Health care has become one of the most important service industries that are undergoing rapid structural transformations. Healthcare remains a paper-intensive and minimally automated and digitized industry. CBS market watch reported that an estimated 90% of all patient information remains on paper. Also, worldwide the healthcare industry is looking for technology that can aid with establishing on-line clinical repositories enabling rapid access to shared information that can help find cures for prevalent medical conditions. As in Pakistan there is no such concept of DWH in healthcare department, so we are making this project and the basic theme behind is to make patient information and monitoring system using data-ware housing. Also, worldwide the healthcare industry is looking for technology that can aid with establishing on-line clinical repositories enabling rapid access to shared information that can help find cures for prevalent medical conditions. It not only facilitates the hospitals as well as doctors but also facilitates the government sector to make critical decisions; our project shall help them in decision making. 1.4 Literature Review The origins of DSS processing hark back to the very early days of computers and information systems. It is interesting that decision support system (DSS) processing developed out of a long and complex evolution of information technology. Its evolution continues today. The Data warehouse architecture has evolved throughout the history of the different stages of information processing. The information contained in a warehouse flows from the same operational systems that could not be directly used to produce strategic information. The data-warehouse user also called the DSS analyst is a business person first and foremost, and a technician second. The primary job of the DSS analyst is to define and discover information used in corporate decision-making. For developing a complete understanding the chapter starts with explaining Data Warehouse and basics of the ETL Operations. 1.4.1 Area of Knowledge This project mainly concerns with healthcare of Data Warehouse design and ETL Operation required populating the Data Warehouse. Complete and clear knowledge of Decision Support System (DSS) and Date Warehouse designing is essential for understanding this project.
  • 15. PIMS Data Warehouse 15 1.4.2 Decision Support Systems (DSS) The origin of DSS processing hark back to the very early days of computers and information systems. It is interesting that decision support system (DSS) processing developed out of a long and complex evolution of information technology. Its evolution continues today. By the mid-1970s, online transaction processing (OLTP) made faster access to data possible, opening whole new vistas for business and processing. The computer could now be used for tasks not previously possible, including driving reservations systems, bank teller systems, manufacturing control systems, and the like. Throughout this period, organizations accumulated growing amounts of data stored in their operational databases. However, in recent times where such systems are common place, organizations are focusing on ways to use operational data to support decision-making, as a means of gaining competitive advantage. Business executives have become desperate for information to stay competitive and improve the bottom line. Although operational systems provide information to run the day-to-day operations, these cannot be readily used to make strategic decisions. Businesses, therefore, are compelled to turn to new ways of getting strategic information. IT departments have been attempting to provide information to the key business personnel in their companies for making strategic decisions. Sometimes an IT department could produce ad hoc reports from a single application. In most cases, the reports would need data from multiple systems, requiring the rewriting of existing programs to create intermediary files that could be used to produce ad hoc reports. Most of the attempts by the IT in the past ended in failure. The users could not clearly define what they wanted in the first place. Once they saw the first set of reports, they wanted more data in different formats. The chain continued. This was mainly because of the very nature of the process of making strategic decisions. We have been trying all along to provide strategic information from the operational systems. Information needed for strategic decisions making has to be available in an interactive manner. The use must be able to query online, get results, and query some more. The information must be in a format suitable for analysis. If we need the ability to provide strategic information, we must get the information from altogether different types of systems. For example, the following queries cannot be answered using by a simple operational system as it contains static data:  How profitable shall the company be next quarter?  Who are the top ten customers during the last six months?  What was the profit last month and how much did it differ from the profit of the same month during the last three years?  What is the relationship between the total annual revenue generated by each branch office and the total number of sales staff assigned to each branch office?
  • 16. PIMS Data Warehouse 16 Operational systems support the business processes of the company. They are used to watch how the business runs, and then make strategic decisions to improve the business. The concept of data warehouse is deemed the solution to meet the requirements of a system capable of supporting decision making, receiving data from multiple data sources. 1.4.3 Data Warehouse A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing. It usually contains historical data derived from transaction data, but it can include data from other sources. It separates analysis workload from transaction workload and enables an organization to consolidate data from several sources. In addition to a relational database, a data warehouse environment includes an extraction, transportation, transformation, and loading (ETL) solution, an online analytical processing (OLAP) engine, client analysis tools, and other applications that manage the process of gathering data and delivering it to business users. Data warehouse is an informational environment that:  Provides an integrated and total view of enterprise.  Make the enterprise current and historical information easily available for decision making.  Make decision-support transactions possible without hindering operational systems.  Renders the organization’s information consistent.  Presents a flexible and interactive source of strategic information. The users of the data warehouse environment have a completely different approach to using the system. Unlike operational users who have a straightforward approach to defining their requirements, the data warehouse user operates in a mindset of discovery. The end user of the data warehouse says, “Give me what I say I want, and then I can tell you what I really want.” 1.4.4 Development Lifecycle Operational data is usually application oriented and as a consequence is an integrated, whereas data warehouse data must be integrated. Other major differences also exist between the operational level of data and processing and the data warehouse level of data and processing. The development life cycles of these systems can be a profound concern, the operational environment is supported by the classical systems development life cycle (the SDLC). The SDLC is often called the “waterfall” development approach because the different activities are specified and one activity- upon its completion-spills down into the next activity and triggers its start.
  • 17. PIMS Data Warehouse 17 The development of the data warehouse operates under a very different life cycle, sometimes called the CLDS (the reverse of the SDLC). The classical SDLC is driven by requirements. The CLDS is almost exactly the reverse: The CLDS starts with data. Once the data is in hand, it is integrated and then tested to see what bias there is to the data, if any. Programs are then written against the data. The results of the programs are analyzed, and finally the requirements of the system are understood. The CLDS is usually called a “spiral” development methodology. The classical system development life cycle (SDLC) does not work in the world of the DSS analyst. The SDLC assumes that requirements are known at the start of design (or at least can be discovered). In the world of the DSS analyst, though, new requirements usually are the last thing to be discovered in the DSS development life cycle. The DSS analyst starts with existing requirements, but factoring in new requirements is almost impossibility. A very different development life cycle is associated with the data warehouse.
  • 18. PIMS Data Warehouse 18 1.4.5 Data Warehouse SDLC  Implement Warehouse  Integrate data  Test for bias  Program against data  Design DSS system  Analyze results  Understand requirements 1.4.6 Classical SDLC  Requirements gathering  Analysis  Design  Programming  Testing  Integration  Implementation The CLDS is a classic data-driven development life cycle, while the SDLC is a classic requirements-driven development life cycle. 1.4.7 Overview of ETL When data is required to be loaded in data warehouse regularly, so that it can serve its purpose of facilitating business analysis. To do this, data from one or more operational systems needs to be extracted and copied into the warehouse. The process of extracting data from source systems and bringing it into the data warehouse is commonly called ETL, which stands for extraction, transformation, and loading. The acronym ETL is perhaps too simplistic, because it omits the transportation phase and implies that each of the other phases of the process is distinct. We refer to the entire process, including data loading, as ETL. The ETL refers to a broad process, and not three well-defined steps. The methodology and tasks of ETL have been well known for many years, and are not necessarily unique to data warehouse environments: a wide variety of proprietary applications and database systems are the IT backbone of any enterprise. Data has to be shared between applications or systems, trying to integrate them, giving at least two applications the same picture of the world. This data sharing was mostly addressed by mechanisms similar to what is now called ETL. Data warehouse environments face the same challenge with the additional burden that they not only have to exchange but to integrate, rearrange and consolidate data over many systems, thereby providing a new unified information base for business
  • 19. PIMS Data Warehouse 19 intelligence. Additionally, the data volume in data warehouse environments tends to be very large. What happens during the ETL process? During extraction, the desired data is identified and extracted from many different sources, including database systems and applications. Very often, it is not possible to identify the specific subset of interest; therefore more data than necessary has to be extracted, so the identification of the relevant data shall be done at a later point in time. Depending on the source system's capabilities (for example, operating system resources), some transformations may take place during this extraction process. The size of the extracted data varies from hundreds of kilobytes up to gigabytes, depending on the source system and the business situation. The same is true for the time delta between two (logically) identical extractions: the time span may vary between days/hours and minutes to near real-time. Web server log files for example can easily become hundreds of megabytes in a very short period of time. 1.4.8 Major Functions The basic purpose of the project is to build a healthcare system that not only help doctors, patients but also the government sector to take major decision on health department, The main features provided by the software to use ETL to:  Create a definition of a data warehouse.  Configure the definitions for a physical instance of the data warehouse.  Validate the set of definitions and their configurations.  Create and populate the data warehouse instance.  Data transformations.  Deploy and initially load the data warehouse instance.  Maintain the physical instance by conditionally refreshing. ETL supports the design of relational database schemas, ETL processes and End User tool environments through the client. Source systems play an important role in ETL a solution. Instead of creating metadata manually, ETL provides integrated components that import the relevant information into its repository. To ensure the quality and completeness of the data in the repository ETL provides extensive validation within the repository. Validation helps to keep a complex system in an accurate and coherent state.
  • 20. PIMS Data Warehouse 20 1.5 Methodology and Software Life Cycle Figure 1 Methodology and Software Life Cycle
  • 21. PIMS Data Warehouse 21 Chapter 2 Problem definition
  • 22. PIMS Data Warehouse 22 2 Problem Definition 2.1 Purpose As in Pakistan there is no such concept of data warehouse in healthcare department, so we are making this project and the basic theme behind is to make patient information and monitoring system using data-ware housing. Also, worldwide the healthcare industry is looking for technology that can aid with establishing on-line clinical repositories enabling rapid access to shared information that can help find cures for prevalent medical conditions. It not only facilitates the hospitals as well as doctors but also facilitates the government sector to make critical decisions; our project shall help them in decision making. Health care has become one of the most important service industries that are undergoing rapid structural transformations. Healthcare remains a paper-intensive and minimally automated and digitized industry. CBS market watch reported that an estimated 90% of all patient information remains on paper. Also, worldwide the healthcare industry is looking for technology that can aid with establishing on-line clinical repositories enabling rapid access to shared information that can help find cures for prevalent medical conditions. 2.2 Product Functions A PIMS based on Data warehouse system that not only contains the historical data but also helps the specific users in decision making.  It helps users to take decisions on healthcare department.  It helps the users in making decisions on patient related cities to focus on them for disease cure purpose.  System not only provides graphical reports but also provide drill down tabular reports to help understand the healthcare issues.  System not only calculates the spreading of disease in required cities but also calculate it with respect to time (containing year, quarter, month, day as well) of it.  System tells the doctors to focus on such age patients having required disease.  System also tells the user that they generate how much revenue in a year, quarter, month, day, hour, min and sec from which patient.  System not only inform doctors that how much patients they are handling but also tell the doctors about their specific patient’s gender and their previous reports and there disease status and so on.
  • 23. PIMS Data Warehouse 23 2.3 Proposed Architecture 2.3.1 Basics of Data Warehouse and ETL 2.3.1.1 What is a Data Warehouse? A common way of introducing data warehousing is to refer to the characteristics of a data warehouse as set forth:  Subject Oriented  Integrated  Nonvolatile  Time Variant 2.3.1.2 Subject Oriented Data warehouses are designed to help you analyze data. For example, to learn more about your company’s sales data, you can build a warehouse that concentrates on sales. Using this warehouse, you can answer questions like "Who was our best customer for this item last year?" This ability to define a data warehouse by subject matter, sales in this case makes the data warehouse subject oriented. 2.3.1.3 Integrated Integration is closely related to subject orientation. Data warehouses must put data from disparate sources into a consistent format. They must resolve such problems as naming conflicts and inconsistencies among units of measure. When they achieve this, they are said to be integrated. 2.3.1.4 Nonvolatile Nonvolatile means that, once entered into the warehouse, data should not change. This is logical because the purpose of a warehouse is to enable you to analyze what has occurred. 2.3.1.5 Time Variant In order to discover trends in business, analysts need large amounts of data. This is very much in contrast to online transaction processing (OLTP) systems, where performance requirements demand that historical data be moved to an archive. A data warehouse’s focus on change over time is what is meant by the term time variant.
  • 24. PIMS Data Warehouse 24 2.3.1.6 Contrasting OLTP and Data Warehousing Environments Figure 2 Contrasting OLTP and Data Warehousing Environments Data warehouses and OLTP systems have very different requirements. Here are some examples of differences between typical data warehouses and OLTP systems: 2.3.1.7 Workload Data warehouses are designed to accommodate ad hoc queries. You might not know the workload of your data warehouse in advance, so a data warehouse should be optimized to perform well for a wide variety of possible query operations. OLTP systems support only predefined operations. Your applications might be specifically tuned or designed to support only these operations. 2.3.1.8 Data modifications A data warehouse is updated on a regular basis by the ETL process (run nightly, weekly monthly, or yearly) using bulk data modification techniques. The end users of a data warehouse do not directly update the data warehouse. In OLTP systems, end users routinely issue individual data modification statements to the database. The OLTP database is always up to date, and reflects the current state of each business transaction. 2.3.1.9 Schema design Data warehouses often use denormalized or partially denormalized schemas (such as a star schema) to optimize query performance. OLTP systems often use fully normalized schemas to optimize update/insert/delete performance, and to guarantee data consistency.
  • 25. PIMS Data Warehouse 25 2.3.1.10 Typical operations A typical data warehouse query scans thousands or millions of rows. For example, "Find the total sales for all customers last month." A typical OLTP operation accesses only a handful of records. For example, "Retrieve the current order for this customer." 2.3.1.11 Historical data Data warehouses usually store many months or years of data. This is to support historical analysis. OLTP systems usually store data from only a few weeks or months. The OLTP system stores only historical data as needed to successfully meet the requirements of the current transaction. 2.3.2 Data Warehouse Architectures Data warehouses and their architectures vary depending upon the specifics of an organization's situation. Three common architectures are:  Data Warehouse Architecture (Basic).  Data Warehouse Architecture (with a Staging Area).  Data Warehouse Architecture (with a Staging Area and Data Marts). 2.3.3 Basic Data Warehouse Architecture Figure 3 Basic Data Warehouse Architecture
  • 26. PIMS Data Warehouse 26 2.3.4 Data Warehouse Architecture with Staging Area Figure 4 Data Warehouse Architecture with Staging Area 2.3.5 Data Warehouse Architecture with Staging Area and Data Marts Figure 5 Data Warehouse Architecture with Staging Area and Data Marts
  • 27. PIMS Data Warehouse 27 2.3.6 Data Warehouse Modeling One question that very often arises at my data warehousing presentations is: Which data modeling tool is best for data warehousing? The answer is simple: your brain. While all the various data modeling tools have their pros and cons, none of them is so intrinsically better than the rest for data warehousing as to rate a recommendation. For example, none of the current data modeling tools cleanly diagrams or records any meta-data regarding how facts and aggregates might use partitioning and/or materialized views. For data warehousing, the physical data model is useful merely as a roadmap for the ETL programmers. The real physical object implementation is far too complex for modeling tools to handle. Some basic steps for transforming an OLTP model into a star schema design are:  DE normalizes lookup relationships.  DE normalizes parent/child relationships.  Create and populate a time dimension.  Create hierarchies of data within dimensions.  Consider using surrogate or meaningless keys. In dimensional modeling of a Data Warehouse, there are generally only two kinds of tables: 2.3.6.1 Dimensions Dimensions are relatively small, DE normalized lookup tables containing business descriptive columns that end-users reference to define their restriction criteria for ad- hoc business intelligence queries. 2.3.6.2 Facts Facts are extremely large tables whose primary keys are formed from the concatenation of all the columns that are foreign keys referencing related dimension tables. Facts also possess numerically additive, non-key columns utilized to satisfy calculations required by end-user ad-hoc business intelligence queries. The key point is that to be successful, fact table implementations must accommodate the different requirements. 2.3.7 ETL Operation ETL involves three (3) major operations:  Data Extraction.  Transformation.  Loading.
  • 28. PIMS Data Warehouse 28 Extraction Transformation & Schema Design Loading OLTP System Data Warehouse Figure 6 ETL Operation 2.3.7.1 Data Extraction Extraction is the operation of extracting data from a source system for further use in a data warehouse environment. This is the first step of the ETL process. After the extraction, this data can be transformed and loaded into the data warehouse. The source systems for a data warehouse are typically transaction processing applications. For example, one of the source systems for a sales analysis data warehouse might be an order entry system that records all of the current order activities. Designing and creating the extraction process is often one of the most time consuming tasks in the ETL process and, indeed, in the entire data warehousing process. The source systems might be very complex and poorly documented, and thus determining which data needs to be extracted can be difficult. The data has to be extracted normally not only once, but several times in a periodic manner to supply all changed data to the warehouse and keep it up-to-date. Moreover, the source system typically cannot be modified, nor can its performance or availability be adjusted, to accommodate the needs of the data warehouse extraction process. These are important considerations for extraction and ETL in general. It assumes that the data warehouse team has already identified the data that shall be extracted, and discusses common techniques used for extracting data from source databases. Designing this process means making decisions about the following two main aspects:  Which extraction method do I choose? This influences the source system, the transportation process, and the time needed for refreshing the warehouse.  How do I provide the extracted data for further processing? This influences the transportation method, and the need for cleaning and transforming the data. 2.3.7.2 Data Transformation Data transformations are often the most complex and, in terms of processing time, the most costly part of the ETL process. They can range from simple data conversions to
  • 29. PIMS Data Warehouse 29 extremely complex data scrubbing techniques. Many, if not all, data transformations can occur within a database, although transformations are often implemented outside of the database (for example, on flat files) as well. 2.3.7.3 Data Loading Data is loaded into a data warehouse in two fundamental ways: a record at a time through a language interface or en masse with a utility. As a rule, loading data by means of a utility is much faster. In addition, indexes must be efficiently loaded at the same time the data is loaded. In some cases, the loading of the indexes may be deferred in order to spread the workload evenly. As the burden of the volume of loading becomes an issue, the load is often parallelized. When this happens, the data being loaded is divided into one of several job streams. Once the input data is divided, each job stream is executed independently of the other job streams. In doing so, the elapsed time needed for loading is reduced by the number of job streams (roughly speaking). Another related approach to the efficient loading of very large amounts of data is staging the data prior to loading. As a rule, large amounts of data are gathered into a buffer area before being processed by extract/transfer/load (ETL) software. The staged data is merged, perhaps edited, summarized, and so forth, before it passes into the ETL layer. Staging of data is needed only where the amount of data is large and the complexity of processing is high. 2.3.7.4 Data Profiling Data profiling is not a glamorous task. It is also not something that you can do once and forget about it. Proper data profiling methodology must become a standard part of both your business and IT infrastructure to allow you to diagnose the health of your systems. Today, many organizations attempt to conduct data profiling tasks manually. With very few columns and minimal rows to profile, this may be practical. But organizations today have thousands of columns and millions (or billions) of records. Profiling this data manually would require an inordinate amount of human intervention that would still be error-prone and subjective. In practice, your organization needs a data profiling tool that can automatically process data from many data source and process hundreds or thousands of columns across many data sources. Data profiling in practice consists of three distinct phases:  Initial profiling and data assessment.  Integration of profiling into automated processes.  Handoff of profiling results to data quality and data integration processes.
  • 30. PIMS Data Warehouse 30 The most effective data management tools can address all of these initiatives. Data analysis reporting alone is just a small part of your overall data initiative. The results from data profiling serve as the foundation for data quality and data integration initiatives. Look for a data profiling solution that allows you to construct data correction, validation and verification routines directly from the profiling reports. This shall help you combine the effort of data inspection and correction phases, helping to streamline your data management process. 2.3.7.5 Assumptions and Dependencies 2.3.7.5.1 Assumptions  The software can be used round the clock.  The requirements can be change with time. 2.3.7.5.2 Dependencies  User should have domain knowledge of computers.  DWH should be maintained with time to time 2.3.7.6 Project Deliverables  Executable in running condition  Detailed Final Draft (Report) 2.3.7.7 Operating Environment 2.3.7.7.1 Software  Operating system: Windows Xp/7  Dream viewer, Visual studio  Backhand Data Mart 2.3.7.7.2 Web services  Opera version  Mozilla Firefox  Microsoft Internet Explorer version  Apple Safari version  Google Chrome version
  • 31. PIMS Data Warehouse 31 Chapter 3 Requirement Analysis
  • 32. PIMS Data Warehouse 32 3 Requirements Analysis The analysis phase defines the requirements of the system, independent of how these requirements shall be accomplished. This phase defines the problem that the end user is trying to solve. The deliverable result at the end of this phase is a requirement document. Ideally, this document states in a clear and precise fashion what is to be built. This analysis represents the ``what'' phase. The requirement document tries to capture the requirements from the end user’s perspective by defining goals and interactions at a level removed from the implementation details. 3.1 Project Overview The product shall provide following functionality regarding ETL Operations: 3.1.1 Data Profiling  Analyzing the column properties.  Analyzing the relationships. 3.1.2 Warehouse Schema Generation The user shall provide an OLTP source system and shall generate a target warehouse schema shall be a star schema 3.1.3 Data Extraction  The data from the OLTP system shall be extracted into files.  Extracted data shall be further processed before loading into the warehouse. 3.1.4 Data Transformation The extracted data is raw and cannot be placed in the data warehouse without enriching it.  The extracted data shall be processed within the staging area according to the required format.  Its quality shall be improved, and  Shall be made ready to be loaded into the data warehouse. 3.1.5 Data Loading This process again is quite cumbersome and shall require special techniques and methods so that all the records are applied successfully to the data warehouse.  The data prepared after transformation shall be applied to the data warehouse database and shall be stored there.  Load images are created to correspond to the target files to be loaded in the data warehouse database.
  • 33. PIMS Data Warehouse 33  Mapping functions shall be provided, which shall map the source system records to the target warehouse 3.2 Functional Requirements Following are some basic requirements described briefly. 3.2.1 Data Profiling PIMS.DP.F.0010: Identify the Data Type of the columns. PIMS.DP.F.0020: Identify the Maximum Length of the columns. PIMS.DP.F.0030: Identify the Null Rule of the columns. PIMS.DP.F.0040: Identify the Unique Rule of the columns. PIMS.DP.F.0050: Identify the relationships between the Tables. 3.2.2 Warehouse Schema Generation PIMS.WSG.F.0010: Support for a Star schema (is a logical structure that has a fact table in the center, surrounded by dimension tables) shall be provided for the warehouse schema design. PIMS.WSG.F.0020: Functions shall be provided so that the user can transform the source data into the target system PIMS.WSG.F.0030: Generated schema shall be implemented in Microsoft Visual Studio R2 as warehouse objects such as facts, dimension tables. 3.2.3 Data Extraction PIMS.DE.F.0010: The source systems for a data warehouse are typically OLTP system which shall be Microsoft Visual Studio R2.
  • 34. PIMS Data Warehouse 34 PIMS.DE.F.0020: The data from the OLTP system shall be extracted into files. PIMS.DE.F.0030: Data has to be extracted for each incremental load as well as for one time initial full load. PIMS.DE.F.0040: Extracted data shall be further processed before loading into the warehouse. 3.2.4 Data Transformation PIMS.DT.F.0010: Quality of the data is to be improved. PIMS.DT.F.0020: Transformation functions provided shall transform the data format. PIMS.DT.F.0030: Standardization of data for different sources shall be done. PIMS.DT.F.0040: Selection takes place at the beginning of the whole process of data transformation. Either whole records or parts of several records can be selected from the source system. PIMS.DT.F.0050: Splitting/ Joining includes the types of data manipulation needed to perform on the selected parts of the source records. PIMS.F.F.0060: Conversion includes all wide variety of rudimentary conversions of single fields for two primary reasons, one to standardize among the data extraction from different sources, another to make the fields usable and understandable to the user. PIMS.DT.F.0070: For summarization, the transformation function is used for summarizing the facts. PIMS.DT.F.0080: Enrichment is the rearrangement and simplification of individual fields to make them more useful for the data.
  • 35. PIMS Data Warehouse 35 3.2.5 Data Loading PIMS.DL.F.0010: Load images are created to correspond to the target files to be loaded in the data warehouse database. PIMS.DL.F.0020: Identification of source system’s table fields mapping to warehouse table. PIMS.DL.F.0030: Loading must be efficient. 3.3 Nonfunctional Requirements 3.3.1 Performance Requirements This shall be very important system being used in the development cycle of ETL so it must be efficient. As it shall be communicating with the SQL server r2 and heavy resources are required for processing so it must utilize the hardware resources optimally. 3.3.2 Safety Requirements The process of testing a test case may take very long time and it is important to keep the track of intermediate results so in case of any failure the work already done is not lost. 3.3.3 Reliability Requirements The process involves different phases and the data after each phase should be secure and reliable. 3.4 External Interface Requirements 3.4.1 User Interface The Users of this application shall be professionals as well as normal users so the user interface has to be comprehensive that provides the required user the ability to control everything and extract the required information easily and as quickly as possible. 3.4.2 Hardware Resources The application requires heavy processing resources. Latest hardware resources are required for the efficient and effective working of the application.
  • 36. PIMS Data Warehouse 36 3.4.3 Hardware Interfaces The software shall not interact with any hardware; it shall only use the operating systems services for connecting to the SQL Database. 3.5 Use Case Specifications 3.5.1 Connect RDBMS User Pre-Condition: Required Schema and User must Exist Description: Connect to an existing schema using a user name and password. Actor: User Success Scenario: User provides the user name and password. User is connected to the schema with the user name provided. User can view the required details of the schema. Alternate Scenarios: None Post-Condition: The details of the RDBMS should be loaded for the viewing purpose. 3.5.2 Connect Data Warehouse Schema Pre-Condition: A proper Data Warehouse schema should exist. Description: Open a previously created Data Warehouse with the provide user name and password. Actor: User Success Scenario: User provides the user name and password. User is connected to the schema with the user name provided. User can view the details of the schema.
  • 37. PIMS Data Warehouse 37 Alternate Scenarios: None Post-Condition: The details of the Data Warehouse should be loaded. 3.5.3 Connect database User Pre-Condition: The user and the schema to be connected must exist Description: Connect to an existing database user with required schema. Actor: User Success Scenario: User should be connected to the required schema with the user rights assigned to the user. Alternate Scenarios: None. Post-Condition: The details of the schema are populated for the connected user to further operations required. 3.5.4 Load Relational Database Model Pre-Condition: Relational Database model must exist. User must have the rights to connect to the database. Description: Load an existing RDBMS for the required operations. Actor: User Success Scenario: User connects the relational database by specifying the user name and password. The entire ERD Model of the RDBMS is loaded. Relationships between these tables are populated. User can view the details of the tables.
  • 38. PIMS Data Warehouse 38 User can perform the desired operations. Alternate Scenarios: None Post-Condition: None 3.5.5 Identify Table Names Pre-Condition: Database must exist. The tables must exist in the database. Description: Identify all the tables in the database. Actor: User Success Scenario: User selects to view the tables of the Database. The entire table names in the database are loaded after transformation. Alternate Scenarios: None Post-Condition: None 3.5.6 Identify Columns with Data Types Pre-Condition: The tables must be specified for which the columns are to be populated. Description: Identify column names with data type of each columns column for the specified table names. Actor: User Success Scenario: User selects to view the complete database.
  • 39. PIMS Data Warehouse 39 All the required table columns names and their data types are populate. Alternate Scenarios: None. Post-Condition: None. 3.5.7 Identify Relationships between Tables Pre-Condition: All The table name and column name for each table must be populated to identify the relationships. Description: Identify relationships between the tables. Actor: User Success Scenario: User selects to view the complete database design. All the relationships are loaded with the details of the Primary Keys and Foreign Keys of the complete database. Alternate Scenarios: None. Post-Condition: None. 3.5.8 Load Warehouse Schema Pre-Condition: The Warehouse Schema must be created before and the fact and dimension tables must exist. Description: Load an existing Data Warehouse Schema for cubes and reports. Actor: User Success Scenario: User loads the existing data warehouse schema.
  • 40. PIMS Data Warehouse 40 The details of the schema are loaded. The Details are displayed to the user. Alternate Scenarios: None. Post-Condition: None. 3.5.9 Map Columns Pre-Condition: A complete Data Warehouse schema must exist with all the facts and dimensions properly defined. Description: The Mapping of columns between the columns of Data Warehouse Facts/Dimensions and the columns of Tables from an RDBMS. Actor: User Success Scenario: User maps each column of the facts and dimension tables to the relational database model columns as required. The mapping between the columns is maintained for the ETL operations. All the mappings are stored in the file. Alternate Scenarios: Load an existing Data Warehouse schema with the already defined mappings. Post-Condition: None.
  • 41. PIMS Data Warehouse 41 3.5.10 Extract Data from RDBMS Pre-Condition: All the mappings should be done for the effective and efficient load. Description: Extract Data from the Database to be loaded in the Data Warehouse. Actor: User Success Scenario: User selects to extract the data from the source system. The extracted data is to be extracted in accordance with the target system. The extracted data is saved in the SQL server 2008. On the Completion of extractions of data from the source, the data is ready for the transformation or loading. Alternate Scenarios: None. Post-Condition: None. 3.5.11 Transform Extracted Data Pre-Condition: All the data must be extracted from the source system. Description: Transform the extracted data, as required, before loading the data into the target Warehouse. Actors: User. Success Scenario: Select the Transformation functions required to be performed be loading the data into the Data Warehouse. User specifies the transformation functions to be performed. User selects to apply the transformation functions.
  • 42. PIMS Data Warehouse 42 Alternate Scenarios: If user does not provide any transformation function the data shall be loaded without any changes in some cases. Post-Condition: None. 3.5.12 Load Data in Warehouse Pre-Condition: All the data to be loaded should be maintained in the SQL server 2008. Description: Loading the Extracted Data from different ODS’s to the Target Data Warehouse. Actor: User Success Scenario: User selects to load the data into the target system. The transformations to be performed are applied to the extracted data. After the successful transformation of data the data is loaded to the target data warehouse. All the changes are saved. Alternate Scenarios: None Post-Condition: Prompt for the Successful ETL.
  • 44. PIMS Data Warehouse 44 4 The Design In the design phase the architecture is established. This phase starts with the requirement document delivered by the requirement phase and maps the requirements into architecture. The architecture defines the components, their interfaces and behaviors. The deliverable design document is the architecture. The design document describes a plan to implement the requirements. This phase represents the ``how'' phase. Details on computer programming languages and environments, machines, packages, application architecture, distributed architecture layering, memory size, platform, algorithms, data structures, global type definitions, interfaces, and many other engineering details are established. The design may include the usage of existing components. 4.1 Modules The current system can easily be divided into four modules. These modules are quite independent from each other and have very simple and well defined interface between each other. These things play a very important role in the successful working of the modules. Basic modules in our system are as follows. 4.1.1 Connectivity The functionality of this module shall be to provide the communication between Database and its client. Connectivity shall provide the interface for communication for providing full operational environment in order to achieve the efficient interaction within the system. 4.1.2 RDBMS Details The detailed schema of the relational database model shall be loaded with the help of this module. This module loads:  Table Name  Columns Names  Data Types  Constraints  Relationships  Path Details of Database Traversals. 4.1.3 Schema Generation This module shall provide its users a very easy way to create the schema of the Data Warehouse. Schema Generator provides the functionality for defining:  Facts  Dimensions
  • 45. PIMS Data Warehouse 45 4.1.4 Column Mappings This module shall provide the functionality to map the columns from relational database model with the Data Warehouse Facts or Dimension Tables. The modifications in the columns shall also be managed in this module. 4.1.5 Extraction The procedure for the data extraction shall be defined with the help of the provided mappings. This module shall extract the data from the relational database in accordance with the defined procedures of the effective extractions using the mappings. 4.1.6 Transformation Different Transformation functions shall be provided. The user shall select the type of transformations and the desired output of the transformation. These transformations procedures shall be maintained for the preload transformations. 4.1.7 Loading Loading mechanism shall be defined to incorporate the defined transformations. The data shall be transformed and loaded in the single step. Initial the data shall be transformed according to the specified functions and the then shall be loaded to the designed Data Warehouse schema.
  • 46. PIMS Data Warehouse Chapter 5 UML Structural Diagram
  • 47. PIMS Data Warehouse 45 5 UML Structure Diagram 5.1 Class Diagram 5.1.1 City lab -Cl_Id -Cl_Address -Cl_Phone -Cl_Fax -Cl_City Branches -Cl_Dept_Id -Cl_Id -Cl_Dept_Name -Cl_Dept_Location Department -Cl_S_Id -Cl_S_FName -Cl_S_LName -Cl_S_Sex -Cl_S_Address -Cl_S_Salary -Cl_S_Phone -Cl_S_Mobile -Cl_Dept_Id -Cl_Shift_Id -Cl_S_Password Staff -Cl_T_Id -Cl_P_Id -Cl_S_Id -Cl_Test_Category_Id Test -Cl_Eqp_Id -Cl_Eqp_Name -Cl_Eqp_Company -Cl_Dept_Id -Cl_Status Equipment -Cl_R_Id -Cl_S_Id -Cl_T_Id -Cl_Dr_Id -Cl_Date_Time -Cl_Disease_Id -Cl_Disease_Status Report -Cl_P_Invoice_No -Cl_S_Id -Cl_P_Id -Cl_Status -Cl_Total_Price PatientInvoice -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Urine_PC -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Cholesterol -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Urine_CE -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Test_CBC -Cl_Shift_Id -Cl_Shift_Name -Cl_Shift_Start -Cl_Shift_End Shift -Cl_Sup_Id -Cl_Sup_Name Supplier -Cl_Dr_Id -Cl_Dr_Name -Cl_Specialization -Cl_Dr_Gender -Cl_Address -Cl_Dr_Phone -Cl_Dept_Id -Cl_Shift_Id -Cl_Password Doctor -Cl_Order_No -Cl_S_Id -Cl_Sup_Id -Cl_Order_Date -Cl_Status OrderRequest -Cl_Order_No -Cl_Order_Item_Id -Cl_Eqp_Name -Cl_DateTime OrderDetail -Cl_R_Dr_Id -Cl_R_Dr_Name -Cl_R_Dr_Address -Cl_R_Dr_Specialization -Cl_R_Dr_Phone_No -Cl_R_Dr_Mobile_No ReferedDoctor -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_CBC_DC -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Urine_ME -Cl_Test_Category_Id -Cl_T_Type Test_Category -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Triglycerides -Cl_Sup_Id -Cl_Eqp_Name -Cl_Total_Invoice SupplierInvoice -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_LFT -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_CBC_AV -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Ant_HCV -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Test_Dengue -Cl_P_Id -Cl_Id -Cl_P_FName -Cl_P_LName -Cl_P_Gender -Cl_P_Address -Cl_P_Phone -Cl_P_Mobile -Cl_R_Dr_Id Patient -Cl_Disease_Id -Cl_Disease_Name Disease -Cl_Test_Name -Cl_Result -Cl_Reference_Value -Cl_R_Id Analysis -Cl_Patient_Invoice_No -Cl_Total_Test -Cl_Total_Price PatientInvoiceDetail Figure 7 City lab
  • 48. PIMS Data Warehouse 46 5.1.2 Health ways -Hw_Id -Hw_Address -Hw_Phone -Hw_Fax -Hw_City Branches -Hw_Department_Id -Hw_Id -Hw_Department_Name -Hw_Department_Location Department -Hw_Staff_Id -Hw_Staff_FName -Hw_Staff_LName -Hw_Staff_Sex -Hw_Staff_Address -Hw_Staff_Salary -Hw_Staff_Phone -Hw_Staff_Mobile -Hw_Dept_Id -Hw_Shift_Id -Hw_Staff_Password Staff -Hw_Test_Id -Hw_Patient_Id -Hw_Staff_Id -Hw_Test_Category_Id Test -Hw_Equipment_Id -Hw_Equipment_Name -Hw_Equipment_Company -Hw_Dept_Id -Hw_Status Equipment -Hw_Report_Id -Hw_Staff_Id -Hw_Test_Id -Hw_Doctor_Id -Hw_DateTime -Hw_Disease_Id -Hw_Disease_Status Report -Hw_Patient_Invoice_No -Hw_Staff_Id -Hw_Patient_Id -Hw_Status -Hw_Total_Price PatientInvoice -Hw_Test_Category_Id -Hw_Test_Name -Hw_Reference_Value Sub_Urine_PC -Hw_Test_Category_Id -Hw_Test_Name -Hw_Reference_Value Sub_Cholesterol -Hw_Test_Category_Id -Hw_Test_Name -Hw_Reference_Value Sub_Urine_CE -Hw_Test_Category_Id -Hw_Test_Name -Hw_Reference_Value Sub_Test_CBC -Hw_Shift_Id -Hw_Shift_Name -Hw_Shift_Start -Hw_Shift_End Shift -Hw_Supplier_Id -Hw_Supplier_Name Supplier -Hw_Doctor_Id -Hw_Doctor_Name -Hw_Specialization -Hw_Doctor_Gender -Hw_Address -Hw_Doctor_Phone -Hw_Department_Id -Hw_Shift_Id -Hw_Password Doctor -Hw_Order_No -Hw_Staff_Id -Hw_Supplier_Id -Hw_Order_Date -Hw_Status OrderRequest -Hw_Order_No -Hw_Order_Item_Id -Hw_Equipment_Name -Hw_DateTime OrderDetail -Hw_Refered_Doctor_Id -Hw_Refered_Doctor_Name -Hw_Refered_Doctor_Address -Hw_Refered_Doctor_Specialization -Hw_Refered_Doctor_Phone_No -Hw_Refered_Doctor_Mobile_No ReferedDoctor -Hw_Test_Category_Id -Hw_Test_Name -Hw_Reference_Value Sub_CBC_DC -Hw_Test_Category_Id -Hw_Test_Name -Hw_Reference_Value Sub_Urine_ME -Hw_Test_Category_Id -Hw_Test_Type Test_Category -Hw_Test_Category_Id -Hw_Test_Name -Hw_Reference_Value Sub_Triglycerides -Hw_Supplier_Id -Hw_Equipment_Name -Hw_Total_Invoice SupplierInvoice -Hw_Test_Category_Id -Hw_Test_Name -Hw_Reference_Value Sub_LFT -Hw_Test_Category_Id -Hw_Test_Name -Hw_Reference_Value Sub_CBC_AV -Hw_Test_Category_Id -Hw_Test_Name -Hw_Reference_Value Sub_Ant_HCV -Hw_Test_Category_Id -Hw_Test_Name -Hw_Reference_Value Sub_Test_Dengue -Hw_Patient_Id -Hw_Id -Hw_Patient_FName -Hw_Patient_LName -Hw_Patient_Gender -Hw_Patient_Address -Hw_Patient_Phone -Hw_Patient_Mobile -Hw_Report_Dr_Id Patient -Hw_Disease_Id -Hw_Disease_Name Disease -Hw_Test_Name -Hw_Result -Hw_Reference_Value -Hw_Report_Id Analysis -Hw_Patient_Invoice_No -Hw_Total_Test -Hw_Total_Price PatientInvoiceDetail Figure 8 Health ways
  • 49. PIMS Data Warehouse 47 5.1.3 Clinic -C_Id -C_Address -C_Phone -C_Fax -C_City Branch -C_Dr_Id -C_Dr_Name -C_Dr_Specialization -C_Dr_Gender -C_Address -C_Dr_Phone -C_Id -C_Password Dr -C_P_Id -C_Id -C_P_FName -C_P_LName -C_P_Gender -C_P_Address -C_P_Phone -C_P_Mobile -C_R_Dr_Id Patients -C_P_Invoice_No -C_S_Id -C_P_Id -C_Status -C_Total_Price Invoice -C_Pre_Id -C_Dr_Id -C_Pat_Id -C_Pre_Date Prescription -C_Pre_Id -C_Med_Id -C_Med_Name -C_Med_Dossage Prescription_Detail Figure 9 Clinic
  • 50. PIMS Data Warehouse 48 5.1.4 CMH hospital -Hospital_Id -Hospital_Name -Hospital_Address -Hospital_City -Hospital_Phone_No -Hospital_Fax_No Branches -Dept_Id -Dept_Name -Dept_Location -Hospital_Id Department -Staff_Id -Staff_Name -Staff_Father_Name -Staff_Sex -Staff_Address -Staff_Salary -Staff_Phone -Staff_Mobile -Dept_Id -Shift_Id -Password Staff -Test_Id -Pat_Id -Staff_Id -Test_Category_Id Test -Cl_Eqp_Id -Cl_Eqp_Name -Cl_Eqp_Company -Lab_Id - Equipment -Report_Id -Staff_Id -Test_Id -Dr_Id -Date_Time -Disease_Id -Disease_Status Report -Pat_Invoice_No -Staff_Id -Pat_Id -Status - PatientInvoice -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Urine_PC -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Cholesterol -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Urine_CE -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Test_CBC -Shift_Id -Shift_Name -Shift_Start -Shift_End Shift -Sup_Id -Sup_Name Supplier -Dr_Id -Dr_Name -Dr_FatherName -Dr_Specialization -Dr_Gender -Dr_Address -Dr_Phone -Dr_Mobile_No -Dept_Id -Shift_Id -Password Doctor -Order_Id -Sup_Id -Staff_Id -Order_Date -Status Orders -Order_Eqp_Name -Order_Eqp_Company -Total_No_Of_Items -Estimated_Total_Price -Order_Id OrderDetail -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_CBC_DC -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Urine_ME -Test_Category_Id -Test_Type Test_Category -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Triglycerides -Eqp_Name -Eqp_Unit_Items -Total_Invoice -Sup_Id SupplierInvoice -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_LFT -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_CBC_AV -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Ant_HCV -Cl_Test_Category_Id -Cl_Test_Name -Cl_Reference_Value Sub_Test_Dengue -Patient_Id -Patient_FatherName -Patient_FullName -Patient_Gender -Patient_Address -Patient_Age -Patient_Type -Patient_Status -Hopital_Id Patient -Disease_Id -Disease_Name Disease -Test_Name -Result -Reference_Value -Report_Id Report_Analysis -Med_Id -Med_Name -Staff_Id -Dept_Id Pharmacy -Reg_Id -Staff_Id -Pat_Id Registery -In_Pat_Id -Ward_Id -Date_Admission -Pat_Id InPatient -Ward_Id -Ward_Name -Ward_Floor -Dept_Id Ward -Room_Id -Room_Type -Room_Floor_Location -Ward_Id Room -Bed_No -Room_Id -Ward_Id Bed -Nurse_Id -Nurse_FirstName -Nurse_LastName -Nurse_Address -Nurse_Phone -Ward_Id -Shift_Id Nurse -Dr_A_Id -Dr_A_Name -Dr_A_FatherName -Dr_A_Address -Dr_A_Phone -Dr_A_Salary -Shift_Id -Dr_Id -Password Dr_Assistant -Batch_No -Med_Id -Med_Doasge -Med_Mg_Date -Med_Exp_Date MedicanDetail -Insurance_No -Insurance_Co -Pat_Id -Expiry_Date Insurance -B_B_Name -Eqp_Id -Staff_Id -Dept_Id BloodBank -Lab_Id -Lab_Location -Dept_Id Lab -Blood_Id -Blood_Type -Comment -B_B_Name -Pat_Id Blood -Out_Pat_Id -DateTime -Pat_Id Out_Patient -Pre_Id -Dr_Id -Pat_Id -Pre_Date Prescription -Pre_Id -Med_Id -Med_Name -Med_Dosage PrescriptionDetail Figure 10 CMH hospital
  • 51. PIMS Data Warehouse 49 5.1.5 Urwah lab -Ur_Id -Ur_Address -Ur_Phone -Ur_Fax -Ur_City Branches -Ur_Dept_Id -Ur_Id -Ur_Dept_Name -Ur_Dept_Location Department -Ur_S_Id -Ur_S_FName -Ur_S_LName -Ur_S_Sex -Ur_S_Address -Ur_S_Salary -Ur_S_Phone -Ur_S_Mobile -Ur_Dept_Id -Ur_Shift_Id -Ur_S_Password Staff -Ur_T_Id -Ur_P_Id -Ur_S_Id -Ur_Test_Category_Id Test -Ur_Eqp_Id -Ur_Eqp_Name -Ur_Eqp_Company -Ur_Dept_Id -Ur_Status Equipment -Ur_R_Id -Ur_S_Id -Ur_T_Id -Ur_Dr_Id -Ur_DateTime -Ur_Disease_Id -Ur_Disease_Status Report -Ur_P_Invoice_No -Ur_S_Id -Ur_P_Id -Ur_Status -Ur_Total_Price PatientInvoice -Ur_Test_Category_Id -Ur_Test_Name -Ur_Reference_Value Sub_Urine_PC -Ur_Test_Category_Id -Ur_Test_Name -Ur_Reference_Value Sub_Cholesterol -Ur_Test_Category_Id -Ur_Test_Name -Ur_Reference_Value Sub_Urine_CE -Ur_Test_Category_Id -Ur_Test_Name -Ur_Reference_Value Sub_Test_CBC -Ur_Shift_Id -Ur_Shift_Name -Ur_Shift_Start -Ur_Shift_End Shift -Ur_Sup_Id -Ur_Sup_Name Supplier -Ur_Dr_Id -Ur_Dr_Name -Ur_Specialization -Ur_Dr_Gender -Ur_Address -Ur_Dr_Phone -Ur_Dept_Id -Ur_Shift_Id -Ur_Password Doctor -Ur_Order_No -Ur_S_Id -Ur_Sup_Id -Ur_Order_Date -Ur_Status OrderRequest -Ur_Order_No -Ur_Order_Item_Id -Ur_Eqp_Name -Ur_DateTime OrderDetail -Ur_R_Dr_Id -Ur_R_Dr_Name -Ur_R_Dr_Address -Ur_R_Dr_Specialization -Ur_R_Dr_Phone_No -Ur_R_Dr_Mobile_No ReferedDoctor -Ur_Test_Category_Id -Ur_Test_Name -Ur_Reference_Value Sub_CBC_DC -Ur_Test_Category_Id -Ur_Test_Name -Ur_Reference_Value Sub_Urine_ME -Ur_Test_Category_Id -Ur_T_Type Test_Category -Ur_Test_Category_Id -Ur_Test_Name -Ur_Reference_Value Sub_Triglycerides -Ur_Sup_Id -Ur_Eqp_Name -Ur_Total_Invoice SupplierInvoice -Ur_Test_Category_Id -Ur_Test_Name -Ur_Reference_Value Sub_LFT -Ur_Test_Category_Id -Ur_Test_Name -Ur_Reference_Value Sub_CBC_AV -Url_Test_Category_Id -Ur_Test_Name -Ur_Reference_Value Sub_Ant_HCV -Ur_Test_Category_Id -Ur_Test_Name -Ur_Reference_Value Sub_Test_Dengue -Ur_P_Id -Ur_Id -Ur_P_FName -Ur_P_LName -Ur_P_Gender -Ur_P_Address -Ur_P_Phone -Ur_P_Mobile -Ur_R_Dr_Id Patient -Ur_Disease_Id -Ur_Disease_Name Disease -Ur_Test_Name -Ur_Result -Ur_Reference_Value -Ur_R_Id Analysis -Ur_Patient_Invoice_No -Ur_Total_Test -Ur_Total_Price PatientInvoiceDetail Figure 11 Urwah lab
  • 52. PIMS Data Warehouse 50 5.2 Object diagram PIMS : PIMS pat : Patient hosp : Hospital lab : LabClinics h_lab : Healthways_lab u_lab : Urwah_Lab c_lab : City_lab stf : staff presc : Prescription doc : Doctor lab_rep : LabClinic_Reporters rep : Reports Test : Lab tests Ans : Anslysis Figure 12 Object diagram
  • 53. PIMS Data Warehouse 51 5.3 Component Diagram There shall be four different data base components from which data shall be extracted, transformed, loaded (ETL) into single component of data warehouse (data mart). ODB-1 ODB-2 ETL ODB-3 ODB-4 Data Warehouse (Data Mart) Figure 13 Component Diagram
  • 54. PIMS Data Warehouse 52 5.4 Deployment Diagram Deployment shall be divided into four levels Database Server maintain SQL Server then Data Warehouse Server maintain Data Warehouse SQL Server then Application Server maintain Web Application and Client Workstation that can view Web Application through Interface. Database Server Data Warehouse Server Application Server Client Work Station Interface Web Application Web ApplicationVisual Studio SQL Server SQL Server Maintain Database Maintain Data Warehouse Deployment Diagram Figure 14 Deployment Diagram
  • 55. PIMS Data Warehouse 53 5.5 Composite Structure Diagram Administrator send request to reporting manager then request goes to SQL server to ETL tool to report tools to query manager. In provided interface report shown. Report Report Report Tools Query Manager DWH Manager ETL Tool SQL Server Reporting Manager Administrator Checking Design and Structure Figure 15 Composite Structure Diagram
  • 56. PIMS Data Warehouse 54 5.6 Package diagram User through interface either registers or login after successful login view report with help of data mining from OLTP process. Data Mart (PIMS) store all information about patient which came through ETL process from four different Operational Data Stores. Application Login Regsiter User Interface ReportData Mining OLAP Data Warehouse Data Mart (PIMS) ETL Operational Data Store-1 Operational Data Store-2 Operational Data Store-3 Operational Data Store-4 Figure 16 Package diagram
  • 57. PIMS Data Warehouse 55 Chapter 6 UML Behavior Diagrams
  • 58. PIMS Data Warehouse 56 6 UML Behavior Diagrams 6.1 Use Case Diagram Figure 17 Use Case Diagram
  • 59. PIMS Data Warehouse 57 6.2 Activity Diagram 6.2.1 Create new project Create New Project New ConnectionUser Open Connection ODS's Set project information [No] [Yes] Set global valuesLoad Project Figure 18 Activity Diagram (Create new project) 6.2.2 Open existing project Open project Check connectionsUser Open Connection ODS's Set project information [No] [Yes] Set global valuesLoad Project Figure 19 Activity Diagram (Open existing project)
  • 60. PIMS Data Warehouse 58 6.2.3 Close project Open project Finalize changesUser Save changes Commit changes Close connection [No] Changes savedClosed [Yes] Figure 20 Close project 6.2.4 Create mapping Create mappings New mappingUser Get source details Get target details Provide mapping [No] [Yes] Validate mappings Finalize mappings Figure 21 Activity Diagram (Create mapping)
  • 61. PIMS Data Warehouse 59 6.2.5 Load RDBMS Load ODS connect DBUser Load table names Load column names Load relationships [No] [Yes] Identify relationshipspopulate DB details Figure 22 Activity Diagram (Load RDBMS)
  • 62. PIMS Data Warehouse 60 6.3 State Machine diagram 6.3.1 Report ETL Query to generate report Process required data Send data Apply defined rulesGenerate report [No] [Yes] Figure 23 State Machine Diagram (Report)
  • 63. PIMS Data Warehouse 61 6.3.2 ETL ETL Load Project get information check meta data Perform ETLExtract Transform Load DWH Figure 24 State Machine Diagram (ETL)
  • 64. PIMS Data Warehouse 62 Chapter 7 UML Interaction Diagrams
  • 65. PIMS Data Warehouse 63 7 UML Interaction Diagrams 7.1 Sequence Diagram A sequence diagram shows an interaction arranged in time sequence. In particular, it shows the instances participating in the interaction by their “lifelines” and the stimuli they exchange arranged in time sequence 7.1.1 Create New Project MainForm NewProject Connect CreateNewProject NewConnection SetProjectInfo SetGlobalValues LoadProject OpenConnection OpenedConnection Figure 25 Sequence Diagram (Create New Project)
  • 66. PIMS Data Warehouse 64 7.1.2 Open Existing Project MainForm Project Connect OpenProject NewConnection LoadProjectInfo LoadGlobalValues OpenedConnection LoadProject OpenConnection Figure 26 Sequence Diagram (Open Existing Project)
  • 67. PIMS Data Warehouse 65 7.1.3 Close Project MainForm CloseProject FinalizeChanges SaveChanges CommintChanges CloseConnection ChangesSaved Reset Projects Connect Figure 27 Sequence Diagram (Close Project)
  • 68. PIMS Data Warehouse 66 7.1.4 Load RDBMS Details MainForm RDBDetail RDBRelationships LoadDBDetails ConnectDB LoadTableNames LoadColumnNames LoadRelationships IdentifyRelationships RelationshipDetails PopulateDBDetails Figure 28 Sequence Diagram (Load RDBMS Details)
  • 69. PIMS Data Warehouse 67 7.1.5 Create Schema MainFormCreateStarCreateSchemaCreateFactCreateDimension LoadSchemaCreator InitializeSchemaDetails CreateSchema NewSchema CreateNewFact DefineTable SetTableDetails FactCreated CreateNewDimension DefineTable SetTableDetails DimensionCreated LinkFactsAndDimensions FinalizeSchema SaveSchema PopulateSchemaDetails
  • 71. PIMS Data Warehouse 69 7.1.6 Create Mappings MainForm TableMappings Source Target CreateMappings NewMapping GetSourceDetails SourceDetails GetTargetDetials TargetDetails ProvideMappings ValidateMappings FinalizeMappings SavedMappings Figure 30 Sequence Diagram (Create Mappings)
  • 72. PIMS Data Warehouse 70 7.1.7 Data Extraction DataExtractor Mappings Source NewExtraction LoadMappingDetails MappingDetails CreateExtractionProcedures ExtractData CreateDataFiles DataFiles SaveFiles Figure 31 Sequence Diagram (Data Extraction)
  • 73. PIMS Data Warehouse 71 7.1.8 Data Transformations MainForm DataTransformation TransformationFunctions Initialize NewTransformation SelectTransformation SetTransformationDetails InitTransfromData TransformationDetails ApplyTransformations TransformedData ExtractedData GetExtractedData ExtractedData TransformationLog Figure 32 Sequence Diagram (Data Transformations)
  • 74. PIMS Data Warehouse 72 Data Loading DataLoader TargetTransformedData GetTransfromedData TransfromedData NewLoad LoadData InitializeTarget SendData SaveData SavedDataLog Figure 33 Sequence Diagram (Data Loading)
  • 75. PIMS Data Warehouse 73 7.1.9 Report Generation Doctor shall select specific criteria to generate report and send to data warehouse manager and it shall give some acknowledgement and send request to report generation tool and it shall generate report according to criteria set by doctor. Web Interface Data Warehouse Manager Send Report Doctor Report Generation Tool Criteria to generate report Process required data Send data Apply defined rules Acknowledge Figure 34 Sequence Diagram (Report Generation)
  • 76. PIMS Data Warehouse 74 7.1.10 ETL Data came from operational data stores that shall be extracted by extract manager then sent to transform manager that set data into standard format then to cleaning manager to get useful information then to load manager and finally it shall be loaded to data warehouse. ODS Extract Manager Transform Manager Cleaning Manager Load Manager Data Warehouse Req() Get_data() Extracted Data Transformed Data Clean Data Load to DWH Extract_data() Transform Data Clean() Figure 35 Sequence Diagram (ETL)
  • 77. PIMS Data Warehouse 75 7.2 Communication Diagram 7.2.1 ETL Data source respond to ETL request then it shall extract data with help of extract manager then sent to transform manager that set data into standard format then to cleaning manager to get useful information then to load manager and finally it shall be loaded to data warehouse. ETL Tool 2.Extract() Extract Manager 3.Transform Data Request() Data Source 1.Respond to Request 4.C leaning D ata Transform Manager Clean Manager Load Manager 5.Load Data Data Warehouse Figure 36 Communication Diagram (ETL) 7.2.2 Report User request generate or view report depends on need of user the request shall be sent to reporting tool to generate required report or view required report by reporting manager. Reporting Tool 1.GenerateReport 2.ViewReport Report Manager 2.1.1.See Specific Report 1.1.Generate Required Report 2.1.View Required Report Web Application User User Requests Request() 1.1.1. Report Generated Figure 37 Communication Diagram (Report)
  • 78. PIMS Data Warehouse 76 7.3 Interaction Overview 7.3.1 ETL Extract manager extract data then sent to transform manager that set data into standard format then to cleaning manager to get useful information then to load manager and finally it shall be loaded to Data Warehouse. Extract Manager Transform Manager Cleaning Manager Load Manager Data Ware House Figure 38 Interaction Overview (ETL) 7.3.2 Warehouse Interaction Doctor view patient history through query manager analyzes patient history and reporting tool suggests prescription depending on analysis of patient history. Doctor View Patient History Analyze Patient History Prescription Query Manager Reporting Tool Figure 39 Interaction Overview (Warehouse Interaction)
  • 79. PIMS Data Warehouse 77 Web Diagrams 7.3.3 Access Model <<Navigation Class>> User/Staff <<Menu>> User Menu <<Guided Tour>> SingIn Page <<Guided Tour>> Contact Us <<Guided Tour>> About Us <<Guided Tour>> SignUp <<Navigation Class>> Staff <<Navigation Class>> User Authorized Staff Authorized User <<Guided Tour>> Home Page <<Menu>> Staff Menu <<Menu>> User Menu <<Navigation Class>> Staff register member <<Navigation Class>> Staff upload Report <<Navigation Class>> Staff/User View report Staff Staff User Staff <<Guided Tour>> SignOut <<Navigation Class>> Staff/User Change Password Staff User Figure 40 Web Diagrams (Access Model)
  • 80. PIMS Data Warehouse 78 Chapter 8 Implementation
  • 81. PIMS Data Warehouse 79 8 Implementation 8.1 System Implementation In this chapter our project detail is discussed and the implementation of the patient information and monitoring system based on data warehouse. The implementation of the system is done in two parts:  Back end software SQL server r2  GUI 8.2 Back end software SQL server r2 Back end is also subdivided into 2parts, namely  SSIS  SSAS 8.2.1 Snowflake Schema Figure 41 Snowflake Schema
  • 82. PIMS Data Warehouse 80 8.2.2 ETL SSIS First comes the ETL part for that we have used SQL server Integration services (SSIS). When data is required to be loaded in data warehouse regularly, so that it can serve its purpose of facilitating business analysis. To do this, data from one or more operational systems needs to be extracted and copied into the warehouse. The process of extracting data from source systems and bringing it into the data warehouse is commonly called ETL, which stands for extraction, transformation, and loading. The acronym ETL is perhaps too simplistic, because it omits the transportation phase and implies that each of the other phases of the process is distinct. We refer to the entire process, including data loading, as ETL. The ETL refers to a broad process, and not three well-defined steps. What happens during the ETL process? During extraction, the desired data is identified and extracted from many different sources, including database systems and applications. Very often, it is not possible to identify the specific subset of interest; therefore more data than necessary has to be extracted, so the identification of the relevant data shall be done at a later point in time. Depending on the source system's capabilities (for example, operating system resources), some transformations may take place during this extraction process. The size of the extracted data varies from hundreds of kilobytes up to gigabytes, depending on the source system and the business situation. The same is true for the time delta between two (logically) identical extractions: the time span may vary between days/hours and minutes to near real-time. Web server log files for example can easily become hundreds of megabytes in a very short period of time. 8.2.3 Overview of PIMS ETL Figure 42 Overview of PIMS ETL