SlideShare a Scribd company logo
1 of 12
Download to read offline
DATA WAREHOUSING
CHSI PROJECT MODULE
Team 1:
Abhinav Garg (11761380)
Tanu Srivastav (11772446)
Tejbeer Chhabra (11756746)
Table of Contents
Executive Summary............................................................................................................... 1
MDX Queries and their output............................................................................................... 2
DMX Queries: Mining Model: ................................................................................................ 7
Appendix............................................................................................................................. 10
1
Executive Summary
With increasing amunt of Data, the need to store and find information from the data becomes
very cruical. With the use of On-Line Analytical Processing (OLAP) technology, we can not only
store large amout temporal data but also perfrom several business intelligence operations. CHSI
data provided has more than 5 and half million records. The aim of the project is to find
meanigful insight which could bring innovation in rural health systems.
1. OLAP Cube and Queries
By executing MDX queries on OLAP Cube, We found several interesting figures and facts,
which are of buiness importance. We discovered how Formulatory type effects charge
quantity, wich year is making most profit on medication, what are profitable caresettings,
what is infusion time for various IV route medication and what are the most frequently
ocuuring discontinue reasons. These findings could be of mangerial importance to the
Hospital administration. As one of the objective of our project was to learn MDX queries,
we have implemented several different MDX functions such as TOPCOUNT, CROSSJOIN,
NON EMPTY, HEAD, FILTER, and SUBSET in our queries.
2. Data Mining
We have build mining structure using DMX queries as well as using Visual Studio
Analytical Services. Data Mining Models could be build on these structure, which allows
us to predict what would be the discontinue reason for the medication. This could give us
information about the effectiveness of the medication. We have implemented three Data
Mining Algorithm namely, Decision Tree, Neural Network and Regression.
2
MDX Queries and their output
3
4
5
6
7
DMX Queries: Mining Model:
1. Creating Mining Structure
2. Creating Mining Model
8
3. Variable Importance:
Most important variable is found to be Infusion Time. Therefore, we can say that discontinuing a
medication could be predicted by the infusion time of medication.
4. Lift Chart Model comparison:
9
From the above table we can say, the models fits well to the data set and are reasonably of similar
strength.
10
Appendix
1. Partitions
Data partition is based on Date ID. We have made 4 partitions for the entire dataset.
2. Aggregations
3. Calculated Measure:
1. Since we have Unit Cost and Unit Price, we have made a calculated measure “NET_PROFIT”,
which is Unit Price – Unit Cost.
2. For data modeling, we have created a calculated measure called “Target_Discontinue”. This is
coded in binary format of 0’s and 1’s. When the discontinue reason is for a positive change in
Patient’s health then it is coded as 1 else 0.

More Related Content

Similar to Data Warehouse Project

Thesis_Sebastian_Ånerud_2015-06-16
Thesis_Sebastian_Ånerud_2015-06-16Thesis_Sebastian_Ånerud_2015-06-16
Thesis_Sebastian_Ånerud_2015-06-16
Sebastian
 
25025031_JINGXUAN_WEI_MINOR_THESIS
25025031_JINGXUAN_WEI_MINOR_THESIS25025031_JINGXUAN_WEI_MINOR_THESIS
25025031_JINGXUAN_WEI_MINOR_THESIS
JINGXUAN WEI
 
Featured Pattern Run Length Coding for Test Data Compression
Featured Pattern Run Length Coding for Test Data CompressionFeatured Pattern Run Length Coding for Test Data Compression
Featured Pattern Run Length Coding for Test Data Compression
Henry Shen
 
Data mining seminar report
Data mining seminar reportData mining seminar report
Data mining seminar report
mayurik19
 
QBD_1464843125535 - Copy
QBD_1464843125535 - CopyQBD_1464843125535 - Copy
QBD_1464843125535 - Copy
Bhavesh Jangale
 

Similar to Data Warehouse Project (20)

Thesis_Sebastian_Ånerud_2015-06-16
Thesis_Sebastian_Ånerud_2015-06-16Thesis_Sebastian_Ånerud_2015-06-16
Thesis_Sebastian_Ånerud_2015-06-16
 
25025031_JINGXUAN_WEI_MINOR_THESIS
25025031_JINGXUAN_WEI_MINOR_THESIS25025031_JINGXUAN_WEI_MINOR_THESIS
25025031_JINGXUAN_WEI_MINOR_THESIS
 
data warehousing
data warehousingdata warehousing
data warehousing
 
Featured Pattern Run Length Coding for Test Data Compression
Featured Pattern Run Length Coding for Test Data CompressionFeatured Pattern Run Length Coding for Test Data Compression
Featured Pattern Run Length Coding for Test Data Compression
 
Sheikh-Bagheri_etal
Sheikh-Bagheri_etalSheikh-Bagheri_etal
Sheikh-Bagheri_etal
 
HyperEPJ - singlesided - sspangsberg to print
HyperEPJ - singlesided - sspangsberg to printHyperEPJ - singlesided - sspangsberg to print
HyperEPJ - singlesided - sspangsberg to print
 
Masters' Thesis - Reza Pourramezan - 2017
Masters' Thesis - Reza Pourramezan - 2017Masters' Thesis - Reza Pourramezan - 2017
Masters' Thesis - Reza Pourramezan - 2017
 
Caravan insurance data mining prediction models
Caravan insurance data mining prediction modelsCaravan insurance data mining prediction models
Caravan insurance data mining prediction models
 
Caravan insurance data mining prediction models
Caravan insurance data mining prediction modelsCaravan insurance data mining prediction models
Caravan insurance data mining prediction models
 
ICU Mortality Rate Estimation Using Machine Learning and Artificial Neural Ne...
ICU Mortality Rate Estimation Using Machine Learning and Artificial Neural Ne...ICU Mortality Rate Estimation Using Machine Learning and Artificial Neural Ne...
ICU Mortality Rate Estimation Using Machine Learning and Artificial Neural Ne...
 
Machine_Learning_Trushita
Machine_Learning_TrushitaMachine_Learning_Trushita
Machine_Learning_Trushita
 
MTörnblom-Final
MTörnblom-FinalMTörnblom-Final
MTörnblom-Final
 
143297502 cc-review
143297502 cc-review143297502 cc-review
143297502 cc-review
 
Mat189: Cluster Analysis with NBA Sports Data
Mat189: Cluster Analysis with NBA Sports DataMat189: Cluster Analysis with NBA Sports Data
Mat189: Cluster Analysis with NBA Sports Data
 
project(copy1)
project(copy1)project(copy1)
project(copy1)
 
An Introduction To Mathematical Modelling
An Introduction To Mathematical ModellingAn Introduction To Mathematical Modelling
An Introduction To Mathematical Modelling
 
Data mining seminar report
Data mining seminar reportData mining seminar report
Data mining seminar report
 
Organisering av digitale prosjekt: Hva har IT-bransjen lært om store prosjekter?
Organisering av digitale prosjekt: Hva har IT-bransjen lært om store prosjekter?Organisering av digitale prosjekt: Hva har IT-bransjen lært om store prosjekter?
Organisering av digitale prosjekt: Hva har IT-bransjen lært om store prosjekter?
 
QBD_1464843125535 - Copy
QBD_1464843125535 - CopyQBD_1464843125535 - Copy
QBD_1464843125535 - Copy
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
 

Recently uploaded

Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted KitAbortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh +966572737505 get cytotec
 
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontangobat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
siskavia95
 
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
mikehavy0
 
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptxClient Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Stephen266013
 
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
wsppdmt
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
yulianti213969
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
bams-3rd-case-presentation-scabies-12-05-2020.pptx
bams-3rd-case-presentation-scabies-12-05-2020.pptxbams-3rd-case-presentation-scabies-12-05-2020.pptx
bams-3rd-case-presentation-scabies-12-05-2020.pptx
JocylDuran
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Stephen266013
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 

Recently uploaded (20)

Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted KitAbortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
 
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontangobat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
 
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptxClient Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
 
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
bams-3rd-case-presentation-scabies-12-05-2020.pptx
bams-3rd-case-presentation-scabies-12-05-2020.pptxbams-3rd-case-presentation-scabies-12-05-2020.pptx
bams-3rd-case-presentation-scabies-12-05-2020.pptx
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 

Data Warehouse Project

  • 1. DATA WAREHOUSING CHSI PROJECT MODULE Team 1: Abhinav Garg (11761380) Tanu Srivastav (11772446) Tejbeer Chhabra (11756746)
  • 2. Table of Contents Executive Summary............................................................................................................... 1 MDX Queries and their output............................................................................................... 2 DMX Queries: Mining Model: ................................................................................................ 7 Appendix............................................................................................................................. 10
  • 3. 1 Executive Summary With increasing amunt of Data, the need to store and find information from the data becomes very cruical. With the use of On-Line Analytical Processing (OLAP) technology, we can not only store large amout temporal data but also perfrom several business intelligence operations. CHSI data provided has more than 5 and half million records. The aim of the project is to find meanigful insight which could bring innovation in rural health systems. 1. OLAP Cube and Queries By executing MDX queries on OLAP Cube, We found several interesting figures and facts, which are of buiness importance. We discovered how Formulatory type effects charge quantity, wich year is making most profit on medication, what are profitable caresettings, what is infusion time for various IV route medication and what are the most frequently ocuuring discontinue reasons. These findings could be of mangerial importance to the Hospital administration. As one of the objective of our project was to learn MDX queries, we have implemented several different MDX functions such as TOPCOUNT, CROSSJOIN, NON EMPTY, HEAD, FILTER, and SUBSET in our queries. 2. Data Mining We have build mining structure using DMX queries as well as using Visual Studio Analytical Services. Data Mining Models could be build on these structure, which allows us to predict what would be the discontinue reason for the medication. This could give us information about the effectiveness of the medication. We have implemented three Data Mining Algorithm namely, Decision Tree, Neural Network and Regression.
  • 4. 2 MDX Queries and their output
  • 5. 3
  • 6. 4
  • 7. 5
  • 8. 6
  • 9. 7 DMX Queries: Mining Model: 1. Creating Mining Structure 2. Creating Mining Model
  • 10. 8 3. Variable Importance: Most important variable is found to be Infusion Time. Therefore, we can say that discontinuing a medication could be predicted by the infusion time of medication. 4. Lift Chart Model comparison:
  • 11. 9 From the above table we can say, the models fits well to the data set and are reasonably of similar strength.
  • 12. 10 Appendix 1. Partitions Data partition is based on Date ID. We have made 4 partitions for the entire dataset. 2. Aggregations 3. Calculated Measure: 1. Since we have Unit Cost and Unit Price, we have made a calculated measure “NET_PROFIT”, which is Unit Price – Unit Cost. 2. For data modeling, we have created a calculated measure called “Target_Discontinue”. This is coded in binary format of 0’s and 1’s. When the discontinue reason is for a positive change in Patient’s health then it is coded as 1 else 0.