• Like
  • Save
20120140506002
Upcoming SlideShare
Loading in...5
×
 

20120140506002

on

  • 32 views

 

Statistics

Views

Total Views
32
Views on SlideShare
32
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    20120140506002 20120140506002 Document Transcript

    • International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 08-14 © IAEME 8 BUILDING AGGREGATES IN THE DATA WAREHOUSE: A CASE STUDY OF BIRTH, DECEASED AND PROPERTY REGISTRATION E-GOVERNANCE DATA Pushpal Desai1 1 (M.Sc. (I.T.) Programme, VNSGU, Surat, India) ABSTRACT In this paper, the concept of aggregates in the data warehouse is discussed. The proposed method to create aggregate in data warehouse and its implementation using Microsoft SQL Server Integration Services is discussed. The results obtained from aggregates are presented. The results indicate that aggregates can be very efficient compare to querying data from base fact table of the data warehouse. Keywords: Aggregates, Data Warehouse, Microsoft SQL Server Integration Service. I. INTRODUCTION An Aggregate is a supplemented data structure that helps make things go faster in the data warehouse [3]. Aggregates are very important part of any data warehouse implementation. An aggregate is a number that is calculated from amounts in many detail records. An aggregate is often the sum of many numbers, although it can also be derived using other arithmetic operations or even from a count of the number of items in a group [1]. An aggregate is a value formed by combining values from a given dimension or set of dimensions to create a single value [1]. By implementing aggregate in the data warehouse, we can store summarized data from the detailed data that are available in the OLTP systems. Once we create different aggregates in the data warehouse, retrieving information from the aggregate is much more efficient compare to detailed data [1]. There are several advantages of creating aggregates in data warehouse. Typically, Aggregates contains fewer rows than the base tables. Therefore, when end user executes query against the aggregate’s fact table instead of the data warehouse fact table, the response time is quite high. So, aggregates are very effective in improving query performance in data warehouse [2]. Typically, data warehouse contains large amount of data with millions of records. In data warehouse environment several users tries to executes complex queries from the data warehouse and that may take lot of time. The use of pre INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND TECHNOLOGY (IJARET) ISSN 0976 - 6480 (Print) ISSN 0976 - 6499 (Online) Volume 5, Issue 6, June (2014), pp. 08-14 © IAEME: www.iaeme.com/ijaret.asp Journal Impact Factor (2014): 7.8273 (Calculated by GISI) www.jifactor.com IJARET © I A E M E
    • International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 08-14 © IAEME 9 calculated aggregates can greatly improve the query execution time and efficiency the data warehouse [4]. II. METHODOLOGY The Aggregate transformation allows us to combine information from multiple records from the source data and convert into a single value [1]. Figure 1: The proposed methodology to create Aggregates To create aggregate, first we need to specify source data and then select the input columns from the source data. We need to specify operations on the input columns and the possible operations on input columns are “group by”, “minimum”, “maximum”, “sum”, “average”, “count”, “count distinct”, etc…After specify these settings, we can create aggregate in the data warehouse and store them for future analysis tasks by the management. The proposed methodology to create aggregates is depicted in the Figure 1. The aggregate transformations are implemented on different data by considering the common business requirements. The SQL Server Integration Service provides aggregate transformation to develop various aggregates [1]. For example, In “Birth Data”, aggregate based on “RegistrationYear”, “ReligionID” and “Sex” fields was developed. Based on these fields, aggregate of “Average Birth Weight” was developed. The Figure 2 shows settings for aggregate transformation settings in the SQL Server Integration Services.
    • International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 08-14 © IAEME 10 Figure 2: Average Birth Weight Aggregate transformation using SSIS Similarly, aggregate for “Average Deceased Age” considering “Registration Year”, “Deceased Religion” and “Deceased Sex” fields was developed. The settings for “Deceased Age Aggregate” transformation are shown in the Figure 3. Figure 3: Average Deceased Age Aggregate Transformation using SSIS Similarly, aggregates for “Property Database” considering average “Property Age” in various wards and property types was developed. The settings for the property age aggregate transformation are shown in the Figure 4.
    • International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 08-14 © IAEME 11 Figure 4: Property Age Aggregate transformation using SQL Server Integration Service III. RESULTS The SQL Server Integration Services package execution on Birth Data source records generated “151” records. The execution flow and result is shown in the Figure 5 and Figure 6 respectively. Figure 5: Execution flow of Child’s Birth Weight Aggregate transformation This aggregate summarized data for “Average Child Birth Weight” attribute. It considers various fields such as Gender, Year and Religion. Hence, whenever, Average Child Birth Weight data is required, query can be efficiently executed against aggregate. This query execution will very efficient as aggregate contains only 151 records and query execution does not affect base fact table.
    • International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 08-14 © IAEME 12 Figure 6: Result of Child’s Birth Weight Aggregate transformation Similarly, we executed SSIS package for creating “Average Deceased Age” attributed. The execution flow and its result are shown in the Figure 7 and Figure 8 respectively. Figure 7: Execution of Deceased Age Aggregate transformation This aggregate considers other fields such as Gender, Religion and Year. This aggregate can be efficiently used, whenever; Average Decease Age information is required
    • International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 08-14 © IAEME 13 Figure 8: Result of Deceased Age Aggregate transformation The execution of SSIS package for “Average Property Age” resulted in “768” rows from 1,47,1859 records stored in base fact table. Figure 9: Execution of Property Age Aggregate transformation This aggregate contains other important fields such as Property Type and Ward Number. So this aggregate can be very efficiently used whenever “Average Property Age” is required as query execution will be against only 768 records.
    • International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 097 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. Figure 10: Result of Property IV. CONCLUSION The results clearly indicate that deployment. The deployment and use of warehouse queries. The practical implementation indicates that queries executed against are highly efficient because aggregates contain far less records compare to base fact tables. V. ACKNOWLADGEMENT AND All results are based on data provided by the munici only. Hence results may change, if data warehouse VI. REFERENCES (1) Brion Larson, Delivering Business Intelligence with Microsoft SQL Server 2008 (2) Paulraj Ponniah, Data Warehousing Fundamentals: A Comprehensive Guide for IT Professional, Wiley India-Ediation. (3) Christopher Adamson, The Complete Reference: Star Schema, Tata McGraw (4) Ashok Kumar Verma, Effect of cube on query performance in data warehouse, Internat Journal of Advanced Research in 2278-6244. (5) Kuldeep Deshpande and Dr. Bhimappa Desai, “A Critical Study and Testing Techniques for Data Technology and Management Information Systems (IJITMIS), Volume pp. 60 - 71, ISSN Print: 0976 (6) Prof. Manas Kumar Sanyal, Sudhangsu Das Way to Roll Out E-Governance Projects Engineering & Technology (IJCET), Volume 0976 – 6367, ISSN Online: 0976 International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 097 6499(Online) Volume 5, Issue 6, June (2014), pp. 08-14 © IAEME 14 Result of Property Age Aggregate transformation results clearly indicate that aggregates are crucial part of any deployment. The deployment and use of aggregates greatly improves the efficiency of practical implementation indicates that queries executed against ggregates contain far less records compare to base fact tables. ACKNOWLADGEMENT AND LIMITATIONS All results are based on data provided by the municipal corporation for the research purpose only. Hence results may change, if data warehouse concepts are applied on actual data sets. Delivering Business Intelligence with Microsoft SQL Server 2008 Warehousing Fundamentals: A Comprehensive Guide for IT Ediation. Christopher Adamson, The Complete Reference: Star Schema, Tata McGraw- Ashok Kumar Verma, Effect of cube on query performance in data warehouse, Internat Journal of Advanced Research in IT and Engineering, Vol. 2, No. 6, June 2013, ISSN: nd Dr. Bhimappa Desai, “A Critical Study of Requirement G or Datawarehousing”, International Journal of Information Technology and Management Information Systems (IJITMIS), Volume 5 0976 – 6405, ISSN Online: 0976 – 6413. Prof. Manas Kumar Sanyal, Sudhangsu Das and Sajal Bhadra, “Cloud Computing Governance Projects in India”, International Journal of Computer Engineering & Technology (IJCET), Volume 4, Issue 2, 2013, pp. 61 - 6367, ISSN Online: 0976 – 6375. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – © IAEME ggregates are crucial part of any data warehouse ggregates greatly improves the efficiency of data practical implementation indicates that queries executed against aggregates ggregates contain far less records compare to base fact tables. pal corporation for the research purpose applied on actual data sets. Delivering Business Intelligence with Microsoft SQL Server 2008. Warehousing Fundamentals: A Comprehensive Guide for IT -Hill Edition. Ashok Kumar Verma, Effect of cube on query performance in data warehouse, International No. 6, June 2013, ISSN: f Requirement Gathering al of Information 5, Issue 1, 2014, nd Sajal Bhadra, “Cloud Computing-A New ournal of Computer 72, ISSN Print: