• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Data compression for Large Multidimensional Data Warehouses
 

Data compression for Large Multidimensional Data Warehouses

on

  • 1,703 views

This presentation is prepared for the presentation of thesis titled as "Data compression for Large Multidimensional Data Warehouses" which was done for the partial fulfillment of the undergrad course ...

This presentation is prepared for the presentation of thesis titled as "Data compression for Large Multidimensional Data Warehouses" which was done for the partial fulfillment of the undergrad course in Dept of CSE, KUET, Bangladesh.

Statistics

Views

Total Views
1,703
Views on SlideShare
1,691
Embed Views
12

Actions

Likes
1
Downloads
27
Comments
0

1 Embed 12

http://www.slideshare.net 12

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Data compression for Large Multidimensional Data Warehouses Data compression for Large Multidimensional Data Warehouses Presentation Transcript

    • Data Compression for Large Multidimensional Data Warehouses
      Supervisor:
      Presented by:
      Dr. K.M. Azharul Hasan
      Associate Professor,
      Head of the Department,
      Department of CSE, KUET
      Abdullah Al Mahmud,
      Roll : 0507006
      Md. Mushfiqur Rahman,
      Roll : 0507029
      1
      This slide is prepared by Abdullah Al Mahmud for the presentation of Thesis which was done as the partial fulfillment of degree of in undergrad course in Khulna University of Engineering & Technology(KUET), Bangladesh
    • Presentation Layout
      • Objectives
      • Existing Compression Schemes
      • Traditional Extendible Array
      • Proposed Compression Scheme
      • EXCS
      (Extendible Array Based Compression Scheme)
      • Comparative Analysis
      • Conclusion
      2
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
      • Data compression technology reduces:
      • effective price of logical data storage capacity
      • improves query performance
      • Multidimensional array is widely used in large number of scientific research.
      • An efficient compression of multidimensional array can handle large multidimensional data sets of data warehouses
      3
      Objectives
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
    • Existing Compression Schemes (1/ 3)
      • Bitmap compression
      • Run Length Encoding
      • Header compression
      • Compressed Column Storage
      • Compressed Row Storage
      4
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
    • Existing Compression Schemes (2/ 3)
      5
      (a) A sparse array. (b) The CRS scheme
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
    • Existing Compression Schemes (3/ 3)
      • Classical methods cannot support updates without completely readjusting runs .
      • Compressing sparse array
      • Do not support extendibility
      6
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
    • Traditional Extendible Array
      • TEA supports dynamic extension of dimension size.
      7
      Position <1,3>
      H1[1]<H2[3]
      Address of Cell=Address1[3]+1=10
      0
      History Counter=
      0
      1
      2
      3
      4
      5
      Figure 1: TEA Construction And Access
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
    • Proposed Compression Scheme
      • Multidimensional arrays are important for sparse array operations
      • Extendibility of multidimensional arrays
      • A compression technique that can work on multidimensional extendible array
      • Our proposed compression scheme is EXCS (Extendible array based Compression Scheme)
      8
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
    • Extendible array based Compression Scheme (EXCS) 1/3
      • We implemented the multidimensional extendible array in secondary memory
      • We have considered dimension =3 in our experimental approach
      • The sub-arrays are distinguished to store them individually in the secondary memory
      9
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
    • Extendible array based Compression Scheme (EXCS) 2/3
      • The sub-arrays are of n-1(=2) dimension
      • A large no. of sub-arrays are generated to be compressed
      • Sub-arrays are dynamically taken as input
      • Only the max no of sub-arrays is to be given
      10
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
    • 11
      Extendible array based Compression Scheme (EXCS) 3/3
      • Each sub-array is compressed individually
      • The compression technique used is similar to CRS
      • The compressed elements are written in the secondary memory as RO, CO, VL of subarray_1, subarray_2, … … subarray_N
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
    • Performance Measurement
      • Performance is measured by measuring two key factors of the compression schemes:
      • Data Density
      • Length of Dimension/ Number of Data
      • compression ratio=
      (compressed data/ original data)
      • space savings = 1 – compression ratio
      • we have considered space savings in percent
      12
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
    • Comparative Analysis (1/4)
      13
      No. of data
      Figure: Comparison with fixed density = 20%
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
    • 14
      Comparative Analysis (2/4)
      No. of data
      Figure: Comparison with fixed density = 25%
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
    • Comparative Analysis (3/4)
      15
      Density of data
      Figure: Comparison with fixed no. of data=64
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
    • Comparative Analysis (4/4)
      16
      Density of data
      Figure: Comparison with fixed no. of data=4096
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
    • Performance Measurement
      • Extendibility of arrays
      • Using multidimensional arrays
      • Extendibility toward any dimension
      • EXCS allows dynamic extension of arrays.
      • In analysis, we can extend data up to n dimensions
      • Performance is good for large no. of data
      17
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh
    • Conclusion
      • Our proposed compression scheme is experimentally done up to 3 dimension data
      • It can be extended experimentally for compressing n dimension data in future.
      • EXCS is effective for large multidimensional data warehouses
      18
      Abdullah Al Mahmud, Student ID: 0507006, Dept. of CSE, KUET, Bangladesh