Wrong confirmation ID
  • Email
  • Favorite
  • Download
  • Embed
  • Private Content

A Comparison of Five Probabilistic View-Size Estimation Techniques in OLAP

by Daniel Lemire on Nov 11, 2008

  • 2,269 views

A data warehouse cannot materialize all possible views, hence we must estimate quickly, accurately, and reliably the size of views to determine the best candidates for materialization. Many available t...

A data warehouse cannot materialize all possible views, hence we must estimate quickly, accurately, and reliably the size of views to determine the best candidates for materialization. Many available techniques for view-size estimation make particular statistical assumptions and their error can be large. Comparatively, unassuming probabilistic techniques are slower, but they estimate accurately and reliability very large view sizes using little memory. We compare five unassuming hashing-based view-size estimation techniques including Stochastic Probabilistic Counting and LogLog Probabilistic Counting. Our experiments show that only Generalized Counting, Gibbons-Tirthapura, and Adaptive Counting provide universally tight estimates irrespective of the size of the view; of those, only Adaptive Counting remains constantly fast as we increase the memory budget.

Accessibility

Categories

Tags

olap dolap2007

Upload Details

Uploaded via SlideShare as Adobe PDF

Usage Rights

© All Rights Reserved

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

Cancel

1 Embed 8

http://www.slideshare.net 8

Statistics

Favorites
0
Downloads
21
Comments
0
Embed Views
8
Views on SlideShare
2,261
Total Views
2,269
Post Comment
Edit your comment Cancel

A Comparison of Five Probabilistic View-Size Estimation Techniques in OLAP — Presentation Transcript