SlideShare a Scribd company logo
1 of 9
 FAST RAQ: A FAST APPROACH TO RANGE
AGGREGATE QUERIES IN BIG DATA ENVIRONMENTS
Range-aggregate queries are to apply a certain aggregate function on all tuples
within given query ranges.
Existing approaches to range-aggregate queries are insufficient to quickly provide
accurate results in big data environments.
In this paper, we propose Fast RAQ—a fast approach to range-aggregate queries in
big data environments.
Fast RAQ first divides big data into independent partitions with a balanced
partitioning algorithm, and then generates a local estimation sketch for each partition.
When a range-aggregate query request arrives, Fast RAQ obtains the result directly
by summarizing local estimates from all partitions.
Fast RAQ has Oð1Þ time complexity for data updates and Oð N P_B Þ time
complexity for range-aggregate queries, where N is the number of distinct
tuples for all dimensions, P is the partition number, and B is the bucket number
in the histogram.
We implement the Fast RAQ approach on the Linux platform, and evaluate its
performance with about 10 billion data records.
Experimental results demonstrate that Fast RAQ provides range-aggregate
query results within a time period two orders of magnitude lower than that of
Hive, while the relative error is less than 3 percent within the given confidence
interval.
 Range-aggregate queries are to apply a
certain aggregate function on all tuples within
given query ranges.
 Existing approaches to range-aggregate
queries are insufficient to quickly provide
accurate results in big data environments
 In this paper, we propose Fast RAQ—a fast approach to range-aggregate queries in big
data environments.
 Fast RAQ first divides big data into independent partitions with a balanced partitioning
algorithm, and then generates a local estimation sketch for each partition.
 When a range-aggregate query request arrives, Fast RAQ obtains the result directly by
summarizing local estimates from all partitions.
 Fast RAQ has Oð1Þ time complexity for data updates and Oð N P_B Þ time complexity for
range-aggregate queries, where N is the number of distinct tuples for all dimensions, P
is the partition number, and B is the bucket number in the histogram.
 We implement the Fast RAQ approach on the Linux platform, and evaluate its
performance with about 10 billion data records.
In this paper, we propose Fas RAQ—a new approximate answering
approach that acquires accurate estimations quickly for range-aggregate
queries in big data environments.
Fast RAQ has Oð1Þ time complexity for data updates and Oð N Þ time
complexity for ad-hoc range-aggregate queries.
If the ratio of edge-bucket cardinality (h PB ) is small enough, Fast RAQ
even has Oð1Þ time complexity for range aggregate queries.
 We believe that Fast RAQ provides a good starting point for developing
real-time answering methods for big data analysis.
 There are also some interesting directions for our future work.
 First, Fast RAQ can solve the 1:n format range aggregate queries
problem, i.e., there is one aggregation column and n index columns in a
record.
 We plan to investigate how our solution can be extended to the case of
m:n format problem, i.e., there are m aggregation columns and n index
columns in a same record.
 Second, Fast RAQ is now running in homogeneous environments.
 We will further explore how Fast RAQ can be applied in heterogeneous
context or even as a tool to boost the performance of data analysis in
DBaas.
[1] P. Mika and G. Tummarello, “Web semantics in the clouds,” IEEE Intell. Syst., vol. 23,
no. 5, pp. 82–87, Sep./Oct. 2008.
[2] T. Preis, H. S. Moat, and E. H. Stanley, “Quantifying trading behavior in financial
markets using Google trends,” Sci. Rep., vol. 3, p. 1684, 2013.
[3] H. Choi and H. Varian, “Predicting the present with Google trends,” Econ. Rec., vol.
88, no. s1, pp. 2–9, 2012.
[4] C.-T. Ho, R. Agrawal, N. Megiddo, and R. Srikant,, “Range queries in OLAP data
cubes,” ACM SIGMOD Rec., vol. 26, no. 2, pp. 73–88, 1997.
[5] G. Mishne, J. Dalton, Z. Li, A. Sharma, and J. Lin, “Fast data in the era of big data:
Twitter’s real-time related query suggestion architecture,” in Proc. ACM SIGMOD Int.
Conf. Manage. Data, 2013, pp. 1147–1158.
Fast raq a fast approach to range aggregate queries in big data environments

More Related Content

What's hot

A Novel Approach for Clustering Big Data based on MapReduce
A Novel Approach for Clustering Big Data based on MapReduce A Novel Approach for Clustering Big Data based on MapReduce
A Novel Approach for Clustering Big Data based on MapReduce
IJECEIAES
 
Magellan-Spark as a Geospatial Analytics Engine by Ram Sriharsha
Magellan-Spark as a Geospatial Analytics Engine by Ram SriharshaMagellan-Spark as a Geospatial Analytics Engine by Ram Sriharsha
Magellan-Spark as a Geospatial Analytics Engine by Ram Sriharsha
Spark Summit
 

What's hot (20)

Big data analytics_7_giants_public_24_sep_2013
Big data analytics_7_giants_public_24_sep_2013Big data analytics_7_giants_public_24_sep_2013
Big data analytics_7_giants_public_24_sep_2013
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
 
Distributed computing abstractions_data_science_6_june_2016_ver_0.4
Distributed computing abstractions_data_science_6_june_2016_ver_0.4Distributed computing abstractions_data_science_6_june_2016_ver_0.4
Distributed computing abstractions_data_science_6_june_2016_ver_0.4
 
A Novel Approach for Clustering Big Data based on MapReduce
A Novel Approach for Clustering Big Data based on MapReduce A Novel Approach for Clustering Big Data based on MapReduce
A Novel Approach for Clustering Big Data based on MapReduce
 
7
77
7
 
PERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTER
PERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTERPERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTER
PERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTER
 
Magellan-Spark as a Geospatial Analytics Engine by Ram Sriharsha
Magellan-Spark as a Geospatial Analytics Engine by Ram SriharshaMagellan-Spark as a Geospatial Analytics Engine by Ram Sriharsha
Magellan-Spark as a Geospatial Analytics Engine by Ram Sriharsha
 
29
2929
29
 
Geospatial data
Geospatial dataGeospatial data
Geospatial data
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streams
 
Database novelty detection
Database novelty detectionDatabase novelty detection
Database novelty detection
 
Introducing Novel Graph Database Cloud Computing For Efficient Data Management
Introducing Novel Graph Database Cloud Computing For Efficient Data ManagementIntroducing Novel Graph Database Cloud Computing For Efficient Data Management
Introducing Novel Graph Database Cloud Computing For Efficient Data Management
 
Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014
 
Icbai 2018 ver_1
Icbai 2018 ver_1Icbai 2018 ver_1
Icbai 2018 ver_1
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Workshop on Real-time & Stream Analytics IEEE BigData 2016
Workshop on Real-time & Stream Analytics IEEE BigData 2016Workshop on Real-time & Stream Analytics IEEE BigData 2016
Workshop on Real-time & Stream Analytics IEEE BigData 2016
 
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
 
Shortest path estimation for graph
Shortest path estimation for graphShortest path estimation for graph
Shortest path estimation for graph
 
Data Imputation by Soft Computing
Data Imputation by Soft ComputingData Imputation by Soft Computing
Data Imputation by Soft Computing
 
Mining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDTMining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDT
 

Similar to Fast raq a fast approach to range aggregate queries in big data environments

IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 
Network Flow Pattern Extraction by Clustering Eugine Kang
Network Flow Pattern Extraction by Clustering Eugine KangNetwork Flow Pattern Extraction by Clustering Eugine Kang
Network Flow Pattern Extraction by Clustering Eugine Kang
Eugine Kang
 
Big Data on Implementation of Many to Many Clustering
Big Data on Implementation of Many to Many ClusteringBig Data on Implementation of Many to Many Clustering
Big Data on Implementation of Many to Many Clustering
paperpublications3
 

Similar to Fast raq a fast approach to range aggregate queries in big data environments (20)

Spatial approximate string search
Spatial approximate string searchSpatial approximate string search
Spatial approximate string search
 
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Spatial approximate string search
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Spatial approximate string searchJAVA 2013 IEEE CLOUDCOMPUTING PROJECT Spatial approximate string search
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Spatial approximate string search
 
Recognition and Detection of Real-Time Objects Using Unified Network of Faste...
Recognition and Detection of Real-Time Objects Using Unified Network of Faste...Recognition and Detection of Real-Time Objects Using Unified Network of Faste...
Recognition and Detection of Real-Time Objects Using Unified Network of Faste...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Network Flow Pattern Extraction by Clustering Eugine Kang
Network Flow Pattern Extraction by Clustering Eugine KangNetwork Flow Pattern Extraction by Clustering Eugine Kang
Network Flow Pattern Extraction by Clustering Eugine Kang
 
Data mining projects topics for java and dot net
Data mining projects topics for java and dot netData mining projects topics for java and dot net
Data mining projects topics for java and dot net
 
Big Data on Implementation of Many to Many Clustering
Big Data on Implementation of Many to Many ClusteringBig Data on Implementation of Many to Many Clustering
Big Data on Implementation of Many to Many Clustering
 
Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...
Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...
Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...
 
C0312023
C0312023C0312023
C0312023
 
Fast top k path-based relevance query on massive graphs
Fast top k path-based relevance query on massive graphsFast top k path-based relevance query on massive graphs
Fast top k path-based relevance query on massive graphs
 
how can implement a multidimensional Data Warehouse using NoSQL
how can implement a multidimensional Data Warehouse using NoSQLhow can implement a multidimensional Data Warehouse using NoSQL
how can implement a multidimensional Data Warehouse using NoSQL
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 
Distributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsDistributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasets
 
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
 
JAVA 2013 IEEE NETWORKSECURITY PROJECT Spatial approximate string search
JAVA 2013 IEEE NETWORKSECURITY PROJECT Spatial approximate string searchJAVA 2013 IEEE NETWORKSECURITY PROJECT Spatial approximate string search
JAVA 2013 IEEE NETWORKSECURITY PROJECT Spatial approximate string search
 
Spatial approximate string search
Spatial approximate string searchSpatial approximate string search
Spatial approximate string search
 
Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)Survey on classification algorithms for data mining (comparison and evaluation)
Survey on classification algorithms for data mining (comparison and evaluation)
 
Modified Pure Radix Sort for Large Heterogeneous Data Set
Modified Pure Radix Sort for Large Heterogeneous Data Set Modified Pure Radix Sort for Large Heterogeneous Data Set
Modified Pure Radix Sort for Large Heterogeneous Data Set
 
A Parallel Algorithm Template for Updating Single-Source Shortest Paths in La...
A Parallel Algorithm Template for Updating Single-Source Shortest Paths in La...A Parallel Algorithm Template for Updating Single-Source Shortest Paths in La...
A Parallel Algorithm Template for Updating Single-Source Shortest Paths in La...
 
Experimenting With Big Data
Experimenting With Big DataExperimenting With Big Data
Experimenting With Big Data
 

More from Nexgen Technology

More from Nexgen Technology (20)

MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
 
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
 
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
 
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
 
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
 
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
 
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CH...
     MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CH...     MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CH...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CH...
 
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHENN...
  MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHENN...  MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHENN...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHENN...
 
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
 
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...    MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHE...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHE...
 
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHENNA...
 MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHENNA... MECHANICAL PROJECTS IN PONDICHERRY,   2020-21  MECHANICAL PROJECTS IN CHENNA...
MECHANICAL PROJECTS IN PONDICHERRY, 2020-21 MECHANICAL PROJECTS IN CHENNA...
 
Ieee 2020 21 vlsi projects in pondicherry,ieee vlsi projects in chennai
Ieee 2020 21 vlsi projects in pondicherry,ieee  vlsi projects  in chennaiIeee 2020 21 vlsi projects in pondicherry,ieee  vlsi projects  in chennai
Ieee 2020 21 vlsi projects in pondicherry,ieee vlsi projects in chennai
 
Ieee 2020 21 power electronics in pondicherry,Ieee 2020 21 power electronics
Ieee 2020 21 power electronics in pondicherry,Ieee 2020 21 power electronics Ieee 2020 21 power electronics in pondicherry,Ieee 2020 21 power electronics
Ieee 2020 21 power electronics in pondicherry,Ieee 2020 21 power electronics
 
Ieee 2020 -21 ns2 in pondicherry, Ieee 2020 -21 ns2 projects,best project cen...
Ieee 2020 -21 ns2 in pondicherry, Ieee 2020 -21 ns2 projects,best project cen...Ieee 2020 -21 ns2 in pondicherry, Ieee 2020 -21 ns2 projects,best project cen...
Ieee 2020 -21 ns2 in pondicherry, Ieee 2020 -21 ns2 projects,best project cen...
 
Ieee 2020 21 ns2 in pondicherry,best project center in pondicherry,final year...
Ieee 2020 21 ns2 in pondicherry,best project center in pondicherry,final year...Ieee 2020 21 ns2 in pondicherry,best project center in pondicherry,final year...
Ieee 2020 21 ns2 in pondicherry,best project center in pondicherry,final year...
 
Ieee 2020 21 java dotnet in pondicherry,final year projects in pondicherry,pr...
Ieee 2020 21 java dotnet in pondicherry,final year projects in pondicherry,pr...Ieee 2020 21 java dotnet in pondicherry,final year projects in pondicherry,pr...
Ieee 2020 21 java dotnet in pondicherry,final year projects in pondicherry,pr...
 
Ieee 2020 21 iot in pondicherry,final year projects in pondicherry,project ce...
Ieee 2020 21 iot in pondicherry,final year projects in pondicherry,project ce...Ieee 2020 21 iot in pondicherry,final year projects in pondicherry,project ce...
Ieee 2020 21 iot in pondicherry,final year projects in pondicherry,project ce...
 
Ieee 2020 21 blockchain in pondicherry,final year projects in pondicherry,bes...
Ieee 2020 21 blockchain in pondicherry,final year projects in pondicherry,bes...Ieee 2020 21 blockchain in pondicherry,final year projects in pondicherry,bes...
Ieee 2020 21 blockchain in pondicherry,final year projects in pondicherry,bes...
 
Ieee 2020 -21 bigdata in pondicherry,project center in pondicherry,best proje...
Ieee 2020 -21 bigdata in pondicherry,project center in pondicherry,best proje...Ieee 2020 -21 bigdata in pondicherry,project center in pondicherry,best proje...
Ieee 2020 -21 bigdata in pondicherry,project center in pondicherry,best proje...
 
Ieee 2020 21 embedded in pondicherry,final year projects in pondicherry,best...
Ieee 2020 21  embedded in pondicherry,final year projects in pondicherry,best...Ieee 2020 21  embedded in pondicherry,final year projects in pondicherry,best...
Ieee 2020 21 embedded in pondicherry,final year projects in pondicherry,best...
 

Recently uploaded

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 

Fast raq a fast approach to range aggregate queries in big data environments

  • 1.  FAST RAQ: A FAST APPROACH TO RANGE AGGREGATE QUERIES IN BIG DATA ENVIRONMENTS
  • 2. Range-aggregate queries are to apply a certain aggregate function on all tuples within given query ranges. Existing approaches to range-aggregate queries are insufficient to quickly provide accurate results in big data environments. In this paper, we propose Fast RAQ—a fast approach to range-aggregate queries in big data environments. Fast RAQ first divides big data into independent partitions with a balanced partitioning algorithm, and then generates a local estimation sketch for each partition. When a range-aggregate query request arrives, Fast RAQ obtains the result directly by summarizing local estimates from all partitions.
  • 3. Fast RAQ has Oð1Þ time complexity for data updates and Oð N P_B Þ time complexity for range-aggregate queries, where N is the number of distinct tuples for all dimensions, P is the partition number, and B is the bucket number in the histogram. We implement the Fast RAQ approach on the Linux platform, and evaluate its performance with about 10 billion data records. Experimental results demonstrate that Fast RAQ provides range-aggregate query results within a time period two orders of magnitude lower than that of Hive, while the relative error is less than 3 percent within the given confidence interval.
  • 4.  Range-aggregate queries are to apply a certain aggregate function on all tuples within given query ranges.  Existing approaches to range-aggregate queries are insufficient to quickly provide accurate results in big data environments
  • 5.  In this paper, we propose Fast RAQ—a fast approach to range-aggregate queries in big data environments.  Fast RAQ first divides big data into independent partitions with a balanced partitioning algorithm, and then generates a local estimation sketch for each partition.  When a range-aggregate query request arrives, Fast RAQ obtains the result directly by summarizing local estimates from all partitions.  Fast RAQ has Oð1Þ time complexity for data updates and Oð N P_B Þ time complexity for range-aggregate queries, where N is the number of distinct tuples for all dimensions, P is the partition number, and B is the bucket number in the histogram.  We implement the Fast RAQ approach on the Linux platform, and evaluate its performance with about 10 billion data records.
  • 6. In this paper, we propose Fas RAQ—a new approximate answering approach that acquires accurate estimations quickly for range-aggregate queries in big data environments. Fast RAQ has Oð1Þ time complexity for data updates and Oð N Þ time complexity for ad-hoc range-aggregate queries. If the ratio of edge-bucket cardinality (h PB ) is small enough, Fast RAQ even has Oð1Þ time complexity for range aggregate queries.  We believe that Fast RAQ provides a good starting point for developing real-time answering methods for big data analysis.  There are also some interesting directions for our future work.
  • 7.  First, Fast RAQ can solve the 1:n format range aggregate queries problem, i.e., there is one aggregation column and n index columns in a record.  We plan to investigate how our solution can be extended to the case of m:n format problem, i.e., there are m aggregation columns and n index columns in a same record.  Second, Fast RAQ is now running in homogeneous environments.  We will further explore how Fast RAQ can be applied in heterogeneous context or even as a tool to boost the performance of data analysis in DBaas.
  • 8. [1] P. Mika and G. Tummarello, “Web semantics in the clouds,” IEEE Intell. Syst., vol. 23, no. 5, pp. 82–87, Sep./Oct. 2008. [2] T. Preis, H. S. Moat, and E. H. Stanley, “Quantifying trading behavior in financial markets using Google trends,” Sci. Rep., vol. 3, p. 1684, 2013. [3] H. Choi and H. Varian, “Predicting the present with Google trends,” Econ. Rec., vol. 88, no. s1, pp. 2–9, 2012. [4] C.-T. Ho, R. Agrawal, N. Megiddo, and R. Srikant,, “Range queries in OLAP data cubes,” ACM SIGMOD Rec., vol. 26, no. 2, pp. 73–88, 1997. [5] G. Mishne, J. Dalton, Z. Li, A. Sharma, and J. Lin, “Fast data in the era of big data: Twitter’s real-time related query suggestion architecture,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, 2013, pp. 1147–1158.