A session from Qubole Best Practice Webinar Series- “Big Data Secrets from the Pros”. Covers how to make Apache Hive queries run faster by
a. Better layout of data on HDFS via partitioning and bucketing
b. Designing test queries by using block and bucket sampling before running the queries on large datasets
c. Using bucket map joins and parallel processing to run queries faster
Visit www.qubole.com for more information.