The State of the Hive Market as presented to the Hive User Group meetup in Palo Alto on March 17, 2014.
http://www.meetup.com/Hive-User-Group-Meeting/events/168271742/
3. Columns
Rows
Dataset Size (bytes) 10,841,585
Driver Fetch Time (second) Bytes sent to Hive server Bytes Received from Hive server
Hive Server 1 Simba Hive ODBC 1.703 362,376 11,822,547
Simba Hive ODBC 291.398 338,756,110 756,232,360
Cloudera Hive JDBC 293 329,590,070 734,135,000
HS2 / HS1 (ratio) Simba Hive ODBC 17111% 93482% 6397%
Columns
Rows
Dataset Size (bytes)
Driver Fetch Time (second) Bytes sent to Hive server Bytes Received from Hive server
Hive Server 1 Simba Hive ODBC 1.514 1,029,867 22,889,786
Simba Hive ODBC 43.651 50,465,452 131,409,423
Cloudera Hive JDBC 47 51,216,888 131,415,614
HS2 / HS1 (ratio) Simba Hive ODBC 2883% 4900% 574%
Hive Server 1 VS Hive Server 2 Performance Comparison
124 columns (a mixture of STRING, INT, and DOUBLE columns)
Test Dataset
12933 rows
Hive Server 2
Hive Server 2
Test Dataset
10 string columns of 102 ascii characters each + 1 string column of 4 ascii characters
20000 rows
20,700,000
4. Average Fetch Time (Second)
Average Data Size Sent to Hive
Server(MB)
Average Data Size Recieved from
Hive Server(MB)
HS1-V3 39.165 4.731 517.762
HS2-V3 292.812 9.923 1307.691
HS2-V6 74.715 5.578 695.352
HS2-V3/HS1-V3 ratio 747.63% 209.75% 252.57%
HS2-V6/HS1-V3 ratio 190.77% 117.92% 134.30%