• Save
La big datacamp2014_vikram_dixit
Upcoming SlideShare
Loading in...5
×
 

La big datacamp2014_vikram_dixit

on

  • 273 views

Big Data Camp LA 2014, Hive 0.13: An upgrade in performance, scaling, security and multi-tenancy by Vikram Dixit of Hortonworks

Big Data Camp LA 2014, Hive 0.13: An upgrade in performance, scaling, security and multi-tenancy by Vikram Dixit of Hortonworks

Statistics

Views

Total Views
273
Views on SlideShare
233
Embed Views
40

Actions

Likes
1
Downloads
0
Comments
0

2 Embeds 40

https://twitter.com 39
http://www.slideee.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • With Hive and Stinger we are focused on enabling the SQL ecosystem and to do that we’ve put Hive on a clear roadmap to SQL compliance. <br /> That includes adding critical datatypes like character and date types as well as implementing common SQL semantics seen in most databases. <br />

La big datacamp2014_vikram_dixit La big datacamp2014_vikram_dixit Presentation Transcript

  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Hive 0.13: An upgrade in Performance, Scaling, Security and Multi-tenancy Vikram Dixit (vikram@apache.org)
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Hive – SQL on Hadoop • Open source Apache project • Started by Facebook in 2009 • Tools to enable easy data extract/transform/load (ETL) • Work with structured, unstructured, semi-structured data • Access to files stored either directly in Apache HDFSTM or in other data storage systems such as Apache HBaseTM • Query execution via MapReduce/Tez • Metadata sharing via HCatalog allows your Pig scripts to work with Hive tables
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Using Hive Effectively • Understanding Hive’s use case – Current focus on making it a fast analytics engine that scales – Transactions coming • Understand the storage mechanism right for you – ORC File - highest compression, metadata used to enable faster reads – Parquet - intermediate compression, fast reads – RC File - legacy, most widely used but suffers in performance – Text - ease of use but lowest in terms of performance • Use the right execution engine – Tez is the recommended execution engine for performance – Map reduce is chosen by default in cases where tez can not yet run the query • Use the right configuration flags – Many optimizations are turned on by default – Some are not. Need to tune it for your cluster because a default is hard to come up with.
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. What’s new in Hive 0.13 •Speed – Hive on Tez – Broadcast Joins, Bucket Map Joins – Vectorized Query processing – Split elimination for ORC file – Parquet file format support •Scale – Smaller hash tables allowing more scalable map joins – More scalable dynamic partition loads
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. What’s new in Hive 0.13 • More SQL improvements – SQL standard Authorization – Char support, Decimal improvements – Permanent UDFs – Streaming ingest from Flume for ACID capability • Additional Improvements – Hive Server 2 improvements – HCatalog parity with Hive data types – JDBC improvements viz. job cancel, async execution • Even more goodies – Mavenization – Parallel test framework – Lots of documentation
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Stinger Project (announced February 2013) Batch AND Interactive SQL-IN- Hadoop Stinger Initiative A broad, community-based effort to drive the next generation of HIVE Hive 0.13, April, 2013 • Hive on Apache Tez • Query Service • Buffer Cache • Cost Based Optimizer (Optiq) • Vectorized Processing Hive 0.11, May 2013: • Base Optimizations • SQL Analytic Functions • ORCFile, Modern File Format Hive 0.12, October 2013: • VARCHAR, DATE Types • ORCFile predicate pushdown • Advanced Optimizations • Performance Boosts via YARN Speed Improve Hive query performance by 100X to allow for interactive query times (seconds) Scale The only SQL interface to Hadoop designed for queries that scale from TB to PB SQL Support broadest range of SQL semantics for analytic applications running against Hadoop …all IN Hadoop Goals:
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. SPEED: Increasing Hive Performance Key Highlights – Tez: New execution engine – Vectorized Query Processing – Startup time improvement – Statistics to accelerate query execution – Cost Based Optimizer: Optiq (missed the cut) Interactive Query Times across ALL use cases • Simple and advanced queries in seconds • Integrates seamlessly with existing tools • Currently a >100x improvement in just nine months Elements of Fast SQL Execution • Query Planner/Cost Based Optimizer w/ Statistics • Query Startup • Query Execution • I/O Path
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Apache Tez (“Speed”) • Replaces MapReduce as primitive for Pig, Hive, Cascading etc. – Smaller latency for interactive queries – Higher throughput for batch queries – 22 contributors: Hortonworks (13), Facebook, Twitter, Yahoo, Microsoft YARN ApplicationMaster to run DAG of Tez Tasks Task with pluggable Input, Processor and Output Tez Task - <Input, Processor, Output> Task ProcessorInput Output
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Hive – MR Hive – Tez Hive-on-MR vs. Hive-on-Tez SELECT g1.x, g1.avg, g2.cnt FROM (SELECT a.x, AVERAGE(a.y) AS avg FROM a GROUP BY a.x) g1 JOIN (SELECT b.x, COUNT(b.y) AS avg FROM b GROUP BY b.x) g2 ON (g1.x = g2.x) ORDER BY avg; GROUP a BY a.x JOIN (a,b) GROUP b BY b.x ORDER BY M M M R R M M R M M R M R HDFS HDFS HDFS M M M R R R M M R GROUP BY a.x JOIN (a,b) ORDER BY GROUP BY x Tez avoids unnecessary writes to HDFS HIVE-4660
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Shuffle Join SELECT ss.ss_item_sk, ss.ss_quantity, inv.inv_quantity_on_hand FROM inventory inv JOIN store_sales ss ON (inv.inv_item_sk = ss.ss_item_sk); Hive – MR Hive – Tez
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Broadcast Join • Similar to map-join w/o the need to build a hash table on the client • Will work with any level of sub-query nesting • Uses stats to determine if applicable • How it works: – Broadcast result set is computed in parallel on the cluster – Join processor are spun up in parallel – Broadcast set is streamed to join processor – Join processors build hash table – Other relation is joined with hashtable • Tez handles: – Best parallelism – Best data transfer of the hashed relation – Best scheduling to avoid latencies
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Broadcast Join Hive – MR Hive – Tez M M M M HDFS M MM M M HDFS SELECT ss.ss_item_sk, ss.ss_quantity, inv.inv_quantity_on_hand FROM store_sales ss JOIN inventory inv ON (inv.inv_item_sk = ss.ss_item_sk); HDFS Inventory scan (Runs as single local map task) Store Sales scan and Join (Inventory hash table read as side file) Inventory scan (Runs on cluster potentially more than 1 mapper) Store Sales scan and Join Broadcast edge
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Dynamically partitioned Hash join • Kicks in when large table is bucketed – Bucketed table – Dynamic as part of query processing – Enabled via set hive.convert.join.bucket.mapjoin.tez = true; (use 0.13.1) • Uses custom edge to match the partitioning on the smaller table • Allows hash-join in cases where broadcast would be too large • Tez gives us the option of building custom edges and vertex managers – Fine grained control over how the data is replicated and partitioned – Scheduling and actual data transfer is handled by Tez
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Dynamically Partitioned Hash Join SELECT ss.ss_item_sk, ss.ss_quantity, inv.inv_quantity_on_hand FROM store_sales ss JOIN inventory inv ON (inv.inv_item_sk = ss.ss_item_sk); Hive – MR Hive – Tez M MM M M HDFS Inventory scan (Runs on cluster potentially more than 1 mapper) Store Sales scan and Join (Custom vertex reads both inputs – no side file reads) Custom edge (routes outputs of previous stage to the correct Mappers of the next stage)M MM M HDFS Inventory scan (Runs as single local map task) Store Sales scan and Join (Inventory hash table read as side file) HDFS
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Dynamically Partitioned Hash Join Plans look very similar to map join but the way things work change between MR and Tez. Hive – MR (Bucket map-join) Hive – Tez • Not dynamically partitioned. • Both tables need to be bucketed by the join key. • Local task that generates the hash table writes n files corresponding to n buckets. • Number of mappers for the join must be same as the number of buckets. • Each of these mappers reads the corresponding bucket file of the local task to perform the join. • Only one of the sides needs to be bucketed and the other side is dynamically bucketed. • Also works if neither side is explicitly bucketed, but another operation forced bucketing in the pipeline (traits) • No writing to HDFS. • There can be more mappers than number of buckets but splits do not span multiple buckets. • The dynamically bucketed mappers have as many outputs as number of buckets and a custom tez routing ensures these outputs reach the right mappers.
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Bulk Inner loop: Vectorization • Avoid Writable objects & use primitive int/long – Allows efficient JIT code for primitive types • Generate per-type loops & avoid runtime type-checks • The classes generated look like – LongColEqualDoubleColumn – LongColEqualLongColumn – LongColEqualLongScalar • Avoid duplicate operations on repeated values – isRepeating & hasNulls
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. ORC: ZeroCopy & caching • Use memory mapped I/O path in HDFS – HDFS in-memory cache • ORC reads can start deserializing early – there is no blocking read() call • Allow OS read-ahead to kick-in • Use buffer-cache pages without copying it • Avoid wasting heap space on ORC stripes • Decompress directly from mapped buffers – Fast JNI code for SNAPPY decompressors
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Scaling • Reduce size of map join hash tables – Hundred bytes were being used to store an integer (Map join key) – HIVE-6430 reduced sizes of the hash tables by 60-70% in many cases – Allowed more efficient use of memory and hence more tables to fit in • Large number of open record writers in ORC file reduced to just 1 – HIVE-6455 – Now in a multi-insert scenario, performance is much better and many more inserts can be done in parallel
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. TPC-DS 10 TB Query Times (Tuned Queries) Page 19 Data: 10 TB data loaded into ORCFile using defaults. Hardware: 20 nodes: 24 CPU threads, 64 GB RAM, 5 local SATA drives each. Queries: TPC-DS queries with partition filters added.
  • © Hortonworks Inc. 2011 Security Page 20 Architecting the Future of Big Data • Old authorization based on grant/revoke • Incomplete model - eg. Anybody can run grant statement • Does not try following standard • Why follow standard ? • Lot of thought has been put into the standard – important for security! • It’s a standard! • Hive should have built-in authorization • Easy to use, no additional components to manage • New features get added that needs authorization • Life cycle of objects should be synced with authorization policy
  • © Hortonworks Inc. 2011 Managing privileges Page 21 Architecting the Future of Big Data • Grant/revoke privilege on object to/from user/role • SHOW GRANT statement • view privilege grants based on user/role name and/or object name • INSERT, SELECT, DELETE, UPDATE, ALL • Privileges for some actions based on object ownership • Table/view ownership : Most alter commands, drop • Database ownership : create table, drop database • URI privileges based on file permissions • https://cwiki.apache.org/confluence/display/Hive/SQL+Stand ard+based+hive+authorization#SQLStandardBasedHiveAuthor ization-Configuration • Use hive 0.13.1 – fixes the issues listed under known issues in above wiki doc.
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Hive Server 2 improvements • Hive server 2 now supports thrift over HTTP and kerberos/LDAP authentication on HTTP • Also supports HTTPS • HiveServer2 can keep sessions alive – Between different JDBC queries • New security model helps – All secure queries run as “hive” user • Ideal for short exploratory queries • Uses same JARs (no download for task) • Even better JIT performance on >1 queries
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Other improvements • Insert-update-delete semantics. Streaming ingest from flume. (HIVE- 5317) – Transaction manager added in. Support for ORC file format only at this time. • Lots of UDF support via permanent functions. No need to have add jar for most commonly used UDFs. Ideally, admin adds the permanent (trusted) functions. • Parquet is a supported storage format. https://cwiki.apache.org/confluence/display/Hive/Parquet • HCatalog now supports all the datatypes supported in Hive. • Hive is now mavenized (Thanks Brock Noland!) • Parallel test framework means Unit testing happens faster and changes get in faster. • Lots of new documentation for all the new features. (Thanks Lefty!) • Bottom line: Hive 0.13 is the fastest, most feature rich version of hive so far.
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Future Work • Lot more improvements coming up • Speed – Sort Merge Bucket Map join in Tez – Total ordering of data – Skew joins – Cost based Optimizer • Security – Authorizing permanent UDF access – Authorizing ‘show grant’ – Support hdfs ACL in URI permission checks (new in hadoop 2.4) – More SQL syntax support – eg revoke just admin option on a role • Multi-tenancy – Sticky HS2 sessions for improved performance in a multi-tenant environment – Improve scheduling in a multi-tenant environment
  • © Hortonworks Inc. 2013.© Hortonworks Inc. 2013. Questions?