The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
Â
Hive acid-updates-strata-sjc-feb-2015
1. Š Hortonworks Inc. 2015
Hive 0.14 Does ACID
February 2015
Page 1
Alan Gates
gates@hortonworks.com
@alanfgates
2. Š Hortonworks Inc. 2015 Page 2
⢠Hive only updated partitions
âInsert overwrite rewrote an entire partition
âForced daily or even hourly partitions
âCould add files to partition directory, but no file compaction
⢠What about concurrent readers?
âOk for inserts, but overwrite caused races
âThere is a zookeeper lock manager, butâŚ
⢠No way to delete or update rows
⢠No INSERT INTO T VALUESâŚ
âBreaks some tools
History
3. Š Hortonworks Inc. 2015 Page 3
â˘Hadoop and Hive have alwaysâŚ
âWorked without ACID
âPerceived as tradeoff for performance
â˘But, your data isnât static
âIt changes daily, hourly, or faster
âAd hoc solutions require a lot of work
âManaging change makes the userâs life better
â˘Do or Do Not, There is NO Try
Why is ACID Critical?
4. Š Hortonworks Inc. 2015 Page 4
⢠NOT OLTP!!!
⢠Updating a Dimension Table
âChanging a customerâs address
⢠Delete Old Records
âRemove records for compliance
⢠Update/Restate Large Fact Tables
âFix problems after they are in the warehouse
⢠Streaming Data Ingest
âA continual stream of data coming in
âTypically from Flume or Storm
⢠NOT OLTP!!!
Use Cases
5. Š Hortonworks Inc. 2015 Page 5
⢠New DML
â INSERT INTO T VALUES(1, âfredâ, ...);
â UPDATE T SET (x = 5[, ...]) WHERE ...
â DELETE FROM T WHERE ...
â Supports partitioned and non-partitioned tables, WHERE clause can
specify partition but not required
⢠Restrictions
â Table must have format that extends AcidInputFormat
â currently ORC
â Table must be bucketed and not sorted
â can use 1 bucket but this will restrict write ||ism
â Table must be marked transactional
â create table T(...) clustered by (a) into 2 buckets stored as orc TBLPROPERTIES
('transactional'='true');
New SQL in Hive 0.14
6. Š Hortonworks Inc. 2015 Page 6
â˘Good
âHandles compactions for us
âAlready has similar data model with LSM
â˘Bad
âNo cross row transactions
âWould require us to write a transaction manager over HBase,
doable, but not less work
âHfile is column family based rather than columnar
âHBase focused on point lookups and range scans
âWarehousing requires full scans
Why Not HBase?
7. Š Hortonworks Inc. 2015 Page 7
â˘HDFS Does Not Allow Arbitrary Writes
âStore changes as delta files
âStitched together by client on read
â˘Writes get a Transaction ID
âSequentially assigned by Metastore
â˘Reads get Committed Transactions
âProvides snapshot consistency
âNo locks required
âProvide a snapshot of data from start of query
Design
10. Š Hortonworks Inc. 2015 Page 10
â˘Created new AcidInput/OutputFormat
âUnique key is transaction, bucket, row
â˘Reader returns correct version of row based on
transaction state
â˘Also Added Raw API for Compactor
âProvides previous events as well
â˘ORC implements new API
âExtends records with change metadata
âAdd operation (d, u, i), transaction and key
Input and Output Formats
11. Š Hortonworks Inc. 2015 Page 11
â˘Need to split buckets for MapReduce
âNeed to split base and deltas the same way
âUse key ranges
âUse indexes
Distributing the Work
12. Š Hortonworks Inc. 2015 Page 12
⢠Existing lock managers
âIn memory - not durable
âZooKeeper - requires additional components to install, administer,
etc.
⢠Locks need to be integrated with transactions
âcommit/rollback must atomically release locks
⢠We sort of have this database lying around which has
ACID characteristics (metastore)
⢠Transactions and locks stored in metastore
⢠Uses metastore DB to provide unique, ascending ids for
transactions and locks
Transaction Manager
13. Š Hortonworks Inc. 2015 Page 13
â˘In Hive 0.14 DML statements are auto-commit
âWorking on adding BEGIN, COMMIT, ROLLBACK
â˘Snapshot isolation
âReader will see consistent data for the duration of
his/her query
âMay extend to other isolation levels in the future
â˘Current transactions can be displayed using
new SHOW TRANSACTIONS statement
Transaction Model
14. Š Hortonworks Inc. 2015 Page 14
â˘Three types of locks
âshared
âsemi-shared (can co-exist with shared, but not
other semi-shared)
âexclusive
â˘Operations require different locks
âSELECT, INSERT â shared
âUPDATE, DELETE â semi-shared
âDROP, INSERT OVERWRITE â exclusive
Locking Model
15. Š Hortonworks Inc. 2015 Page 15
â˘Each transaction (or batch of transactions
in streaming ingest) creates a new delta file
â˘Too many files = NameNode ď
â˘Need a way to
âCollect many deltas into one delta â minor
compaction
âRewrite base and delta to new base â major
compaction
Compactor
16. Š Hortonworks Inc. 2015 Page 16
â˘Run when there are 10 or more deltas
(configurable)
â˘Results in base + 1 delta
Minor Compaction
/hive/warehouse/purchaselog/ds=201403311000/base_0028000
/hive/warehouse/purchaselog/ds=201403311000/delta_0028001_0028100
/hive/warehouse/purchaselog/ds=201403311000/delta_0028101_0028200
/hive/warehouse/purchaselog/ds=201403311000/delta_0028201_0028300
/hive/warehouse/purchaselog/ds=201403311000/delta_0028301_0028400
/hive/warehouse/purchaselog/ds=201403311000/delta_0028401_0028500
/hive/warehouse/purchaselog/ds=201403311000/base_0028000
/hive/warehouse/purchaselog/ds=201403311000/delta_0028001_0028500
17. Š Hortonworks Inc. 2015 Page 17
â˘Run when deltas are 10% the size of base
(configurable)
â˘Results in new base
Major Compaction
/hive/warehouse/purchaselog/ds=201403311000/base_0028000
/hive/warehouse/purchaselog/ds=201403311000/delta_0028001_0028100
/hive/warehouse/purchaselog/ds=201403311000/delta_0028101_0028200
/hive/warehouse/purchaselog/ds=201403311000/delta_0028201_0028300
/hive/warehouse/purchaselog/ds=201403311000/delta_0028301_0028400
/hive/warehouse/purchaselog/ds=201403311000/delta_0028401_0028500
/hive/warehouse/purchaselog/ds=201403311000/base_0028500
18. Š Hortonworks Inc. 2015 Page 18
⢠Metastore thrift server will schedule and execute
compactions
âNo need for user to schedule
âUser can initiate via new ALTER TABLE COMPACT
statement
⢠No locking required, compactions run at same time
as select and DML
âCompactor aware of readers, does not remove old files until
readers have finished with them
⢠Current compactions can be viewed via new SHOW
COMPACTIONS statement
Compactor Continued
19. Š Hortonworks Inc. 2015 Page 19
⢠Data is flowing in from generators in a stream
⢠Without this, you have to add it to Hive in batches, often
every hour
âThus your users have to wait an hour before they can see their
data
⢠New interface in hive.hcatalog.streaming lets applications
write small batches of records and commit them
âUsers can now see data within a few seconds of it arriving from
the data generators
⢠Available for Apache Flume in HDP 2.1 and Storm in HDP
2.2
Application: Streaming Ingest
20. Š Hortonworks Inc. 2015 Page 20
â˘On the client
hive.support.concurrency=true
hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
hive.enforce.bucketing=true
â˘On the metastore server
hive.compactor.initiator.on=true
hive.compactor.worker.threads=1 # or more
Configuration
21. Š Hortonworks Inc. 2015 Page 21
⢠Phase 1, Hive 0.13
â Transaction and new lock manager
â ORC file support
â Automatic and manual compaction
â Snapshot isolation
â Streaming ingest via Flume
⢠Phase 2, Hive 0.14
â INSERT ⌠VALUES, UPDATE, DELETE
⢠Phase 3, Hive 1.2(?)
â Add support for only some columns in insert
â INSERT into T (a, b) select c, d from U;
â BEGIN, COMMIT, ROLLBACK
⢠Future (all speculative based on user feedback)
â Integration with HCatalog
â Versioned or point in time queries
â Streaming ingest of updates and deletes
â Additional isolation levels such as dirty read or read committed
â MERGE
Phases of Development
22. Š Hortonworks Inc. 2015 Page 22
â˘JIRA:
https://issues.apache.org/jira/browse/HI
VE-5317
â˘Adds ACID semantics to Hive
â˘Uses SQL standard commands
âINSERT, UPDATE, DELETE
â˘Provides scalable read and write access
Conclusion