Successfully reported this slideshow.
Your SlideShare is downloading. ×

Fine-Grained, Secure and Efficient Data Provenance on Blockchain Systems

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 22 Ad

Fine-Grained, Secure and Efficient Data Provenance on Blockchain Systems

Download to read offline

Download paper from PVLDB:
http://www.vldb.org/pvldb/vol12/p975-ruan.pdf

The success of Bitcoin and other cryptocurrencies bring enormous interest to blockchains. A blockchain system implements a tamper-evident ledger for recording transactions that modify some global states. The system captures entire evolution history of the states. The management of that history, also known as data provenance or lineage, has been studied extensively in database systems. However, querying data history in existing blockchains can only be done by replaying all transactions. This approach is applicable to large-scale, offline analysis, but is not suitable for online transaction processing.

We present LineageChain, a fine-grained, secure and efficient provenance system for blockchains. LineageChain exposes provenance information to smart contracts via simple and elegant interfaces, thereby enabling a new class of blockchain applications whose execution logics depend on provenance information at runtime. LineageChain captures provenance during contract execution, and efficiently stores it in a Merkle tree. LineageChain provides a novel skip list index designed for supporting efficient provenance query processing. We have implemented LineageChain on top of Hyperledger and a blockchain-optimized storage system called ForkBase. Our extensive evaluation of LineageChain demonstrates its benefits to the new class of blockchain applications, its efficient query, and its small storage overhead.

Download paper from PVLDB:
http://www.vldb.org/pvldb/vol12/p975-ruan.pdf

The success of Bitcoin and other cryptocurrencies bring enormous interest to blockchains. A blockchain system implements a tamper-evident ledger for recording transactions that modify some global states. The system captures entire evolution history of the states. The management of that history, also known as data provenance or lineage, has been studied extensively in database systems. However, querying data history in existing blockchains can only be done by replaying all transactions. This approach is applicable to large-scale, offline analysis, but is not suitable for online transaction processing.

We present LineageChain, a fine-grained, secure and efficient provenance system for blockchains. LineageChain exposes provenance information to smart contracts via simple and elegant interfaces, thereby enabling a new class of blockchain applications whose execution logics depend on provenance information at runtime. LineageChain captures provenance during contract execution, and efficiently stores it in a Merkle tree. LineageChain provides a novel skip list index designed for supporting efficient provenance query processing. We have implemented LineageChain on top of Hyperledger and a blockchain-optimized storage system called ForkBase. Our extensive evaluation of LineageChain demonstrates its benefits to the new class of blockchain applications, its efficient query, and its small storage overhead.

Advertisement
Advertisement

More Related Content

Similar to Fine-Grained, Secure and Efficient Data Provenance on Blockchain Systems (20)

More from Qian Lin (13)

Advertisement

Recently uploaded (20)

Fine-Grained, Secure and Efficient Data Provenance on Blockchain Systems

  1. 1. Fine-Grained, Secure and Efficient Data Provenance on Blockchain Systems Pingcheng RUAN, Gang CHEN, Tien Tuan Anh DINH, Qian LIN, Beng Chin OOI, Meihui ZHANG
  2. 2. Blockchain Is a Class of Database Blockchain Distributed Database Database Bitcoin Distributed Transactional Systems
  3. 3. Blockchain Basics • P2P network – Asynchronous transaction • Byzantine environment – Mutual distrusting setup • Distributed ledger – Smart contract • Inherent provenance-preserving – ONLY for offline analytical query Contract Txn 1 Txn 2 Token.Transfer( A, B, 10) Token.Transfer( C, D, 20)Token.Transfer( A, B, 20)
  4. 4. Motivation Expose provenance information to smart contracts both Efficiently Securely Enabler for provenance- dependent smart contracts Enrich the transaction semantics
  5. 5. Provenance-dependent Contracts • Previous transfer precondition: – Enough balance from the sender – CURRENT STATE ONLY • New transfer precondition: – Historical balance > threshold – Recipient not transacted with certain blacklisted addresses recently – HISTORICAL STATE & PROVENANCE INFO
  6. 6. Workarounds Workaround 1: •Dump every thing into current state •Effort-needed, expensive, error-prune Workaround 2: • Offline analytics + Online transactions • Break of serializability • Transaction-ordering attacks Workaround 3: • Minimum system instrumentation • NOT protocol level (e.g., Hyperledger Fabric v1.0+) • Data tampering Holistic Approach: • Protocol-level enhancement  Secure • Performance-aware  Efficient Account1_v1: 10 Account1_v2: 20 Account1_v3: 15 Account2_v2: 12
  7. 7. Challenges • With clearly-defined transformation semantics • E.g • Map and reduce in Hadoop • Select, join and aggregation in SQL NO standardized operations • Tamper evidence • Integrity proof Byzantine environment • Gas mechanism • Verifier’s dilemma Ever-growing ledger
  8. 8. Block Structure Block Header Prev Hash hash Txn Digest State Digest PoW Nonce Txn List
  9. 9. Enhancement Basis (Merkle Tree Variants) • Limitation – Latest State only • Tamper evidence – Succinct digest (root hash) – Integrity proof (access path) Block Header Account Address and Assoicated Balance in Global State: 0xABC: 10 0xABCD: 15 0xABCE: 20 0xBC: 25 Previous Block Hash Transaction MPT Root Hash Nonce Receipt MPT Root Hash State MPT Root Hash = H(Z) nilA: H(X) B: H(Y)Z BC: H(V) C: 25X Y 10D: 15 E: 20V <Updated Chaincode ID>_<Key>: ccid1_k1 ccid2_k1 ccid3_k1 ccid3_k2 Block Header State Root Hash Previous Block Hash Transaction Root Hash G E F A B C ccid1_k1 ccid2_k1 ccid3_k1 ccid3_k2 D Bucket List (a) (b) Merkle Patricia Trie Merkle Bucket Tree
  10. 10. LineageChain Overview
  11. 11. Application Layer • Provenance specification – User-defined input-output dependency • Provenance query handler – Hist(stateID, [blockNum])  (val, blkStart, txnID) – Backward(stateID, blkNum)  List<(depStateID, depBlkNum)> – Forward(stateID, blkNum)  List<(depStateID, depBlkNum)> InputID1 InputID2 OutputID1 OutputID2 OutputID3 Backward Dependency Forward Dependency
  12. 12. Application Layer Recipient -> Sender
  13. 13. Execution Layer • Receive – Contract invocation context – Provenance specification • Compute – Transaction results – Concrete dependency • Prepare Merkle DAG – Introduce one layer of direction – Hash reference to encode provenance backward dependency
  14. 14. Execution Layer • Forward tracking – Problem: Undecided forward dependency during state update • Solution – Lazily store forward dependency on the successor state entry
  15. 15. Storage Layer • Problem – Efficient version-based (historical) query for a state ID • Solution: – Deterministic Append-only Skip List – Hash-based reference After appending versions 12 and 16
  16. 16. Evaluation • MICRO benchmarking (vs. flat storage) – Preference to recent version query (with DASL) – More efficient BFS enabled by backtrack (with ForkBase) • MACRO benchmarking (applied to Hyperledger Fabric v0.6 and v1.3) – Negligible runtime overhead o Tiny proportion of latency – Negligible storage overhead o >70% of space for blocks o 25% for historical states o 2~4% for DASL indexes and hash pointers
  17. 17. Performance of Provenance Query • vs. Workaround 2 – Compute data provenance offline and conditionally trigger online transaction
  18. 18. Micro Performance of Provenance Query • vs. Workaround 1 – Dump everything into the current state • vs. Workaround 3 – Use Hyperledger Fabric’s built-in HistoryDB
  19. 19. Runtime Overhead • Transaction processing Hyperledger Fabric v0.6 Hyperledger Fabric v1.3
  20. 20. Storage Overhead
  21. 21. Conclusion • LineageChain – Enabler for provenance-dependent blockchain applications – Protocol-level enhancement w.r.t. efficiency and security – Negligible performance and storage overhead • Key designs – User-defined dependency specification – Merkle DAG with dependency tracking – DASL index to accelerate data provenance query – Adoption in Hyperledger Fabric (v0.6 & v1.3)
  22. 22. ThankYou!

×