SlideShare a Scribd company logo
1 of 22
Download to read offline
Fine-Grained, Secure and Efficient Data
Provenance on Blockchain Systems
Pingcheng RUAN, Gang CHEN, Tien Tuan Anh DINH,
Qian LIN, Beng Chin OOI, Meihui ZHANG
Blockchain Is a Class of Database
Blockchain
Distributed
Database
Database
Bitcoin
Distributed
Transactional
Systems
Blockchain Basics
• P2P network
– Asynchronous transaction
• Byzantine environment
– Mutual distrusting setup
• Distributed ledger
– Smart contract
• Inherent provenance-preserving
– ONLY for offline analytical query
Contract Txn 1 Txn 2
Token.Transfer( A, B, 10)
Token.Transfer( C, D, 20)Token.Transfer( A, B, 20)
Motivation
Expose provenance information
to smart contracts both
Efficiently Securely
Enabler for provenance-
dependent smart contracts
Enrich the transaction semantics
Provenance-dependent Contracts
• Previous transfer precondition:
– Enough balance from the sender
– CURRENT STATE ONLY
• New transfer precondition:
– Historical balance > threshold
– Recipient not transacted with certain blacklisted addresses recently
– HISTORICAL STATE & PROVENANCE INFO
Workarounds
Workaround 1:
•Dump every thing into current
state
•Effort-needed, expensive,
error-prune
Workaround 2:
• Offline analytics + Online
transactions
• Break of serializability
• Transaction-ordering attacks
Workaround 3:
• Minimum system
instrumentation
• NOT protocol level (e.g.,
Hyperledger Fabric v1.0+)
• Data tampering
Holistic Approach:
• Protocol-level enhancement
 Secure
• Performance-aware
 Efficient
Account1_v1: 10
Account1_v2: 20
Account1_v3: 15
Account2_v2: 12
Challenges
• With clearly-defined transformation semantics
• E.g
• Map and reduce in Hadoop
• Select, join and aggregation in SQL
NO standardized operations
• Tamper evidence
• Integrity proof
Byzantine environment
• Gas mechanism
• Verifier’s dilemma
Ever-growing ledger
Block Structure
Block Header
Prev Hash hash Txn Digest
State Digest PoW Nonce
Txn List
Enhancement Basis (Merkle Tree Variants)
• Limitation
– Latest State only
• Tamper evidence
– Succinct digest (root hash)
– Integrity proof (access path)
Block Header
Account Address and Assoicated
Balance in Global State:
0xABC: 10 0xABCD: 15
0xABCE: 20 0xBC: 25
Previous Block Hash
Transaction MPT Root Hash
Nonce
Receipt MPT Root Hash
State MPT Root Hash = H(Z)
nilA: H(X) B: H(Y)Z
BC: H(V) C: 25X Y
10D: 15 E: 20V
<Updated Chaincode ID>_<Key>:
ccid1_k1 ccid2_k1
ccid3_k1 ccid3_k2
Block Header
State Root Hash
Previous Block Hash
Transaction Root Hash
G
E F
A B C
ccid1_k1
ccid2_k1
ccid3_k1
ccid3_k2
D
Bucket List
(a) (b)
Merkle Patricia Trie Merkle Bucket Tree
LineageChain Overview
Application Layer
• Provenance specification
– User-defined input-output dependency
• Provenance query handler
– Hist(stateID, [blockNum])
 (val, blkStart, txnID)
– Backward(stateID, blkNum)
 List<(depStateID, depBlkNum)>
– Forward(stateID, blkNum)
 List<(depStateID, depBlkNum)>
InputID1
InputID2
OutputID1
OutputID2
OutputID3
Backward
Dependency
Forward
Dependency
Application Layer
Recipient -> Sender
Execution Layer
• Receive
– Contract invocation context
– Provenance specification
• Compute
– Transaction results
– Concrete dependency
• Prepare Merkle DAG
– Introduce one layer of direction
– Hash reference to encode
provenance backward
dependency
Execution Layer
• Forward tracking
– Problem: Undecided forward dependency during state update
• Solution
– Lazily store forward dependency on the successor state entry
Storage Layer
• Problem
– Efficient version-based (historical) query for a state ID
• Solution:
– Deterministic Append-only Skip List
– Hash-based reference
After appending
versions 12 and 16
Evaluation
• MICRO benchmarking (vs. flat storage)
– Preference to recent version query (with DASL)
– More efficient BFS enabled by backtrack (with ForkBase)
• MACRO benchmarking (applied to Hyperledger Fabric v0.6 and
v1.3)
– Negligible runtime overhead
o Tiny proportion of latency
– Negligible storage overhead
o >70% of space for blocks
o 25% for historical states
o 2~4% for DASL indexes and hash pointers
Performance of Provenance Query
• vs. Workaround 2
– Compute data provenance offline and conditionally trigger online transaction
Micro Performance of Provenance Query
• vs. Workaround 1
– Dump everything into the current state
• vs. Workaround 3
– Use Hyperledger Fabric’s built-in HistoryDB
Runtime Overhead
• Transaction processing
Hyperledger Fabric v0.6 Hyperledger Fabric v1.3
Storage Overhead
Conclusion
• LineageChain
– Enabler for provenance-dependent blockchain applications
– Protocol-level enhancement w.r.t. efficiency and security
– Negligible performance and storage overhead
• Key designs
– User-defined dependency specification
– Merkle DAG with dependency tracking
– DASL index to accelerate data provenance query
– Adoption in Hyperledger Fabric (v0.6 & v1.3)
ThankYou!

More Related Content

Similar to Fine-Grained, Secure and Efficient Data Provenance on Blockchain Systems

Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...
Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...
Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...
Flink Forward
 

Similar to Fine-Grained, Secure and Efficient Data Provenance on Blockchain Systems (20)

Part 4 : reliable transport and sharing resources
Part 4 : reliable transport and sharing resourcesPart 4 : reliable transport and sharing resources
Part 4 : reliable transport and sharing resources
 
Zeus Locality aware distributed transaction upload.pptx
Zeus Locality aware distributed transaction upload.pptxZeus Locality aware distributed transaction upload.pptx
Zeus Locality aware distributed transaction upload.pptx
 
4 transport-sharing
4 transport-sharing4 transport-sharing
4 transport-sharing
 
Blockchain
BlockchainBlockchain
Blockchain
 
Tribeflow on bitcoin data
Tribeflow on bitcoin dataTribeflow on bitcoin data
Tribeflow on bitcoin data
 
Part3-reliable.pptx
Part3-reliable.pptxPart3-reliable.pptx
Part3-reliable.pptx
 
TXGX 2019_Albert_High Availability Architecture of Klaytn Service Chain
TXGX 2019_Albert_High Availability Architecture of Klaytn Service ChainTXGX 2019_Albert_High Availability Architecture of Klaytn Service Chain
TXGX 2019_Albert_High Availability Architecture of Klaytn Service Chain
 
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
 
Quality of service
Quality of serviceQuality of service
Quality of service
 
Redesigning MPTCP in Edge clouds
Redesigning MPTCP in Edge cloudsRedesigning MPTCP in Edge clouds
Redesigning MPTCP in Edge clouds
 
Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...
Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...
Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...
 
Chap 03
Chap 03Chap 03
Chap 03
 
Chap 03
Chap 03Chap 03
Chap 03
 
Transport Layer Description By Varun Tiwari
Transport Layer Description By Varun TiwariTransport Layer Description By Varun Tiwari
Transport Layer Description By Varun Tiwari
 
qos-f05.pdf
qos-f05.pdfqos-f05.pdf
qos-f05.pdf
 
qos-f05.ppt
qos-f05.pptqos-f05.ppt
qos-f05.ppt
 
qos-f05 (2).ppt
qos-f05 (2).pptqos-f05 (2).ppt
qos-f05 (2).ppt
 
qos-f05 (3).ppt
qos-f05 (3).pptqos-f05 (3).ppt
qos-f05 (3).ppt
 
Thaker q3 2008
Thaker q3 2008Thaker q3 2008
Thaker q3 2008
 
BigchainDB: A Scalable Blockchain Database, In Python
BigchainDB: A Scalable Blockchain Database, In PythonBigchainDB: A Scalable Blockchain Database, In Python
BigchainDB: A Scalable Blockchain Database, In Python
 

More from Qian Lin

Trinity: A Distributed Graph Engine on a Memory Cloud
Trinity: A Distributed Graph Engine on a Memory CloudTrinity: A Distributed Graph Engine on a Memory Cloud
Trinity: A Distributed Graph Engine on a Memory Cloud
Qian Lin
 
Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices
Presto: Distributed Machine Learning and Graph Processing with Sparse MatricesPresto: Distributed Machine Learning and Graph Processing with Sparse Matrices
Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices
Qian Lin
 
Adaptive Execution Support for Malleable Computation
Adaptive Execution Support for Malleable ComputationAdaptive Execution Support for Malleable Computation
Adaptive Execution Support for Malleable Computation
Qian Lin
 
C-Cube: Elastic Continuous Clustering in the Cloud
C-Cube: Elastic Continuous Clustering in the CloudC-Cube: Elastic Continuous Clustering in the Cloud
C-Cube: Elastic Continuous Clustering in the Cloud
Qian Lin
 
Kineograph: Taking the Pulse of a Fast-Changing and Connected World
Kineograph: Taking the Pulse of a Fast-Changing and Connected WorldKineograph: Taking the Pulse of a Fast-Changing and Connected World
Kineograph: Taking the Pulse of a Fast-Changing and Connected World
Qian Lin
 
Optimizing Virtual Machines Using Hybrid Virtualization
Optimizing Virtual Machines Using Hybrid VirtualizationOptimizing Virtual Machines Using Hybrid Virtualization
Optimizing Virtual Machines Using Hybrid Virtualization
Qian Lin
 
Virtual Machine Performance
Virtual Machine PerformanceVirtual Machine Performance
Virtual Machine Performance
Qian Lin
 
Be an Explorer, Be a Coder, Be a Writer
Be an Explorer, Be a Coder, Be a WriterBe an Explorer, Be a Coder, Be a Writer
Be an Explorer, Be a Coder, Be a Writer
Qian Lin
 
SciMATE: A Novel MapReduce-Like Framework for Multiple Scientific Data Formats
SciMATE: A Novel MapReduce-Like Framework for Multiple Scientific Data FormatsSciMATE: A Novel MapReduce-Like Framework for Multiple Scientific Data Formats
SciMATE: A Novel MapReduce-Like Framework for Multiple Scientific Data Formats
Qian Lin
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
Qian Lin
 

More from Qian Lin (13)

PaxosStore: High-availability Storage Made Practical in WeChat
PaxosStore: High-availability Storage Made Practical in WeChatPaxosStore: High-availability Storage Made Practical in WeChat
PaxosStore: High-availability Storage Made Practical in WeChat
 
Trinity: A Distributed Graph Engine on a Memory Cloud
Trinity: A Distributed Graph Engine on a Memory CloudTrinity: A Distributed Graph Engine on a Memory Cloud
Trinity: A Distributed Graph Engine on a Memory Cloud
 
Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices
Presto: Distributed Machine Learning and Graph Processing with Sparse MatricesPresto: Distributed Machine Learning and Graph Processing with Sparse Matrices
Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices
 
Adaptive Execution Support for Malleable Computation
Adaptive Execution Support for Malleable ComputationAdaptive Execution Support for Malleable Computation
Adaptive Execution Support for Malleable Computation
 
C-Cube: Elastic Continuous Clustering in the Cloud
C-Cube: Elastic Continuous Clustering in the CloudC-Cube: Elastic Continuous Clustering in the Cloud
C-Cube: Elastic Continuous Clustering in the Cloud
 
Kineograph: Taking the Pulse of a Fast-Changing and Connected World
Kineograph: Taking the Pulse of a Fast-Changing and Connected WorldKineograph: Taking the Pulse of a Fast-Changing and Connected World
Kineograph: Taking the Pulse of a Fast-Changing and Connected World
 
Optimizing Virtual Machines Using Hybrid Virtualization
Optimizing Virtual Machines Using Hybrid VirtualizationOptimizing Virtual Machines Using Hybrid Virtualization
Optimizing Virtual Machines Using Hybrid Virtualization
 
Virtual Machine Performance
Virtual Machine PerformanceVirtual Machine Performance
Virtual Machine Performance
 
Be an Explorer, Be a Coder, Be a Writer
Be an Explorer, Be a Coder, Be a WriterBe an Explorer, Be a Coder, Be a Writer
Be an Explorer, Be a Coder, Be a Writer
 
SciMATE: A Novel MapReduce-Like Framework for Multiple Scientific Data Formats
SciMATE: A Novel MapReduce-Like Framework for Multiple Scientific Data FormatsSciMATE: A Novel MapReduce-Like Framework for Multiple Scientific Data Formats
SciMATE: A Novel MapReduce-Like Framework for Multiple Scientific Data Formats
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
In-situ MapReduce for Log Processing
In-situ MapReduce for Log ProcessingIn-situ MapReduce for Log Processing
In-situ MapReduce for Log Processing
 
C-MR: Continuously Executing MapReduce Workflows on Multi-Core Processors
C-MR: Continuously Executing MapReduce Workflows on Multi-Core ProcessorsC-MR: Continuously Executing MapReduce Workflows on Multi-Core Processors
C-MR: Continuously Executing MapReduce Workflows on Multi-Core Processors
 

Recently uploaded

Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Fine-Grained, Secure and Efficient Data Provenance on Blockchain Systems

  • 1. Fine-Grained, Secure and Efficient Data Provenance on Blockchain Systems Pingcheng RUAN, Gang CHEN, Tien Tuan Anh DINH, Qian LIN, Beng Chin OOI, Meihui ZHANG
  • 2. Blockchain Is a Class of Database Blockchain Distributed Database Database Bitcoin Distributed Transactional Systems
  • 3. Blockchain Basics • P2P network – Asynchronous transaction • Byzantine environment – Mutual distrusting setup • Distributed ledger – Smart contract • Inherent provenance-preserving – ONLY for offline analytical query Contract Txn 1 Txn 2 Token.Transfer( A, B, 10) Token.Transfer( C, D, 20)Token.Transfer( A, B, 20)
  • 4. Motivation Expose provenance information to smart contracts both Efficiently Securely Enabler for provenance- dependent smart contracts Enrich the transaction semantics
  • 5. Provenance-dependent Contracts • Previous transfer precondition: – Enough balance from the sender – CURRENT STATE ONLY • New transfer precondition: – Historical balance > threshold – Recipient not transacted with certain blacklisted addresses recently – HISTORICAL STATE & PROVENANCE INFO
  • 6. Workarounds Workaround 1: •Dump every thing into current state •Effort-needed, expensive, error-prune Workaround 2: • Offline analytics + Online transactions • Break of serializability • Transaction-ordering attacks Workaround 3: • Minimum system instrumentation • NOT protocol level (e.g., Hyperledger Fabric v1.0+) • Data tampering Holistic Approach: • Protocol-level enhancement  Secure • Performance-aware  Efficient Account1_v1: 10 Account1_v2: 20 Account1_v3: 15 Account2_v2: 12
  • 7. Challenges • With clearly-defined transformation semantics • E.g • Map and reduce in Hadoop • Select, join and aggregation in SQL NO standardized operations • Tamper evidence • Integrity proof Byzantine environment • Gas mechanism • Verifier’s dilemma Ever-growing ledger
  • 8. Block Structure Block Header Prev Hash hash Txn Digest State Digest PoW Nonce Txn List
  • 9. Enhancement Basis (Merkle Tree Variants) • Limitation – Latest State only • Tamper evidence – Succinct digest (root hash) – Integrity proof (access path) Block Header Account Address and Assoicated Balance in Global State: 0xABC: 10 0xABCD: 15 0xABCE: 20 0xBC: 25 Previous Block Hash Transaction MPT Root Hash Nonce Receipt MPT Root Hash State MPT Root Hash = H(Z) nilA: H(X) B: H(Y)Z BC: H(V) C: 25X Y 10D: 15 E: 20V <Updated Chaincode ID>_<Key>: ccid1_k1 ccid2_k1 ccid3_k1 ccid3_k2 Block Header State Root Hash Previous Block Hash Transaction Root Hash G E F A B C ccid1_k1 ccid2_k1 ccid3_k1 ccid3_k2 D Bucket List (a) (b) Merkle Patricia Trie Merkle Bucket Tree
  • 11. Application Layer • Provenance specification – User-defined input-output dependency • Provenance query handler – Hist(stateID, [blockNum])  (val, blkStart, txnID) – Backward(stateID, blkNum)  List<(depStateID, depBlkNum)> – Forward(stateID, blkNum)  List<(depStateID, depBlkNum)> InputID1 InputID2 OutputID1 OutputID2 OutputID3 Backward Dependency Forward Dependency
  • 13. Execution Layer • Receive – Contract invocation context – Provenance specification • Compute – Transaction results – Concrete dependency • Prepare Merkle DAG – Introduce one layer of direction – Hash reference to encode provenance backward dependency
  • 14. Execution Layer • Forward tracking – Problem: Undecided forward dependency during state update • Solution – Lazily store forward dependency on the successor state entry
  • 15. Storage Layer • Problem – Efficient version-based (historical) query for a state ID • Solution: – Deterministic Append-only Skip List – Hash-based reference After appending versions 12 and 16
  • 16. Evaluation • MICRO benchmarking (vs. flat storage) – Preference to recent version query (with DASL) – More efficient BFS enabled by backtrack (with ForkBase) • MACRO benchmarking (applied to Hyperledger Fabric v0.6 and v1.3) – Negligible runtime overhead o Tiny proportion of latency – Negligible storage overhead o >70% of space for blocks o 25% for historical states o 2~4% for DASL indexes and hash pointers
  • 17. Performance of Provenance Query • vs. Workaround 2 – Compute data provenance offline and conditionally trigger online transaction
  • 18. Micro Performance of Provenance Query • vs. Workaround 1 – Dump everything into the current state • vs. Workaround 3 – Use Hyperledger Fabric’s built-in HistoryDB
  • 19. Runtime Overhead • Transaction processing Hyperledger Fabric v0.6 Hyperledger Fabric v1.3
  • 21. Conclusion • LineageChain – Enabler for provenance-dependent blockchain applications – Protocol-level enhancement w.r.t. efficiency and security – Negligible performance and storage overhead • Key designs – User-defined dependency specification – Merkle DAG with dependency tracking – DASL index to accelerate data provenance query – Adoption in Hyperledger Fabric (v0.6 & v1.3)