Blockchain Safety
and
Smart Contract Simulation
Tezos Japan, DaiLambda, Inc.
Jun FURUSE/古瀬淳, Singapore, NUS, 2019-10-31ASIASIM 2019
What is Blockchain?
Blockchain is DB
It is a hard computer science!
Ledger style
Distributed
Open/permissionless network
Ledger Style DB
DB by modification history accumulation:
Block consists of basic write operations :
. .S0 →
B1
S1 →
B2
S2 →
B3
S3 →
B4
= ∅S0
= ( )S1 B1 S0
= ( ) = ( (. . ( ). . ))Sn Bn Sn−1 Bn Bn−1 B1 S0
B o
= [ , . . , ]Bn on1
onm
= ( ) = (. . ( ( )). . )Sn Bn Sn−1 onm
on1
Sn−1
Distributed DB
Multiple nodes, each carries a replica of the DB:
Fault tolerance
Load balancing
Distributed DB: conflicts
How to resolve conflicts?
B
→→
A
x = A
x = A
x = A →→x = B
x = B
x = A
x = B
or
?
? ?
Consensus Algorithm
Algorithm to resolve conflicts between nodes:
Termination
All non-faulty nodes eventurally decide on a value.
Agreement
All nodes decided do so on the same value.
Validity
The decided value must be proposed by somee. node.
Bad news: no such algorithm exists [FLP85].
Good news: Paxos
Concensus algorithm which may not terminate
but practically OK.
Voting between nodes to an agreement:
Safe
Agreement and validity are always assured.
Not always Live
Termination is not assured. It may repeat votes infinitely.
Only for closed network with fixed participants.
Closed Network DB
If you are happy with closed world,
normal distributed DB with Paxos is your friend.
No need of “Blockchain”.
Open Network DB
Permissionless public network,
where anyone can join by running a node:
Decentralized: no central servers, therefore neutral.
Distributed with redundancy: almost impossible to hack.
Low cost, compared to the existing social infrastructure.
Problem of Open Network Consensus
Consensus control and inhibition
by Sybil attack: produces fake
identities to dominate the network.
Satoshi Nakamoto and Bitcoin
Prevent Sybil attack, infinite identity forgery:
Requirement of real world resource:
computation power (PoW)
Rewards: incentive to provide the
resource and behave honestly.
The rewards are recorded as tokens
in the DB.
Birth of cryptocurrency.
Smart Contracts
DB can carry not only account balance,
but also any data, even program code.
Code attached to an account is executed when it
receives a transaction.
Automatic execution of financial/non-financial contracts
Distributed application framework towards neutral and
public infrastructrure, without any central authority.
Safety Requirements of Blockchain
Consequence of introduction of token rewards
for open distributed DB:
Speculative money flows in.
System must be open source.
Security holes are immediate targets for theft.
Blockchain must have very high level of security.
Safety of Smart Contracts
Bugs turns smart contracts into automatic stealing
machines:
Past incidents around smart contracts:
The DAO (3.6M ETH, 15% of circulation, 50M USD stolen)
Parity vulnerability#1 (150K ETH, 32M USD stolen)
Parity vulnerability#2 (513K ETH, 160M USD frozen)
High level of security is required here too.
Tezos Blockchain
3rd generation blockchain tech.
LPoS
Eco friendly robust Sybil attack prevention
On chain governance
Protocol upgrades by on-chain proposals and voting
Safety first design
Proving safety properties by formal verification
Formal Verification
Mathematically prove good/bad software properties.
Properties are correct for all the cases, if proven.
Methods: Type systems, model checking, theorem provers.
Tests are important, but not fully trustable:
only few corner cases can be verified.
No guarantee for the untested cases.
Tezos and Formal Verification
Protocol
Written in OCaml, good match with FV
Smart contract VM
Statically typed, purely functional stack VM,
with structured data such as pairs, lists, map, etc.
Smart contract safety
Modeling of smart contract interpreter using Coq,
and correctness proofs of smart contracts using it.
Cryptographic primitives
Correctness proof by F* language.
Ledger file system
With correctness proofs.
Blockchain and Simulation
Safety can be proven, but how safe is not always
provable. Simulation is also important:
Concensus
P2P network
Smart contract
etc.
Writing to the DB
Sending operation to a node:
Transactions
Account creations, etc
Operations spread into the P2P network.
→
→
O1
O1
O1
O1
O1
O1
O2
O2
O2
O2
O2
O2
O2
O2
o
Block creation
Block proposer chooses operations found in his node
and form a block of them.
The proposer releases to the network.
O1
O1
O2
O2
O2
O2
B = [ O1, O2 ]
→
B
O1
⇨ B1 = [ O1, O2 ]
→
B
B
B
B
B
B
B
B
= [ , . . , ]Bn on1
onm
B
Smart Contract Simulation
Contract is a function:
Emulate contract execution of
by applying a block only with the call:
c : param → operation
c(p)
Bn−1
= [c(p)]Bn
= ( ) = c(p)( )Sn Bn Sn−1 Sn−1
Simulation can Differ from the Reality
In reality, the call can be interrupted by other ops:
Simulation:
The call may not be taken into but :
Malicious block proposer can insert his op before
your call in to change its behaviour:
= [c(p)]Bn
= c(p)( )Sn Sn−1
Bn Bn+1
= c(p)( )  where   = ( ) ≠Sn+1 Sn Sn Bn Sn−1 Sn−1
o
′
Bn
= [ , c(p)]( ) = c(p)( ( ))Sn o
′
Sn−1 o
′
Sn−1
Mitigations
Fail the call when the simulation and real thing differ?
Failing in general does not scale the blockchain.
For specific contracts, this behaviour can be useful and
implementable with a counter.
Or prove contracts are immune to operation
permutations: ex. voting.
Or repeat simulations over variations of states:
Lattice model
Monte Carlo
Gas cost
Contracts are executed on each node.
Unrestricted execution such as infinite loops easily
results into DDoS attacks.
To prevent DDoS, smart contract executions must pay
fee proportional to the amount of calculation, gas.
VM for Smart Contracts
Nodes and smart contracts run in various conditions
(CPU architecture, CPU model, memory, etc.).
To define and count gas, smart contracts are modeled
over a stack VM and interpreted:
PUSH, ADD, SUB, IF, ..
The gas cost is defined for each VM opcode.
How to Decide Gas Cost?
Hard to deduce. Too many variables:
CPU characteristics
Interpreter performance
GC timing
Argument sizes and structures
Simulations come in. (by Ilias Garnier)
Gas Cost Simulation in Tezos #1
Guess a formula of gas cost of each opcode with
several unknown parameters:
ADD for arbitrary precision ints
Complexitiy:
Formula:
GET for map (by balanced tree)
Complexitiy:
Formula:
O(max(size(arg1), size(arg2)))
x + y ∗ max(size(arg1), size(arg2))
O(size(key) ∗ log(size(tree)))
x + y ∗ size(key) ∗ log(size(set))
Gas Cost Simulation in Tezos #2-1
Run each opcode under various conditions to get
good approximations of the parameters:
Add for arbitrary precision ints
Formula:
Cost function:
x + y ∗ max(size(arg1), size(arg2))
300 + 0.07 ∗ max(size(arg1), size(arg2))
Gas Cost Simulation in Tezos #2-2
Run each opcode under various conditions to get
good approximations of the parameters:
GET for map (by balanced tree)
Formula:
Defined cost:
x + y ∗ size(key) ∗ log(size(map))
1000 + 0.07 ∗ size(key) ∗ log(size(map))
Gas Cost Simulation in Tezos #3
Boil down to Gas code in the VM interpreter:
let add i1 i2 =  
  atomic_step_cost  
    (51 + (Compare.Int.max (int_bytes i1) (int_bytes i2) / 62))
let map_access key (module Box) = 
  let map_card  = snd Box.boxed in 
  let key_bytes = size_of_comparable Box.key_ty key in 
  atomic_step_cost ((1 + (key_bytes / 70)) * log2 map_card)
Conclusion
Blockchain: open distributed ledger DB.
Rewards are required to drive concensus.
Safety. Tezos pushes formal verification.
Simulations play important roles in many places.

Blockchain and Smart Contract Simulation

  • 1.
    Blockchain Safety and Smart ContractSimulation Tezos Japan, DaiLambda, Inc. Jun FURUSE/古瀬淳, Singapore, NUS, 2019-10-31ASIASIM 2019
  • 2.
  • 3.
    Blockchain is DB Itis a hard computer science! Ledger style Distributed Open/permissionless network
  • 4.
    Ledger Style DB DBby modification history accumulation: Block consists of basic write operations : . .S0 → B1 S1 → B2 S2 → B3 S3 → B4 = ∅S0 = ( )S1 B1 S0 = ( ) = ( (. . ( ). . ))Sn Bn Sn−1 Bn Bn−1 B1 S0 B o = [ , . . , ]Bn on1 onm = ( ) = (. . ( ( )). . )Sn Bn Sn−1 onm on1 Sn−1
  • 5.
    Distributed DB Multiple nodes,each carries a replica of the DB: Fault tolerance Load balancing
  • 6.
    Distributed DB: conflicts Howto resolve conflicts? B →→ A x = A x = A x = A →→x = B x = B x = A x = B or ? ? ?
  • 7.
    Consensus Algorithm Algorithm toresolve conflicts between nodes: Termination All non-faulty nodes eventurally decide on a value. Agreement All nodes decided do so on the same value. Validity The decided value must be proposed by somee. node. Bad news: no such algorithm exists [FLP85].
  • 8.
    Good news: Paxos Concensusalgorithm which may not terminate but practically OK. Voting between nodes to an agreement: Safe Agreement and validity are always assured. Not always Live Termination is not assured. It may repeat votes infinitely. Only for closed network with fixed participants.
  • 9.
    Closed Network DB Ifyou are happy with closed world, normal distributed DB with Paxos is your friend. No need of “Blockchain”.
  • 10.
    Open Network DB Permissionlesspublic network, where anyone can join by running a node: Decentralized: no central servers, therefore neutral. Distributed with redundancy: almost impossible to hack. Low cost, compared to the existing social infrastructure.
  • 11.
    Problem of OpenNetwork Consensus Consensus control and inhibition by Sybil attack: produces fake identities to dominate the network.
  • 12.
    Satoshi Nakamoto andBitcoin Prevent Sybil attack, infinite identity forgery: Requirement of real world resource: computation power (PoW) Rewards: incentive to provide the resource and behave honestly. The rewards are recorded as tokens in the DB. Birth of cryptocurrency.
  • 13.
    Smart Contracts DB cancarry not only account balance, but also any data, even program code. Code attached to an account is executed when it receives a transaction. Automatic execution of financial/non-financial contracts Distributed application framework towards neutral and public infrastructrure, without any central authority.
  • 14.
    Safety Requirements ofBlockchain Consequence of introduction of token rewards for open distributed DB: Speculative money flows in. System must be open source. Security holes are immediate targets for theft. Blockchain must have very high level of security.
  • 15.
    Safety of SmartContracts Bugs turns smart contracts into automatic stealing machines: Past incidents around smart contracts: The DAO (3.6M ETH, 15% of circulation, 50M USD stolen) Parity vulnerability#1 (150K ETH, 32M USD stolen) Parity vulnerability#2 (513K ETH, 160M USD frozen) High level of security is required here too.
  • 16.
    Tezos Blockchain 3rd generationblockchain tech. LPoS Eco friendly robust Sybil attack prevention On chain governance Protocol upgrades by on-chain proposals and voting Safety first design Proving safety properties by formal verification
  • 17.
    Formal Verification Mathematically provegood/bad software properties. Properties are correct for all the cases, if proven. Methods: Type systems, model checking, theorem provers. Tests are important, but not fully trustable: only few corner cases can be verified. No guarantee for the untested cases.
  • 18.
    Tezos and FormalVerification Protocol Written in OCaml, good match with FV Smart contract VM Statically typed, purely functional stack VM, with structured data such as pairs, lists, map, etc. Smart contract safety Modeling of smart contract interpreter using Coq, and correctness proofs of smart contracts using it. Cryptographic primitives Correctness proof by F* language. Ledger file system With correctness proofs.
  • 19.
    Blockchain and Simulation Safetycan be proven, but how safe is not always provable. Simulation is also important: Concensus P2P network Smart contract etc.
  • 20.
    Writing to theDB Sending operation to a node: Transactions Account creations, etc Operations spread into the P2P network. → → O1 O1 O1 O1 O1 O1 O2 O2 O2 O2 O2 O2 O2 O2 o
  • 21.
    Block creation Block proposerchooses operations found in his node and form a block of them. The proposer releases to the network. O1 O1 O2 O2 O2 O2 B = [ O1, O2 ] → B O1 ⇨ B1 = [ O1, O2 ] → B B B B B B B B = [ , . . , ]Bn on1 onm B
  • 22.
    Smart Contract Simulation Contractis a function: Emulate contract execution of by applying a block only with the call: c : param → operation c(p) Bn−1 = [c(p)]Bn = ( ) = c(p)( )Sn Bn Sn−1 Sn−1
  • 23.
    Simulation can Differfrom the Reality In reality, the call can be interrupted by other ops: Simulation: The call may not be taken into but : Malicious block proposer can insert his op before your call in to change its behaviour: = [c(p)]Bn = c(p)( )Sn Sn−1 Bn Bn+1 = c(p)( )  where   = ( ) ≠Sn+1 Sn Sn Bn Sn−1 Sn−1 o ′ Bn = [ , c(p)]( ) = c(p)( ( ))Sn o ′ Sn−1 o ′ Sn−1
  • 24.
    Mitigations Fail the callwhen the simulation and real thing differ? Failing in general does not scale the blockchain. For specific contracts, this behaviour can be useful and implementable with a counter. Or prove contracts are immune to operation permutations: ex. voting. Or repeat simulations over variations of states: Lattice model Monte Carlo
  • 25.
    Gas cost Contracts areexecuted on each node. Unrestricted execution such as infinite loops easily results into DDoS attacks. To prevent DDoS, smart contract executions must pay fee proportional to the amount of calculation, gas.
  • 26.
    VM for SmartContracts Nodes and smart contracts run in various conditions (CPU architecture, CPU model, memory, etc.). To define and count gas, smart contracts are modeled over a stack VM and interpreted: PUSH, ADD, SUB, IF, .. The gas cost is defined for each VM opcode.
  • 27.
    How to DecideGas Cost? Hard to deduce. Too many variables: CPU characteristics Interpreter performance GC timing Argument sizes and structures Simulations come in. (by Ilias Garnier)
  • 28.
    Gas Cost Simulationin Tezos #1 Guess a formula of gas cost of each opcode with several unknown parameters: ADD for arbitrary precision ints Complexitiy: Formula: GET for map (by balanced tree) Complexitiy: Formula: O(max(size(arg1), size(arg2))) x + y ∗ max(size(arg1), size(arg2)) O(size(key) ∗ log(size(tree))) x + y ∗ size(key) ∗ log(size(set))
  • 29.
    Gas Cost Simulationin Tezos #2-1 Run each opcode under various conditions to get good approximations of the parameters: Add for arbitrary precision ints Formula: Cost function: x + y ∗ max(size(arg1), size(arg2)) 300 + 0.07 ∗ max(size(arg1), size(arg2))
  • 30.
    Gas Cost Simulationin Tezos #2-2 Run each opcode under various conditions to get good approximations of the parameters: GET for map (by balanced tree) Formula: Defined cost: x + y ∗ size(key) ∗ log(size(map)) 1000 + 0.07 ∗ size(key) ∗ log(size(map))
  • 31.
    Gas Cost Simulationin Tezos #3 Boil down to Gas code in the VM interpreter: let add i1 i2 =     atomic_step_cost       (51 + (Compare.Int.max (int_bytes i1) (int_bytes i2) / 62)) let map_access key (module Box) =    let map_card  = snd Box.boxed in    let key_bytes = size_of_comparable Box.key_ty key in    atomic_step_cost ((1 + (key_bytes / 70)) * log2 map_card)
  • 32.
    Conclusion Blockchain: open distributedledger DB. Rewards are required to drive concensus. Safety. Tezos pushes formal verification. Simulations play important roles in many places.