Blockchains are fault-tolerant open distributed databases with stored procedures. The fact they are permissionless and need an internal currency to achieve data replication makes them targets of choice for hackers. Formal verification techniques like theorem proving are required to achieve financial-grade security in blockchains. This presentation shows the challenges in certifying blockchain components.
Uploaded behalf of Diego.
2. Tezos seen from the application side
Tezos network
Tezos
node
Smart Contracts
functions / stored procedures
Desktop or Web application
Language independent
Remote Procedure Call
JSON over HTTP
For the application developer, a blockchain like Tezos is a
webservice with functions to be called via a web API.
2 / 19
3. Tezos seen from the core developer side
P2P layer
Distributed computation
Distributed database
Virtual machine
Contracts
Languages
x = f (3)
x = f (4)
Safe and fast computation of f
Validation that the business logic of f is correct
Expression of business logic of f
write x = 3 or x = 4 ?
f(4) -> 4f(3) -> 3
For the core developer, Tezos is a very complex piece of software.
3 / 19
4. Challenges in the P2P layer
Communication protocols are difficult to code
Error prone code : the Tezos network was down for a couple
of hours because of messages with nothing but 0 were sent
Attacks : The system needs to be protected against
impersonation, message deletion and DDoS
Optimization : Protocols need to be optimized for the type
of messages that are sent to reduce bandwith
The challenge is to automatically generate optimized code from a
specification proven correct.
4 / 19
6. Distributed computation
Broadcast the function calls, elect a leader to compute and
broadcast the results ?
f (3)
f (4)
f (3)
f (4)
f (3)
f (4)
f (3)
f (4)
f (3)
f (4)
f (3)
f (4)
f (3)
f (4)
x = f (3)
x = f (4)
6 / 19
7. Distributed computation
Broadcast the function calls and compute on every node ?
f (3)
f (4)
f (3)
f (4)
f (3)
f (4)
f (3)
f (4)
f (3)
f (4)
f (3)
f (4)
f (3)
f (4)
x = f (3)
x = f (4)
7 / 19
8. Distributed computation
Challenges of distributed computations
Understand the consequences of each algorithm on the system
(liveness, latency, etc)
Formally prove the properties of the algorithms and their
variants
Automatically generate the code and its dependencies with
other layers (e.g. P2P layer for information broadcast) from a
specification
8 / 19
9. Distributed database
The state of the database needs to be consistent across all
nodes : one of the transactions (x = 3 or x = 4) need to be
discarded. The consensus algorithm makes sure that the network
converges on a single value.
x = 3
x = 4
?
9 / 19
10. Distributed database
Some results about consensus algorithms
Consensus in fault-tolerant distributed networks is impossible
[FLP 85] ⇒ Consensus ”algorithms” may not terminate.
Crash and hack-tolerance ⇒ nodes need to ”vote” and use
some form of majority rule to reach consensus
Paxos [Lamport 90] works well for closed networks : flexible,
variants for crash and hack tolerance, formally verified
Open networks are subject to sybil attacks : the majority rule
can be tricked by creating fake nodes controlled by an attacker
Nakamoto idea [2008]: in open networks, replace majority rule
by majority of computing power rule
Participating to the network consensus costs money
(computing power to create the data to be agreed upon)
Reaching consensus is rewarded with money (coins directly in
the data to be agreed upon)
10 / 19
11. Distributed database
Nakamoto conjecture
Consensus algorithm in an open network ⇒ internal currency
We wanted a consensus algorithm for an open distributed
database. We ended with a currency and all the problems that
go with it
All the hackers of the world will try to attack the system
All the speculators of the world will try to speculate on the
system coin
The challenges are to understand distributed consensus algorithms,
formally prove their properties and generate the corresponding
code from a formal specification.
11 / 19
12. Blockchains
Blockchain
Open distributed hack-tolerant databases with stored procedures
are called blockchains.
Until someone finds better consensus algorithms, replication of the
database across the network requires an internal currency to
reward nodes for reaching consensus.
The challenges are
Financial-systems grade security
Design a consensus algorithm that doesn’t require creating a
currency
12 / 19
13. Virtual machine
Poorly designed virtual machines are attack vectors for
blockchains.
EVM WASM Michelson
256 ints 32 / 64 ints Infinite precision ints
No data structures No data structures Persistent sets, maps, lists
Side effects Side effects No side effects
Purpose made Standard Purpose made
Ethereum Dfinity Tezos
The challenge is to understand the semantics of programs in
bytecode from unknown origin and prove properties about them,
to allow users to verify the 3rd party program they are going to call
does what it claims it does.
13 / 19
14. Contracts
Blockchains are per essence slow. As a result everything that can
be computed off-chain should be done so.
Bad
Contract : x -> sqrt x
App : () -> f 9
Good
Contract : x y -> if x * x = y then x else fail
App : () -> f (sqrt 9) 9
The better contract only checks a multiplication instead of
computing a square root with Newton algorithm.
14 / 19
15. Contracts
There are many computations that are faster to verify than to
compute
Sorting an array
Computing an algebraic number
Solving a SAT problem
Solving an NP-hard problem
The challenges are
Convert programs for the blockchain into an fast off-chain
computation and a slow on-chain verification.
Transforming the blockchain VM into the minimal language
that allows verifying the properties of the programs we are
interested in
15 / 19
16. Languages
The best language to describe the business logic in a
smart-contract depends on what you believe a smart-contract
should be and who is going to write it.
Ethereum Tezos
Universal distributed computer Automate simple (legal) contracts
Written in JavaScript / Python Written in DSLs
By any developer By specialized developers
The challenges are
Ensure contract semantics correspond to user expectation
Prove contract-specific properties
Certify their compilation end-to-end
Design DSLs for specific contract types
16 / 19
17. Cryptography
Financial-system grade security is supported by cryptography
Certified cryptographic primitives (HACL in F*)
Signed messages
Zero-knowledge proofs
Anonymous transactions
Interoperability with other systems
17 / 19
18. Zero-knowledge proofs
Zero-knowledge proofs allow proving that a given computation was
performed, without disclosing the details of the computation
A problem mathematically equivalent to the initial one, whose
solution can be verified but doesn’t allow easily going back to
the original problem (e.g. mapping into SAT)
Current state-of-the-art zk-SNARKs and zk-STARKs
Allows interoperation of existing systems with the blockchain
because existing systems can generate a proof the operations
they performed were properly done without revealing details.
Proof can be checked by blockchain before synchronizing
accounts.
18 / 19
19. Zero-knowledge proofs
The challenges are
End-to-end certified zero-knowledge proofs
zk-STARKs and zk-SNARKs use complex computer algebra
running on clusters of computers
Domain-specific zero-knowledge proofs
zk-STARKs and zk-SNARKs are generic and very complex
Can domain specific zero-knowledge proof be simpler ?
19 / 19