Mike Slinn discusses how machine learning could be used with smart contracts in the future. In 10 years, ML will likely be commonly used with smart contracts to do things like detect fraud, optimize transactions, and provide automated customer service. However, ML computation would need to be done off-chain due to the significant resources required. Oracles could also incorporate ML to provide information to smart contracts. While Solidity is currently the main language for Ethereum smart contracts, it has security and other issues. Better options for smart contract languages may exist in the future.
Unblocking The Main Thread Solving ANRs and Frozen Frames
Fullsize Smart Contracts That Learn
1. Smart Contracts that Learn
(Extended Version)
Mike Slinn
June 29, 2018
IBM Ottawa
2. About Mike Slinn
• Distinguished engineer
• Contributor to Ethereum Java and Scala libraries
• Operates ScalaCourses.com
• Author of EmpathyWorks (artificial personality)
• Expert witness
• Twitter: mslinn
3. Key Facts about Mike Slinn
• Focuses on generating business value by
applying people, process and technology
• Wrote 3 books on distributed computing
• Created hundreds of online lectures on
advanced computing concepts
• Uses many computer languages (“polyglot”)
5. The most popular language or framework used
to write smart contracts in 10 years probably
has not been conceived of yet.
-- Mike Slinn, March 3, 2018 --
6. Machine Learning & Smart Contracts
• Machine learning (ML) will be commonly used
with smart contracts in 10 years.
• Highly specialized languages will evolve to
define and verify contracts in various industries.
• ML will have features added to guarantee
deterministic results for decentralized apps.
• Security will surely have to improve
7. Espionage
• In 10 years time corporate and nation/state
espionage will include:
o Sabotaging training data
o Spoofing smart contract events
o Seeding smart contract templates with security
weaknesses
o … and lots more …
10. Confession
On-chain smart contracts cannot actually learn
However, these can learn:
• User interfaces for on-chain distributed applications
(ĐApps)
• Oracles
• Gateway contracts
• Off-chain smart contracts (systems integrated via json-
rpc, IPC and other mechanisms)
12. Beware Non-Deterministic Behavior
• Blockchain requires determinism for consensus
• A ML-driven product may not have deterministic
behavior and may produce counter-intuitive
results
• A personalized recommender system may
produce different results for a user action after
learning additional preferences
13. ML is Done Offchain
ML is currently centralized offchain, because
significant computation and storage are
required
• Centralized, offchain processes can respond
to onchain events and initiate activity via
json-rpc or IPC to Ethereum clients
• ML likely not decentralized for several years
14. ‘(De)centralization’ is Misleading
• Decentralization uses consensus before
output is accepted from multiple EVMs, which
then leads to transactions.
• Centralization does not use consensus; note
that the web3 definition of centralization
includes federated systems and distributed
applications that employ DNS with
geolocation routing and failover.
16. Logical Clocks
• Distributed systems experience Einstein’s
relativistic effects due to the limitation of the
speed of light
• If ML systems were enhanced to incorporate
logical clocks to associate the learned state
at arbitrary global timestamp values they
could become deterministic.
18. Dogma Is Bad For Business
• The degree of centralization is mostly a
business decision
• System integration strategies
o ML systems can indirectly interact with the blockchain
using json-rpc or IPC to an Ethereum client such as
geth or Parity
o Native apps can combine ML with blockchain
• Solidity is suboptimal
19. Issues With Solidity:
• Primitive type system.
• Compiler bugs (more surely exist).
• Few software tools available.
• Expensive to work with
o Hard to hire for
o Low productivity
o High risk
• Very expensive to maintain.
• Shelf life for this technology will be short.
20. Avoid Solidity If Possible
• Write the smart contract in the language of your
choice, and use json-rpc calls as desired.
• Resulting code will be well understood by all.
• Audits will be more reliable.
• Costs will be much less using a common
language instead of Solidity.
• Talent will be much easier to find.
21. Ethereum Is Not Symmetric
• Onchain smart contracts are distributed and
use consensus
• Offchain smart contracts (using Ethereum
clients) are currently centralized and so do
not use consensus
• This will likely evolve over the next several
years
22. ChainLink (1/2)
• Wraps existing singleton services and
presents them as a decentralized oracles
• Run by a for-profit organization
• See smartcontract.com
23. ChainLink (2/2)
• Does not address the motivations for
decentralization:
o Fault tolerance – single point of failure
o Attack resistance
o Collusion resistance
• Wrapping a singleton and presenting it in a
decentralized manner does not make the
singleton decentralized
24. Transpiling
• Process of converting a program written in one
language into another language.
• Solidity could be transpiled to json-rpc calls from
node.js and JVM languages (Java, Scala,
JRuby, Jython, Groovy, etc).
• … but don’t bother because you endure all the
problems with Solidity and get none of the
benefits of native contracts
26. Self-Optimizing Contracts
• Optimize transactions for greatest margin,
minimal waste, constant deal flow, or other
criteria
• Results would improve over time
• This is an onchain example
28. Fraudulent Event Detection
• Smart contracts currently act on all events
• Fraud detection often employs machine
learning
• Incorporating ML into smart contracts could
make them resistant to fraud
• This is an onchain example
29.
30. Automated Customer Service Agents
• Chatbots and voice interfaces (Alexa)
• Much more natural to use
• Can be built into devices
• This is an onchain example
32. Medical Diagnosis Expert System
• Smart contract mediates access to an expert system
(oracle, incorporates machine learning)
• Accepts anonymous patient data
• Passes data to an expert system that performs analysis
• Returns diagnostic results
• Charges for the service
• This is an offchain example
34. Supply Chain
• Native application (JVM, .NET, C++, whatever)
uses json-rpc to interact with the blockchain
• Solidity is not required
• This is an offchain example
37. Traditional Contracts…
• Outline the terms of a relationship
• According to a specific jurisdiction
• So that the specified government can enforce
the terms
38. Smart Contracts…
• Enable rule-based autonomous actions in
response to events.
• Work within and between organizations and
the rest of society, world-wide.
39. Smart Contracts Threaten Tradition
• Enforce a relationship with cryptographic
code
• Without regard to the jurisdiction of any
government
40. Smart Contracts …
• Also known as cryptocontracts
• Are computer programs
• Directly control the transfer of blockchain-based
digital currencies or assets
• Define the rules and penalties for an agreement
• Might automatically enforce those obligations
41. System Integration
• Collecting inputs and outputs to smart
contracts requires system integration.
• Approaches vary according to input sources,
output destinations, volumes of data, required
reliability, fairness (near-constant latency),
etc.
• Beware of introducing single points of failure
42. Smart Contract Capabilities
Manage relationship between parties by:
• Maintaining virtual ledgers
• Reading/writing arbitrary on-chain data
• Reading/writing off-chain data
• Forwarding events to other contracts
• Acting as a software library
44. Oracles
• Smart contracts need oracles to resolve details
that cannot be precisely known at the time the
contract is written.
• Oracles provide reference information for smart
contracts.
• An oracle is usually a (singleton) REST API
connected to a data source.
• Using oracles generally decreases security
46. Ethereum Virtual Machine (EVM)
• Deterministic.
• Each Ethereum node has an EVM instance.
• EVM has a similar execution model to both the
Java and the .NET virtual machines.
• All these VMs are stack machines executing
bytecode.
• EVM adds storage and its bytecode is somewhat
more suited for contracts.
47. Interesting Smart Contract Languages
All compile to EVM byte code except Chaincode
• AxLang
• Chaincode (Hyperledger)
• LLL
• Pact (Kadena)
• Solidity
49. Solidity
• Solidity contracts are difficult to secure.
• Formal verification could help.
• Most Solidity contracts ignore security
recommendations.
• Solidity’s support for types is rather primitive.
50. Solidity and Security
• Contracts written in Solidity are difficult to
make secure.
o Formal verification could help.
• Solidity’s support for types is primitive
53. AxLang
• Smart contract language designed to support
formal verification.
• Cross-compiled Scala DSL for Ethereum
• Designed to scale
• Not yet ready
54. Pact / Kadena
• Functional, interpreted Lisp-like syntax
• Features type inference
• Similar to database stored procedures in an
online transaction processing (OLTP) system
• Not Turing complete
• Runs on the Kadena blockchain
57. Some json-rpc Libraries
• web3.js – for node.js
o Can also transpile Solidity to JavaScript
• web3j – for Java
• Can also transpile Solidity to Java
• web3j-scala – for Scala
o (I wrote this one)
o Idiomatic Scala wrapper around web3j
59. How do Computers Learn?
• Trial and error with feedback
• Training
60. Types of Machine Learning
Classifier systems
Reinforcement
Representation
Rule-based
Similarity and metric
Sparse dictionary
Support vector machines
Association rule
Artificial neural networks
Bayesian networks
Clustering
Decision tree
Deep
Genetic algorithms
Inductive logic programming
61. How Might Smart Contracts Learn?
• “Learning” computation must occur off-chain
• Enforced by the Ethereum fee structure
62. Machine Learning Applications
Machine perception, including computer
vision and object recognition
Optimization and metaheuristic
Recommendation systems
Robot locomotion (autonomy)
Sentiment analysis
Speech and handwriting recognition
Structural health monitoring
Syntactic pattern recognition
User behavior analytics
Translation
Automated theorem proving
Adaptive websites
Bioinformatics
Classifying DNA sequences
Detecting credit-card fraud
General game playing
Information retrieval
Internet fraud detection
Insurance
Linguistics
Medical diagnosis
65. Security Cannot Be Retrofitted
• Secure systems can only be designed that
way from the start
o Trying to secure an existing platform can only give
marginal improvements
• Need orders of magnitude of improvements
to smart contract security
o Not possible without a fresh start