Talk at the Etherreum Developper Conference. Presents our approach to build a fully decentralized Cloud Infrastructure based on the Ethereum blockchain and Desktop Grid middleware.
2. The Promise of Ethereum
• Dapps: Distributed Applications running on the Blockchain
How to satisfy compute/data-intensive DApps ?
Blockchain offer limited computing resources : storage is
expensive, slow EVM, high tx latency etc.
3. iEx.ec Objective
• Provides Blockchain-based Distributed Applications
access to the off-chain computing resources they need:
– Computing resources (CPU, GPU, storage)
– Data access (remote storage)
– Applications (compute and/or data-intensive)
– Services (deployed as containers)
4. Global Market for Computing Resources
Low cost, Secure, on Demand and Fully Distributed Cloud
Ethereum
Blockchain
5. Towards Distributed Cloud
Computing
• Benefits of Decentralizing Data-Centers.
– Be$er energy efficiency
– Data closer to the user
• Example of next-gen Data-centers
• Fog/Edge Computing
5G network -- In-network storage and processing
a) Rutgers
b) S@mergy
c) Qarnot
6. Origin of the Technology :
Desktop Grid Computing
Using Idle PCs on the Internet to
Execute Parallel Applications :
• Mature technology
• Advanced features: security, virtualiza@on, QoS
• Many applica@ons : Finance, Bio-medical,
Chemistry, High Energy Physics etc…
• European Desktop Grid Infrastructure
• h$p://desktopgridfedera@on.org
Book on Desktop Grid Compu@n.
Ed. C. Cérin & G. Fedak, CRC/
Chapman and all
7. XtremWeb XtremWeb-HEP
BitDew SpeQuloS
MapReduce
MPICH-V
2000
• 1st Internet P2P Global
Computing Platform
• Bag-of Task Application
• Multi-users & multi-
applications
• Grid & Cloud
• Highly secure
• Virtualization
• Hybrid public/private
Infrastructure
• Parallel computing
• N-faults resilience
2001
2003
2008 2012
2010
• Big Data
• 1st Implementation of
MapReduce for Internet
Computing
• Large Scale Data
Management
• QoS for Best-effort
infrastructure
Building Distributed Cloud
>1M€ EU FP7, ANR funding, ≈100 papers published
Tens of users/applications: Finance, HEP, biomedical research…
9. Resource Management on the Blockchain
Resource Provisioning
Market Management
Framework
Matchmaking
Task/Compu@ng
resources
Mul@ –Criteria
Scheduling
Result cer@fica@on
Verified File transfer
Resource Publica@on
Resource Ontology
10. E-FAST : E-Services Framework for
Knowledge-bAsed Decision SupporT in
Finance
Service Oriented Platform:
Integrated, advanced tools to analyze financial market data, high-level
services that automa@cally react to market changes and propose investment
alterna@ves
Data and Computing-Intensive Methods:
Text-mining, Neural Networks and Gene@c Algorithms, enhanced by applying
relevant findings from the efficient-market theory study.
11. Selling E-FAST using iEx.ec
Customers access E-FAST services which uses iEx.ec for their execution:
Only pay for resources when a service has been sold to a customer
15. Proof-of-Contribution
Ensures that action that happen out of the blockchain
leads to correct token transaction in the blockchain
Example: execu@on of a set of compute intensive task (Bag-of-Tasks)
Dapp Ethereum iEx.ec sidechain Distributed Cloud
transac@on
Select resources/applica@ons
Fetch&
execute BoT
Results cer@fica@on
Feasability ? :
* Asynchronous RPC
• GridCoin (h$p://www.gridcoin.us)
• Ethereum Computa@on Marketplace (see Github)
• Reputa@on + Result cer@fica@on (majority vo@ng, spot checking, blacklis@ng..)
contract
17. Thanks to
Mircea Moca (Universitatea Babeș-Bolyai)
Oleg Lodygesnsky (IN2P3/CNRS/Univ. Paris XI)
DACA, Wanxiang Blockchain Lab
cryptofr slack team, chaintech, asseth
Editor's Notes
My name is Gilles Fedak. I am a researcher at INRIA, which is the French National Institute for Research in Computer Science.
My research background is in Parallel and Distributed Computing with a focus on building Distributed Computing Infrastructure based on machines distributed on the Internet .
This is a joint work with Pr. Haiwu He who is with the Chinese Academy of Science.
This talk is about how to build a Distributed Cloud based on the Ethereum blockchain.
The goal is also to give some perspective from the infrastructure point a vue.
Ethereum allows to develop distributed applications and systems that run on the Blockchain.
And the blockchain gives these applications very nice properties : autonomous, resilient, secure, consensus.
These are very important features and this is going to change drastically the way we design distributed applicatiosn.
So with Etheruem comes a lot of promises sometime advertized as : unstopable applications , supercomputer.
However, when actually try to move your existing distributed system to Ethereum,, you discover that there’s a great gap between the promises and what you have in term of computing capabilities. The blockchain offer few storage, EVM performances, tx latency .
And that’s really a limitation, as soon as you have algorithms that have significant processing requirement or that require data access.
And this gap is even harder to understand, considering that there is actually a huge computing power provided by the miner’s network. For instance the Enigma mining farm farm Genesis has 14 Pflops peak performance.
Somehow this project is also about giving this computing power back to the application that need it.
Let’s take advantage of the blockchain and organize a global market for Computing resources.
We can think it as akind of airbnb for computing resources.
Every body would be able to provide or to rent its computing node.
And so that would form a sort of Distributed Cloud, in the sense that you go on the blockchain and you get ther resources on-demand through
smart contracts
on a pay-as-you-go basis.
And the good thing with this idea is distributed cloud is actually very timely.
A the moment Cloud Computing relies on extremely centralised data-centers and this has a lot of issues.
For instance in France, it is just impossible to set up a new data-center in Paris area, because of the lack of room and power supply.
So data-center are now located in remote places where the energy is cheap or where there is free cooling, such as Iceland, Tibet.
.
So the distributed Cloud it’s about relocating the data-centers in the city close the data producers and consumers.
To give you an idea of how distributed data centers may looks like, here are some projects from partners we are working with.
The parasol project at Rutgers Uni.v setup a data-center on the roof of their building. Solar panel, battery, low-power arm processor and Energy autonomous. I’ll talk about stimergy later. Qarnot Computing proposposes the Q.Rad, which is both a server, and a heater. It’s the heat generated by 3D rendering that is heating your appartment during winter time.
And there’s even more to come with the advent of FoG/Edge computing where there will in-network storage and processing.
The goal of iExec is to make those machine avaible on the blockchain.
You get the idea, now how can we make it happen.
Actually the technology to build the distributed cloud is already there.
At the origin, it was called Desktop Grid Computing. The principle is to use Destop Pcs, on the Internet, when there are idle to execute large parallel applications.
Desktop Grid Computing, that’s an idea we have pushed to its extreme limit.
For exeample, we did parallel computing on the Internet, the first implementation of MapReduce on the Internet on 2010,
The software that iare central for the Distributed Cloud are XtremWeb-HEP, which is production version developped by Oleg Lodygensky at IN2P3, and BitDew that does Large scale data management.
Moreover, even if it’s called Desktop Grid Computing, we’re not actually not using any Desktop PCs. At the moment most of the comoute nodes are clusters. It’s just that this technologies make the gatering of very large number of nodes distributed on the Internet extremely easy.
The way we are working at the moment is that we take the regular stack with applications at the top, then resource management and cloud resources. And then we put Ethereum in the middle and we try to see what are the components that we can move to the blockchain.
It’s an experimental approach : learn-by-doing.
And what we have discovered so far is that some components are really easy to port, that’s the left part of the gauge, and the more you go on the right, the more challenging it gets.
Resource publication it’s taking description of the resources and publish this as a smart contract on the blockchain. Resource provisionning consists in adding a small tags that gives the state of the resources. Matchmaking is little more tricky, it says this application that requires 4GB memory can run on this machine that provides 16GB memory. And then you go on operation that are much more challenging.
Scheduling is matching a list of tasks to execute with a list of machines. So Mircea Moca at BBU, proposed a algorithm is multi-criteria, satisfaction-oriented and pull based. It’s very nice because it allows to express strategies such as I’m want the fastest execution possible even if I have to pay for it. The problement is that it’s very memory and compute intensive, and that’s just impossible to run that on blockchain, which basically motivates this work.
In term of use case, we’re working with the e-fast application. E-fast is framework for financial analyis.
In particular eFast relies on machine learning, and this is typically both compute and data intensive, as you have to train your algorithm with many data.
So e-Fast will directly benefit from the computing power provided by iEx.ec when developing their systems.
But more interesting e-Fast customers can directly through the blockchainlaunch the e-fast service on their own data.
And because blockchain applications are autonomous e-fast would directly acquire the computing resources it needs on the blockchain, through the iExec smart contract.
The last step in our use case is to use the Stimergy computing resource. Stimergy is doing servers that serve as furnaces. So the heat of the processors is used tore-warm the water in a building.
We hope to acheive a demo by November the first smart contract that can warm a swimning pool as a side-effect.
Now, I would like to give a glimpse on the future of iEx.ec, based on those early experiments. I am almost convinced that it might not be
A good idea to everything on the Ethereum, instead there should be an sidechain to manage the computations and data transfers.
There are several reasons for that:
- We need a new consensus for off-chain resource utilisation, this is what we call Proof-of-contribution,
- Some information are needed for ensuring the proof-of-contribution, but are totally meaning-less with respect to the provisionning contract.
- The workload for this system can be quite different with transaction that arrives in huge burst
- finally the notion of consensus can be very different. Some parallel applications tolerate that a fraction of their results is wrong.
As a conclusion infrastructure matters !
Decentralizing the Cloud, it’s also an opportunity to switch to a new model that canbe radically different.
And why not a cloud that is energy positive, that produces more energy than it consumes !