Abstract of the Presentation:
Ocean Protocol is a non-profit foundation designing and implementing a decentralized data marketplace. Ocean Protocol aims to break down data silos through equitable, universal, and secure access.
The problem: Large companies have a data advantage in the ongoing AI revolution. With huge amounts data feeding AI models, there is an incentive to keep this data in-house and gain a competitive advantage. Given that more data often yields better AI, there are solutions which smaller companies and innovators can unlock if they had access to more data.
The solution: Ocean Protocol tackles this challenge with a decentralized marketplace for data and AI services. Data providers are incentivised to open their data, and data consumers will gain seamless access to a new world of data, be it big data from a large corporation, or an aggregate of small datasets.
This talk will demo the features available to Data Engineers, how you can partake in the decentralized data and AI ecosystem, and the roadmap until Ocean Protocol’s Ethereum main-net release.
About the Author:
Marcus Jones is the lead Data Scientist at Ocean Protocol, a Blockchain decentralized marketplace for data and models supporting the AI ecosystem. Marcus has been hacking in Python development and and Open Source projects for over 10 years, starting at the national research agency in Vienna, moving into international project management and consulting, and landing now in Berlin at Ocean Protocol. Marcus is focused on the intersection of Artificial Intelligence and the new decentralized Web 3.0 ecosystem, and how it can empower Data Engineers and Data Scientists.
DN18 | Ocean Protocol – Empowering a Decentralized Marketplace for The New AI Ecosystem | Marcus Jones | Ocean Protocol
1. Ocean ProtocolA decentralized data exchange protocol to unlock data for AI
Data Natives Conference – 22 November 2018
@oceanprotocol
Marcus Jones
Data Scientist
11
2. Ocean Protocol solves data sharing
for Stakeholders
▪ a decentralized data exchange protocol to unlock
data for AI
▪ uses blockchain technology that allows data to be
shared and transferred in a safe, secure and
transparent manner
▪ enables a decentralized platform and network
connecting providers and consumers of valuable
data, and providing open access for developers to
build services
22
6. Incentive to hoard and silo data
“It is not who has the best
algorithm that wins. It’s who
has the most data.”
Michele Banko, Eric Brill. Scaling to Very Very Large Corpora for
Natural Language Disambiguation. ACL2001
66
7. Lots of data and
compute
Lots of AI expertise &
problems to solve
77
8. Lots of data and
compute
Lots of AI expertise &
problems to solve
… top enterprises
88
9. Lots of data and
compute
Lots of AI expertise &
problems to solve
… top enterprises … AI startups
99
10. Lots of data and
compute
Lots of AI expertise &
problems to solve
… top enterprises … AI startups
Have both:
Goog, FB, a handful
$$
1010
22. JupyterLab as front-end
• Write scripts in pure python,
convert to notebook on CI
tool (travis)
• JupyterHub running on AWS
with Kubernetes
• Connects to HTTP endpoints
to test new components and
run integration tests
• Integration with MetaMask for
Etherium test and mainnet
integration
2222
23. No data escape
1) Recieve template generator
2) Generate data and test your pipeline
3) Package your model and send to data owner
4) Receive logs as Proof of Train
5) Independent verifiers check honesty of data owner
6) Receive your trained model
-> No model escape, homomorphic encryption, federated learning, data provenance
2323
See: https://fitchain.io/
24. data incentives
Bring compute to the data
f(x)
private
data
modeling
algorithm
privately
train model
private
model
model
predictions
Data stays
behind
firewall
24
25. Why blockchain? Why decentralized?
1. So data owner retains control of their data
“If you don’t have the key, you don’t control the data”
2. Must be a public utility, vs controlled by a single entity.
Anti-pattern: FB
3. Allows for block rewards to incentivize data supply
Add to data commons, be rewarded.
2525
26. How is Ocean Protocol organized?
- Non-profit foundation to ensure open access to the
protocol and platform,
- Promote increased decentralization with time
- 2 contracted companies implementing Ocean Protocol:
- BigChainDB in Berlin
- DEX in Singapore
- Funding: 10% token sale raised 18M EUR in March
2626
27. Roadmap to mainnet release
2727
1 2 3
Contract
* Service
* Access
* Condition
Contract
* Service
* Access
* Condition
Contract
* Service
* Access
* Condition
Plankton
“Tap into Data Science tools”
August 2018
Trilobyte
“Signal data assets”
November 2018
Tethys -
“Promote verified services”
Q2 2019
3
Ethereum
mainnet!
28. - build a marketplace
- publish open source code
- contribute to technical development of the project
- work together on relevant use cases on data sharing
- examine no-data-escape functionality
Collaboration Opportunities: EXAMPLES