Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

IPFS: A Whole New World


Published on

ArcBlock Technical Learning Series Presents IPFS.

If there's a missing piece in current blockchain stack, that'll be a decentralized, public verifiable file system. Ideally before decentralizing computing, we shall decentralize the data. IPFS filled in this area, and it has a great potential to push web to the true web3 - decentralized web. This talk will talk about what problem IPFS is trying to solve, how it solves the problem, and how to use IPFS in our applications.

Published in: Software

IPFS: A Whole New World

  1. 1. IPFS: a whole new world Brought to you by Tyr Chen 1
  2. 2. Existing issues in internet 2
  3. 3. Addressing 3
  4. 4. Bandwith & Latency 4
  5. 5. Collaboration & O ine support 5
  6. 6. And something horrible…WTF on resiliency? 6
  7. 7. More horrible…WTF data security? 7
  8. 8. And the ultimate tragedy…in the history of mankind 8
  9. 9. What rst? • web first • mobile first • data first • AI first • … • … • distributed, offline first? 9
  10. 10. Entering InterPlanetary File System world! 10
  11. 11. The big things before IPFS • DHT: Ditributed Hash Table • Kademlia DHT: query is . Used widely by Gnutella and BitTorrent • Coral DSHT: make the storage and bandwidth usage more efficient than Kademlia • s/kademlia DHT: add PoW to prevent attack (PoW on node id gen, ). • BitTorrent: incentified by bit-for-tat and prioritized with rare block • Git: Merkle DAG O(log2N ) 11
  12. 12. DHT 12
  13. 13. What is DHT? A distributed hash table (DHT) is a class of a decentralized distributed system that provides a lookup service similar to a hash table: (key, value) pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. Keys are unique identifiers which map to particular values, which in turn can be anything from addresses, to documents, to arbitrary data.[1] Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. “ 13
  14. 14. Chord, Pastry DHT 14
  15. 15. Node join / leave 15
  16. 16. BitTorrent / Kademlia DHT 16
  17. 17. IPFS & BitTorrent • Similarity: • exchange of data (blocks) in IPFS is inspired by BitTorrent • tit-for-tat strategy (if you don’t share, you won’t get) • get rare pieces first • Difference: • separate swarm for each file (BitTorrent), one swarm for all (BitSwap in IPFS) 17
  18. 18. IPFS & Git (copied from white paper) 1. Immutable objects represent Files (blob), Directories (tree), and Changes (commit). 2. Objects are content-addressed, by the cryptographic hash of their contents. 3. Links to other objects are embedded, forming a Merkle DAG. This provides many useful integrity and workflow properties. 4. Most versioning metadata (branches, tags, etc.) are simply pointer references, and thus inexpensive to create and update. 5. Version changes only update references or add objects. 6. Distributing version changes to other users is simply transferring objects and updating remote references. 18
  19. 19. Merkle DAG 19
  20. 20. What are the use cases for Merkle DAG? 20
  21. 21. IPFS Design 21
  22. 22. IPFS Core Parts • Identities: node identity generation & verification • Network: p2p • Routing: DHT • Exchange: BitSwap • Objects: Merkle DAG • Files: versioned file system like Git • Naming: self-certifying mutable name system 22
  23. 23. Exchange: BitSwap • peers exchange which blocks they have (have_list) and which blocks they are looking for (want_list) upon connecting • to decide if a node will actually share data, it will apply its BitSwap Strategy • based on previous data exchanges between these two peers • peers keep track of the amount of data they share (builds credit) and the amount of data they receive (builds debt) • kept track of in the BitSwap Ledger • if a peer has credit (shared more than received) • our node will send the requested block • if a peer has debt, our node will share or not share • depending on a deterministic function where the chance of sharing becomes smaller when the debt is bigger • a data exchange always starts with the exchange of the ledger, if it is not identical our node disconnects 23
  24. 24. BitSwap Ledger type Ledger struct { owner NodeId partner NodeId bytes_sent int bytes_recv int timestamp Timestamp } 24
  25. 25. BitSwap Spec // Additional state kept type BitSwap struct { ledgers map[NodeId]Ledger // Ledgers known to this node, inc inactive active map[NodeId]Peer // currently open connections to other nodes need_list []Multihash // checksums of blocks this node needs have_list []Multihash // checksums of blocks this node has } type Peer struct { nodeid NodeId ledger Ledger // Ledger between the node and this peer last_seen Timestamp // timestamp of last received message want_list []Multihash // checksums of all blocks wanted by peer // includes blocks wanted by peer’s peers } // Protocol interface: interface Peer { open (nodeid :NodeId, ledger :Ledger); send_want_list (want_list :WantList); send_block (block :Block) -> (complete :Bool); 25
  26. 26. Files: unixfs syntax = "proto2"; package unixfs.pb; message Data { enum DataType { Raw = 0; Directory = 1; File = 2; Metadata = 3; Symlink = 4; HAMTShard = 5; } required DataType Type = 1; optional bytes Data = 2; optional uint64 filesize = 3; repeated uint64 blocksizes = 4; optional uint64 hashType = 5; optional uint64 fanout = 6; } message Metadata { optional string MimeType = 1; } 26
  27. 27. Naming: add mutability • The root address of a node is /ipns/ • The content it points to can be changed by publishing an IPFS object to this address • By publishing, the owner of the node (the person who knows the secret key that was generated with ipfs init) cryptographically signs this “pointer”. • This enables other users to verify the authenticity of the object published by the owner. • Just like IPFS paths, IPNS paths also start with a hash, followed by a Unix-like path. • IPNS records are announced and resolved via the DHT. 27
  28. 28. IPFS stack 28
  29. 29. IPFS stack • Moving the data easily and efficiently: libp2p • Defining the data: IPLD, IPNS • Using the data: IPFS app 29
  30. 30. Concepts • CID: content identifier. Based on the content’s cryptographic hash. • DNS link: use DNS TXT records to map a domain name (e.g. to an IPFS address. • IPNS: Inter-Planetary Name System is a system for creating and updating mutable links to IPFS content. IPFS address changes everytime the content changes. A name in IPNS is the hash of a public key. • MFS: Mutible File System allows to treat files like a normal file system. It takes care of all the work of updating links and hashes upon change of file. • Pinning: IPFS nodes treads data like a cache so if you want something to be retained long-term you can pin it. • UinxFS: UnixFS is a data format to respresent files and all their links and metadata, loosely based on how files work in Unix. 30
  31. 31. Node 31
  32. 32. IPNS 32
  33. 33. Exploring IPFS 33
  34. 34. Start IPFS $ ipfs init initializing IPFS node at /Users/tchen/.ipfs generating 2048-bit RSA keypair...done peer identity: QmYu24HbZC3FTMxfKPJFNFM16tNdbMSJYtvfT2Kixe9Qo6 to get started, enter: ipfs cat /ipfs/QmS4ustL54uo8FzR9455qaxZwuMiUhyvMcX9Ba8nUH4uVv/readme $ brew services start ipfs ==> Successfully started `ipfs` (label: homebrew.mxcl.ipfs) 34
  35. 35. Add a le $ echo 'hello world' | ipfs add added QmT78zSuBmuS4z925WZfrqQ1qHaJ56DQaTfyMUF7F8ff5o QmT78zSuBmuS4z925WZfrqQ1qHaJ56DQaTfyMUF7F8ff5o 12 B / ? [--------------------------------------------------------------------------------------------------------------------------=------ $ ipfs cat QmT78zSuBmuS4z925WZfrqQ1qHaJ56DQaTfyMUF7F8ff5o hello world 35
  36. 36. IPFS peers $ ipfs swarm peers /ip4/ /ip4/ /ip4/ /ip4/ /ip4/ /ip4/ /ip4/ /ip4/ 36
  37. 37. IPFS id $ ipfs id { "ID": "QmYu24HbZC3FTMxfKPJFNFM16tNdbMSJYtvfT2Kixe9Qo6", "PublicKey": "CAASpgIwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQC0VFzN0Dv90LqJXTvRoS0G1nhHi6S0mONQ1jftl9QUUv8hTucf1XpWu+VfkSKcoWwr4MZZi5 "Addresses": [ "/ip4/", "/ip4/", "/ip6/::1/tcp/4001/ipfs/QmYu24HbZC3FTMxfKPJFNFM16tNdbMSJYtvfT2Kixe9Qo6", "/ip4/", "/ip4/" ], "AgentVersion": "go-ipfs/0.4.18/aefc746", "ProtocolVersion": "ipfs/0.1.0" } 37
  38. 38. Providers $ ipfs dht findprovs QmT78zSuBmuS4z925WZfrqQ1qHaJ56DQaTfyMUF7F8ff5o QmYu24HbZC3FTMxfKPJFNFM16tNdbMSJYtvfT2Kixe9Qo6 Qmbut9Ywz9YEDrz8ySBSgWyJk41Uvm2QJPhwDJzJyGFsD6 QmceZ5vcFtpcaPeAa66xnL79xo6fRrAmx6Gp6UZAFdkrDA QmfFj5Am7Jw2DZm7s1aVdR5Ti5baT6a8B1ifcxCbya1vw5 QmVGkGSV25o3AMjcjjnPVb1PqJzrA1PvvhMiV57cMEuExb QmaAbZQVFavUub1cvsXP8tfbk7p5i2cRVvimWv5La2E9U8 QmPDcnTLF5HftAhteRGVhAHnxwNfLzm541W7LL1rpaChy7 QmPDpBw1xsGvnNmkt9z9NsgNpezKzFNwVPcLcPVh2Weuwv QmPMT3ZUATZWqZEiHd4Kjfe6H9UTTYjRJN4kqLvgWhrJpU QmPR6Ggp5BTaKtvGJ9rwn3e86dMtzPRGRNmdLG8Yp7bhkP QmNRZdPPtYycST9dPUPLQEqMoFq43fRx5pkKYBjYKuw1Fa QmNSYDHzei4vn91k1sdF7oEcBHQWVTbibEBtXUpYSdkapX QmNTXQyssJ5vMynxdW1jo5EcQq9XARS9nrGJykcYGKYbNH QmNdtfrpDhP4yaRnPTUdqynnewBF4p5tUoq6Ngw1XFdmdj QmNfFvrEoBThgW7dcPTkWjqAsicH1eq2GHsHrmCx8bMrxv QmNjm16wkUUsUoRmr3b8QoQAPZYBwfBHcygkjGbSauTSWu QmNkZvJVtf1AfkudwvkSwfV3Ru5hHFvitUugzvYBxtuPT7 QmNn2QFMrNcHstJVWnaT8XK31xe1e6HLvz7qg29yE2BGkS QmNnnkCTY1ZbtusjMAtJ9Rn5arVtaDBjwRkFTf2WdPGmRq QmNoBE4qVq7vuNtAMTqLhMYryL2YiWcrVB7eEbw4nYkjpW 38
  39. 39. Wait a moment, why everything starts with Qm ? • sha2-256 • base58 • multihash 39
  40. 40. IPFS use cases 1. As a mounted global filesystem, under /ipfs and /ipns. 2. As a mounted personal sync folder that automatically versions, publishes, and backs up any writes. 3. As an encrypted file or data sharing system. 4. As a versioned package manager for all software. 5. As the root filesystem of a Virtual Machine. 6. As the boot filesystem of a VM (under a hypervisor). 7. As a database: applications can write directly to the Merkle DAG data model and get all the versioning, caching, and distribution IPFS provides. 8. As a linked (and encrypted) communications platform. 9. As an integrity checked CDN for large files (without SSL). 10. As an encrypted CDN. 11. On webpages, as a web CDN. 40
  41. 41. Problems in IPFS • Data is not automatically replicated by default • you may lose your data if nobody is using or pinning it, see this discussion • at the moment it serves as a filesystem cache • ipfs cluster allows files to be pinned across a cluster • IPFS cluster is not efficient on replication • at the moment, either accept it • or build your own with eraser code like Reed-Solomon algo 41
  42. 42. IPFS for private usage 42
  43. 43. IPFS for blockchain 43
  44. 44. 44