SlideShare a Scribd company logo
1 of 33
Download to read offline
Module:
Content Addressing
in IPFS
ResNetLab on Tour
Agenda
➔ Content Addressing Motivation
◆ What is it and why it’s good
➔ Content Addressing in IPFS
◆ Chunking
◆ Linking Chunks in Merkle DAGs
◆ Going from Data to Data Structures with IPLD
➔ Anatomy of the IPFS Content Identifier (CID)
Motivation for Content
Addressing
What is it and why it’s good Motivation
Content Addressing in IPFS
Anatomy of a CID
Why location
addressing fails us
● The URL only points to a single copy, stored in a single location.
● If that copy disappears, there is no way to know where other
copies are.
● It is not possible for a user to validate the integrity of the
content
○ A malicious actor can poison DNS, or change the copy’s location, without the end
user noticing.
○ HTTPS is an improvement, but only secures the transport, not the content.
● No request aggregation, resulting in duplication of effort and
bandwidth waste (i.e. no options for multicast in the wild).
Location- vs content-addressing
location addressing:
ask exactly one host:
IP(dns(URL))
content addressing:
ask any host, choose closest host
The address of content (i.e. its fingerprint) is derived from the content
itself and not from the location where it is stored.
/ipfs/QmW98pJrc6FZ6/cat.png
IP:120.1.11.22
http://example.com/cat.png
http://10.20.30.40/cat.png
ipfs://QmW98pJrc6FZ6/cat.png
IP:10.20.30.40
IP:15.25.35.45
location
content
CIDs are:
● the most fundamental ingredient of the IPFS architecture
● used for content addressing
● used to name every piece of data in IPFS
● a hash with some metadata
● self describing
Content
Identifier
CIDv0: QmS4ustL54uo8FzR9455qaxZwuMiUhyvMcX9Ba8nUH4uVv
CIDv1: bafybeibxm2nsadl3fnxv2sxcxmxaco2jl53wpeorjdzidjwf5aqdg7wa6u
CIDs are
Immutable links
Deduplication Identical data can be verified by its address ->
Caching -> saves resources and provides faster access to content
Self-certification Content is authenticated by its address, not by a
Certificate Authority -> Decentralisation
Immutability If the content changes, its address also changes ->
Integrity Checking
Why Immutability?
fingerprint
data
Integrity Checking
hash
==
!=
Integrity Checking
Why Immutability?
😊
👍
😤
👎
a
CONTENT ADDRESSING CONTENT DISCOVERY
& ROUTING
CONTENT EXCHANGE
● Chunking
● Linking Chunks
in Merkle DAGs
● From Data to
Data Structures with IPLD
● Anatomy of the IPFS CID
● Routing & Provider
Records
● DHT-based Routing
● Gossip-based Routing
● Bitswap
● GraphSync
MUTABLE NAMES &
MESSAGE DELIVERY
● Dynamic Data
● IPNS
● PubSub
● CRDTs
IPFS Components
Content Addressing
1 Chunking
2 Linking Chunks
3 Going from Data to Data Structures
Motivation
Content Addressing in IPFS
Anatomy of a CID
a
CONTENT ADDRESSING CONTENT DISCOVERY
& ROUTING
CONTENT EXCHANGE
● Chunking
● Linking Chunks
in Merkle DAGs
● From Data to
Data Structures with IPLD
● Anatomy of the IPFS CID
● Routing & Provider
Records
● DHT-based Routing
● Gossip-based Routing
● Bitswap
● GraphSync
MUTABLE NAMES &
MESSAGE DELIVERY
● Dynamic Data
● IPNS
● PubSub
● CRDTs
IPFS Components
Content Addressing
Chunking
● Deduplication
● Piecewise Transfer
● Random Access
Each chunk is individually addressed and
identified by its own hash
File
Chunked File
Chunking
● Deduplication
● Piecewise Transfer
● Random Access
Optimise storage requirements
Content Addressing
File
Chunked File
● Deduplication
● Piecewise Transfer
● Random Access
Identify errors without having to
fetch the whole file
Discard parts that arrived in error
Chunking
Content Addressing
File
Chunked File
● Deduplication
● Piecewise Transfer
● Random Access
Optimise bandwidth requirements
Fetch the parts you need only
Chunking
Content Addressing
File
Chunked File
a
CONTENT ADDRESSING CONTENT DISCOVERY
& ROUTING
CONTENT EXCHANGE
● Chunking
● Linking Chunks
in Merkle DAGs
● From Data to
Data Structures with IPLD
● Anatomy of the IPFS CID
● Routing & Provider
Records
● DHT-based Routing
● Gossip-based Routing
● Bitswap
● GraphSync
MUTABLE NAMES &
MESSAGE DELIVERY
● Dynamic Data
● IPNS
● PubSub
● CRDTs
IPFS Components
• Files are split in chunks that are connected into a Merkle Directed Acyclic
Graph (Merkle DAG).
• In Merkle DAGs, similarly to Merkle Trees:
• Every chunk is represented as a node.
• The hash of the chunk is the address of the node.
• The address of every node is embedded in its parent, as a link.
• What is not possible with Merkle Trees is for a node to have multiple
parents.
• Each node in the DAG is addressed and can be accessed independently,
making it an entity of its own.
Content Addressing in IPFS
File Chunks:
UnixFS File:
(merkle-link)
(a hash)
(merkle-tree)
0-200 200-350
0-100 100-200 200-300 300-350
Linking Chunks in a DAG
Content Addressing
File Chunks:
UnixFS File:
(merkle-link)
(a hash)
(merkle-tree-dag) - directed acyclic graph
0-200 200-350
0-100 100-200 200-300 300-350
Merkle DAGs are graph data structures where each node is content-addressed
Visit: dag.ipfs.io
Linking Chunks in a DAG
Content Addressing
a
CONTENT ADDRESSING CONTENT DISCOVERY
& ROUTING
CONTENT EXCHANGE
● Chunking
● Linking Chunks
in Merkle DAGs
● From Data to
Data Structures with IPLD
● Anatomy of the IPFS CID
● Routing & Provider
Records
● DHT-based Routing
● Gossip-based Routing
● Bitswap
● GraphSync
MUTABLE NAMES &
MESSAGE DELIVERY
● Dynamic Data
● IPNS
● PubSub
● CRDTs
IPFS Components
• IPLD provides the standards and formats to build data structures
based on Merkle-DAGs.
• With IPLD we go from Data to Data Structures.
• You can think of an IPLD Graph as the tree of folders in your file
system. It starts from the root and splits to directories and files.
• Recall:
• Every point in the tree is a node in the IPLD graph
• Every node in the IPLD graph has its own CID with the root node
having the root CID
• But also: every point in the tree points and is linked to from its
parent node.
InterPlanetary Linked
Data (IPLD)
LOCATION-BASED
ADDRESSING
CONTENT-BASED
ADDRESSING
IPLD-Powered Merkle-DAGs
Content Addressing with Merkle DAGs:
From Data to Data Structures with IPLD
LOCATION-BASED
IDENTIFIER
CONTENT-BASED
IDENTIFIER
IPLD-Powered Merkle-DAGs
http://example.com corresponds to ipfs://Qm<hash>
Only one host can serve this Anyone with the content can serve this
IPLD lets you seamlessly link and traverse various types of content-linked data
a
CONTENT ADDRESSING CONTENT DISCOVERY
& ROUTING
CONTENT EXCHANGE
● Chunking
● Linking Chunks
in Merkle DAGs
● From Data to
Data Structures with IPLD
● Anatomy of the IPFS CID
● Routing & Provider
Records
● DHT-based Routing
● Gossip-based Routing
● Bitswap
● GraphSync
MUTABLE NAMES &
MESSAGE DELIVERY
● Dynamic Data
● IPNS
● PubSub
● CRDTs
IPFS Components
Anatomy of the IPFS CID
Motivation
Content Addressing in IPFS
Anatomy of the IPFS CID
CIDs are:
● used for content addressing
● used to name every piece of data in IPFS
● a hash with some metadata
● self describing
Content
Identifier
Basic Concepts
CIDv0: QmS4ustL54uo8FzR9455qaxZwuMiUhyvMcX9Ba8nUH4uVv
CIDv1: bafybeibxm2nsadl3fnxv2sxcxmxaco2jl53wpeorjdzidjwf5aqdg7wa6u
Self-Certification: Content is authenticated by its address
Immutability: If the content changes, its address also changes
De-duplication: Same data can be identified by its address
Anatomy of a CID
<base>base(<cid-version><multicodec><multihash>)
Multihash A self-describing hash digest.
Multicodec A pre-set number to uniquely identify a format,
or protocol.
Multibase A self-describing base-encoded string.
Anatomy of
a CID
name, tag, code, description
identity, multihash, 0x00, raw binary
ip4, multiaddr, 0x04,
dccp, multiaddr, 0x21,
dnsaddr, multiaddr, 0x38,
protobuf, serialization, 0x50, Protocol Buffers
cbor, serialization, 0x51, CBOR
raw, ipld, 0x55, raw binary
...
Multihash A self-describing hash digest:
● Hash Function (multicodec)
● Hash Digest Length
● Hash Digest
• Multicodec: a pre-set number to uniquely identify a format, or
protocol.
Multibase A self-describing base encoding.
● A multibase prefix.
■ b - base32
■ z - base58
■ f - base16
● Followed by the base encoded data.
Full list: https://github.com/multiformats/multicodec
Anatomy of
a CID
Binary Breakdown
110010010…
1000000000000010
01110000
00000001 00010010
CID Version
Multicodec
Hash
function
sha-256
(0x12)
Length
Multicodec
Actual
Content Hash!
CID-V1
How to
interpret
the data
dag-pb
(0x70) 128 | 2
How to interpret the
CID
CID-V0
or
CID-V1
bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi
CID-V1
QmbWqxBEKC3P8tqsKc98xmWNzrzDtRLMiMPL8wBuTGsMnR
CID-V0
Visit: cid.ipfs.io
<- IPLD encoding ->
<base>base(<cid-version><multicodec><multihash>)
Anatomy of
a CID
Common Misconception
The CID and in particular the Multihash part (hash digest) of the CID applies to
the DAG representation of the file - not the file itself.
CID
file bytes
DAG
representation of
file bytes
Thank you for watching
Reach out in case you have questions or comments!

More Related Content

What's hot

InterPlanetary File System (IPFS)
InterPlanetary File System (IPFS)InterPlanetary File System (IPFS)
InterPlanetary File System (IPFS)Gene Leybzon
 
Apache Arrow Flight Overview
Apache Arrow Flight OverviewApache Arrow Flight Overview
Apache Arrow Flight OverviewJacques Nadeau
 
Apache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop componentsApache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop componentsDataWorks Summit/Hadoop Summit
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseDataWorks Summit
 
Road to NODES - Healthcare Analytics
Road to NODES - Healthcare AnalyticsRoad to NODES - Healthcare Analytics
Road to NODES - Healthcare AnalyticsNeo4j
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsAnton Kirillov
 
Intro to Neo4j
Intro to Neo4jIntro to Neo4j
Intro to Neo4jNeo4j
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017 Karan Singh
 
3D: DBT using Databricks and Delta
3D: DBT using Databricks and Delta3D: DBT using Databricks and Delta
3D: DBT using Databricks and DeltaDatabricks
 
Observability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineageObservability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineageDatabricks
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsAlluxio, Inc.
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowJulien Le Dem
 
Neo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseNeo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseMindfire Solutions
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark InternalsPietro Michiardi
 
Data and AI summit: data pipelines observability with open lineage
Data and AI summit: data pipelines observability with open lineageData and AI summit: data pipelines observability with open lineage
Data and AI summit: data pipelines observability with open lineageJulien Le Dem
 
An overview of snowflake
An overview of snowflakeAn overview of snowflake
An overview of snowflakeSivakumar Ramar
 

What's hot (20)

InterPlanetary File System (IPFS)
InterPlanetary File System (IPFS)InterPlanetary File System (IPFS)
InterPlanetary File System (IPFS)
 
Apache Arrow Flight Overview
Apache Arrow Flight OverviewApache Arrow Flight Overview
Apache Arrow Flight Overview
 
Spark architecture
Spark architectureSpark architecture
Spark architecture
 
Apache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop componentsApache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop components
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
 
Road to NODES - Healthcare Analytics
Road to NODES - Healthcare AnalyticsRoad to NODES - Healthcare Analytics
Road to NODES - Healthcare Analytics
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
Intro to Neo4j
Intro to Neo4jIntro to Neo4j
Intro to Neo4j
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
3D: DBT using Databricks and Delta
3D: DBT using Databricks and Delta3D: DBT using Databricks and Delta
3D: DBT using Databricks and Delta
 
Observability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineageObservability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineage
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
 
Neo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseNeo4J : Introduction to Graph Database
Neo4J : Introduction to Graph Database
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
 
Data and AI summit: data pipelines observability with open lineage
Data and AI summit: data pipelines observability with open lineageData and AI summit: data pipelines observability with open lineage
Data and AI summit: data pipelines observability with open lineage
 
Ipfs
IpfsIpfs
Ipfs
 
An overview of snowflake
An overview of snowflakeAn overview of snowflake
An overview of snowflake
 

Similar to IPFS Content Addressing Explained

Webinar: Applying REST to Network Management – An Implementor’s View
Webinar: Applying REST to Network Management – An Implementor’s View Webinar: Applying REST to Network Management – An Implementor’s View
Webinar: Applying REST to Network Management – An Implementor’s View Tail-f Systems
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
Minerva: Drill Storage Plugin for IPFS
Minerva: Drill Storage Plugin for IPFSMinerva: Drill Storage Plugin for IPFS
Minerva: Drill Storage Plugin for IPFSBowenDing4
 
Hops - Distributed metadata for Hadoop
Hops - Distributed metadata for HadoopHops - Distributed metadata for Hadoop
Hops - Distributed metadata for HadoopJim Dowling
 
Spark - The beginnings
Spark -  The beginningsSpark -  The beginnings
Spark - The beginningsDaniel Leon
 
Introduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQLIntroduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQLdatamantra
 
FIWARE Global Summit - NGSI-LD: Modelling, Linking and Utilizing Context Info...
FIWARE Global Summit - NGSI-LD: Modelling, Linking and Utilizing Context Info...FIWARE Global Summit - NGSI-LD: Modelling, Linking and Utilizing Context Info...
FIWARE Global Summit - NGSI-LD: Modelling, Linking and Utilizing Context Info...FIWARE
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionAlluxio, Inc.
 
Data Science with the Help of Metadata
Data Science with the Help of MetadataData Science with the Help of Metadata
Data Science with the Help of MetadataJim Dowling
 
Wisely Chen Spark Talk At Spark Gathering in Taiwan
Wisely Chen Spark Talk At Spark Gathering in Taiwan Wisely Chen Spark Talk At Spark Gathering in Taiwan
Wisely Chen Spark Talk At Spark Gathering in Taiwan Wisely chen
 
Mongo presentation conf
Mongo presentation confMongo presentation conf
Mongo presentation confShridhar Joshi
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Oscar Corcho
 
Apache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - PanoraysApache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - PanoraysDemi Ben-Ari
 
Using Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data FabricUsing Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data FabricCambridge Semantics
 

Similar to IPFS Content Addressing Explained (20)

Webinar: Applying REST to Network Management – An Implementor’s View
Webinar: Applying REST to Network Management – An Implementor’s View Webinar: Applying REST to Network Management – An Implementor’s View
Webinar: Applying REST to Network Management – An Implementor’s View
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Minerva: Drill Storage Plugin for IPFS
Minerva: Drill Storage Plugin for IPFSMinerva: Drill Storage Plugin for IPFS
Minerva: Drill Storage Plugin for IPFS
 
Zipkin
ZipkinZipkin
Zipkin
 
Hops - Distributed metadata for Hadoop
Hops - Distributed metadata for HadoopHops - Distributed metadata for Hadoop
Hops - Distributed metadata for Hadoop
 
Spark - The beginnings
Spark -  The beginningsSpark -  The beginnings
Spark - The beginnings
 
Introduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQLIntroduction to Structured Data Processing with Spark SQL
Introduction to Structured Data Processing with Spark SQL
 
FIWARE Global Summit - NGSI-LD: Modelling, Linking and Utilizing Context Info...
FIWARE Global Summit - NGSI-LD: Modelling, Linking and Utilizing Context Info...FIWARE Global Summit - NGSI-LD: Modelling, Linking and Utilizing Context Info...
FIWARE Global Summit - NGSI-LD: Modelling, Linking and Utilizing Context Info...
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
 
Data Science with the Help of Metadata
Data Science with the Help of MetadataData Science with the Help of Metadata
Data Science with the Help of Metadata
 
NOSQL Overview
NOSQL OverviewNOSQL Overview
NOSQL Overview
 
Wisely Chen Spark Talk At Spark Gathering in Taiwan
Wisely Chen Spark Talk At Spark Gathering in Taiwan Wisely Chen Spark Talk At Spark Gathering in Taiwan
Wisely Chen Spark Talk At Spark Gathering in Taiwan
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
HDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the CloudHDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the Cloud
 
Mongo presentation conf
Mongo presentation confMongo presentation conf
Mongo presentation conf
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?
 
Apache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - PanoraysApache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - Panorays
 
Big data clustering
Big data clusteringBig data clustering
Big data clustering
 
Using Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data FabricUsing Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data Fabric
 

Recently uploaded

Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxmibuzondetrabajo
 
Company Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxCompany Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxMario
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119APNIC
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书rnrncn29
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
TRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptxTRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptxAndrieCagasanAkio
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 

Recently uploaded (11)

Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptx
 
Company Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxCompany Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptx
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
TRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptxTRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptx
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 

IPFS Content Addressing Explained

  • 2. Agenda ➔ Content Addressing Motivation ◆ What is it and why it’s good ➔ Content Addressing in IPFS ◆ Chunking ◆ Linking Chunks in Merkle DAGs ◆ Going from Data to Data Structures with IPLD ➔ Anatomy of the IPFS Content Identifier (CID)
  • 3. Motivation for Content Addressing What is it and why it’s good Motivation Content Addressing in IPFS Anatomy of a CID
  • 4. Why location addressing fails us ● The URL only points to a single copy, stored in a single location. ● If that copy disappears, there is no way to know where other copies are. ● It is not possible for a user to validate the integrity of the content ○ A malicious actor can poison DNS, or change the copy’s location, without the end user noticing. ○ HTTPS is an improvement, but only secures the transport, not the content. ● No request aggregation, resulting in duplication of effort and bandwidth waste (i.e. no options for multicast in the wild).
  • 5. Location- vs content-addressing location addressing: ask exactly one host: IP(dns(URL)) content addressing: ask any host, choose closest host The address of content (i.e. its fingerprint) is derived from the content itself and not from the location where it is stored.
  • 7. CIDs are: ● the most fundamental ingredient of the IPFS architecture ● used for content addressing ● used to name every piece of data in IPFS ● a hash with some metadata ● self describing Content Identifier CIDv0: QmS4ustL54uo8FzR9455qaxZwuMiUhyvMcX9Ba8nUH4uVv CIDv1: bafybeibxm2nsadl3fnxv2sxcxmxaco2jl53wpeorjdzidjwf5aqdg7wa6u
  • 8. CIDs are Immutable links Deduplication Identical data can be verified by its address -> Caching -> saves resources and provides faster access to content Self-certification Content is authenticated by its address, not by a Certificate Authority -> Decentralisation Immutability If the content changes, its address also changes -> Integrity Checking
  • 11. a CONTENT ADDRESSING CONTENT DISCOVERY & ROUTING CONTENT EXCHANGE ● Chunking ● Linking Chunks in Merkle DAGs ● From Data to Data Structures with IPLD ● Anatomy of the IPFS CID ● Routing & Provider Records ● DHT-based Routing ● Gossip-based Routing ● Bitswap ● GraphSync MUTABLE NAMES & MESSAGE DELIVERY ● Dynamic Data ● IPNS ● PubSub ● CRDTs IPFS Components
  • 12. Content Addressing 1 Chunking 2 Linking Chunks 3 Going from Data to Data Structures Motivation Content Addressing in IPFS Anatomy of a CID
  • 13. a CONTENT ADDRESSING CONTENT DISCOVERY & ROUTING CONTENT EXCHANGE ● Chunking ● Linking Chunks in Merkle DAGs ● From Data to Data Structures with IPLD ● Anatomy of the IPFS CID ● Routing & Provider Records ● DHT-based Routing ● Gossip-based Routing ● Bitswap ● GraphSync MUTABLE NAMES & MESSAGE DELIVERY ● Dynamic Data ● IPNS ● PubSub ● CRDTs IPFS Components
  • 14. Content Addressing Chunking ● Deduplication ● Piecewise Transfer ● Random Access Each chunk is individually addressed and identified by its own hash File Chunked File
  • 15. Chunking ● Deduplication ● Piecewise Transfer ● Random Access Optimise storage requirements Content Addressing File Chunked File
  • 16. ● Deduplication ● Piecewise Transfer ● Random Access Identify errors without having to fetch the whole file Discard parts that arrived in error Chunking Content Addressing File Chunked File
  • 17. ● Deduplication ● Piecewise Transfer ● Random Access Optimise bandwidth requirements Fetch the parts you need only Chunking Content Addressing File Chunked File
  • 18. a CONTENT ADDRESSING CONTENT DISCOVERY & ROUTING CONTENT EXCHANGE ● Chunking ● Linking Chunks in Merkle DAGs ● From Data to Data Structures with IPLD ● Anatomy of the IPFS CID ● Routing & Provider Records ● DHT-based Routing ● Gossip-based Routing ● Bitswap ● GraphSync MUTABLE NAMES & MESSAGE DELIVERY ● Dynamic Data ● IPNS ● PubSub ● CRDTs IPFS Components
  • 19. • Files are split in chunks that are connected into a Merkle Directed Acyclic Graph (Merkle DAG). • In Merkle DAGs, similarly to Merkle Trees: • Every chunk is represented as a node. • The hash of the chunk is the address of the node. • The address of every node is embedded in its parent, as a link. • What is not possible with Merkle Trees is for a node to have multiple parents. • Each node in the DAG is addressed and can be accessed independently, making it an entity of its own. Content Addressing in IPFS
  • 20. File Chunks: UnixFS File: (merkle-link) (a hash) (merkle-tree) 0-200 200-350 0-100 100-200 200-300 300-350 Linking Chunks in a DAG Content Addressing
  • 21. File Chunks: UnixFS File: (merkle-link) (a hash) (merkle-tree-dag) - directed acyclic graph 0-200 200-350 0-100 100-200 200-300 300-350 Merkle DAGs are graph data structures where each node is content-addressed Visit: dag.ipfs.io Linking Chunks in a DAG Content Addressing
  • 22. a CONTENT ADDRESSING CONTENT DISCOVERY & ROUTING CONTENT EXCHANGE ● Chunking ● Linking Chunks in Merkle DAGs ● From Data to Data Structures with IPLD ● Anatomy of the IPFS CID ● Routing & Provider Records ● DHT-based Routing ● Gossip-based Routing ● Bitswap ● GraphSync MUTABLE NAMES & MESSAGE DELIVERY ● Dynamic Data ● IPNS ● PubSub ● CRDTs IPFS Components
  • 23. • IPLD provides the standards and formats to build data structures based on Merkle-DAGs. • With IPLD we go from Data to Data Structures. • You can think of an IPLD Graph as the tree of folders in your file system. It starts from the root and splits to directories and files. • Recall: • Every point in the tree is a node in the IPLD graph • Every node in the IPLD graph has its own CID with the root node having the root CID • But also: every point in the tree points and is linked to from its parent node. InterPlanetary Linked Data (IPLD)
  • 24. LOCATION-BASED ADDRESSING CONTENT-BASED ADDRESSING IPLD-Powered Merkle-DAGs Content Addressing with Merkle DAGs: From Data to Data Structures with IPLD LOCATION-BASED IDENTIFIER CONTENT-BASED IDENTIFIER IPLD-Powered Merkle-DAGs http://example.com corresponds to ipfs://Qm<hash> Only one host can serve this Anyone with the content can serve this IPLD lets you seamlessly link and traverse various types of content-linked data
  • 25. a CONTENT ADDRESSING CONTENT DISCOVERY & ROUTING CONTENT EXCHANGE ● Chunking ● Linking Chunks in Merkle DAGs ● From Data to Data Structures with IPLD ● Anatomy of the IPFS CID ● Routing & Provider Records ● DHT-based Routing ● Gossip-based Routing ● Bitswap ● GraphSync MUTABLE NAMES & MESSAGE DELIVERY ● Dynamic Data ● IPNS ● PubSub ● CRDTs IPFS Components
  • 26. Anatomy of the IPFS CID Motivation Content Addressing in IPFS Anatomy of the IPFS CID
  • 27. CIDs are: ● used for content addressing ● used to name every piece of data in IPFS ● a hash with some metadata ● self describing Content Identifier Basic Concepts CIDv0: QmS4ustL54uo8FzR9455qaxZwuMiUhyvMcX9Ba8nUH4uVv CIDv1: bafybeibxm2nsadl3fnxv2sxcxmxaco2jl53wpeorjdzidjwf5aqdg7wa6u Self-Certification: Content is authenticated by its address Immutability: If the content changes, its address also changes De-duplication: Same data can be identified by its address
  • 28. Anatomy of a CID <base>base(<cid-version><multicodec><multihash>) Multihash A self-describing hash digest. Multicodec A pre-set number to uniquely identify a format, or protocol. Multibase A self-describing base-encoded string.
  • 29. Anatomy of a CID name, tag, code, description identity, multihash, 0x00, raw binary ip4, multiaddr, 0x04, dccp, multiaddr, 0x21, dnsaddr, multiaddr, 0x38, protobuf, serialization, 0x50, Protocol Buffers cbor, serialization, 0x51, CBOR raw, ipld, 0x55, raw binary ... Multihash A self-describing hash digest: ● Hash Function (multicodec) ● Hash Digest Length ● Hash Digest • Multicodec: a pre-set number to uniquely identify a format, or protocol. Multibase A self-describing base encoding. ● A multibase prefix. ■ b - base32 ■ z - base58 ■ f - base16 ● Followed by the base encoded data. Full list: https://github.com/multiformats/multicodec
  • 30. Anatomy of a CID Binary Breakdown 110010010… 1000000000000010 01110000 00000001 00010010 CID Version Multicodec Hash function sha-256 (0x12) Length Multicodec Actual Content Hash! CID-V1 How to interpret the data dag-pb (0x70) 128 | 2 How to interpret the CID CID-V0 or CID-V1 bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi CID-V1 QmbWqxBEKC3P8tqsKc98xmWNzrzDtRLMiMPL8wBuTGsMnR CID-V0 Visit: cid.ipfs.io <- IPLD encoding -> <base>base(<cid-version><multicodec><multihash>)
  • 31.
  • 32. Anatomy of a CID Common Misconception The CID and in particular the Multihash part (hash digest) of the CID applies to the DAG representation of the file - not the file itself. CID file bytes DAG representation of file bytes
  • 33. Thank you for watching Reach out in case you have questions or comments!