SlideShare a Scribd company logo
1 of 61
An Overview of Peer-to-Peer
Sami Rollins
Outline
• P2P Overview
– What is a peer?
– Example applications
– Benefits of P2P
– Is this just distributed computing?
• P2P Challenges
• Distributed Hash Tables (DHTs)
What is Peer-to-Peer (P2P)?
• Napster?
• Gnutella?
• Most people think of P2P as music sharing
What is a peer?
• Contrasted with
Client-Server model
• Servers are centrally
maintained and
administered
• Client has fewer
resources than a server
What is a peer?
• A peer’s resources are
similar to the resources of
the other participants
• P2P – peers
communicating directly
with other peers and
sharing resources
• Often administered by
different entities
– Compare with DNS
P2P Application Taxonomy
P2P Systems
Distributed Computing
SETI@home
File Sharing
Gnutella
Collaboration
Jabber
Platforms
JXTA
Distributed Computing
Collaboration
sendMessage receiveMessage sendMessage receiveMessage
Collaboration
sendMessage receiveMessage sendMessage receiveMessage
Platforms
Find Peers … Send Messages
Gnutella Instant Messaging
P2P Goals/Benefits
• Cost sharing
• Resource aggregation
• Improved scalability/reliability
• Increased autonomy
• Anonymity/privacy
• Dynamism
• Ad-hoc communication
P2P File Sharing
• Centralized
– Napster
• Decentralized
– Gnutella
• Hierarchical
– Kazaa
• Incentivized
– BitTorrent
• Distributed Hash Tables
– Chord, CAN, Tapestry, Pastry
Challenges
• Peer discovery
• Group management
• Search
• Download
• Incentives
Metrics
• Per-node state
• Bandwidth usage
• Search time
• Fault tolerance/resiliency
Centralized
• Napster model
– Server contacted during search
– Peers directly exchange content
• Benefits:
– Efficient search
– Limited bandwidth usage
– No per-node state
• Drawbacks:
– Central point of failure
– Limited scale
Bob Alice
Jane
Judy
Decentralized (Flooding)
• Gnutella model
– Search is flooded to neighbors
– Neighbors are determined
randomly
• Benefits:
– No central point of failure
– Limited per-node state
• Drawbacks:
– Slow searches
– Bandwidth intensive
Bob
Alice
Jane
Judy
Carl
Hierarchical
• Kazaa/new Gnutella model
– Nodes with high bandwidth/long uptime
become supernodes/ultrapeers
– Search requests sent to supernode
– Supernode caches info about attached leaf
nodes
– Supernodes connect to eachother (32 in
Limewire)
• Benefits:
– Search faster than flooding
• Drawbacks:
– Many of the same problems as
decentralized
– Reconfiguration when supernode fails
SuperBob
Andy
Jane
Judy
Carl
SuperFred
SuperTed
Alex
Alice
BitTorrent
Source
.torrent server
seed
tracker
1. Download torrent
2. Get list of
peers and seeds
(the swarm)
3. Exchange vector
of content downloaded
with peers
4. Exchange content w/
peers
5. Update w/
progress
BitTorrent
• Key Ideas
– Break large files into small blocks and download blocks
individually
– Provide incentives for uploading content
• Allow download from peers that provide best upload rate
• Benefits
– Incentives
– Centralized search
– No neighbor state (except the peers in your swarm)
• Drawbacks
– “Centralized” search
• No central repository
Distributed Hash Tables (DHT)
• Chord, CAN, Tapestry, Pastry
model
– AKA Structured P2P networks
– Provide performance guarantees
– If content exists, it will be found
• Benefits:
– More efficient searching
– Limited per-node state
• Drawbacks:
– Limited fault-tolerance vs
redundancy
001 012
212
305
332
212 ?
212 ?
DHTs: Overview
• Goal: Map key to value
• Decentralized with bounded number of neighbors
• Provide guaranteed performance for search
– If content is in network, it will be found
– Number of messages required for search is bounded
• Provide guaranteed performance for join/leave
– Minimal number of nodes affected
• Suitable for applications like file systems that
require guaranteed performance
Comparing DHTs
• Neighbor state
• Search performance
• Join algorithm
• Failure recovery
CAN
• Associate to each node and item a unique id in an
d-dimensional space
• Goals
– Scales to hundreds of thousands of nodes
– Handles rapid arrival and failure of nodes
• Properties
– Routing table size O(d)
– Guarantees that a file is found in at most d*n1/d steps,
where n is the total number of nodes
Slide modified from another presentation
CAN Example: Two Dimensional
Space
• Space divided between nodes
• All nodes cover the entire
space
• Each node covers either a
square or a rectangular area of
ratios 1:2 or 2:1
• Example:
– Node n1:(1, 2) first node that
joins  cover the entire space
1 2 3 4 5 6 7
0
1
2
3
4
5
6
7
0
n1
Slide modified from another presentation
CAN Example: Two Dimensional
Space
• Node n2:(4, 2) joins
• n2 contacts n1
• n1 splits its area and assigns
half to n2
1 2 3 4 5 6 7
0
1
2
3
4
5
6
7
0
n1 n2
Slide modified from another presentation
CAN Example: Two Dimensional
Space
• Nodes n3:(3, 5) n4:(5, 5) and n5:(6,6) join
• Each new node sends JOIN request to an existing
node chosen randomly
• New node gets neighbor table from existing node
• New and existing nodes update neighbor tables and
neighbors accordingly
– before n5 joins, n4 has neighbors n2 and n3
– n5 adds n4 and n2 to neighborlist
– n2 updated to include n5 in neighborlist
• Only O(2d) nodes are affected
1 2 3 4 5 6 7
0
1
2
3
4
5
6
7
0
n1 n2
n3 n4
n5
Slide modified from another presentation
CAN Example: Two Dimensional
Space
• Bootstrapping - assume CAN has an associated
DNS domain and domain resolves to IP of one or
more bootstrap nodes
• Optimizations - landmark routing
– Ping a landmark server(s) and choose an
existing node based on distance to landmark
1 2 3 4 5 6 7
0
1
2
3
4
5
6
7
0
n1 n2
n3 n4
n5
Slide modified from another presentation
CAN Example: Two Dimensional
Space
• Nodes: n1:(1, 2); n2:(4,2);
n3:(3, 5); n4:(5,5);n5:(6,6)
• Items: f1:(2,3); f2:(5,1);
f3:(2,1); f4:(7,5);
1 2 3 4 5 6 7
0
1
2
3
4
5
6
7
0
n1 n2
n3 n4
n5
f1
f2
f3
f4
Slide modified from another presentation
CAN Example: Two Dimensional
Space
• Each item is stored by the
node who owns its mapping
in the space
1 2 3 4 5 6 7
0
1
2
3
4
5
6
7
0
n1 n2
n3 n4
n5
f1
f2
f3
f4
Slide modified from another presentation
CAN: Query Example
• Forward query to the
neighbor that is closest to the
query id (Euclidean distance)
• Example: assume n1 queries
f4
1 2 3 4 5 6 7
0
1
2
3
4
5
6
7
0
n1 n2
n3 n4
n5
f1
f2
f3
f4
Slide modified from another presentation
CAN: Query Example
• Forward query to the
neighbor that is closest to the
query id
• Example: assume n1 queries
f4
1 2 3 4 5 6 7
0
1
2
3
4
5
6
7
0
n1 n2
n3 n4
n5
f1
f2
f3
f4
Slide modified from another presentation
CAN: Query Example
• Forward query to the
neighbor that is closest to the
query id
• Example: assume n1 queries
f4
1 2 3 4 5 6 7
0
1
2
3
4
5
6
7
0
n1 n2
n3 n4
n5
f1
f2
f3
f4
Slide modified from another presentation
CAN: Query Example
• Content guaranteed to be
found in d*n1/d hops
– Each dimension has n1/d nodes
• Increasing the number of
dimensions reduces path
length but increases number
of neighbors
1 2 3 4 5 6 7
0
1
2
3
4
5
6
7
0
n1 n2
n3 n4
n5
f1
f2
f3
f4
Slide modified from another presentation
Node Failure Recovery
• Detection
– Nodes periodically send refresh messages to neighbors
• Simple failures
– neighbor’s neighbors are cached
– when a node fails, one of its neighbors takes over its zone
• when a node fails to receive a refresh from neighbor, it sets a
timer
• many neighbors may simultaneously set their timers
• when a node’s timer goes off, it sends a TAKEOVER to the
failed node’s neighbors
• when a node receives a TAKEOVER it either (a) cancels its
timer if the zone volume of the sender is smaller than its own
or (b) replies with a TAKEOVER
Slide modified from another presentation
Chord
• Each node has m-bit id that is a
SHA-1 hash of its IP address
• Nodes are arranged in a circle
modulo m
• Data is hashed to an id in the same
id space
• Node n stores data with id between
n and n’s predecessor
– 0 stores 4-0
– 1 stores 1
– 3 stores 2-3
0
2
4
6
1
7
3
5
Chord
• Simple query algorithm:
– Node maintains successor
– To find data with id i, query
successor until successor > i
found
• Running time?
0
2
4
6
1
7
3
5
Chord
• In reality, nodes maintain a finger
table with more routing
information
– For a node n, the ith entry in its
finger table is the first node
that succeeds n by at least 2i-1
• Size of finger table?
0
2
4
6
1
7
3
5
1 1
2 3
4 0
2 3
3 3
5 0
4 0
5 0
7 0
Chord
• In reality, nodes maintain a finger
table with more routing
information
– For a node n, the ith entry in its
finger table is the first node
that succeeds n by at least 2i-1
• Size of finger table?
• O(log N)
0
2
4
6
1
7
3
5
1 1
2 3
4 0
2 3
3 3
5 0
4 0
5 0
7 0
Chord
query:
hash key to get id
if id == node id - data found
else if id in finger table - data found
else
p = find_predecessor(id)
n = find_successor(p)
find_predecessor(id):
choose n in finger table closest to id
if n < id < find_successor(n)
return n
else
ask n for finger entry closest to id and
recurse
0
2
4
6
1
7
3
5
1 1
2 3
4 0
2 3
3 3
5 0
4 0
5 0
7 0
Chord
• Running time of
query algorithm?
– Problem size is
halved at each
iteration
0
2
4
6
1
7
3
5
1 1
2 3
4 0
2 3
3 3
5 0
4 0
5 0
7 0
Chord
• Running time of
query algorithm?
– O(log N)
0
2
4
6
1
7
3
5
1 1
2 3
4 0
2 3
3 3
5 0
4 0
5 0
7 0
Chord
• Join
– initialize
predecessor and
fingers
– update fingers and
predecessors of
existing nodes
– transfer data
0
2
4
6
1
7
3
5
1 1
2 3
4 6
2 3
3 3
5 6
4 6
5 6
7 0
7 0
0 0
2 3
Chord
• Initialize predecessor and finger of
new node n*
– n* contacts existing node in
network n
– n does a lookup of predecessor
of n*
– for each entry in finger table,
look up successor
• Running time - O(mlogN)
• Optimization - initialize n* with
finger table of successor
– with high probability, reduces
running time to O(log N)
0
2
4
6
1
7
3
5
1 1
2 3
4 6
2 3
3 3
5 6
4 6
5 6
7 0
7 0
0 0
2 3
Chord
• Update existing nodes
– n* becomes ith finger of a node
p if
• p precedes n by at least 2i-1
• the ith finger of p succeeds n
– start at predecessor of n* and
walk backwards
– for i=1 to 3:
• find predecessor of n*-2i-1
• update table and recurse
• Running time O(log2N)
0
2
4
6
1
7
3
5
1 1
2 3
4 6
2 3
3 3
5 6
4 6
5 6
7 0
7 0
0 0
2 3
Chord
• Stabilization
– Goal: handle concurrent joins
– Periodically, ask successor for
its predecessor
– If your successor’s
predecessor isn’t you, update
– Periodically, refresh finger
tables
• Failures
– keep list of r successors
– if successor fails, replace with
next in the list
– finger tables will be corrected
by stabilization algorithm
0
2
4
6
1
7
3
5
1 1
2 3
4 6
2 3
3 3
5 6
4 6
5 6
7 0
7 0
0 0
2 3
DHTs – Tapestry/Pastry
• Global mesh
• Suffix-based routing
• Uses underlying network
distance in constructing
mesh
13FE
ABFE
1290
239E
73FE
9990
F990
993E
04FE
43FE
Comparing Guarantees
logbN
Neighbor
map
Pastry
b logbN
logbN
Global Mesh
Tapestry
2d
dN1/d
Multi-
dimensional
CAN
log N
log N
Uni-
dimensional
Chord
State
Search
Model
b logbN + b
Remaining Problems?
• Hard to handle highly dynamic
environments
• Usable services
• Methods don’t consider peer characteristics
Measurement Studies
• “Free Riding on Gnutella”
• Most studies focus on Gnutella
• Want to determine how users behave
• Recommendations for the best way to
design systems
Free Riding Results
• Who is sharing what?
• August 2000
The top Share As percent of whole
333 hosts (1%) 1,142,645 37%
1,667 hosts (5%) 2,182,087 70%
3,334 hosts (10%) 2,692,082 87%
5,000 hosts (15%) 2,928,905 94%
6,667 hosts (20%) 3,037,232 98%
8,333 hosts (25%) 3,082,572 99%
Saroiu et al Study
• How many peers are server-like…client-
like?
– Bandwidth, latency
• Connectivity
• Who is sharing what?
Saroiu et al Study
• May 2001
• Napster crawl
– query index server and keep track of results
– query about returned peers
– don’t capture users sharing unpopular content
• Gnutella crawl
– send out ping messages with large TTL
Results Overview
• Lots of heterogeneity between peers
– Systems should consider peer capabilities
• Peers lie
– Systems must be able to verify reported peer
capabilities or measure true capabilities
Measured Bandwidth
Reported Bandwidth
Measured Latency
Measured Uptime
Number of Shared Files
Connectivity
Points of Discussion
• Is it all hype?
• Should P2P be a research area?
• Do P2P applications/systems have common
research questions?
• What are the “killer apps” for P2P systems?
Conclusion
• P2P is an interesting and useful model
• There are lots of technical challenges to be
solved

More Related Content

Similar to An overview of Peer-to-Peer technology new

Infinispan, a distributed in-memory key/value data grid and cache
 Infinispan, a distributed in-memory key/value data grid and cache Infinispan, a distributed in-memory key/value data grid and cache
Infinispan, a distributed in-memory key/value data grid and cacheSebastian Andrasoni
 
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016Duyhai Doan
 
Optimizing Set-Similarity Join and Search with Different Prefix Schemes
Optimizing Set-Similarity Join and Search with Different Prefix SchemesOptimizing Set-Similarity Join and Search with Different Prefix Schemes
Optimizing Set-Similarity Join and Search with Different Prefix SchemesHPCC Systems
 
Building day 2 upload Building the Internet of Things with Thingsquare and ...
Building day 2   upload Building the Internet of Things with Thingsquare and ...Building day 2   upload Building the Internet of Things with Thingsquare and ...
Building day 2 upload Building the Internet of Things with Thingsquare and ...Adam Dunkels
 
Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford MapR Technologies
 
INC 2004: An Efficient Mechanism for Adaptive Resource Discovery in Grids
INC 2004: An Efficient Mechanism for Adaptive Resource Discovery in GridsINC 2004: An Efficient Mechanism for Adaptive Resource Discovery in Grids
INC 2004: An Efficient Mechanism for Adaptive Resource Discovery in GridsJames Salter
 
MongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB
 
A scalabilty and mobility resilient data search system
A  scalabilty and mobility resilient data search systemA  scalabilty and mobility resilient data search system
A scalabilty and mobility resilient data search systemAleesha Noushad
 
CRYPTOGRAPHY AND NETWORK SECURITY
CRYPTOGRAPHY AND NETWORK SECURITYCRYPTOGRAPHY AND NETWORK SECURITY
CRYPTOGRAPHY AND NETWORK SECURITYKathirvel Ayyaswamy
 
Peer-to-Peer Streaming Based on Network Coding Decreases Packet Jitter
Peer-to-Peer Streaming Based on Network Coding Decreases Packet JitterPeer-to-Peer Streaming Based on Network Coding Decreases Packet Jitter
Peer-to-Peer Streaming Based on Network Coding Decreases Packet JitterAlpen-Adria-Universität
 
2018a 1324654jhjkhkhkkjhk
2018a 1324654jhjkhkhkkjhk2018a 1324654jhjkhkhkkjhk
2018a 1324654jhjkhkhkkjhkJasser Kouki
 
Linx88 IPv6 Neighbor Discovery Russell Heilling
Linx88 IPv6 Neighbor Discovery Russell HeillingLinx88 IPv6 Neighbor Discovery Russell Heilling
Linx88 IPv6 Neighbor Discovery Russell HeillingRussell Heilling
 
Moving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC SystemsMoving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC SystemsHPCC Systems
 

Similar to An overview of Peer-to-Peer technology new (20)

Infinispan, a distributed in-memory key/value data grid and cache
 Infinispan, a distributed in-memory key/value data grid and cache Infinispan, a distributed in-memory key/value data grid and cache
Infinispan, a distributed in-memory key/value data grid and cache
 
datalink.ppt
datalink.pptdatalink.ppt
datalink.ppt
 
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
 
Optimizing Set-Similarity Join and Search with Different Prefix Schemes
Optimizing Set-Similarity Join and Search with Different Prefix SchemesOptimizing Set-Similarity Join and Search with Different Prefix Schemes
Optimizing Set-Similarity Join and Search with Different Prefix Schemes
 
Building day 2 upload Building the Internet of Things with Thingsquare and ...
Building day 2   upload Building the Internet of Things with Thingsquare and ...Building day 2   upload Building the Internet of Things with Thingsquare and ...
Building day 2 upload Building the Internet of Things with Thingsquare and ...
 
Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford
 
Dns ppt
Dns pptDns ppt
Dns ppt
 
INC 2004: An Efficient Mechanism for Adaptive Resource Discovery in Grids
INC 2004: An Efficient Mechanism for Adaptive Resource Discovery in GridsINC 2004: An Efficient Mechanism for Adaptive Resource Discovery in Grids
INC 2004: An Efficient Mechanism for Adaptive Resource Discovery in Grids
 
Tutorial 1
Tutorial 1Tutorial 1
Tutorial 1
 
Meeting 4 DNS
Meeting 4   DNSMeeting 4   DNS
Meeting 4 DNS
 
MongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: ShardingMongoDB for Time Series Data: Sharding
MongoDB for Time Series Data: Sharding
 
A scalabilty and mobility resilient data search system
A  scalabilty and mobility resilient data search systemA  scalabilty and mobility resilient data search system
A scalabilty and mobility resilient data search system
 
CRYPTOGRAPHY AND NETWORK SECURITY
CRYPTOGRAPHY AND NETWORK SECURITYCRYPTOGRAPHY AND NETWORK SECURITY
CRYPTOGRAPHY AND NETWORK SECURITY
 
Peer-to-Peer Streaming Based on Network Coding Decreases Packet Jitter
Peer-to-Peer Streaming Based on Network Coding Decreases Packet JitterPeer-to-Peer Streaming Based on Network Coding Decreases Packet Jitter
Peer-to-Peer Streaming Based on Network Coding Decreases Packet Jitter
 
2018a 1324654jhjkhkhkkjhk
2018a 1324654jhjkhkhkkjhk2018a 1324654jhjkhkhkkjhk
2018a 1324654jhjkhkhkkjhk
 
Linx88 IPv6 Neighbor Discovery Russell Heilling
Linx88 IPv6 Neighbor Discovery Russell HeillingLinx88 IPv6 Neighbor Discovery Russell Heilling
Linx88 IPv6 Neighbor Discovery Russell Heilling
 
Ch03
Ch03Ch03
Ch03
 
Moving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC SystemsMoving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC Systems
 
Bh eu 05-kaminsky
Bh eu 05-kaminskyBh eu 05-kaminsky
Bh eu 05-kaminsky
 
Bh eu 05-kaminsky
Bh eu 05-kaminskyBh eu 05-kaminsky
Bh eu 05-kaminsky
 

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Recently uploaded (20)

Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

An overview of Peer-to-Peer technology new

  • 1. An Overview of Peer-to-Peer Sami Rollins
  • 2. Outline • P2P Overview – What is a peer? – Example applications – Benefits of P2P – Is this just distributed computing? • P2P Challenges • Distributed Hash Tables (DHTs)
  • 3. What is Peer-to-Peer (P2P)? • Napster? • Gnutella? • Most people think of P2P as music sharing
  • 4. What is a peer? • Contrasted with Client-Server model • Servers are centrally maintained and administered • Client has fewer resources than a server
  • 5. What is a peer? • A peer’s resources are similar to the resources of the other participants • P2P – peers communicating directly with other peers and sharing resources • Often administered by different entities – Compare with DNS
  • 6. P2P Application Taxonomy P2P Systems Distributed Computing SETI@home File Sharing Gnutella Collaboration Jabber Platforms JXTA
  • 10. Platforms Find Peers … Send Messages Gnutella Instant Messaging
  • 11. P2P Goals/Benefits • Cost sharing • Resource aggregation • Improved scalability/reliability • Increased autonomy • Anonymity/privacy • Dynamism • Ad-hoc communication
  • 12. P2P File Sharing • Centralized – Napster • Decentralized – Gnutella • Hierarchical – Kazaa • Incentivized – BitTorrent • Distributed Hash Tables – Chord, CAN, Tapestry, Pastry
  • 13. Challenges • Peer discovery • Group management • Search • Download • Incentives
  • 14. Metrics • Per-node state • Bandwidth usage • Search time • Fault tolerance/resiliency
  • 15. Centralized • Napster model – Server contacted during search – Peers directly exchange content • Benefits: – Efficient search – Limited bandwidth usage – No per-node state • Drawbacks: – Central point of failure – Limited scale Bob Alice Jane Judy
  • 16. Decentralized (Flooding) • Gnutella model – Search is flooded to neighbors – Neighbors are determined randomly • Benefits: – No central point of failure – Limited per-node state • Drawbacks: – Slow searches – Bandwidth intensive Bob Alice Jane Judy Carl
  • 17. Hierarchical • Kazaa/new Gnutella model – Nodes with high bandwidth/long uptime become supernodes/ultrapeers – Search requests sent to supernode – Supernode caches info about attached leaf nodes – Supernodes connect to eachother (32 in Limewire) • Benefits: – Search faster than flooding • Drawbacks: – Many of the same problems as decentralized – Reconfiguration when supernode fails SuperBob Andy Jane Judy Carl SuperFred SuperTed Alex Alice
  • 18. BitTorrent Source .torrent server seed tracker 1. Download torrent 2. Get list of peers and seeds (the swarm) 3. Exchange vector of content downloaded with peers 4. Exchange content w/ peers 5. Update w/ progress
  • 19. BitTorrent • Key Ideas – Break large files into small blocks and download blocks individually – Provide incentives for uploading content • Allow download from peers that provide best upload rate • Benefits – Incentives – Centralized search – No neighbor state (except the peers in your swarm) • Drawbacks – “Centralized” search • No central repository
  • 20. Distributed Hash Tables (DHT) • Chord, CAN, Tapestry, Pastry model – AKA Structured P2P networks – Provide performance guarantees – If content exists, it will be found • Benefits: – More efficient searching – Limited per-node state • Drawbacks: – Limited fault-tolerance vs redundancy 001 012 212 305 332 212 ? 212 ?
  • 21. DHTs: Overview • Goal: Map key to value • Decentralized with bounded number of neighbors • Provide guaranteed performance for search – If content is in network, it will be found – Number of messages required for search is bounded • Provide guaranteed performance for join/leave – Minimal number of nodes affected • Suitable for applications like file systems that require guaranteed performance
  • 22. Comparing DHTs • Neighbor state • Search performance • Join algorithm • Failure recovery
  • 23. CAN • Associate to each node and item a unique id in an d-dimensional space • Goals – Scales to hundreds of thousands of nodes – Handles rapid arrival and failure of nodes • Properties – Routing table size O(d) – Guarantees that a file is found in at most d*n1/d steps, where n is the total number of nodes Slide modified from another presentation
  • 24. CAN Example: Two Dimensional Space • Space divided between nodes • All nodes cover the entire space • Each node covers either a square or a rectangular area of ratios 1:2 or 2:1 • Example: – Node n1:(1, 2) first node that joins  cover the entire space 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 Slide modified from another presentation
  • 25. CAN Example: Two Dimensional Space • Node n2:(4, 2) joins • n2 contacts n1 • n1 splits its area and assigns half to n2 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 n2 Slide modified from another presentation
  • 26. CAN Example: Two Dimensional Space • Nodes n3:(3, 5) n4:(5, 5) and n5:(6,6) join • Each new node sends JOIN request to an existing node chosen randomly • New node gets neighbor table from existing node • New and existing nodes update neighbor tables and neighbors accordingly – before n5 joins, n4 has neighbors n2 and n3 – n5 adds n4 and n2 to neighborlist – n2 updated to include n5 in neighborlist • Only O(2d) nodes are affected 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5 Slide modified from another presentation
  • 27. CAN Example: Two Dimensional Space • Bootstrapping - assume CAN has an associated DNS domain and domain resolves to IP of one or more bootstrap nodes • Optimizations - landmark routing – Ping a landmark server(s) and choose an existing node based on distance to landmark 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5 Slide modified from another presentation
  • 28. CAN Example: Two Dimensional Space • Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6) • Items: f1:(2,3); f2:(5,1); f3:(2,1); f4:(7,5); 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5 f1 f2 f3 f4 Slide modified from another presentation
  • 29. CAN Example: Two Dimensional Space • Each item is stored by the node who owns its mapping in the space 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5 f1 f2 f3 f4 Slide modified from another presentation
  • 30. CAN: Query Example • Forward query to the neighbor that is closest to the query id (Euclidean distance) • Example: assume n1 queries f4 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5 f1 f2 f3 f4 Slide modified from another presentation
  • 31. CAN: Query Example • Forward query to the neighbor that is closest to the query id • Example: assume n1 queries f4 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5 f1 f2 f3 f4 Slide modified from another presentation
  • 32. CAN: Query Example • Forward query to the neighbor that is closest to the query id • Example: assume n1 queries f4 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5 f1 f2 f3 f4 Slide modified from another presentation
  • 33. CAN: Query Example • Content guaranteed to be found in d*n1/d hops – Each dimension has n1/d nodes • Increasing the number of dimensions reduces path length but increases number of neighbors 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5 f1 f2 f3 f4 Slide modified from another presentation
  • 34. Node Failure Recovery • Detection – Nodes periodically send refresh messages to neighbors • Simple failures – neighbor’s neighbors are cached – when a node fails, one of its neighbors takes over its zone • when a node fails to receive a refresh from neighbor, it sets a timer • many neighbors may simultaneously set their timers • when a node’s timer goes off, it sends a TAKEOVER to the failed node’s neighbors • when a node receives a TAKEOVER it either (a) cancels its timer if the zone volume of the sender is smaller than its own or (b) replies with a TAKEOVER Slide modified from another presentation
  • 35. Chord • Each node has m-bit id that is a SHA-1 hash of its IP address • Nodes are arranged in a circle modulo m • Data is hashed to an id in the same id space • Node n stores data with id between n and n’s predecessor – 0 stores 4-0 – 1 stores 1 – 3 stores 2-3 0 2 4 6 1 7 3 5
  • 36. Chord • Simple query algorithm: – Node maintains successor – To find data with id i, query successor until successor > i found • Running time? 0 2 4 6 1 7 3 5
  • 37. Chord • In reality, nodes maintain a finger table with more routing information – For a node n, the ith entry in its finger table is the first node that succeeds n by at least 2i-1 • Size of finger table? 0 2 4 6 1 7 3 5 1 1 2 3 4 0 2 3 3 3 5 0 4 0 5 0 7 0
  • 38. Chord • In reality, nodes maintain a finger table with more routing information – For a node n, the ith entry in its finger table is the first node that succeeds n by at least 2i-1 • Size of finger table? • O(log N) 0 2 4 6 1 7 3 5 1 1 2 3 4 0 2 3 3 3 5 0 4 0 5 0 7 0
  • 39. Chord query: hash key to get id if id == node id - data found else if id in finger table - data found else p = find_predecessor(id) n = find_successor(p) find_predecessor(id): choose n in finger table closest to id if n < id < find_successor(n) return n else ask n for finger entry closest to id and recurse 0 2 4 6 1 7 3 5 1 1 2 3 4 0 2 3 3 3 5 0 4 0 5 0 7 0
  • 40. Chord • Running time of query algorithm? – Problem size is halved at each iteration 0 2 4 6 1 7 3 5 1 1 2 3 4 0 2 3 3 3 5 0 4 0 5 0 7 0
  • 41. Chord • Running time of query algorithm? – O(log N) 0 2 4 6 1 7 3 5 1 1 2 3 4 0 2 3 3 3 5 0 4 0 5 0 7 0
  • 42. Chord • Join – initialize predecessor and fingers – update fingers and predecessors of existing nodes – transfer data 0 2 4 6 1 7 3 5 1 1 2 3 4 6 2 3 3 3 5 6 4 6 5 6 7 0 7 0 0 0 2 3
  • 43. Chord • Initialize predecessor and finger of new node n* – n* contacts existing node in network n – n does a lookup of predecessor of n* – for each entry in finger table, look up successor • Running time - O(mlogN) • Optimization - initialize n* with finger table of successor – with high probability, reduces running time to O(log N) 0 2 4 6 1 7 3 5 1 1 2 3 4 6 2 3 3 3 5 6 4 6 5 6 7 0 7 0 0 0 2 3
  • 44. Chord • Update existing nodes – n* becomes ith finger of a node p if • p precedes n by at least 2i-1 • the ith finger of p succeeds n – start at predecessor of n* and walk backwards – for i=1 to 3: • find predecessor of n*-2i-1 • update table and recurse • Running time O(log2N) 0 2 4 6 1 7 3 5 1 1 2 3 4 6 2 3 3 3 5 6 4 6 5 6 7 0 7 0 0 0 2 3
  • 45. Chord • Stabilization – Goal: handle concurrent joins – Periodically, ask successor for its predecessor – If your successor’s predecessor isn’t you, update – Periodically, refresh finger tables • Failures – keep list of r successors – if successor fails, replace with next in the list – finger tables will be corrected by stabilization algorithm 0 2 4 6 1 7 3 5 1 1 2 3 4 6 2 3 3 3 5 6 4 6 5 6 7 0 7 0 0 0 2 3
  • 46. DHTs – Tapestry/Pastry • Global mesh • Suffix-based routing • Uses underlying network distance in constructing mesh 13FE ABFE 1290 239E 73FE 9990 F990 993E 04FE 43FE
  • 47. Comparing Guarantees logbN Neighbor map Pastry b logbN logbN Global Mesh Tapestry 2d dN1/d Multi- dimensional CAN log N log N Uni- dimensional Chord State Search Model b logbN + b
  • 48. Remaining Problems? • Hard to handle highly dynamic environments • Usable services • Methods don’t consider peer characteristics
  • 49. Measurement Studies • “Free Riding on Gnutella” • Most studies focus on Gnutella • Want to determine how users behave • Recommendations for the best way to design systems
  • 50. Free Riding Results • Who is sharing what? • August 2000 The top Share As percent of whole 333 hosts (1%) 1,142,645 37% 1,667 hosts (5%) 2,182,087 70% 3,334 hosts (10%) 2,692,082 87% 5,000 hosts (15%) 2,928,905 94% 6,667 hosts (20%) 3,037,232 98% 8,333 hosts (25%) 3,082,572 99%
  • 51. Saroiu et al Study • How many peers are server-like…client- like? – Bandwidth, latency • Connectivity • Who is sharing what?
  • 52. Saroiu et al Study • May 2001 • Napster crawl – query index server and keep track of results – query about returned peers – don’t capture users sharing unpopular content • Gnutella crawl – send out ping messages with large TTL
  • 53. Results Overview • Lots of heterogeneity between peers – Systems should consider peer capabilities • Peers lie – Systems must be able to verify reported peer capabilities or measure true capabilities
  • 60. Points of Discussion • Is it all hype? • Should P2P be a research area? • Do P2P applications/systems have common research questions? • What are the “killer apps” for P2P systems?
  • 61. Conclusion • P2P is an interesting and useful model • There are lots of technical challenges to be solved