SlideShare a Scribd company logo
1 of 52
Peer-to-Peer Networking
CS4482 High Performance Networking
Dilum Bandara
Dilum.Bandara@uom.lk
Slides by Dilum Bandara and Anura Jayasumana
Outline
 P2P systems
 Unstructured vs. structured overlays
 P2P streaming
2
Peer-to-Peer (P2P) Systems
 Distributed systems without any central control
 Autonomous peers
 Equivalent in functionality/privileges; Both a client & a server
 Protocol features
 Protocol constructed at application layer
 Overlaid on top of Internet
 Typically a peer has a unique identifier
 Supports some type of message routing capability
 Fairness & Performance
 Self-scaling
 Free-rider problem
 Peer churn
3
Internet
P2P Applications
 Many application
domains
 File sharing – BitTorrent,
KaZaA, Napster,
BearShare
 IPTV – PPLive,
CoolStreaming, SopCast
 VoIP – Skype
 CPU cycle sharing – SETI,
World Community Grid
 Distributed data fusion –
CASA
 Impact of P2P traffic
 In 2008 – 50% 2009- 39% of
total Internet traffic (2014-
17%)
 Today – Volume still growing
 3.5 Exabytes/month (4 in
2014)
 Globally, P2P TV is now over
280 petabytes per month
 P2P traffic 20 percent of all
mobile data traffic globally
4
[Hyperconnectivity and the
Approaching Zettabyte Era,
Cisco 2010]
P2P Characteristics
 Tremendous scalability
 Millions of peers
 Globally distributed
 Bandwidth intensive
 Upload/download
 Many concurrent connections
 Aggressive/unfair bandwidth utilization
 Aggregated downloads to overcome asymmetric upload
bandwidth
 Heterogeneous
 Superpeers
 Critical for performance/functionality 5
Internet
Terminology
 Application
 Tier 2 – Services provided to end
users
 Tier 1 – Middleware services
 Overlay
 How peers are connected
 Application layer network
consists of peers
 e.g., dial-up on top of telephone
network, BGP, PlanetLab, CDNs
 Underlay
 Internet, Bluetooth
 Peers implement top 3 layers
6
Application – Tier 2
File sharing, streaming, VoIP, P2P clouds
Application – Tier 1
Indexing/DHT, Caching, replication, access
control, reputation, trust
Overlay
Unstructured, structured, & hybrid
Gnutella, Chord, Kademlia, CAN
Underlay
Internet, Bluetooth
Overlay Connectivity
7
P2P Overlay
Unstructured
Deterministic
Napster
BitTorrent
JXTA
Nondeterministic
Gnutella
KaZaA
Structured
Sub-linear state
Chord
Kademlia
CAN
Pastry
Tapestry
Constant state
Viceroy
Cycloid
Hybrid
Structella
Kelip
Local minima
search
Bootstrapping
 How is an initial P2P overlay is formed from a
set of nodes?
 Use some known information
 Use a well-known server to register initial set of peers
 Some peer addresses are well known
 A well known domain name
 Use a local broadcast to collect nearby peers, &
merge such sets to larger sets
8
Bootstrapping (Cont.)
 Each peer maintains a random subset of peers
 e.g., peers in Skype maintain a cache of superpeers
 An incoming peer talks to 1+ known peers
 A known peer accepting an incoming peer
 Keeps track of incoming peer
 May redirect incoming peer to another peer
 Give a random set of peers to contact
 Discover more peers by random walk, gossiping,
or deterministic walk within overlay
9
Resource Discovery Overview
10
Centralized
O(1)
Fast lookup
Single point of failure
Unstructured
O(hopsmax)
Easy network maintenance
Not guaranteed to find resources
Distribute Hash Table (DHT)
O(log N)
Guaranteed performance
Not for dynamic systems
Superpeer
O(hopsmax)
Better scalability
Not guaranteed to find resources
Centralized – Napster
 Centralized database for
lookup
 Guaranteed content
discovery
 Low overhead
 Single point of failure
 Easy to track
 Legal issues
 File transfer directly
between peers
 Killer P2P application
 June 1999 – July 2001
 26.4 million users (peak)
11
Unstructured – Gnutella
 Fully distributed
 Random connections
 Initial entry point is
known
 Peers maintain dynamic
list of neighbors
 Connections to multiple
peers
 Highly resilient to node
failures
12
Unstructured P2P (cont.)
 Flooding-based lookup
 Guaranteed content discovery
 Implosion  High overhead
 Expanding ring flooding
 TTL-based random walk
 Content discovery not
guaranteed
 Better performance by biasing
random walk toward nodes with
higher degree
 If response follow same path
 Anonymity
 Used in KaZaA, BearShare,
LimeWire, McAfee 13
D
S
D
s
Flooding
Random walk
Superpeers
 Resource rich peers 
Superpeers
 Bandwidth, reliability, trust,
memory, CPU, etc.
 Flooding or random walk
 Only superpeers are
involved
 Lower overhead
 More scalable
 Content discovery not
guaranteed
 Better performance when
superpeers share list of
file names
 Examples: Gnutella V0.6,
FastTrack, Freenet KaZaA,
Skype
14
s D
BitTorrent
 Most popular P2P file sharing
system to date
 Features
 Centralized search
 Multiple downloads
 Enforce fairness
 Rarest-first dissemination
 Incentives
 Better contribution  Better
download speeds (not always)
 Enable content delivery
networks
 Revenue through ads on search
engines 15
User
Trackers
Web-based
search engine
Content
owner
Keyword search
.torrent file
server
Download
.torrent file
Get list of
peers
Download/
upload
chunks
BitTorrent Protocol
 Content owner creates a
.torrent file
 File name, length, hash,
list of trackers
 Place .torrent file on a
server
 Publish URL of .torrent
file to a web site
 Torrent search engine
 .torrent file points to a
tracker(s)
 Registry of leaches &
seeds for a given file
16
User
Trackers
Web-based
search engine
Content
owner
Keyword search
.torrent file
server
Download
.torrent file
Get list of
peers
Download/
upload
chunks
1
2
3
4
1
2
3
4
BitTorrent Protocol (cont.)
 Tracker
 Provide a random subset
of peers sharing same file
 Peer contacts subset of
peers parallely
 Files are shared based
on chunk IDs
 Chunk – segment of file
 Periodically ask tracker
for new set of IPs
 Every 15 min
 Pick peers with highest
upload rate 17
User
Trackers
Web-based
search engine
Content
owner
Keyword search
.torrent file
server
Download
.torrent file
Get list of
peers
Download/
upload
chunks
1
2
3
4
1
2
3
4
BitTorrent Terminology
 Swarm
 Set of peers accessing
(upload/download) same
file
 Seeds
 Peers with entire file
 Leeches
 Peers with part of file or
no file (want to download)
18
www.kat.ph
BitTorrent Site Stat
19
User ranking
of file quality
Seedpeer.com
www.kat.ph/stats/
Files in search
engine
User verified to
be valid
Across all files
Search Cloud
BitTorrent Content Popularity
 Few highly popular content
 Moderately popular
content follow Zipf’s-like
distribution
 Typical Zipf’s parameter
0.5-1.0
20
Number of occurances/queries (x) (log)
1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8
No
of
search
terms
with
>
x
occurances
(log)
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
0.5
fenopy.org
y = 1.96 - 1.024x, r2 = 0.98
Number of occurances/queries (x) (og)
2.4 2.6 2.8 3.0 3.2 3.4 3.6
No
of
search
terms
with
>
x
occurances
(log)
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
0.5
youbitTorrent.com
y = 4.71 - 1.88x, r2 = 0.98
Number of occurances/queries (x) (log)
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5
No
of
search
terms
with
>
x
occurances
(log)
-7
-6
-5
-4
-3
-2
-1
0
Dataset1 - kickasstorrents.com
y = 0.96 - 1.51x, r2 = 0.98
Toy Story 3
DVD release date
June 18, 2010
BitTorrent Characteristics
 Flash crowd effect
 Asymmetric bandwidth
 Most peers leave after
downloading
 Diurnal & seasonal
patterns 21
Flash crowd
Download
speed
Session length
[Zhang, 2009]
BitTorrent Evolution
22
BitTorrent
Global community
Islands of communities
Connected islands of
communities
BitTorrent ver. 4.2
BitTorrent Fairness/Incentives
 Tit-for-tat
 Bandwidth policy
 Upload to 4 peers that give me the highest download bandwidth
 1 random peer
 Chunk policy
 Rarest first
 Download least popular chunk
 Initial seed try not to send same chunk twice
 Most peers leave immediately after downloading
 Modified nodes increase free riding
 Modified policies
23
Summary – Unstructured P2P
 Separate content discovery & delivery
 Content discovery is mostly outside of P2P overlay
 Centralized solutions
 Not scalable
 Affect content delivery when failed
 Distributed solutions
 High overhead
 May not locate the content
 No predictable performance
 Delay or message bounds
 Lack of QoS or QoE
24
Structured P2P
 Deterministic approach to locate contents & peers
 Locate peer(s) responsible for a given key
 Contents
 Unique key
 Hash of file name, metadata, or actual content
 160-bit or higher
 Peers also have a key
 Random bit string or IP address
 Index keys on a Distributed Hash Table (DHT)
 Distributed address space [0, 2m – 1]
 Deterministic overlay to publish & locate content
 Bounded performance under standard conditions
25
Terminology
 Hash function
 Converts a large amount of data into
a small datum
 Hash table
 Data structure that uses hashing to
index content
 Distributed Hash Table (DHT)
 A hash table that is distributed
 Types of hashing
 Consistent or random
 Locality preserving
26
f()
f()
f() g()
g()
g()
Structured P2P – Example
 2 operations
 store(key, value)
 locate(key)
27
Ring – 16 addresses
Song.mp3
Cars.mpeg
f()
f()
Find cars.mpeg
n + 2i – 1, 1  i  m
Successor
11 Song.mp3
6 Cars.mpeg
O(log N) hops
Chord [Stoica, 2001]
 Key space arranged as a ring
 Peers responsible for segment of
the ring
 Called successor of a key
 1st peer in clockwise direction
 Routing table
 Keep a pointer (finger) to m peers
 Keep a finger to (2i – 1)-th peer, 1 ≤ i ≤ m
 Key resolution
 Go to peer with the closest key
 Recursively continue until key is find
 Can be located within O(log n) hops
28
m =3-bit key ring
Chord (cont.)
 New peer entering overlay
 Takes keys from the successor
 Peer leaving overlay
 Give keys to the successor
 Fingers are updated as peers join & leave
 Peer failure or churn makes finger table entries stale 29
New peer with key 6 joins the overlay Peer with key 1 leave the overlay
Chord Performance
 Path length
 Worst case O(log N)
 Average ½log2N
 Updates O(log2 N)
 Fingers O(log N)
 Alternative paths (log N)!
 Balanced distribution of
keys
 Under uniform distribution
 N(log N) virtual nodes
provides best load
distribution
30
Kademlia [Maymounkov, 2002]
 Used in BitTorrent, eMule, aMule, & AZUREUS
 160-bit keys
 Nodes are assigned random keys
 Distance between 2 keys is determined by XOR
 Routing in the ring is bidirectional
 dist(a  b) = dist(b  a)
 Enable nodes to learn about new nodes from received messages
 Keys are stored in nodes with the shortest XOR distance
31
Structured P2P – Alternate Designs
32
d-Torus
Content-Addressable Network (CAN)
[Ratnasamy, 2001]
(0, 0)
(1, 0)
(0, 0)
(0, 1)
Zone
controller
(0.1,0.9)
(0.3,0.4)
(0.4,0.8)
(0.75,0.2)
(0.35,0.1)
(0.65,0.7)
(0.8,0.4)
(0.8,0.8)
(0-0.5, 0-0.5)
(0.5-1, 0-0.5)
(0-0.5, 0.5-1)
(0.5-1, 0.5-0.75)
Cube connected cycle
Cycloid [Shen, 2006]
Summary – Structured P2P
 Content discovery is within the P2P overlay
 Deterministic performance
 Chord
 Unidirectional routing
 Recursive routing
 Peer churn & failure is an issue
 Kademlia
 Bidirectional routing
 Parallel iterative routing
 Work better under peer failure & churn
 MySong.mp3 is not same as mysong.mp3
 Unbalanced distribution of keys & load
33
Summary (cont.)
34
Scheme Architecture Routing
mechanism
Lookup
overhead*
Routing
table size*
Join/leav
e cost
Resilience
Chord Circular key
space
Successor &
long distant links
O(log N) O(log N) O(log2 N) High
CAN d-torus Greedy routing
through
neighbors
O(dN1/d) 2d 2d Moderate
Pastry Hypercube Correct one digit
in key at time
O(logB N) O(B logB N) O(logB N) Moderate
Tapestry Hypercube Correct one digit
in key at time
O(logB N) O(logB N) O(logB N) Moderate
Viceroy Butterfly network Predecessor &
successor links
O(log N) O(1) O(log N) Low
Kademlia Binary tree, XOR
distance metric
Iteratively find
nodes close to
key
O(log N) O(log N) O(log N) High
Cycloid Cube connected
cycles
Links to cyclic &
cubical
neighbors
O(d) O(1) O(d) Moderate
* N – number of nodes in overlay, d – number of dimensions B – base of a key identifier
Structured vs. Unstructured
35
Unstructured P2P Structured P2P
Overlay
construction
High flexibility Low flexibility
Resources Indexed locally Indexed remotely on a distributed
hash table
Query messages Broadcast or random walk Unicast
Content location Best effort Guaranteed
Performance Unpredictable Predictable bounds
Overhead High Relatively low
Object types Mutable, with many complex
attributes
Immutable, with few simple
attributes
Peer churn &
failure
Supports high failure rates Supports moderate failure rates
Applicable
environments
Small-scale or highly dynamic, e.g.,
mobile P2P
Large-scale & relatively stable,
e.g., desktop file sharing
Examples Gnutella, LimeWire, KaZaA,
BitTorrent
Chord, CAN, Pastry, eMule,
BitTorrent
Login Server
Superpeer
overlay
Skype
 Proprietary
 Encrypted control & data messages
 Many platforms
 Voice/video calls, instant
messaging, file transfer,
video/audio conferencing
 Superpeer overlay
 Related to KaZaA
 Based on bandwidth, not behind
firewall/NAT, & processing power
 Enables NAT & firewall traversal for
ordinary peers
36
Skype (Cont.)
 30% superpeers
 Relatively stable
 Diurnal behavior
 Longer session length than typical
P2P - Heavy tailed 37
[Guha, 2006]
P2P Streaming
 Emergence of IPTV
 Content Delivery Networks (CDNs) can’t handle bandwidth
requirements
 No multicast support at network layer
 P2P
 Easy to implement
 No global topology maintenance
 Tremendous scalability
 Greater demand  Better service
 Cost effective
 Robustness
 No single point of failure
 Adaptive
 Application layer 38
P2P Streaming – Components
 Chunk
 Segment of the video stream
 e.g., one second worth of video
 Partners
 Subset of known peers that a peer may
actually talk to 39
(Hie, 2008)
(Zhang, 2005)
(Liu, 2008)
Tree-Push Approach
 Construct overlay tree starting from video source
 Parent peer selection is based on
 Bandwidth, latency, number of peers, etc.
 Data/chunks are pushed down the tree
 Multi-tree-based approach
 Better content distribution
 Enhanced reliability
40
Tree-Push Approach – Issues
 Connectivity is affected when peers at the top of the tree
leave/fail
 Time to reconstruct the tree
 Unbalanced tree
 Majority of peers are leaves
 Unable to utilize their bandwidth
 High delay
41
(Zhang, 2005)
Mesh-Pull Approach
 A peer connects to multiple peers
forming a mesh
 Pros
 More robust to failure
 Better bandwidth utilization
 Cons
 No specific chunk forwarding path
 Need to pull chunks from partners/peers
 Need to know which partner has what
 Used in most commercial products
42
(Zhang, 2005)
Chunk Sharing
 Each peer
 Caches a set of chunks within a sliding window
 Shares its chunk information with its partners
 Buffer maps are used to inform chunk availability
 Chunks may be in one or more partners
 What chunks to get from whom?
43
(Hie, 2008)
Chunk Scheduling
 Some chunks are highly available while others are scare
 Some chinks needs to be played soon
 New chunks need to be pulled from video source
 Chunk scheduling consider how a peer can get chunks while
 Minimizing latency
 Preventing skipping
 Maximizing throughput
 Chunk scheduling
 Random, rarest first, earliest deadline first, earliest deadline & rarest
first
 Determines user QoE
 Most commercial products use TCP for chunk transmission
 Control message overhead ~1-2% 44
Random Scheduling
 One of the earliest approach – used in Chainsaw
 Peers periodically share buffer maps
 Select a random chunk & request it from one of the partners
having the chunk
 Some peers may experience significant playback delay
 1-2 minutes
 Skipping is possible
45
[Pai, 2005]
Rarest-First Scheduling
 Used in CoolStraming
 Chunk = 1 sec video, 120 chunk in sliding window
 A peer gets the rarest chunk so that chunk can be spread to
its partners
 Steps
1. Gather buffer maps periodically
2. Calculate number of suppliers (i.e., partners with chunk) for each
chunk
3. Request chunks with the lowest number of suppliers
4. For chunks with multiple suppliers, request from the supplier with
highest bandwidth & free time
 Gather application-level bandwidth data for each partner
 Request are made through a bitmap
46
Rarest First (cont.)
 It is sufficient to maintain 4 partners
 Discover more peers overtime – use gossiping
 Keep only the partners that have sufficient bandwidth &
more chunks 47
(Zhang, 2005)
Rarest First (Cont.)
 More robust than tree-push
approach
 Larger user community  Better
service quality
 Most users experience < 1 min
delay 48
(Zhang, 2005)
Earliest-Deadline First
 Objectives – Minimum playback delay & continuity
 Rule 1
 Chunk with the lowest sequence number has the highest priority
 So request chunk with the lowest sequence number
 Try to meet earliest deadline
 Rule 2
 Peer with the lowest, largest sequence number in buffer map has the
highest priority
 Falling behind, so let it speed up
49
[Chen, 2008]
Earliest-Deadline First (Cont.)
 DPC – Distributed Priority based Chunk scheduling
 L – Number of partners
 Lower playback delay
 Lower skipping
50
[Chen, 2008]
Bibliography
1. R. Bland, D. Caulfield, E. Clarke, A. Hanley, and E. Kelleher, “P2P routing,” Available:
http://ntrg.cs.tcd.ie/undergrad/4ba2.02-03/p9.html
2. S. Guha, N. Daswani, and R. Jain, An Experimental Study of the Skype Peer-to-Peer VoIP
System, 5th Int. Workshop on Peer-to-Peer Systems (IPTPS ‘06), Feb. 2006.
3. Y. Guo, C. Liang, and Y. Liu, Adaptive queue-based chunk scheduling for P2P live
streaming, IFIP Networking, May 2008.
4. X. Hei, Y. Liu, and K. W. Ross, IPTV over P2P streaming networks: the mesh-pull
approach, IEEE Communications Magazine, vol. 46, no. 2, Feb. 2008, pp. 86-92.
5. R Morselli, B. Bhattacharjee, A. Srinivasan, and M. A. Marsh, Efficient lookup on
unstructured topologies, 24th ACM Symposium on Principles of Distributed Computing
(PODC '05), July 2005.
6. Z. Chen, K. Xue, and P. Hong, A study on reducing chunk scheduling delay for mesh-
based P2P live streaming, 7th Int. Conf. on Grid and Cooperative Computing, 2008, pp.
356-361.
7. Cisco Systems, Inc., Cisco Visual Networking Index: Forecast and Methodology, 2008–
2013, June 2009.
8. Cisco Systems, Inc., Approaching the Zettabyte Era, June 2008.
9. J. Liu, S. G. Rao, B. Li, and H. Zhang, “Opportunities and challenges of peer-to-peer
internet video broadcast,” In Proc. of IEEE, vol. 96, no. 1, Jan. 2008, pp. 11-24.
51
Bibliography
10. P. Maymounkov and D. Mazières, Kademlia: A peer-to-peer information system based on
the XOR metric, 1st Int. Workshop on Peer-to-peer Systems (IPTPS ‘02), Feb. 2002, pp.
53-65.
11. S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, A scalable content-
addressable network, ACM Special Interest Group on Data Communication (SIGCOMM
‘01), Aug. 2001.
12. I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan, Chord: a scalable
peer-to-peer lookup service for internet applications, ACM SIGCOMM ‘01, 2001, pp. 149-
160.
13. D. Stutzbach, R. Rejaie, and S. Sen, Characterizing unstructured overlay topologies in
modern P2P file-sharing systems, IEEE/ACM Transactions on Networking, vol. 16, no. 2,
April 2008.
14. Y. Tang, L. Sun, K. Zhang, S. Yang, and Y. Zhong, Longer, better: On extending user
online duration to improve quality of streaming service in P2P networks, IEEE Int. Conf. on
Multimedia and Expo, July 2007.
15. M. Zhang, Y. Xiong, Q. Zhang, and S. Yang, On the optimal scheduling for media
streaming in data-driven overlay networks, GLOBECOM ‘06, Nov.-Dec. 2006.
16. X. Zhang, J. Liu, B. Li, and T. P. Yum, CoolStreaming/DONet: a data-driven overlay
network for efficient live media streaming, INFOCOM 2005, Mar. 2005.
52

More Related Content

Similar to Peer-to-Peer Networking Systems and Streaming

Lecture: Software Agents and P2P
Lecture: Software Agents and P2PLecture: Software Agents and P2P
Lecture: Software Agents and P2PJames Salter
 
BitTorrent Protocol
BitTorrent ProtocolBitTorrent Protocol
BitTorrent ProtocolSridharBR
 
(130316) #fitalk bit torrent protocol
(130316) #fitalk   bit torrent protocol(130316) #fitalk   bit torrent protocol
(130316) #fitalk bit torrent protocolINSIGHT FORENSIC
 
Big data & hadoop framework
Big data & hadoop frameworkBig data & hadoop framework
Big data & hadoop frameworkTu Pham
 
Bittorrent_project_Srikanth_Vanama
Bittorrent_project_Srikanth_VanamaBittorrent_project_Srikanth_Vanama
Bittorrent_project_Srikanth_VanamaSrikanth Vanama
 
Research Data Management Implementations
Research Data Management Implementations Research Data Management Implementations
Research Data Management Implementations Globus
 
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...OpenDNS
 
Splunk Stream - Einblicke in Netzwerk Traffic
Splunk Stream - Einblicke in Netzwerk TrafficSplunk Stream - Einblicke in Netzwerk Traffic
Splunk Stream - Einblicke in Netzwerk TrafficSplunk
 
Bit Torrent Protocol
Bit Torrent ProtocolBit Torrent Protocol
Bit Torrent ProtocolAli Habeeb
 
Computer network (10)
Computer network (10)Computer network (10)
Computer network (10)NYversity
 
Networking presentation 9 march 2009
Networking presentation   9 march 2009Networking presentation   9 march 2009
Networking presentation 9 march 2009Kinshook Chaturvedi
 
Peer-to-peer Internet telephony
Peer-to-peer Internet telephonyPeer-to-peer Internet telephony
Peer-to-peer Internet telephonyKundan Singh
 
Bit torrent protocol
Bit torrent protocolBit torrent protocol
Bit torrent protocolD bipul lomga
 
The Internet and World Wide Web
The Internet and World Wide WebThe Internet and World Wide Web
The Internet and World Wide Webwebhostingguy
 
Realtime Detection of DDOS attacks using Apache Spark and MLLib
Realtime Detection of DDOS attacks using Apache Spark and MLLibRealtime Detection of DDOS attacks using Apache Spark and MLLib
Realtime Detection of DDOS attacks using Apache Spark and MLLibRyan Bosshart
 

Similar to Peer-to-Peer Networking Systems and Streaming (20)

Lecture: Software Agents and P2P
Lecture: Software Agents and P2PLecture: Software Agents and P2P
Lecture: Software Agents and P2P
 
BitTorrent Protocol
BitTorrent ProtocolBitTorrent Protocol
BitTorrent Protocol
 
(130316) #fitalk bit torrent protocol
(130316) #fitalk   bit torrent protocol(130316) #fitalk   bit torrent protocol
(130316) #fitalk bit torrent protocol
 
Ods chapter7
Ods chapter7Ods chapter7
Ods chapter7
 
Big data & hadoop framework
Big data & hadoop frameworkBig data & hadoop framework
Big data & hadoop framework
 
Bittorrent_project_Srikanth_Vanama
Bittorrent_project_Srikanth_VanamaBittorrent_project_Srikanth_Vanama
Bittorrent_project_Srikanth_Vanama
 
Research Data Management Implementations
Research Data Management Implementations Research Data Management Implementations
Research Data Management Implementations
 
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
 
Splunk Stream - Einblicke in Netzwerk Traffic
Splunk Stream - Einblicke in Netzwerk TrafficSplunk Stream - Einblicke in Netzwerk Traffic
Splunk Stream - Einblicke in Netzwerk Traffic
 
Bit Torrent Protocol
Bit Torrent ProtocolBit Torrent Protocol
Bit Torrent Protocol
 
Computer network (10)
Computer network (10)Computer network (10)
Computer network (10)
 
Networking presentation 9 march 2009
Networking presentation   9 march 2009Networking presentation   9 march 2009
Networking presentation 9 march 2009
 
Peer-to-peer Internet telephony
Peer-to-peer Internet telephonyPeer-to-peer Internet telephony
Peer-to-peer Internet telephony
 
Tor
TorTor
Tor
 
Bit torrent protocol
Bit torrent protocolBit torrent protocol
Bit torrent protocol
 
Penetration Testing Boot CAMP
Penetration Testing Boot CAMPPenetration Testing Boot CAMP
Penetration Testing Boot CAMP
 
Bittorrent in a P2P social network
Bittorrent in a P2P social networkBittorrent in a P2P social network
Bittorrent in a P2P social network
 
The Internet and World Wide Web
The Internet and World Wide WebThe Internet and World Wide Web
The Internet and World Wide Web
 
6 networking
6 networking6 networking
6 networking
 
Realtime Detection of DDOS attacks using Apache Spark and MLLib
Realtime Detection of DDOS attacks using Apache Spark and MLLibRealtime Detection of DDOS attacks using Apache Spark and MLLib
Realtime Detection of DDOS attacks using Apache Spark and MLLib
 

More from Dilum Bandara

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningDilum Bandara
 
Time Series Analysis and Forecasting in Practice
Time Series Analysis and Forecasting in PracticeTime Series Analysis and Forecasting in Practice
Time Series Analysis and Forecasting in PracticeDilum Bandara
 
Introduction to Dimension Reduction with PCA
Introduction to Dimension Reduction with PCAIntroduction to Dimension Reduction with PCA
Introduction to Dimension Reduction with PCADilum Bandara
 
Introduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive AnalyticsIntroduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive AnalyticsDilum Bandara
 
Introduction to Concurrent Data Structures
Introduction to Concurrent Data StructuresIntroduction to Concurrent Data Structures
Introduction to Concurrent Data StructuresDilum Bandara
 
Hard to Paralelize Problems: Matrix-Vector and Matrix-Matrix
Hard to Paralelize Problems: Matrix-Vector and Matrix-MatrixHard to Paralelize Problems: Matrix-Vector and Matrix-Matrix
Hard to Paralelize Problems: Matrix-Vector and Matrix-MatrixDilum Bandara
 
Introduction to Map-Reduce Programming with Hadoop
Introduction to Map-Reduce Programming with HadoopIntroduction to Map-Reduce Programming with Hadoop
Introduction to Map-Reduce Programming with HadoopDilum Bandara
 
Embarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel ProblemsEmbarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel ProblemsDilum Bandara
 
Introduction to Warehouse-Scale Computers
Introduction to Warehouse-Scale ComputersIntroduction to Warehouse-Scale Computers
Introduction to Warehouse-Scale ComputersDilum Bandara
 
Introduction to Thread Level Parallelism
Introduction to Thread Level ParallelismIntroduction to Thread Level Parallelism
Introduction to Thread Level ParallelismDilum Bandara
 
CPU Memory Hierarchy and Caching Techniques
CPU Memory Hierarchy and Caching TechniquesCPU Memory Hierarchy and Caching Techniques
CPU Memory Hierarchy and Caching TechniquesDilum Bandara
 
Data-Level Parallelism in Microprocessors
Data-Level Parallelism in MicroprocessorsData-Level Parallelism in Microprocessors
Data-Level Parallelism in MicroprocessorsDilum Bandara
 
Instruction Level Parallelism – Hardware Techniques
Instruction Level Parallelism – Hardware TechniquesInstruction Level Parallelism – Hardware Techniques
Instruction Level Parallelism – Hardware TechniquesDilum Bandara
 
Instruction Level Parallelism – Compiler Techniques
Instruction Level Parallelism – Compiler TechniquesInstruction Level Parallelism – Compiler Techniques
Instruction Level Parallelism – Compiler TechniquesDilum Bandara
 
CPU Pipelining and Hazards - An Introduction
CPU Pipelining and Hazards - An IntroductionCPU Pipelining and Hazards - An Introduction
CPU Pipelining and Hazards - An IntroductionDilum Bandara
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
High Performance Networking with Advanced TCP
High Performance Networking with Advanced TCPHigh Performance Networking with Advanced TCP
High Performance Networking with Advanced TCPDilum Bandara
 
Introduction to Content Delivery Networks
Introduction to Content Delivery NetworksIntroduction to Content Delivery Networks
Introduction to Content Delivery NetworksDilum Bandara
 
Wired Broadband Communication
Wired Broadband CommunicationWired Broadband Communication
Wired Broadband CommunicationDilum Bandara
 

More from Dilum Bandara (20)

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Time Series Analysis and Forecasting in Practice
Time Series Analysis and Forecasting in PracticeTime Series Analysis and Forecasting in Practice
Time Series Analysis and Forecasting in Practice
 
Introduction to Dimension Reduction with PCA
Introduction to Dimension Reduction with PCAIntroduction to Dimension Reduction with PCA
Introduction to Dimension Reduction with PCA
 
Introduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive AnalyticsIntroduction to Descriptive & Predictive Analytics
Introduction to Descriptive & Predictive Analytics
 
Introduction to Concurrent Data Structures
Introduction to Concurrent Data StructuresIntroduction to Concurrent Data Structures
Introduction to Concurrent Data Structures
 
Hard to Paralelize Problems: Matrix-Vector and Matrix-Matrix
Hard to Paralelize Problems: Matrix-Vector and Matrix-MatrixHard to Paralelize Problems: Matrix-Vector and Matrix-Matrix
Hard to Paralelize Problems: Matrix-Vector and Matrix-Matrix
 
Introduction to Map-Reduce Programming with Hadoop
Introduction to Map-Reduce Programming with HadoopIntroduction to Map-Reduce Programming with Hadoop
Introduction to Map-Reduce Programming with Hadoop
 
Embarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel ProblemsEmbarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel Problems
 
Introduction to Warehouse-Scale Computers
Introduction to Warehouse-Scale ComputersIntroduction to Warehouse-Scale Computers
Introduction to Warehouse-Scale Computers
 
Introduction to Thread Level Parallelism
Introduction to Thread Level ParallelismIntroduction to Thread Level Parallelism
Introduction to Thread Level Parallelism
 
CPU Memory Hierarchy and Caching Techniques
CPU Memory Hierarchy and Caching TechniquesCPU Memory Hierarchy and Caching Techniques
CPU Memory Hierarchy and Caching Techniques
 
Data-Level Parallelism in Microprocessors
Data-Level Parallelism in MicroprocessorsData-Level Parallelism in Microprocessors
Data-Level Parallelism in Microprocessors
 
Instruction Level Parallelism – Hardware Techniques
Instruction Level Parallelism – Hardware TechniquesInstruction Level Parallelism – Hardware Techniques
Instruction Level Parallelism – Hardware Techniques
 
Instruction Level Parallelism – Compiler Techniques
Instruction Level Parallelism – Compiler TechniquesInstruction Level Parallelism – Compiler Techniques
Instruction Level Parallelism – Compiler Techniques
 
CPU Pipelining and Hazards - An Introduction
CPU Pipelining and Hazards - An IntroductionCPU Pipelining and Hazards - An Introduction
CPU Pipelining and Hazards - An Introduction
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
High Performance Networking with Advanced TCP
High Performance Networking with Advanced TCPHigh Performance Networking with Advanced TCP
High Performance Networking with Advanced TCP
 
Introduction to Content Delivery Networks
Introduction to Content Delivery NetworksIntroduction to Content Delivery Networks
Introduction to Content Delivery Networks
 
Mobile Services
Mobile ServicesMobile Services
Mobile Services
 
Wired Broadband Communication
Wired Broadband CommunicationWired Broadband Communication
Wired Broadband Communication
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 

Peer-to-Peer Networking Systems and Streaming

  • 1. Peer-to-Peer Networking CS4482 High Performance Networking Dilum Bandara Dilum.Bandara@uom.lk Slides by Dilum Bandara and Anura Jayasumana
  • 2. Outline  P2P systems  Unstructured vs. structured overlays  P2P streaming 2
  • 3. Peer-to-Peer (P2P) Systems  Distributed systems without any central control  Autonomous peers  Equivalent in functionality/privileges; Both a client & a server  Protocol features  Protocol constructed at application layer  Overlaid on top of Internet  Typically a peer has a unique identifier  Supports some type of message routing capability  Fairness & Performance  Self-scaling  Free-rider problem  Peer churn 3 Internet
  • 4. P2P Applications  Many application domains  File sharing – BitTorrent, KaZaA, Napster, BearShare  IPTV – PPLive, CoolStreaming, SopCast  VoIP – Skype  CPU cycle sharing – SETI, World Community Grid  Distributed data fusion – CASA  Impact of P2P traffic  In 2008 – 50% 2009- 39% of total Internet traffic (2014- 17%)  Today – Volume still growing  3.5 Exabytes/month (4 in 2014)  Globally, P2P TV is now over 280 petabytes per month  P2P traffic 20 percent of all mobile data traffic globally 4 [Hyperconnectivity and the Approaching Zettabyte Era, Cisco 2010]
  • 5. P2P Characteristics  Tremendous scalability  Millions of peers  Globally distributed  Bandwidth intensive  Upload/download  Many concurrent connections  Aggressive/unfair bandwidth utilization  Aggregated downloads to overcome asymmetric upload bandwidth  Heterogeneous  Superpeers  Critical for performance/functionality 5 Internet
  • 6. Terminology  Application  Tier 2 – Services provided to end users  Tier 1 – Middleware services  Overlay  How peers are connected  Application layer network consists of peers  e.g., dial-up on top of telephone network, BGP, PlanetLab, CDNs  Underlay  Internet, Bluetooth  Peers implement top 3 layers 6 Application – Tier 2 File sharing, streaming, VoIP, P2P clouds Application – Tier 1 Indexing/DHT, Caching, replication, access control, reputation, trust Overlay Unstructured, structured, & hybrid Gnutella, Chord, Kademlia, CAN Underlay Internet, Bluetooth
  • 7. Overlay Connectivity 7 P2P Overlay Unstructured Deterministic Napster BitTorrent JXTA Nondeterministic Gnutella KaZaA Structured Sub-linear state Chord Kademlia CAN Pastry Tapestry Constant state Viceroy Cycloid Hybrid Structella Kelip Local minima search
  • 8. Bootstrapping  How is an initial P2P overlay is formed from a set of nodes?  Use some known information  Use a well-known server to register initial set of peers  Some peer addresses are well known  A well known domain name  Use a local broadcast to collect nearby peers, & merge such sets to larger sets 8
  • 9. Bootstrapping (Cont.)  Each peer maintains a random subset of peers  e.g., peers in Skype maintain a cache of superpeers  An incoming peer talks to 1+ known peers  A known peer accepting an incoming peer  Keeps track of incoming peer  May redirect incoming peer to another peer  Give a random set of peers to contact  Discover more peers by random walk, gossiping, or deterministic walk within overlay 9
  • 10. Resource Discovery Overview 10 Centralized O(1) Fast lookup Single point of failure Unstructured O(hopsmax) Easy network maintenance Not guaranteed to find resources Distribute Hash Table (DHT) O(log N) Guaranteed performance Not for dynamic systems Superpeer O(hopsmax) Better scalability Not guaranteed to find resources
  • 11. Centralized – Napster  Centralized database for lookup  Guaranteed content discovery  Low overhead  Single point of failure  Easy to track  Legal issues  File transfer directly between peers  Killer P2P application  June 1999 – July 2001  26.4 million users (peak) 11
  • 12. Unstructured – Gnutella  Fully distributed  Random connections  Initial entry point is known  Peers maintain dynamic list of neighbors  Connections to multiple peers  Highly resilient to node failures 12
  • 13. Unstructured P2P (cont.)  Flooding-based lookup  Guaranteed content discovery  Implosion  High overhead  Expanding ring flooding  TTL-based random walk  Content discovery not guaranteed  Better performance by biasing random walk toward nodes with higher degree  If response follow same path  Anonymity  Used in KaZaA, BearShare, LimeWire, McAfee 13 D S D s Flooding Random walk
  • 14. Superpeers  Resource rich peers  Superpeers  Bandwidth, reliability, trust, memory, CPU, etc.  Flooding or random walk  Only superpeers are involved  Lower overhead  More scalable  Content discovery not guaranteed  Better performance when superpeers share list of file names  Examples: Gnutella V0.6, FastTrack, Freenet KaZaA, Skype 14 s D
  • 15. BitTorrent  Most popular P2P file sharing system to date  Features  Centralized search  Multiple downloads  Enforce fairness  Rarest-first dissemination  Incentives  Better contribution  Better download speeds (not always)  Enable content delivery networks  Revenue through ads on search engines 15 User Trackers Web-based search engine Content owner Keyword search .torrent file server Download .torrent file Get list of peers Download/ upload chunks
  • 16. BitTorrent Protocol  Content owner creates a .torrent file  File name, length, hash, list of trackers  Place .torrent file on a server  Publish URL of .torrent file to a web site  Torrent search engine  .torrent file points to a tracker(s)  Registry of leaches & seeds for a given file 16 User Trackers Web-based search engine Content owner Keyword search .torrent file server Download .torrent file Get list of peers Download/ upload chunks 1 2 3 4 1 2 3 4
  • 17. BitTorrent Protocol (cont.)  Tracker  Provide a random subset of peers sharing same file  Peer contacts subset of peers parallely  Files are shared based on chunk IDs  Chunk – segment of file  Periodically ask tracker for new set of IPs  Every 15 min  Pick peers with highest upload rate 17 User Trackers Web-based search engine Content owner Keyword search .torrent file server Download .torrent file Get list of peers Download/ upload chunks 1 2 3 4 1 2 3 4
  • 18. BitTorrent Terminology  Swarm  Set of peers accessing (upload/download) same file  Seeds  Peers with entire file  Leeches  Peers with part of file or no file (want to download) 18 www.kat.ph
  • 19. BitTorrent Site Stat 19 User ranking of file quality Seedpeer.com www.kat.ph/stats/ Files in search engine User verified to be valid Across all files Search Cloud
  • 20. BitTorrent Content Popularity  Few highly popular content  Moderately popular content follow Zipf’s-like distribution  Typical Zipf’s parameter 0.5-1.0 20 Number of occurances/queries (x) (log) 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 No of search terms with > x occurances (log) -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 fenopy.org y = 1.96 - 1.024x, r2 = 0.98 Number of occurances/queries (x) (og) 2.4 2.6 2.8 3.0 3.2 3.4 3.6 No of search terms with > x occurances (log) -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 youbitTorrent.com y = 4.71 - 1.88x, r2 = 0.98 Number of occurances/queries (x) (log) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 No of search terms with > x occurances (log) -7 -6 -5 -4 -3 -2 -1 0 Dataset1 - kickasstorrents.com y = 0.96 - 1.51x, r2 = 0.98 Toy Story 3 DVD release date June 18, 2010
  • 21. BitTorrent Characteristics  Flash crowd effect  Asymmetric bandwidth  Most peers leave after downloading  Diurnal & seasonal patterns 21 Flash crowd Download speed Session length [Zhang, 2009]
  • 22. BitTorrent Evolution 22 BitTorrent Global community Islands of communities Connected islands of communities BitTorrent ver. 4.2
  • 23. BitTorrent Fairness/Incentives  Tit-for-tat  Bandwidth policy  Upload to 4 peers that give me the highest download bandwidth  1 random peer  Chunk policy  Rarest first  Download least popular chunk  Initial seed try not to send same chunk twice  Most peers leave immediately after downloading  Modified nodes increase free riding  Modified policies 23
  • 24. Summary – Unstructured P2P  Separate content discovery & delivery  Content discovery is mostly outside of P2P overlay  Centralized solutions  Not scalable  Affect content delivery when failed  Distributed solutions  High overhead  May not locate the content  No predictable performance  Delay or message bounds  Lack of QoS or QoE 24
  • 25. Structured P2P  Deterministic approach to locate contents & peers  Locate peer(s) responsible for a given key  Contents  Unique key  Hash of file name, metadata, or actual content  160-bit or higher  Peers also have a key  Random bit string or IP address  Index keys on a Distributed Hash Table (DHT)  Distributed address space [0, 2m – 1]  Deterministic overlay to publish & locate content  Bounded performance under standard conditions 25
  • 26. Terminology  Hash function  Converts a large amount of data into a small datum  Hash table  Data structure that uses hashing to index content  Distributed Hash Table (DHT)  A hash table that is distributed  Types of hashing  Consistent or random  Locality preserving 26 f() f() f() g() g() g()
  • 27. Structured P2P – Example  2 operations  store(key, value)  locate(key) 27 Ring – 16 addresses Song.mp3 Cars.mpeg f() f() Find cars.mpeg n + 2i – 1, 1  i  m Successor 11 Song.mp3 6 Cars.mpeg O(log N) hops
  • 28. Chord [Stoica, 2001]  Key space arranged as a ring  Peers responsible for segment of the ring  Called successor of a key  1st peer in clockwise direction  Routing table  Keep a pointer (finger) to m peers  Keep a finger to (2i – 1)-th peer, 1 ≤ i ≤ m  Key resolution  Go to peer with the closest key  Recursively continue until key is find  Can be located within O(log n) hops 28 m =3-bit key ring
  • 29. Chord (cont.)  New peer entering overlay  Takes keys from the successor  Peer leaving overlay  Give keys to the successor  Fingers are updated as peers join & leave  Peer failure or churn makes finger table entries stale 29 New peer with key 6 joins the overlay Peer with key 1 leave the overlay
  • 30. Chord Performance  Path length  Worst case O(log N)  Average ½log2N  Updates O(log2 N)  Fingers O(log N)  Alternative paths (log N)!  Balanced distribution of keys  Under uniform distribution  N(log N) virtual nodes provides best load distribution 30
  • 31. Kademlia [Maymounkov, 2002]  Used in BitTorrent, eMule, aMule, & AZUREUS  160-bit keys  Nodes are assigned random keys  Distance between 2 keys is determined by XOR  Routing in the ring is bidirectional  dist(a  b) = dist(b  a)  Enable nodes to learn about new nodes from received messages  Keys are stored in nodes with the shortest XOR distance 31
  • 32. Structured P2P – Alternate Designs 32 d-Torus Content-Addressable Network (CAN) [Ratnasamy, 2001] (0, 0) (1, 0) (0, 0) (0, 1) Zone controller (0.1,0.9) (0.3,0.4) (0.4,0.8) (0.75,0.2) (0.35,0.1) (0.65,0.7) (0.8,0.4) (0.8,0.8) (0-0.5, 0-0.5) (0.5-1, 0-0.5) (0-0.5, 0.5-1) (0.5-1, 0.5-0.75) Cube connected cycle Cycloid [Shen, 2006]
  • 33. Summary – Structured P2P  Content discovery is within the P2P overlay  Deterministic performance  Chord  Unidirectional routing  Recursive routing  Peer churn & failure is an issue  Kademlia  Bidirectional routing  Parallel iterative routing  Work better under peer failure & churn  MySong.mp3 is not same as mysong.mp3  Unbalanced distribution of keys & load 33
  • 34. Summary (cont.) 34 Scheme Architecture Routing mechanism Lookup overhead* Routing table size* Join/leav e cost Resilience Chord Circular key space Successor & long distant links O(log N) O(log N) O(log2 N) High CAN d-torus Greedy routing through neighbors O(dN1/d) 2d 2d Moderate Pastry Hypercube Correct one digit in key at time O(logB N) O(B logB N) O(logB N) Moderate Tapestry Hypercube Correct one digit in key at time O(logB N) O(logB N) O(logB N) Moderate Viceroy Butterfly network Predecessor & successor links O(log N) O(1) O(log N) Low Kademlia Binary tree, XOR distance metric Iteratively find nodes close to key O(log N) O(log N) O(log N) High Cycloid Cube connected cycles Links to cyclic & cubical neighbors O(d) O(1) O(d) Moderate * N – number of nodes in overlay, d – number of dimensions B – base of a key identifier
  • 35. Structured vs. Unstructured 35 Unstructured P2P Structured P2P Overlay construction High flexibility Low flexibility Resources Indexed locally Indexed remotely on a distributed hash table Query messages Broadcast or random walk Unicast Content location Best effort Guaranteed Performance Unpredictable Predictable bounds Overhead High Relatively low Object types Mutable, with many complex attributes Immutable, with few simple attributes Peer churn & failure Supports high failure rates Supports moderate failure rates Applicable environments Small-scale or highly dynamic, e.g., mobile P2P Large-scale & relatively stable, e.g., desktop file sharing Examples Gnutella, LimeWire, KaZaA, BitTorrent Chord, CAN, Pastry, eMule, BitTorrent
  • 36. Login Server Superpeer overlay Skype  Proprietary  Encrypted control & data messages  Many platforms  Voice/video calls, instant messaging, file transfer, video/audio conferencing  Superpeer overlay  Related to KaZaA  Based on bandwidth, not behind firewall/NAT, & processing power  Enables NAT & firewall traversal for ordinary peers 36
  • 37. Skype (Cont.)  30% superpeers  Relatively stable  Diurnal behavior  Longer session length than typical P2P - Heavy tailed 37 [Guha, 2006]
  • 38. P2P Streaming  Emergence of IPTV  Content Delivery Networks (CDNs) can’t handle bandwidth requirements  No multicast support at network layer  P2P  Easy to implement  No global topology maintenance  Tremendous scalability  Greater demand  Better service  Cost effective  Robustness  No single point of failure  Adaptive  Application layer 38
  • 39. P2P Streaming – Components  Chunk  Segment of the video stream  e.g., one second worth of video  Partners  Subset of known peers that a peer may actually talk to 39 (Hie, 2008) (Zhang, 2005)
  • 40. (Liu, 2008) Tree-Push Approach  Construct overlay tree starting from video source  Parent peer selection is based on  Bandwidth, latency, number of peers, etc.  Data/chunks are pushed down the tree  Multi-tree-based approach  Better content distribution  Enhanced reliability 40
  • 41. Tree-Push Approach – Issues  Connectivity is affected when peers at the top of the tree leave/fail  Time to reconstruct the tree  Unbalanced tree  Majority of peers are leaves  Unable to utilize their bandwidth  High delay 41 (Zhang, 2005)
  • 42. Mesh-Pull Approach  A peer connects to multiple peers forming a mesh  Pros  More robust to failure  Better bandwidth utilization  Cons  No specific chunk forwarding path  Need to pull chunks from partners/peers  Need to know which partner has what  Used in most commercial products 42 (Zhang, 2005)
  • 43. Chunk Sharing  Each peer  Caches a set of chunks within a sliding window  Shares its chunk information with its partners  Buffer maps are used to inform chunk availability  Chunks may be in one or more partners  What chunks to get from whom? 43 (Hie, 2008)
  • 44. Chunk Scheduling  Some chunks are highly available while others are scare  Some chinks needs to be played soon  New chunks need to be pulled from video source  Chunk scheduling consider how a peer can get chunks while  Minimizing latency  Preventing skipping  Maximizing throughput  Chunk scheduling  Random, rarest first, earliest deadline first, earliest deadline & rarest first  Determines user QoE  Most commercial products use TCP for chunk transmission  Control message overhead ~1-2% 44
  • 45. Random Scheduling  One of the earliest approach – used in Chainsaw  Peers periodically share buffer maps  Select a random chunk & request it from one of the partners having the chunk  Some peers may experience significant playback delay  1-2 minutes  Skipping is possible 45 [Pai, 2005]
  • 46. Rarest-First Scheduling  Used in CoolStraming  Chunk = 1 sec video, 120 chunk in sliding window  A peer gets the rarest chunk so that chunk can be spread to its partners  Steps 1. Gather buffer maps periodically 2. Calculate number of suppliers (i.e., partners with chunk) for each chunk 3. Request chunks with the lowest number of suppliers 4. For chunks with multiple suppliers, request from the supplier with highest bandwidth & free time  Gather application-level bandwidth data for each partner  Request are made through a bitmap 46
  • 47. Rarest First (cont.)  It is sufficient to maintain 4 partners  Discover more peers overtime – use gossiping  Keep only the partners that have sufficient bandwidth & more chunks 47 (Zhang, 2005)
  • 48. Rarest First (Cont.)  More robust than tree-push approach  Larger user community  Better service quality  Most users experience < 1 min delay 48 (Zhang, 2005)
  • 49. Earliest-Deadline First  Objectives – Minimum playback delay & continuity  Rule 1  Chunk with the lowest sequence number has the highest priority  So request chunk with the lowest sequence number  Try to meet earliest deadline  Rule 2  Peer with the lowest, largest sequence number in buffer map has the highest priority  Falling behind, so let it speed up 49 [Chen, 2008]
  • 50. Earliest-Deadline First (Cont.)  DPC – Distributed Priority based Chunk scheduling  L – Number of partners  Lower playback delay  Lower skipping 50 [Chen, 2008]
  • 51. Bibliography 1. R. Bland, D. Caulfield, E. Clarke, A. Hanley, and E. Kelleher, “P2P routing,” Available: http://ntrg.cs.tcd.ie/undergrad/4ba2.02-03/p9.html 2. S. Guha, N. Daswani, and R. Jain, An Experimental Study of the Skype Peer-to-Peer VoIP System, 5th Int. Workshop on Peer-to-Peer Systems (IPTPS ‘06), Feb. 2006. 3. Y. Guo, C. Liang, and Y. Liu, Adaptive queue-based chunk scheduling for P2P live streaming, IFIP Networking, May 2008. 4. X. Hei, Y. Liu, and K. W. Ross, IPTV over P2P streaming networks: the mesh-pull approach, IEEE Communications Magazine, vol. 46, no. 2, Feb. 2008, pp. 86-92. 5. R Morselli, B. Bhattacharjee, A. Srinivasan, and M. A. Marsh, Efficient lookup on unstructured topologies, 24th ACM Symposium on Principles of Distributed Computing (PODC '05), July 2005. 6. Z. Chen, K. Xue, and P. Hong, A study on reducing chunk scheduling delay for mesh- based P2P live streaming, 7th Int. Conf. on Grid and Cooperative Computing, 2008, pp. 356-361. 7. Cisco Systems, Inc., Cisco Visual Networking Index: Forecast and Methodology, 2008– 2013, June 2009. 8. Cisco Systems, Inc., Approaching the Zettabyte Era, June 2008. 9. J. Liu, S. G. Rao, B. Li, and H. Zhang, “Opportunities and challenges of peer-to-peer internet video broadcast,” In Proc. of IEEE, vol. 96, no. 1, Jan. 2008, pp. 11-24. 51
  • 52. Bibliography 10. P. Maymounkov and D. Mazières, Kademlia: A peer-to-peer information system based on the XOR metric, 1st Int. Workshop on Peer-to-peer Systems (IPTPS ‘02), Feb. 2002, pp. 53-65. 11. S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, A scalable content- addressable network, ACM Special Interest Group on Data Communication (SIGCOMM ‘01), Aug. 2001. 12. I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan, Chord: a scalable peer-to-peer lookup service for internet applications, ACM SIGCOMM ‘01, 2001, pp. 149- 160. 13. D. Stutzbach, R. Rejaie, and S. Sen, Characterizing unstructured overlay topologies in modern P2P file-sharing systems, IEEE/ACM Transactions on Networking, vol. 16, no. 2, April 2008. 14. Y. Tang, L. Sun, K. Zhang, S. Yang, and Y. Zhong, Longer, better: On extending user online duration to improve quality of streaming service in P2P networks, IEEE Int. Conf. on Multimedia and Expo, July 2007. 15. M. Zhang, Y. Xiong, Q. Zhang, and S. Yang, On the optimal scheduling for media streaming in data-driven overlay networks, GLOBECOM ‘06, Nov.-Dec. 2006. 16. X. Zhang, J. Liu, B. Li, and T. P. Yum, CoolStreaming/DONet: a data-driven overlay network for efficient live media streaming, INFOCOM 2005, Mar. 2005. 52

Editor's Notes

  1. Some peers have special roles, higher connectivity, etc. Implemented as an application layer network - referred to as overlay network [ Overlay concept is older than P2P – more general, e.g., news delivery networks, multicast routing (as most IP routers do not support it), resilient overlay protocol (to provide a better path in physical network), secure delivery, trust establishment between endpoints, censorship resistant delivery, etc. Allows for more rapid deployment of new appns as it does not depend on changes to the underlying network. ] Many of the message types defined for various P2P systems are similar Self-scaling – more users => more capacity; but due to asymetries, this may not be achievable in certain cases. Peer Churn – when a peer goes off line, it takes time for others to be alerted and adapt to that. (Churn rate affects performance) Free rider – peer that uses service without contributing to it (peers follow their self interest). Peers are autonomous, some may even be malicious. Validating trustworthyness etc. without central control is difficult. - strategies such as Tit-for-tat
  2. % decline is due to services such as YouTube & 1 click file sharing systems such as DropBox, Box, RapidShare, etc. Source - Cisco Visual Networking Index 2008 – 2013 , 2009 ISP pesrpective: Relatively small no of users can overwhelm the network Broadband access wasn’t designed for this type of traffic (P2P symmetric, traditional DSL, Cable modem (4X or higher) downstream compared to upstream)
  3. Aggregated downloads to overcome asymmetric bandwidth Download from multiple peers at the same time (each peer may have a limited upload bandwidth) Many concurrent connections – TCP manage fairness per connection. If multiple connections are initiated can gain more bandwidth Superpeers are used in file sharing, Skype, & resource discovery
  4. Application layer is divided to 2 sub-layers (tier 1 may also be referred to as middleware layer) Example overlays – dialup, BGP, PlanetLab, CDNs It’s incorrect to assume DHT as the overlay – it’s only an index implemented on top of the overlay. E.g., Chord finger table form overlay & what’s indexed at a node form DHT
  5. Hybrid – structured topology but unstructured communication. E.g., for chord overlay & then use it to broadcast to all the node efficiently. Lee’s best peer selection. Radars are connected as structured P2P because data fusion group is known in advance. Where as best peers for current data fusion is found using broadcast Local minima search – each node has an ID. Resources are indexed in the local node with the closest ID (local minima). When routing first do a random walk then do a deterministic walk looking for the local minima We’ll not talk about hybrid designs in detail
  6. How is the initial P2P network formed from nodes ?
  7. Before specifics – this slide gives an overview Colored circles can be interpreted as different files Bound give the lookup/query cost
  8. (Lookup – process of finding where the content is) Napster is not the first P2P system, but demonstrated the potential (1999) Before that, many organizations (including Intel) used some kind of P2P application(s) to aggregate idle computing power in there machines Inspired many modern P2P systems, very popular, later a test case for many legal issues Uses central server for storing and searching the directory of files (hence not a full P2P system as many subsequent systems were) Step 1 - Peers report their list of files to centralized database Step 2 - users query central database Sep 3 – file is directly download from a peer that have it (no multiple/parallel downloads)
  9. First full P2P filesharing system. Earliest versions (through V0.4) used unstructured overlay with flooding for queries Due to need for scalability (V0.6 and higher) adopted a superpeer architecture. High-capacity peers are super peers, and all queries are routed using a flooding mechanism among superpeers.
  10. FLOODING & Random Walk: Flooding: Implosion – same node getting multiple messages for same query Both: scalability, RW – may not find obj Enhancements: TTL – time to live Expanding ring flooding – first flood to k-hops, if no result flood k+1 hops, if no response then try k+ 2 hops, similarly continue Random walk: – Query failure determined by a timeout or explicit failure message from last node. Several random walk queries may be issued in parallel as well. Additional techniques in UP2P: Overlay topology a) how do decide on peers (number, who to connect to or retain, etc.); base decision on capacity of peers, type of content, connectivity or peer, etc.; b) clustering – ex. Clusters formed when probl of two nodes being connected is higher if they have a common neighbor; - Prefer connectivity to high-degree nodes, with shared contents, peers with objects closer in key space, etc. Object placement – selection of nodes where an object is placed. E.g., base it on popularity, routing mechanism, etc. - distribute replica’s of popular objects (explicit push, caching) Caching - can inform random walk on what objects are nearby (cache summary information about contents of neighbors etc.) Query forwarding criteria Misc: McAfee use P2P to update virus definition within a local network
  11. Superpeers are selected based on Bandwidth, reliability, trust, memory, CPU Gnutella V0.6 and higher, FastTrack a proprietary system: FastTrack: Another P2P system around same time as Gnutella. Used by no of clients such as KaZaA, Grokster and Imesh. Proprietary system using an encrypted protocol high-capacity nodes are supernodes (SN), and low-capacity nodes are ordinary nodes (ON); Each SN maintains connections to 40-50 other SNs (in a network of ~3M nodes and ~30K SNs - practical numbers) and 50-80 ONs . Each ON connects to one SN; SN provides ON with a list of other SNs which ON caches; after ON issues a query and a SN responds, ON disconnects from current SN and attaches to a new SN from list. ON now receieves a new SN list which it merges with its list. Average SN-SN connection ~35 mins, SN-ON connection ~10 mins (~30% <30 secs) These changes (also in connectivity) help balance load in the network, improving locality, and connection shuffling that increases long range coverage. It also makes tracking peer transfers difficult. Freenet: (Open Source) Proposed in 1999 – P2P file-sharing system - contains security, anonymity and deniability features Objects and peers have identifiers – aka routing keys (created using a hash function). Each peer – a fixed sized routing table (containing keys of peers); mesh; Requests forwarded to peers with closest matching routing key. If request fails, it tries again with peer with next closest routing key. (Algorithm- steepest accent hill climbing with back tracking until TTL expires) -Also caches objects along the return path to reduce failure. FastFreenet: improves the hit-rate by: Peers share a fuzzy description of files it has with neighbors, which allows nodes to forward query to peers likely to have the object. Fuzzy description- an N bit number where each bit corresponds to 1/N segment of the key space.
  12. Users – Better contribution  better download speeds (not always) Content providers – Enable content delivery networks 3rd parties - Revenue through ads on search engines Guaranteed to find content because of centralized search
  13. Trackers can be contacted using TCP, UDP, or HTTP
  14. Curves are based on Pareto distribution
  15. Fig 1 : Comparison of swarm dynamics resulting from swarm-level measurement and peer-level measurement TR – 4 different communities Peer arrival is not poison In the past, average session length was 30 min
  16. Everything in 1 community
  17. Rarest first – based on of chunks in currently connected list of peers
  18. Unstructured P2P – easy to implement, inefficient routing, inability to locate rare objects. Gradual changes – e.g. clustering, near/rar links, semantic links etc. to improve efficiency. SP2P takes this one step further. QoE – quality of experience
  19. Structured overlay - designs overlays with routing mechanisms that are deterministic, and allowing for location of any objects (in bounded time). P2P supports key based routing - object identifiers are mapped to peer identifier address space, and object requested routed to the nearest peer in P2P address space Goal – Distributed object location and routing (DOLR). A specific scheme is DHT (distributed hash table)
  20. M – key length in bits Original paper use iterative routing (s go to x, x inform y to s, s go to y, y inform z, s go to z …) to implement recursive routing
  21. Fig 2 – 10,000 nodes 1,000,000 keys Virtual node – 1 physical node acting as multiple nodes distributed across the ring Ideally 1 physical node should represent log N virtual nodes
  22. Conceptually chord is recursive – but actual implementation in paper uses iterative routing (no difference in performance just increase hop count)
  23. Skype was developed by the same group that developed KaZaa – which is the 1st P2P system to use superpeers Superpeers are selected based on bandwidth, not behind firewall or NAT, and processing power Skype protocol behavior is found using reverse engineering Has over 600 million users today Peers not behind firewall/NAT communicate directly One behind NAT/firewall other not behind  Peer behind initiate connection to other If both are behind  through superpeer(s) Works on Windows, Linux, Mac, iPhone, Android
  24. Red dots – all nodes including superpeers
  25. Peers that have a complete copy of the file are called seeds
  26. ON/OFF period T – Average of the exponential distribution Originally, chunks were transported using TFRC TFRC - TCP Friendly Rate Control (RFC 3448) Currently seems to be using TCP/UDP