0
An Introduction to
Peer-to-Peer Networks
          Presentation for
    MIE456 - Information Systems
          Infrastruct...
Agenda
n   Overview of P2P
    n   Characteristics
    n   Benefits

n   Unstructured P2P systems
    n   Napster (Central...
Client/Server Architecture
n   Well known,
    powerful, reliable                                           Server
    ser...
Client/Server Limitations
n   Scalability is hard to achieve
n   Presents a single point of failure
n   Requires administr...
P2P Computing*
n   P2P computing is the sharing of computer resources and
    services by direct exchange between systems....
P2P Architecture
n   All nodes are both
    clients and servers
    n   Provide and consume                               ...
P2P Network Characteristics
n   Clients are also servers and routers
    n   Nodes contribute content, storage, memory, CP...
P2P Benefits
n   Efficient use of resources
     n   Unused bandwidth, storage, processing power at the edge of the networ...
P2P Applications
n   Are these P2P systems?

    n   File sharing (Napster, Gnutella, Kazaa)

    n   Multiplayer games (U...
Popular P2P Systems
n   Napster, Gnutella, Kazaa, Freenet

n   Large scale sharing of files.
    n   User A makes files (m...
Napster
n    A way to share music files with
     others

n    Users upload their list of files to
     Napster server
n  ...
Napster
n   Central Napster server
    n   Can ensure correct results
    n   Bottleneck for scalability
    n   Single po...
Gnutella
n    Share any type of files
     (not just music)
n    Decentralized search
     unlike Napster

n    You ask yo...
Gnutella
n   Decentralized
    n   No single point of failure
    n   Not as susceptible to denial of service
    n   Cann...
Kazaa (Fasttrack network)
n   Hybrid of centralized Napster and decentralized Gnutella

n   Super-peers act as local searc...
Free riding*
n   File sharing networks rely on users sharing data

n   Two types of free riding
    n   Downloading but no...
Anonymity
n   Napster, Gnutella, Kazaa don’t provide
    anonymity
    n   Users know who they are downloading from
    n ...
Freenet
n   Data flows in reverse path of query
    n   Impossible to know if a user is initiating or forwarding a query
 ...
Structured P2P
n   Second generation P2P overlay networks

n   Self-organizing
n   Load balanced
n   Fault-tolerant

n   S...
Distributed Hash Tables (DHT)
n   Distributed version of a hash table data structure
n   Stores (key, value) pairs
    n  ...
DHT Generic Interface
n   Node id: m-bit identifier (similar to an IP address)
n   Key:     sequence of bytes
n   Value: s...
DHT Applications
n   Many services can be built on top of a DHT
    interface
    n   File sharing
    n   Archival storag...
DHT Desirable Properties
n   Keys mapped evenly to all nodes in the
    network
n   Each node maintains information about ...
DHT Routing Protocols
n   DHT is a generic interface

n   There are several implementations of this interface
     n   Cho...
Chord API
n   Node id:     unique m-bit identifier
                 (hash of IP address or other unique ID)
n   Key:      ...
Chord Identifier Circle
n   Nodes organized in
    an identifier circle
    based on node
    identifiers
n   Keys assigne...
Chord Finger Table
n   O(logN)
    table size

n   ith finger
    points to
    first node
    that
    succeeds n
    by ...
Chord Key Location
n   Lookup in
    finger table
    the furthest
    node that
    precedes
    key

n   Query
    homes...
Chord Properties
n   In a system with N nodes and K keys, with
    high probability…
    n   each node receives at most K/...
Network locality
n   Nodes close on ring can be far in the
    network.
                                                  ...
Pastry
n   Similar interface to Chord

n   Considers network locality to
    minimize hops messages
    travel

n   New no...
CAN
n   Based on a
    “d-dimensional
    Cartesian
    coordinate
    space on a
    d-torus”

n   Each node
    owns a d...
CAN Routing and Node Arrival
P2P Review
n   Two key functions of P2P systems
    n   Sharing content
    n   Finding content

n   Sharing content
    n...
Conclusions
n   P2P connects devices at the edge of the Internet

n   Popular in “industry”
    n   Napster, Kazaa, etc. a...
Upcoming SlideShare
Loading in...5
×

An Introduction to Peer-to-Peer Networks

959

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
959
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
23
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "An Introduction to Peer-to-Peer Networks"

  1. 1. An Introduction to Peer-to-Peer Networks Presentation for MIE456 - Information Systems Infrastructure II Vinod Muthusamy October 30, 2003
  2. 2. Agenda n Overview of P2P n Characteristics n Benefits n Unstructured P2P systems n Napster (Centralized) n Gnutella (Distributed) n Kazaa/Fasttrack (Super-peers) n Structured P2P systems (DHTs) n Chord n Pastry n CAN n Conclusions
  3. 3. Client/Server Architecture n Well known, powerful, reliable Server server is a data source Client Client n Clients request data from server Internet n Very successful Client Client model n WWW (HTTP), FTP, Web services, etc. * Figure from http://project-iris.net/talks/dht-toronto-03.ppt
  4. 4. Client/Server Limitations n Scalability is hard to achieve n Presents a single point of failure n Requires administration n Unused resources at the network edge n P2P systems try to address these limitations
  5. 5. P2P Computing* n P2P computing is the sharing of computer resources and services by direct exchange between systems. n These resources and services include the exchange of information, processing cycles, cache storage, and disk storage for files. n P2P computing takes advantage of existing computing power, computer storage and networking connectivity, allowing users to leverage their collective power to the ‘benefit’ of all. * From http://www-sop.inria.fr/mistral/personnel/Robin.Groenevelt/ Publications/Peer-to-Peer_Introduction_Feb.ppt
  6. 6. P2P Architecture n All nodes are both clients and servers n Provide and consume Node data n Any node can initiate a Node Node connection Internet n No centralized data source n “The ultimate form of Node Node democracy on the Internet” n “The ultimate threat to copy-right protection on the Internet” * Content from http://project-iris.net/talks/dht-toronto-03.ppt
  7. 7. P2P Network Characteristics n Clients are also servers and routers n Nodes contribute content, storage, memory, CPU n Nodes are autonomous (no administrative authority) n Network is dynamic: nodes enter and leave the network “frequently” n Nodes collaborate directly with each other (not through well-known servers) n Nodes have widely varying capabilities
  8. 8. P2P Benefits n Efficient use of resources n Unused bandwidth, storage, processing power at the edge of the network n Scalability n Consumers of resources also donate resources n Aggregate resources grow naturally with utilization n Reliability n Replicas n Geographic distribution n No single point of failure n Ease of administration n Nodes self organize n No need to deploy servers to satisfy demand (c.f. scalability) n Built-in fault tolerance, replication, and load balancing
  9. 9. P2P Applications n Are these P2P systems? n File sharing (Napster, Gnutella, Kazaa) n Multiplayer games (Unreal Tournament, DOOM) n Collaborative applications (ICQ, shared whiteboard) n Distributed computation (Seti@home) n Ad-hoc networks
  10. 10. Popular P2P Systems n Napster, Gnutella, Kazaa, Freenet n Large scale sharing of files. n User A makes files (music, video, etc.) on their computer available to others n User B connects to the network, searches for files and downloads files directly from user A n Issues of copyright infringement
  11. 11. Napster n A way to share music files with others n Users upload their list of files to Napster server n You send queries to Napster server for files of interest n Keyword search (artist, song, album, bitrate, etc.) n Napster server replies with IP address of users with matching files n You connect directly to user A to download file * Figure from http://computer.howstuffworks.com/file-sharing.htm
  12. 12. Napster n Central Napster server n Can ensure correct results n Bottleneck for scalability n Single point of failure n Susceptible to denial of service n Malicious users n Lawsuits, legislation n Search is centralized n File transfer is direct (peer-to-peer)
  13. 13. Gnutella n Share any type of files (not just music) n Decentralized search unlike Napster n You ask your neighbours for files of interest n Neighbours ask their neighbours, and so on n TTL field quenches messages after a number of hops n Users with matching files reply to you * Figure from http://computer.howstuffworks.com/file-sharing.htm
  14. 14. Gnutella n Decentralized n No single point of failure n Not as susceptible to denial of service n Cannot ensure correct results n Flooding queries n Search is now distributed but still not scalable
  15. 15. Kazaa (Fasttrack network) n Hybrid of centralized Napster and decentralized Gnutella n Super-peers act as local search hubs n Each super-peer is similar to a Napster server for a small portion of the network n Super-peers are automatically chosen by the system based on their capacities (storage, bandwidth, etc.) and availability (connection time) n Users upload their list of files to a super-peer n Super-peers periodically exchange file lists n You send queries to a super-peer for files of interest
  16. 16. Free riding* n File sharing networks rely on users sharing data n Two types of free riding n Downloading but not sharing any data n Not sharing any interesting data n On Gnutella n 15% of users contribute 94% of content n 63% of users never responded to a query n Didn’t have “interesting” data * Data from E. Adar and B.A. Huberman (2000), “Free Riding on Gnutella”
  17. 17. Anonymity n Napster, Gnutella, Kazaa don’t provide anonymity n Users know who they are downloading from n Others know who sent a query n Freenet n Designed to provide anonymity among other features
  18. 18. Freenet n Data flows in reverse path of query n Impossible to know if a user is initiating or forwarding a query n Impossible to know if a user is consuming or forwarding data n “Smart” queries n Requests get routed to correct peer by incremental discovery
  19. 19. Structured P2P n Second generation P2P overlay networks n Self-organizing n Load balanced n Fault-tolerant n Scalable guarantees on numbers of hops to answer a query n Major difference with unstructured P2P systems n Based on a distributed hash table interface
  20. 20. Distributed Hash Tables (DHT) n Distributed version of a hash table data structure n Stores (key, value) pairs n The key is like a filename n The value can be file contents n Goal: Efficiently insert/lookup/delete (key, value) pairs n Each peer stores a subset of (key, value) pairs in the system n Core operation: Find node responsible for a key n Map key to node n Efficiently route insert/lookup/delete request to this node
  21. 21. DHT Generic Interface n Node id: m-bit identifier (similar to an IP address) n Key: sequence of bytes n Value: sequence of bytes n put(key, value) n Store (key,value) at the node responsible for the key n value = get(key) n Retrieve value associated with key (from the appropriate node)
  22. 22. DHT Applications n Many services can be built on top of a DHT interface n File sharing n Archival storage n Databases n Naming, service discovery n Chat service n Rendezvous-based communication n Publish/Subscribe
  23. 23. DHT Desirable Properties n Keys mapped evenly to all nodes in the network n Each node maintains information about only a few other nodes n Messages can be routed to a node efficiently n Node arrival/departures only affect a few nodes
  24. 24. DHT Routing Protocols n DHT is a generic interface n There are several implementations of this interface n Chord [MIT] n Pastry [Microsoft Research UK, Rice University] n Tapestry [UC Berkeley] n Content Addressable Network (CAN) [UC Berkeley] n SkipNet [Microsoft Research US, Univ. of Washington] n Kademlia [New York University] n Viceroy [Israel, UC Berkeley] n P-Grid [EPFL Switzerland] n Freenet [Ian Clarke] n These systems are often referred to as P2P routing substrates or P2P overlay networks
  25. 25. Chord API n Node id: unique m-bit identifier (hash of IP address or other unique ID) n Key: m-bit identifier (hash of a sequence of bytes) n Value: sequence of bytes n API n insert(key, value) à store key/value at r nodes n lookup(key) n update(key, newval) n join(n) n leave()
  26. 26. Chord Identifier Circle n Nodes organized in an identifier circle based on node identifiers n Keys assigned to their successor node in the identifier circle n Hash function ensures even distribution of nodes and keys on the circle
  27. 27. Chord Finger Table n O(logN) table size n ith finger points to first node that succeeds n by at least 2i-1
  28. 28. Chord Key Location n Lookup in finger table the furthest node that precedes key n Query homes in on target in O(logN) hops
  29. 29. Chord Properties n In a system with N nodes and K keys, with high probability… n each node receives at most K/N keys n each node maintains info. about O(logN) other nodes n lookups resolved with O(logN) hops n No delivery guarantees n No consistency among replicas n Hops have poor network locality
  30. 30. Network locality n Nodes close on ring can be far in the network. To vu.nl Lulea.se OR-DSL N20 CMU MIT MA-Cable Cisco CA-T1 N40 Cornell N41 CCI N80 NYU Aros Utah * Figure from http://project-iris.net/talks/dht-toronto-03.ppt
  31. 31. Pastry n Similar interface to Chord n Considers network locality to minimize hops messages travel n New node needs to know a nearby node to achieve locality n Each routing hop matches the destination identifier by one more digit n Many choices in each hop (locality possible)
  32. 32. CAN n Based on a “d-dimensional Cartesian coordinate space on a d-torus” n Each node owns a distinct zone in the space n Each key hashes to a point in the space
  33. 33. CAN Routing and Node Arrival
  34. 34. P2P Review n Two key functions of P2P systems n Sharing content n Finding content n Sharing content n Direct transfer between peers n All systems do this n Structured vs. unstructured placement of data n Automatic replication of data n Finding content n Centralized (Napster) n Decentralized (Gnutella) n Probabilistic guarantees (DHTs)
  35. 35. Conclusions n P2P connects devices at the edge of the Internet n Popular in “industry” n Napster, Kazaa, etc. allow users to share data n Legal issues still to be resolved n Exciting research in academia n DHTs (Chord, Pastry, etc.) n Improve properties/performance of overlays n Applications other than file sharing are being developed
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×