SWORD2 and Bittorrent
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

SWORD2 and Bittorrent

on

  • 778 views

Presentation giving a summary of what we got up to at the MRD hack day 2012. http://devcsi.ukoln.ac.uk/past-events/mrd-hack-days/

Presentation giving a summary of what we got up to at the MRD hack day 2012. http://devcsi.ukoln.ac.uk/past-events/mrd-hack-days/

Statistics

Views

Total Views
778
Views on SlideShare
778
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

SWORD2 and Bittorrent Presentation Transcript

  • 1. SWORD2 & BITTORRENTA Network Admin’s Worst Nightmare Tim Brody, Damian Steer,Sander van der Waal, Steve Welburn
  • 2. WHAT IS SWORD2?SWORD2 is a protocol for depositing stuff and its metadatawith a repository. Its implemented as a profile of the AtomPublishing Protocol, which is roughly: Client GETs service document from server Client POSTs stuff for deposit and metadata to url mentioned in service document Server responds with created this at url
  • 3. Client can GET url Client edit url content with PUT Client can DELETE urlAtom originated in blogs, and SWORD2 essentially justexpands the metadata used.
  • 4. THE PROBLEM WITHBIG DEPOSITSBig deposits take ages to transfer, which makes themsuseptible to interruptions due to error, or simply boredom(Oops, I closed my laptop...). In itself that ought to berecoverable since HTTP supports partial uploads using therange header. However if you look at steps 2 and 3 aboveyou may see a problem: Client POSTs stuff for deposit and metadata Server responds with created this at url
  • 5. THE IDEASend a reference to content via SWORD, rather than thecontent itself.We could use any number of schemes then, such as ftp,rsync or http. (HTTP will work fine this way around becausethe content has an identity and could be resumed)(Aside: its also interesting that a repository could chose notto download, such as situations where the data is stored in anational subject repository)
  • 6. OR BITTORRENTUnlike rsync, ftp, or http, there are many serverimplementations, with nice GUIs, for a variety of platformsin a number of languages. (server and client labels arentespecially helpful with bittorrent)Handles partial downloading with ease.No packaging required: moving directories is as easy asindividual files.
  • 7. WHAT DO YOU NEED?A bittorrent client at the depositors end. This is where thefiles start.A bittorrent client at the repository end. This is where thefiles will appear.A bittorrent tracker.
  • 8. WHAT WE NOW KNOWBittorrent is a peer-to-peer network. The clients are peers, itjust happens that some have all the data (seeders), andsome are seeking data (leechers). Data is identified (veryroughly) using a hash of the content.Clients need to find each other, and to do this they useservers called trackers, the URLs of which are included intorrent files. Trackers are pretty simple: you can contactthem to say I am interested in X, and find other clientsinterested in X.
  • 9. USING SWORD2 ANDBITTORRENTUploader opens bittorrent client, and creates a torrent filefor a file or directory.The tracker used may be the repository itself.SWORD deposit is made as usual, but the content is atorrent file.Content will be deposited.
  • 10. IMPLEMENTATIONTim has / is making EPrints a bittorrent tracker.It will spot torrents uploaded via SWORD.Uses transmission-cli to download.Steve is making a deposit client.Makes a torrent file, opens in torrent client, and uploads viaSWORD.
  • 11. INTERESTING STUFFThis really helps with the other issue of large datasets:downloading. I hope people will typically want individualfiles, but this would allow full downloads without killing theserver.
  • 12. MORE INTERESTINGSTUFFIts robust, and actually quite secure. You cant downloadwithout the torrent file.Can limit torrents to a particular tracker.The tracker also provides basic usage information.
  • 13. THE DISTRIBUTEDCONTENT STOREThe picture shifts from repositories holding data to databeing spread across the network.