• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Building a scalable online backup system in python
 

Building a scalable online backup system in python

on

  • 3,544 views

An overview of the design and architecture of the PutPlace online backup system.

An overview of the design and architecture of the PutPlace online backup system.

Statistics

Views

Total Views
3,544
Views on SlideShare
3,543
Embed Views
1

Actions

Likes
4
Downloads
34
Comments
0

1 Embed 1

http://rxtx.posterous.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Building a scalable online backup system in python Building a scalable online backup system in python Presentation Transcript

    • Building a Scalable Online Backup System in Python
      Joe Drumgoole
      http://twitter/jdrumgoole
    • Scaling
      You probably shouldn’t care
      Throughput vs response time
      Scaling is a fractal problem
      The database is what will get ya!
      Amazing what a well tuned DB will support
      http://twitter.com/jdrumgoole
      2
    • PutPlace Architecture
      http://twitter.com/jdrumgoole
      3
    • Online Backup
      Not really a Web 2.0 play
      More like client server
      Larger vision of PutPlace
      Map of your Digital World
      http://twitter.com/jdrumgoole
      4
    • Online Backup : Client
      Installation and support of Windows 20**
      Mac Support
      Open file/locked file handling
      Bandwidth throttling
      CPU Throttling
      Upload restarts
      Feedback
      http://twitter.com/jdrumgoole
      5
    • Online Backup : Server
      Don’t loose any files
      De-duplication
      Thumbnail generation for images
      Flickr Backup
      Client Feedback
      Bulk download
      File relationships
      http://twitter.com/jdrumgoole
      6
    • Online Backup - Secrets
      People Don’t backup
      Compute dominates
      Restores represent 0.01% of bandwidth and load
      Writing web clones of Windows Explorer is hard
      The browser sucks as a client side app container (for now)
      http://twitter.com/jdrumgoole
      7
    • Scaling
      For online backup the challenge is to receive shed loads of data from lots of clients
      Clients upload in 1MB chunks
      Chunks must be stored coalesced and push to stable backup (S3)
      Clients must get acknowledgement
      Web page must update
      Quota management
      http://twitter.com/jdrumgoole
      8
    • Load Balancer
      Load Balancer : Perlbal
      http://www.danga.com/perlbal/
      Can handle 100 x millions of requests per day
      Event based
      (sshhh : Don’t tell anyone, but its Perl!)
      It does fall over occasionally
      Otherwise works perfectly
      http://twitter.com/jdrumgoole
      9
    • App Server
      Our app servers:
      Handle login
      Deliver web pages
      Handle uploads from clients
      Hand off heavy duty processing to task servers
      Thumbnail generation
      File coalescing
      Checksum generation
      Hand off is via a database queue
      http://twitter.com/jdrumgoole
      10
    • App Server
      Just Django Instances
      Templates deliver web pages
      Views handle chunks/login etc.
      Models update the database
      Task Servers do the heavy lifting
      http://twitter.com/jdrumgoole
      11
    • Task Server
      Run off a database queue (table)
      Four main task servers:
      Assemble completed file uploads
      Create thumbnails
      Remove deleted files
      Generate user statistics
      Servers are multi-threaded
      http://twitter.com/jdrumgoole
      12
    • Refactoring
      Originally N blacknight servers writing to NFS
      Then N blacknight servers writing to S3
      Then N EC2 servers writing to S3
      The N EC2 servers writing to MogileFS/S3
      Lots of uploading optimisations along the way
      http://twitter.com/jdrumgoole
      13
    • Results
      System has successfully uploaded over 100k files in a single day
      Regularily does 50k files a day
      Have about 2k registered users
      Continues to get registrations
      Runs in lights out mode (no daily/weekly/monthly housekeeping)
      http://twitter.com/jdrumgoole
      14
    • What worked
      Python proved extremely flexible
      Standard library saved us lots of work
      Django provided a lot of glue
      Easy to migrate from dedicated host on NFS to Cloud Hosting and S3 storage
      Nagios/Monitis monitoring
      http://twitter.com/jdrumgoole
      15
    • What Didn’t Work
      Would use MySQL rather than Postgres
      Easier to cluster, more knowledge available
      Native Windows Client
      Unecessary, Python client was good enough
      Would use an off the shelf queueing system
      RabbitMQ, ActiveMQ, SQS
      Kludgey client side API
      Threading The Client
      http://twitter.com/jdrumgoole
      16
    • Tool Chain
      Wush.net : Subversion and Trac
      DynDNS: Dynamic DNS
      Python/Django: Dev Stack
      Postgres: Database
      Hudson : Build Server
      Perlbal: Load Balancing
      MogileFS : Distributed File System
      Memcached : Caching
      Nagios, Monitis: Monitoring
      Hamachi : VPN through Firewall
      Google Apps : Email, Calendar, Docs, Wiki
      AuthSMTP : Validated SMTP
      Zendesk: Support Desk
      Amazon : Storage, Compute, Bandwidth
      Paypal : Billing
      http://twitter.com/jdrumgoole
      17
    • Costs
      Capital Expenditure
      One server 5k euro
      One laptop per developer 2.5k (7 devs)
      One Linksys WIFI/Firewall (won at Raffle)
      Two 24 port switches 1.6k
      Total: ~24k
      Running Costs for Grid and Storage
      ~1800 euro a month (8 instances)
      http://twitter.com/jdrumgoole
      18
    • If I Were Doing it Again
      Stick with native python client
      Look at eventing ala Node.js for server
      Use MySQL
      Use Google App Engine as Front End/Load Balancer
      Use a commercial queueing package
      http://twitter.com/jdrumgoole
      19
    • Thanks
      Q&A
      http://twitter.com/jdrumgoole
      20