Rubygems - behind the gems.

6,809 views
6,647 views

Published on

My talk about the current state of the rubygems infrastructure, problems, possible solutions.

The intention behind this talk is to make people care about a problem and join forces to fix them. It's not about blaming anyone who spents her/his time for doing open source work!

Published in: Technology
1 Comment
13 Likes
Statistics
Notes
  • Nice slide deck! We definitely need to figure this out. We've discussed using mirrorbrain among other things. Hop in #rubygems at some point so we can discuss this, or shoot me an email.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
6,809
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
47
Comments
1
Likes
13
Embeds 0
No embeds

No notes for slide













































  • Rubygems - behind the gems.

    1. 1. RUBYGEMS ...behind the gems Roland Moriz ~ Moriz GmbH Ruby User Group München 26.10.2010 http://moriz.de/
    2. 2. http://moriz.de/Rubygems behind the gems. Hello blaaa bla Moriz GmbH bla bla Software Development Services bla bla bla bla Consulting bla blaaaa bla blaaaaaa bla Infrastructure Services bla bla Roland bla bla bla bla professional software development since 1999 bla bla Amazon Marketplace Deutschland bla bla bla Tiscali Games bla bla FIFA WM 2006 bla Yahoo.de bla bla bla bla two billion pageviews bla bla blala Allianz24.de/ Allsecur.de bla bla bla Ruby User Group München bla bla blabla http://moriz.de/ bla blaaaba http://rails.io bla http://boot.io blablabla recently hetzner-api gem bla bla bla and the slides will be available @ http://moriz.de/talks/ rubygems. ;-)
    3. 3. http://moriz.de/Rubygems behind the gems. RUBYGEMS MOVING PARTS rubygems / cli gemcutter gem code / library, app, data, meta
    4. 4. http://moriz.de/Rubygems behind the gems. RUBYGEMS MOVING PARTS rubygems / cli gemcutter $ gem require ”rubygems“ http://rubygems.org/ (and extensions to the rubygems client) distribution creation, download, setup, usage (index building, server)
    5. 5. http://moriz.de/Rubygems behind the gems. RUBYGEMS FACTS • used by nearly every ruby project • the core of the ruby ecosystem • standard lib (with MRI 1.9.x) • 17.000+ gem projects • 81.000+ gem files • 23 GB+
    6. 6. http://moriz.de/ started at RubyConf 2003 by: • Rich Kilmer • Chad Fowler • David Black • Paul Brannan • Jim Weirch > http://rubyforge.org/projects/rubygems/ Rubygems behind the gems. RUBYGEMS FACTS
    7. 7. http://moriz.de/Rubygems behind the gems. GEM FACTS • described by a .gemspec • gem build my.gemspec easier ways: • bundler, jewler, newgem(?), ...
    8. 8. http://moriz.de/Rubygems behind the gems. GEM FACTS tar xvf rails-3.0.1.gem x data.tar.gz x metadata.gz contents:
    9. 9. http://moriz.de/Rubygems behind the gems. GEM FACTS metadata.gz > gzipped YAML data.tar.gz > payload
    10. 10. http://moriz.de/Rubygems behind the gems. GEMCUTTER FACTS • started in April 2009 • is now rubygems.org (rubygems 1.3.6+) • replaced rubyforge • manages uploads & downloads • rails app using PostgreSQL + rack middleware with sinatra • by Nick Quaranto (@qrush) of Thoughtbot > http://github.com/rubygems/gemcutter
    11. 11. http://moriz.de/Rubygems behind the gems. BIG PICTURE: UPLOAD RELEASE $ gem release hetzner-api.gemspec Successfully built RubyGem Name: hetzner-api Version: 1.0.0 File: hetzner-api-1.0.0.gem Pushing gem to RubyGems.org... gem release
    12. 12. http://moriz.de/Rubygems behind the gems. BIG PICTURE: UPLOAD RELEASE cli rubygems.org (gemcutter)
    13. 13. http://moriz.de/Rubygems behind the gems. BIG PICTURE: UPLOAD RELEASE cli rubygems.org AWS S3 gem file
    14. 14. http://moriz.de/Rubygems behind the gems. BIG PICTURE: UPLOAD RELEASE cli rubygems.org AWS S3 update specs > database > spec files to s3 spec files
    15. 15. http://moriz.de/Rubygems behind the gems. BIG PICTURE: UPLOAD RELEASE cli rubygems.org AWS S3 update specs webhooks, rss, ... http://rubygems.org/pages/api_docs
    16. 16. http://moriz.de/Rubygems behind the gems. BIG PICTURE: DOWNLOAD cli rubygems.org AWS S3
    17. 17. http://moriz.de/Rubygems behind the gems. SPECS AKA „THE INDEX“ $sudo gem install rails -V GET http://gems.rubyforge.org/latest_specs.4.8.gz 302 Found GET http://production.s3.rubygems.org/latest_specs.4.8.gz 200 OK GET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz 302 Found GET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz 200 OK GET http://gems.rubyforge.org/specs.4.8.gz 302 Found GET http://production.s3.rubygems.org/specs.4.8.gz 200 OK GET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz 302 Found GET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz ...
    18. 18. http://moriz.de/Rubygems behind the gems. SPECS AKA „THE INDEX“no specific version => latest $sudo gem install rails -V GET http://gems.rubyforge.org/latest_specs.4.8.gz 302 Found GET http://production.s3.rubygems.org/latest_specs.4.8.gz 200 OK GET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz 302 Found GET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz 200 OK GET http://gems.rubyforge.org/specs.4.8.gz 302 Found GET http://production.s3.rubygems.org/specs.4.8.gz 200 OK GET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz 302 Found GET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz ...
    19. 19. $sudo gem install rails -V GET http://gems.rubyforge.org/latest_specs.4.8.gz 302 Found GET http://production.s3.rubygems.org/latest_specs.4.8.gz 200 OK GET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz 302 Found GET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz 200 OK GET http://gems.rubyforge.org/specs.4.8.gz 302 Found GET http://production.s3.rubygems.org/specs.4.8.gz 200 OK GET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz 302 Found GET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz ... http://moriz.de/Rubygems behind the gems. SPECS AKA „THE INDEX“ Gem.marshal_version => Marshal::MAJOR_VERSION Marshal::MINOR_VERSION
    20. 20. http://moriz.de/Rubygems behind the gems. SPECS AKA „THE INDEX“ irb(main):001:0> x = {} => {} irb(main):002:0> x['farbe'] = 'ananasblau' => "ananasblau" irb(main):003:0> Marshal.dump x => "004b{006"nfarbe"017ananasblau" etc. > http://ruby-doc.org/core/classes/Marshal.html
    21. 21. $sudo gem install rails -V GET http://gems.rubyforge.org/latest_specs.4.8.gz 302 Found GET http://production.s3.rubygems.org/latest_specs.4.8.gz 200 OK GET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz 302 Found GET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz 200 OK GET http://gems.rubyforge.org/specs.4.8.gz 302 Found GET http://production.s3.rubygems.org/specs.4.8.gz 200 OK GET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz 302 Found GET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz ... http://moriz.de/Rubygems behind the gems. SPECS AKA „THE INDEX“
    22. 22. http://moriz.de/Rubygems behind the gems. SPECS AKA „THE INDEX“ latest_specs: lists the latest release number of all gems (~150 KB / 570 KB) specs: list of all gem releases (380 KB / 2.2 MB) latest_specs = Marshal.load open 'latest_specs.4.8' latest_specs.size => 17501 specs = Marshal.load open 'specs.4.8'; specs.size => 83490 (there‘s also a pre-release spec (remember „gem install rails --pre“) and others: see rubygems source lib/rubygems/commands/generate_index_command.rb)
    23. 23. $sudo gem install rails -V GET http://gems.rubyforge.org/latest_specs.4.8.gz 302 Found GET http://production.s3.rubygems.org/latest_specs.4.8.gz 200 OK GET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz 302 Found GET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz 200 OK GET http://gems.rubyforge.org/specs.4.8.gz 302 Found GET http://production.s3.rubygems.org/specs.4.8.gz 200 OK GET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz 302 Found GET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz ... http://moriz.de/Rubygems behind the gems. SPECS AKA „THE INDEX“ xload + parse spec dependencies
    24. 24. http://moriz.de/Rubygems behind the gems. SPECS AKA „THE INDEX“ Gem::Specification.new do |s|   s.authors = ["David Heinemeier Hansson"]   s.date = Time.utc(2010, 10, 14)   s.dependencies = [Gem::Dependency.new("activesupport",     Gem::Requirement.new(["= 3.0.1"]),     :runtime),    Gem::Dependency.new("actionpack",     Gem::Requirement.new(["= 3.0.1"]),     :runtime),    Gem::Dependency.new("activerecord",     Gem::Requirement.new(["= 3.0.1"]),     :runtime),    Gem::Dependency.new("activeresource",     Gem::Requirement.new(["= 3.0.1"]),     :runtime),    Gem::Dependency.new("actionmailer",     Gem::Requirement.new(["= 3.0.1"]),     :runtime),    Gem::Dependency.new("railties",     Gem::Requirement.new(["= 3.0.1"]),     :runtime),    Gem::Dependency.new("bundler",     Gem::Requirement.new(["~> 1.0.0"]),     :runtime)]   s.description = "Ruby on Rails is a full-stack web framework optimized Marshal.load Gem.inflate File.read 'rails-3.0.1.gemspec.rz'
    25. 25. http://moriz.de/Rubygems behind the gems. SPECS AKA „THE INDEX“ $sudo gem install rails -V GET http://gems.rubyforge.org/latest_specs.4.8.gz 302 Found GET http://production.s3.rubygems.org/latest_specs.4.8.gz 200 OK GET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz 302 Found GET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz 200 OK GET http://gems.rubyforge.org/specs.4.8.gz 302 Found GET http://production.s3.rubygems.org/specs.4.8.gz 200 OK GET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz 302 Found GET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz ... deps with explicit version requirement => require full spec list
    26. 26. http://moriz.de/Rubygems behind the gems. SPECS AKA „THE INDEX“ $sudo gem install rails -V GET http://gems.rubyforge.org/latest_specs.4.8.gz 302 Found GET http://production.s3.rubygems.org/latest_specs.4.8.gz 200 OK GET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz 302 Found GET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz 200 OK GET http://gems.rubyforge.org/specs.4.8.gz 302 Found GET http://production.s3.rubygems.org/specs.4.8.gz 200 OK GET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz 302 Found GET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz ... for each dependency then download and install the .gem files
    27. 27. http://moriz.de/Rubygems behind the gems. PROBLEMS: WHAT IF? cli rubygems.org AWS S3
    28. 28. http://moriz.de/Rubygems behind the gems. PROBLEMS: WHAT IF? cli rubygems.org AWS S3 Temporary Outage: no new gem releases no gem downloads (index missing) ! new app deployments? new server deployments?
    29. 29. http://moriz.de/Rubygems behind the gems. PROBLEMS: WHAT IF? cli rubygems.org AWS S3 Fatal Outage, reasons: • Hardware • Software (attack, fs corruption) • Amazon • account „deactivation“ • account deletion • S3 data loss • S3 bucket account theft/crack • Sunny day kills all the clouds. • Jeff Bezos‘ new bicy^Segw^Rocket.
    30. 30. http://moriz.de/Rubygems behind the gems. PROBLEMS: WHAT IF? cli rubygems.org AWS S3 Fatal Outage: ALL GEMS LOST
    31. 31. http://moriz.de/Rubygems behind the gems. PROBLEMS: WHAT IF? cli rubygems.org AWS S3 Fatal Outage: ALL GEMS LOST try again. ^_^
    32. 32. http://moriz.de/Rubygems behind the gems. PROBLEMS: MIRRORING Infrastructure independence to save your business from a rubygems desaster: > Start your own mirror Fallback for rubygems.org desaster? > Use a public mirror > Start your own mirror
    33. 33. http://moriz.de/Rubygems behind the gems. PROBLEMS: PUBLIC MIRRORS Comprehensive Perl Archive Network 2010-10-25 online since 1995-10-26 7770 MB 228 mirrors 8463 authors 18582 modules 228 independent public and free mirrors!
    34. 34. http://moriz.de/Rubygems behind the gems. PROBLEMS: PUBLIC MIRRORS Debian Mirror Sites: 445 http://www.debian.org/mirror/list
    35. 35. http://moriz.de/Rubygems behind the gems. PROBLEMS: PUBLIC MIRRORS „The Python Package Index is a repository of software for the Python programming language. There are currently 11801 packages here“
    36. 36. http://moriz.de/Rubygems behind the gems. PROBLEMS: PUBLIC MIRRORS
    37. 37. http://moriz.de/Rubygems behind the gems. PROBLEMS: PUBLIC MIRRORS 0 active, public, free mirrors. lost in migration (rubyforge > gemcutter)
    38. 38. http://moriz.de/Rubygems behind the gems. PROBLEMS: MIRRORING Mirroring stuff in rubygems is currently broken: • „gem mirror“ misses some gems & slow downloads: one gem at a time. • index building is broken (see #362) • reliability (#362, too) http://help.rubygems.org/discussions/problems/362-cant-mirror-rubygems- repo-incorrect-header-check http://help.rubygems.org/discussions/problems/212-some-gems-and-specs-missing-that-are-in-the-index Gemcutter already lost gems:
    39. 39. http://moriz.de/Rubygems behind the gems. PROBLEMS: MIRRORING There is also no easy way to mirror a S3 bucket: • no ftp • no rsync • no file-list to use with e.g. wget = you cannot even run a reliable private mirror :-(
    40. 40. http://moriz.de/Rubygems behind the gems. SOLUTION Provide rsync on master for sync-ability. On EC2, Rackspace, does not matter if it‘s fast... > NO custom mirroring software! > most FOSS mirror sites use rsync > use rsync, ask mirrors, problem solved. > AWS cloudfront is NOT a solution > not mirrorable, same vendor SPOFs.
    41. 41. http://moriz.de/Rubygems behind the gems. SOLUTION Provide rsync on master for sync-ability. On EC2, Rackspace, does not matter if it‘s fast... Provide a DNS based distribution (GeoDNS) > a realiable base for (private) mirroring > speed & latency improvements > NO custom mirroring software needed! > saves money (AWS and Rackspace fees) > make use of the new mirrors!
    42. 42. http://moriz.de/Rubygems behind the gems. SOLUTION Why not? > no „instant deploy“ (real-time mirroring) > no download stats
    43. 43. http://moriz.de/Rubygems behind the gems. SOLUTION Why not? Rubygems CLI could fallback to the rubygems.org master if a gem version is not on the used mirror. It already does if you configure it. (current downside: d/l spec-lists from master everytime, looks fixable to me) > no „instant deploy“ (real-time mirroring) > no download stats
    44. 44. http://moriz.de/Rubygems behind the gems. THINGS WILL FAIL... just make sure you‘ve a working plan B AND: KISS & YAGNI. Keep it simple. less moving parts > less things that will break. Don‘t over-engineer.
    45. 45. http://moriz.de/Rubygems behind the gems. HELP OpenSource projects need your support. Gemcutter/Rubygems, too. Go contribute if you care about your ruby business. The Gemcutter source is really awesome, a good read for every developer.

    ×