Migrating legacy data
A way to migrate data into your rails app
About me
Patrick Hüsler
Freelance developer @ http://www.huesler-
informatik.ch
Apple enthusiast
Surfing and Kung Fu
Legacy...
Us...
Considerations
 RDBMS

 Keys

 Philosophy

 Structure

 Naming
Tasks
 Connect?

 Retrieve?

 Integrate?

 Map?

 Validate?
Connect
database adapter (hopefully)
add it to database.yml
set_table_name
set_primary_key
set_sequence_name
define associations
Retrieve
 Create corresponding AR
 models

 Tell AR to connect to the
 legacy database

 Use AR’s features to adjust
 things like table name,
 primary key etc

 Set up associations
1 class Legacy::User < ActiveRecord::Base
2
3   establish_connection("legacydb")
4   set_table_name :users
5
6   belongs_to :login, :class_name => "Legacy::Login"
7 end
1   class Legacy::Base < ActiveRecord::Base
 2     self.abstract_class = true
 3     establish_connection("legacydb")
 4   end
 5
 6   class Legacy::User < Legacy::Base
 7     belongs_to :login, :class_name => "Legacy::Login"
 8   end
 9
10
Integrate
 Put models in their own
 namespace to avoid collisions

 Put them in their own folder
1   class Legacy::Base < ActiveRecord::Base
 2     self.abstract_class = true
 3     establish_connection("legacydb")
 4   end
 5
 6   class Legacy::User < Legacy::Base
 7     belongs_to :login, :class_name => "Legacy::Login"
 8   end
 9
10
Map
Let the models handle the mapping
1 class Legacy::Country < Legacy::Base
2
3   has_many :regions, :class_name => "Legacy::Region"
4
5   def code
6     abbrev
7   end
8 end
Validate
 Just use Active Record

 Create finders and scopes
 to only migrate the data
 you want

 Use memoization to
 speed up retrieval (e.g.
 fetch only valid records)
1 class Legacy::User < Legacy::Base
2   belongs_to :login, :class_name => "Legacy::Login"
3
4   named_scope :active     , :conditions => ["status = 1"]
5
6   validates_uniqueness_of :email, :scope => :status
7   validates_presence_of :first_name
8   validates_associated :login
9 end
Migrate
 Use migrators/importers
1   # General Importer
 2   # Code removed for the sake of this example
 3   class Migration::Importer
 4     def initialize(_entity_to_migrate,_migrated_entity)
 5       @entity_to_migrate = _entity_to_migrate
 6       @migrated_entity = _migrated_entity
 7     end
 8
 9     def migrate
10       @migrated_entity.attributes.symbolize_keys.keys.each do |attr|
11         if @entity_to_migrate.respond_to?(attr)
12           @migrated_entity.send("#{attr}=",@entity_to_migrate.send(attr))
13         end
14       end
15       @migrated_entity
16     end
17   end
1   # General Importer
 2   # Lines removed for the sake of this example
 3   class Migration::Importer
 4     def self.migrate_all(source_class,entity_name = nil)
 5       migrate_all_from_collection(source_class.send(:all),entity_name)
 6     end
 7
 8     def self.migrate_all_from_collection(collection,entity_name = nil)
 9       entity_name ||= collection.first.class.to_s.pluralize
10       collection.each do |entity|
11         result = self.new(entity).migrate
12         result.save!
13       end
14     end
15   end
1 class Migration::UserImporter < Migration::Importer
2   def initialize(user)
3     super(user,User.new)
4   end
5
6   def self.migrate_all
7     migrate_all_from_collection(Legacy::User.valid)
8   end
9 end
Test
 Integrates seamlessly with
 your rails test suit

 A separate folder helps to
 organize it
1 require File.dirname(__FILE__) + '/../../spec_helper'
 2
 3 describe Migration::UserImporter do
 4
 5   before(:all) do
 6     # setup your test here
 7   end
 8
 9   it "should map email address" do
10    @migrated_user.email.should equal(@user_to_migrate.email)
11   end
12
13   it "should map login to old login" do
14     @migrated_user.old_login.should == @user_to_migrate.login.login
15   end
16
17   it "should not have an empty salt" do
18     @migrated_user.salt.should_not be_blank
19   end
20
21 end
Automate
Rake to the rescue

Let it deal with
dependencies
1 namespace :legacy do
 2   namespace :data do
 3     desc "import data from legacy schema"
 4     task :import => :environment do
 5       Rake::Task['legacy:data:import:users'].invoke
 6     end
 7     namespace :import do
 8       task :fields_of_study do
 9         puts "Migrating Fields of study"
10         Migration::FieldOfStudyImporter.migrate_all
11         puts "Done migrating fields of study"
12       end
13
14       task :universities do
15         puts "Migrating universities"
16         Migration::UniversityImporter.migrate_all
17         puts "Done migrating universities"
18       end
19
20       task :users => [:fields_of_study,:universities] do
21         puts "Migrating users"
22         Migration::UserImporter.migrate_all
23         puts "Done migrating users"
24       end
25     end
26   end
27 end
Deploy
Just put in deploy.rb and let
capistrano handle the rest
1 namespace :deploy do
 2   desc "Migrate data"
 3   task :migrate_legacy_data, :roles => :app do
 4     # Actual one liner that would not fit
 5     # on a presentation screen
 6     command = "cd #{deploy_to}/current;"
 7     command << "RAILS_ENV='production' "
 8     command << "/usr/bin/rake legacy:data:import"
 9     run command
10   end
11 end
Questions?
Contact info
 Available for hire
 http://www.huesler-informatik.ch
 patrick.huesler@gmail.com
 https://www.xing.com/profile/Patrick_Huesler
 http://twitter.com/phuesler
 http://github.com/phuesler

Migrating legacy data

  • 1.
    Migrating legacy data Away to migrate data into your rails app
  • 2.
    About me Patrick Hüsler Freelancedeveloper @ http://www.huesler- informatik.ch Apple enthusiast Surfing and Kung Fu
  • 3.
  • 4.
  • 5.
    Considerations RDBMS Keys Philosophy Structure Naming
  • 6.
    Tasks Connect? Retrieve? Integrate? Map? Validate?
  • 7.
    Connect database adapter (hopefully) addit to database.yml set_table_name set_primary_key set_sequence_name define associations
  • 8.
    Retrieve Create correspondingAR models Tell AR to connect to the legacy database Use AR’s features to adjust things like table name, primary key etc Set up associations
  • 9.
    1 class Legacy::User< ActiveRecord::Base 2 3 establish_connection("legacydb") 4 set_table_name :users 5 6 belongs_to :login, :class_name => "Legacy::Login" 7 end
  • 10.
    1 class Legacy::Base < ActiveRecord::Base 2 self.abstract_class = true 3 establish_connection("legacydb") 4 end 5 6 class Legacy::User < Legacy::Base 7 belongs_to :login, :class_name => "Legacy::Login" 8 end 9 10
  • 11.
    Integrate Put modelsin their own namespace to avoid collisions Put them in their own folder
  • 12.
    1 class Legacy::Base < ActiveRecord::Base 2 self.abstract_class = true 3 establish_connection("legacydb") 4 end 5 6 class Legacy::User < Legacy::Base 7 belongs_to :login, :class_name => "Legacy::Login" 8 end 9 10
  • 13.
    Map Let the modelshandle the mapping
  • 14.
    1 class Legacy::Country< Legacy::Base 2 3 has_many :regions, :class_name => "Legacy::Region" 4 5 def code 6 abbrev 7 end 8 end
  • 15.
    Validate Just useActive Record Create finders and scopes to only migrate the data you want Use memoization to speed up retrieval (e.g. fetch only valid records)
  • 16.
    1 class Legacy::User< Legacy::Base 2 belongs_to :login, :class_name => "Legacy::Login" 3 4 named_scope :active , :conditions => ["status = 1"] 5 6 validates_uniqueness_of :email, :scope => :status 7 validates_presence_of :first_name 8 validates_associated :login 9 end
  • 17.
  • 18.
    1 # General Importer 2 # Code removed for the sake of this example 3 class Migration::Importer 4 def initialize(_entity_to_migrate,_migrated_entity) 5 @entity_to_migrate = _entity_to_migrate 6 @migrated_entity = _migrated_entity 7 end 8 9 def migrate 10 @migrated_entity.attributes.symbolize_keys.keys.each do |attr| 11 if @entity_to_migrate.respond_to?(attr) 12 @migrated_entity.send("#{attr}=",@entity_to_migrate.send(attr)) 13 end 14 end 15 @migrated_entity 16 end 17 end
  • 19.
    1 # General Importer 2 # Lines removed for the sake of this example 3 class Migration::Importer 4 def self.migrate_all(source_class,entity_name = nil) 5 migrate_all_from_collection(source_class.send(:all),entity_name) 6 end 7 8 def self.migrate_all_from_collection(collection,entity_name = nil) 9 entity_name ||= collection.first.class.to_s.pluralize 10 collection.each do |entity| 11 result = self.new(entity).migrate 12 result.save! 13 end 14 end 15 end
  • 20.
    1 class Migration::UserImporter< Migration::Importer 2 def initialize(user) 3 super(user,User.new) 4 end 5 6 def self.migrate_all 7 migrate_all_from_collection(Legacy::User.valid) 8 end 9 end
  • 21.
    Test Integrates seamlesslywith your rails test suit A separate folder helps to organize it
  • 22.
    1 require File.dirname(__FILE__)+ '/../../spec_helper' 2 3 describe Migration::UserImporter do 4 5 before(:all) do 6 # setup your test here 7 end 8 9 it "should map email address" do 10 @migrated_user.email.should equal(@user_to_migrate.email) 11 end 12 13 it "should map login to old login" do 14 @migrated_user.old_login.should == @user_to_migrate.login.login 15 end 16 17 it "should not have an empty salt" do 18 @migrated_user.salt.should_not be_blank 19 end 20 21 end
  • 23.
    Automate Rake to therescue Let it deal with dependencies
  • 24.
    1 namespace :legacydo 2 namespace :data do 3 desc "import data from legacy schema" 4 task :import => :environment do 5 Rake::Task['legacy:data:import:users'].invoke 6 end 7 namespace :import do 8 task :fields_of_study do 9 puts "Migrating Fields of study" 10 Migration::FieldOfStudyImporter.migrate_all 11 puts "Done migrating fields of study" 12 end 13 14 task :universities do 15 puts "Migrating universities" 16 Migration::UniversityImporter.migrate_all 17 puts "Done migrating universities" 18 end 19 20 task :users => [:fields_of_study,:universities] do 21 puts "Migrating users" 22 Migration::UserImporter.migrate_all 23 puts "Done migrating users" 24 end 25 end 26 end 27 end
  • 25.
    Deploy Just put indeploy.rb and let capistrano handle the rest
  • 26.
    1 namespace :deploydo 2 desc "Migrate data" 3 task :migrate_legacy_data, :roles => :app do 4 # Actual one liner that would not fit 5 # on a presentation screen 6 command = "cd #{deploy_to}/current;" 7 command << "RAILS_ENV='production' " 8 command << "/usr/bin/rake legacy:data:import" 9 run command 10 end 11 end
  • 27.
  • 28.
    Contact info Availablefor hire http://www.huesler-informatik.ch patrick.huesler@gmail.com https://www.xing.com/profile/Patrick_Huesler http://twitter.com/phuesler http://github.com/phuesler