SlideShare a Scribd company logo
Andrey Koleshko
Back-end developer @ Toptal
Github/twitter: @ka8725
Email: ka8725@gmail.com
Rails data migrations
● Code is never set in stone
● DB structure mutates
○ Columns/tables rename/drop
○ Move one type of relationship to other (e.g. from “belongs to” to
“has and belongs to many”, from “has many” to “has one”, etc.)
● Zero-downtime policy (production experience)
● Ton of data to migrate
● Public API exposed for other services
● NoSQL
The problem definition
● Code is never set in stone
● DB structure mutates
○ Columns/tables rename/drop
○ Move one type of relationship to other (e.g. from “belongs to” to
“has and belongs to many”, from “has many” to “has one”, etc.)
● Zero-downtime policy (production experience)
● Ton of data to migrate
● Public API exposed for other services
● NoSQL
The problem definition
● No production yet
● Production without zero-downtime policy
● Production with zero-downtime policy
Different situations
● No production yet
● Production without zero-downtime policy
● Production with zero-downtime policy
Different situations: the hardest case
Schema migrations != Data migrations
Tell things apart
class AddStatusToUser < AR::Migration
def up
add_column :users, :status, :string
end
def down
remove_column :users, :status
end
end
Tell things apart: schema migrations
class AddStatusToUser < AR::Migration
def up
add_column :users, :status, :string
User.find_each do |user|
user.status = 'active'
user.save!
end
end
...
Tell things apart: data migrations
● Write data migrations inside schema migrations (1)
● Write data migrations separately from schema migrations (2)
Different solutions
● Write any Rails code carelessly (a)
● Redefine models and use them in place (b)
● Call migration data code written outside (seeds, services, etc.) (c)
● Raw SQL (d)
● Rake tasks (e)
Different solutions
|{1, 2} x {a, b, c, d, e}| = 10
Different solutions
● Do you need the migrations functioning forever?
● Is a developer environment important more than production?
Pick a solution based on balance
● Do you need the migrations functioning forever?
○ No, clean them up from time to time
○ Don’t run all migrations at fresh start
○ Local/staging loads dump and the final schema at once
○ Obfuscate dump if needed
● Is a developer environment important more than production?
○ Obviously no, see the points above
My choice
class AddStatusToUser < AR::Migration
def up
add_column :users, :status, :string
User.find_each do |user|
user.status = 'active'
user.save!
end
end
...
Solution #1: Ruby code inside schema migration
● Error-prone - What if someone renames User model later?
● Not recommended
Solution #1: Ruby code inside schema migration
class AddStatusToUser < AR::Migration
class User < ActiveRecord::Base; end
def up
add_column :users, :status, :string
User.find_each { |user| user.update!(status: ‘active’) }
end
...
Solution #2: Redefine models inside migrations
class AddStatusToUser < AR::Migration
class User < AR::Base; belongs_to :role, polymorphic:
true; end
class Role < AR::Base; has_many :users, as: :role; end
----------------------------------------------------------
role = Role.create!(name: 'admin')
User.create!(nick: '@ka8725', role: role)
Solution #2: Redefine models inside migrations. Bug
Solution #2: Redefine models inside migrations. Bug
> user = User.find_by(nick: '@ka8725')
> user.role # => nil
Solution #2: Redefine models inside migrations. Bug
> user = User.find_by(nick: '@ka8725')
> user.role # => nil
> user.role_type # => AddStatusToUser::Role
Expected:
> user.role_type == Role # => true
● Much better than the previous one
● Error-prone - How to deal with tricky associations?
● Interesting bug with polymorphic associations
● Not recommended
Solution #2: Redefine models inside migrations
Common approach in Rails community?
● Has all previous problems
● Not a better choice
● Not recommended
Solution #3: Call migration data code written outside
from schema migrations
● Fast execution
● No previous problems
Solution #4: Raw SQL
● SQL knowledge
● More time to code
Solution #4: Raw SQL
Solution #5: Rake tasks
● Define custom Rake tasks
● Run when needed
rake db_migration:fix_data
Solution #5: Rake tasks
● Not a bad choice
● Requires some manual work
● Can be automated
● Can be developed to similar solution as schema migrations
in Rails
Solution #5: Rake tasks
Not bad solution for a start
● Define data migrations inside schema migrations
● But write tests for data migrations
● https://railsguides.net/change-data-in-migrations-like-a-boss/
● https://github.com/ka8725/migration_data
● Similar solution for schema migrations with versioning
○ https://github.com/ilyakatz/data-migrate
● Write SQL
● Schema migrations are made in several steps
○ https://blog.codeship.com/rails-migrations-zero-downtime/
● Heavy migrations (last for hours) are split into several
background jobs scheduled with some interval
The best choice suites production zero-downtime
The best choice suites production zero-downtime
The best choice suites production zero-downtime
Sort and run combined:
for local env only!
● Schema migrations should be fast (<1s)
● Avoid data migrations inside schema migrations
● Data migrations run after deployment
● Complementary actions are made on following deploys if the
data migration is run successfully
Production zero-downtime: deployment caveats
Zero downtime
Production
code
DB
Deploy timeline
Schema migrations
Symlink
Zero downtime
Production
code
DB
Deploy timeline
Schema migrations
Symlink
Zero downtime
Production
code
DB
Deploy timeline
Schema migrations
Symlink
Zero downtime
Production
code
DB
Deploy timeline
Schema migrations
Symlink
Data migrations
Split to smaller jobs
Process(1-1000) Process(10001-2000) Process(20001-3000)
j#1 j#2
j#3
j#6
j#4
j#7
j#8
j#5
@ka8725
Andrey Koleshko
Remotely working vetetran
Questions?

More Related Content

What's hot

Stockholm JAM September 2018
Stockholm JAM September 2018Stockholm JAM September 2018
Stockholm JAM September 2018
Andrey Devyatkin
 
Introduction to Reactjs
Introduction to ReactjsIntroduction to Reactjs
Introduction to Reactjs
NodeXperts
 
Let's Graph
Let's GraphLet's Graph
Let's Graph
Fabien de Maestri
 
ReactiveX
ReactiveXReactiveX
ReactiveX
BADR
 
React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)
Jarek Potiuk
 
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Flink Taiwan User Group
 
Seminar globalize3 - DungNV
Seminar globalize3 - DungNVSeminar globalize3 - DungNV
Seminar globalize3 - DungNV
Framgia Vietnam
 
Why should a Java programmer shifts towards Functional Programming Paradigm
Why should a Java programmer shifts towards Functional Programming ParadigmWhy should a Java programmer shifts towards Functional Programming Paradigm
Why should a Java programmer shifts towards Functional Programming Paradigm
Tech Triveni
 
IE WebGL and Babylon.js (Web3D 2014)
IE WebGL and Babylon.js (Web3D 2014)IE WebGL and Babylon.js (Web3D 2014)
IE WebGL and Babylon.js (Web3D 2014)
David Catuhe
 
ReactiveX-SEA
ReactiveX-SEAReactiveX-SEA
ReactiveX-SEAYang Yang
 
Sprint Boot & Kotlin - Meetup.pdf
Sprint Boot & Kotlin - Meetup.pdfSprint Boot & Kotlin - Meetup.pdf
Sprint Boot & Kotlin - Meetup.pdf
Christian Zellot
 
NE Scala 2016 roundup
NE Scala 2016 roundupNE Scala 2016 roundup
NE Scala 2016 roundup
Hung Lin
 
Introduction to javascript technologies
Introduction to javascript technologiesIntroduction to javascript technologies
Introduction to javascript technologies
Abdalla Elsayed
 
The state of Jenkins pipelines or do I still need freestyle jobs
The state of Jenkins pipelines or do I still need freestyle jobsThe state of Jenkins pipelines or do I still need freestyle jobs
The state of Jenkins pipelines or do I still need freestyle jobs
Andrey Devyatkin
 
The Professional Programmer
The Professional ProgrammerThe Professional Programmer
The Professional Programmer
Dave Cross
 
Introduction to functional programming, with Elixir
Introduction to functional programming,  with ElixirIntroduction to functional programming,  with Elixir
Introduction to functional programming, with Elixir
kirandanduprolu
 
Relay: Seamless Syncing for React (VanJS)
Relay: Seamless Syncing for React (VanJS)Relay: Seamless Syncing for React (VanJS)
Relay: Seamless Syncing for React (VanJS)
Brooklyn Zelenka
 
Intro to Crystal Programming Language
Intro to Crystal Programming LanguageIntro to Crystal Programming Language
Intro to Crystal Programming Language
Adler Hsieh
 
Moving From Angular to React
Moving From Angular to ReactMoving From Angular to React
Moving From Angular to React
Ilya Gurevich
 
LINEデリマでのElasticsearchの運用と監視の話
LINEデリマでのElasticsearchの運用と監視の話LINEデリマでのElasticsearchの運用と監視の話
LINEデリマでのElasticsearchの運用と監視の話
LINE Corporation
 

What's hot (20)

Stockholm JAM September 2018
Stockholm JAM September 2018Stockholm JAM September 2018
Stockholm JAM September 2018
 
Introduction to Reactjs
Introduction to ReactjsIntroduction to Reactjs
Introduction to Reactjs
 
Let's Graph
Let's GraphLet's Graph
Let's Graph
 
ReactiveX
ReactiveXReactiveX
ReactiveX
 
React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)
 
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
 
Seminar globalize3 - DungNV
Seminar globalize3 - DungNVSeminar globalize3 - DungNV
Seminar globalize3 - DungNV
 
Why should a Java programmer shifts towards Functional Programming Paradigm
Why should a Java programmer shifts towards Functional Programming ParadigmWhy should a Java programmer shifts towards Functional Programming Paradigm
Why should a Java programmer shifts towards Functional Programming Paradigm
 
IE WebGL and Babylon.js (Web3D 2014)
IE WebGL and Babylon.js (Web3D 2014)IE WebGL and Babylon.js (Web3D 2014)
IE WebGL and Babylon.js (Web3D 2014)
 
ReactiveX-SEA
ReactiveX-SEAReactiveX-SEA
ReactiveX-SEA
 
Sprint Boot & Kotlin - Meetup.pdf
Sprint Boot & Kotlin - Meetup.pdfSprint Boot & Kotlin - Meetup.pdf
Sprint Boot & Kotlin - Meetup.pdf
 
NE Scala 2016 roundup
NE Scala 2016 roundupNE Scala 2016 roundup
NE Scala 2016 roundup
 
Introduction to javascript technologies
Introduction to javascript technologiesIntroduction to javascript technologies
Introduction to javascript technologies
 
The state of Jenkins pipelines or do I still need freestyle jobs
The state of Jenkins pipelines or do I still need freestyle jobsThe state of Jenkins pipelines or do I still need freestyle jobs
The state of Jenkins pipelines or do I still need freestyle jobs
 
The Professional Programmer
The Professional ProgrammerThe Professional Programmer
The Professional Programmer
 
Introduction to functional programming, with Elixir
Introduction to functional programming,  with ElixirIntroduction to functional programming,  with Elixir
Introduction to functional programming, with Elixir
 
Relay: Seamless Syncing for React (VanJS)
Relay: Seamless Syncing for React (VanJS)Relay: Seamless Syncing for React (VanJS)
Relay: Seamless Syncing for React (VanJS)
 
Intro to Crystal Programming Language
Intro to Crystal Programming LanguageIntro to Crystal Programming Language
Intro to Crystal Programming Language
 
Moving From Angular to React
Moving From Angular to ReactMoving From Angular to React
Moving From Angular to React
 
LINEデリマでのElasticsearchの運用と監視の話
LINEデリマでのElasticsearchの運用と監視の話LINEデリマでのElasticsearchの運用と監視の話
LINEデリマでのElasticsearchの運用と監視の話
 

Similar to Rails data migrations

Snowflake Automated Deployments / CI/CD Pipelines
Snowflake Automated Deployments / CI/CD PipelinesSnowflake Automated Deployments / CI/CD Pipelines
Snowflake Automated Deployments / CI/CD Pipelines
Drew Hansen
 
Polyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherPolyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great Together
John Wood
 
Active record, standalone migrations, and working with Arel
Active record, standalone migrations, and working with ArelActive record, standalone migrations, and working with Arel
Active record, standalone migrations, and working with Arel
Alex Tironati
 
Achieving Full Stack DevOps at Colonial Life
Achieving Full Stack DevOps at Colonial Life Achieving Full Stack DevOps at Colonial Life
Achieving Full Stack DevOps at Colonial Life
DevOps.com
 
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...PostgresOpen
 
Modern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureModern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data Capture
Databricks
 
Database Migrations with Gradle and Liquibase
Database Migrations with Gradle and LiquibaseDatabase Migrations with Gradle and Liquibase
Database Migrations with Gradle and Liquibase
Dan Stine
 
Keeping code clean
Keeping code cleanKeeping code clean
Keeping code clean
Brett Child
 
Complete+dbt+Bootcamp+slides-plus examples
Complete+dbt+Bootcamp+slides-plus examplesComplete+dbt+Bootcamp+slides-plus examples
Complete+dbt+Bootcamp+slides-plus examples
nicolascombin1
 
Liquibase Integration with MuleSoft
Liquibase Integration with MuleSoftLiquibase Integration with MuleSoft
Liquibase Integration with MuleSoft
NeerajKumar1965
 
Webinar: Migrating from RDBMS to MongoDB
Webinar: Migrating from RDBMS to MongoDBWebinar: Migrating from RDBMS to MongoDB
Webinar: Migrating from RDBMS to MongoDB
MongoDB
 
SynapseIndia drupal presentation on drupal info
SynapseIndia drupal  presentation on drupal infoSynapseIndia drupal  presentation on drupal info
SynapseIndia drupal presentation on drupal info
Synapseindiappsdevelopment
 
Serverless Compose vs hurtownia danych
Serverless Compose vs hurtownia danychServerless Compose vs hurtownia danych
Serverless Compose vs hurtownia danych
The Software House
 
Handling Database Deployments
Handling Database DeploymentsHandling Database Deployments
Handling Database Deployments
Mike Willbanks
 
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Gabriele Bartolini
 
SQL in the Hybrid World
SQL in the Hybrid WorldSQL in the Hybrid World
SQL in the Hybrid World
Tanel Poder
 
Gobblin @ NerdWallet (Nov 2015)
Gobblin @ NerdWallet (Nov 2015)Gobblin @ NerdWallet (Nov 2015)
Gobblin @ NerdWallet (Nov 2015)
NerdWalletHQ
 
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
OdessaJS Conf
 
Instant developer onboarding with self contained repositories
Instant developer onboarding with self contained repositoriesInstant developer onboarding with self contained repositories
Instant developer onboarding with self contained repositories
Yshay Yaacobi
 

Similar to Rails data migrations (20)

Snowflake Automated Deployments / CI/CD Pipelines
Snowflake Automated Deployments / CI/CD PipelinesSnowflake Automated Deployments / CI/CD Pipelines
Snowflake Automated Deployments / CI/CD Pipelines
 
Polyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherPolyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great Together
 
Active record, standalone migrations, and working with Arel
Active record, standalone migrations, and working with ArelActive record, standalone migrations, and working with Arel
Active record, standalone migrations, and working with Arel
 
Achieving Full Stack DevOps at Colonial Life
Achieving Full Stack DevOps at Colonial Life Achieving Full Stack DevOps at Colonial Life
Achieving Full Stack DevOps at Colonial Life
 
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
 
Modern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureModern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data Capture
 
Database Migrations with Gradle and Liquibase
Database Migrations with Gradle and LiquibaseDatabase Migrations with Gradle and Liquibase
Database Migrations with Gradle and Liquibase
 
Keeping code clean
Keeping code cleanKeeping code clean
Keeping code clean
 
Complete+dbt+Bootcamp+slides-plus examples
Complete+dbt+Bootcamp+slides-plus examplesComplete+dbt+Bootcamp+slides-plus examples
Complete+dbt+Bootcamp+slides-plus examples
 
Liquibase Integration with MuleSoft
Liquibase Integration with MuleSoftLiquibase Integration with MuleSoft
Liquibase Integration with MuleSoft
 
Webinar: Migrating from RDBMS to MongoDB
Webinar: Migrating from RDBMS to MongoDBWebinar: Migrating from RDBMS to MongoDB
Webinar: Migrating from RDBMS to MongoDB
 
SynapseIndia drupal presentation on drupal info
SynapseIndia drupal  presentation on drupal infoSynapseIndia drupal  presentation on drupal info
SynapseIndia drupal presentation on drupal info
 
Serverless Compose vs hurtownia danych
Serverless Compose vs hurtownia danychServerless Compose vs hurtownia danych
Serverless Compose vs hurtownia danych
 
Handling Database Deployments
Handling Database DeploymentsHandling Database Deployments
Handling Database Deployments
 
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
 
DDD with Behat
DDD with BehatDDD with Behat
DDD with Behat
 
SQL in the Hybrid World
SQL in the Hybrid WorldSQL in the Hybrid World
SQL in the Hybrid World
 
Gobblin @ NerdWallet (Nov 2015)
Gobblin @ NerdWallet (Nov 2015)Gobblin @ NerdWallet (Nov 2015)
Gobblin @ NerdWallet (Nov 2015)
 
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
 
Instant developer onboarding with self contained repositories
Instant developer onboarding with self contained repositoriesInstant developer onboarding with self contained repositories
Instant developer onboarding with self contained repositories
 

More from Andrei Kaleshka

How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
Andrei Kaleshka
 
Business domain isolation in db
Business domain isolation in dbBusiness domain isolation in db
Business domain isolation in db
Andrei Kaleshka
 
Корпоративное приложение на Rails
Корпоративное приложение на RailsКорпоративное приложение на Rails
Корпоративное приложение на Rails
Andrei Kaleshka
 
Complete ruby code
Complete ruby codeComplete ruby code
Complete ruby code
Andrei Kaleshka
 
Rails 3 assets pipeline
Rails 3 assets pipelineRails 3 assets pipeline
Rails 3 assets pipeline
Andrei Kaleshka
 

More from Andrei Kaleshka (7)

How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Business domain isolation in db
Business domain isolation in dbBusiness domain isolation in db
Business domain isolation in db
 
Корпоративное приложение на Rails
Корпоративное приложение на RailsКорпоративное приложение на Rails
Корпоративное приложение на Rails
 
Ruby exceptions
Ruby exceptionsRuby exceptions
Ruby exceptions
 
Rails3 way
Rails3 wayRails3 way
Rails3 way
 
Complete ruby code
Complete ruby codeComplete ruby code
Complete ruby code
 
Rails 3 assets pipeline
Rails 3 assets pipelineRails 3 assets pipeline
Rails 3 assets pipeline
 

Recently uploaded

2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
Alina Yurenko
 
Nidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, TipsNidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, Tips
vrstrong314
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)
abdulrafaychaudhry
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Game Development with Unity3D (Game Development lecture 3)
Game Development  with Unity3D (Game Development lecture 3)Game Development  with Unity3D (Game Development lecture 3)
Game Development with Unity3D (Game Development lecture 3)
abdulrafaychaudhry
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
ShamsuddeenMuhammadA
 
Enterprise Software Development with No Code Solutions.pptx
Enterprise Software Development with No Code Solutions.pptxEnterprise Software Development with No Code Solutions.pptx
Enterprise Software Development with No Code Solutions.pptx
QuickwayInfoSystems3
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 

Recently uploaded (20)

2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
 
Nidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, TipsNidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, Tips
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Game Development with Unity3D (Game Development lecture 3)
Game Development  with Unity3D (Game Development lecture 3)Game Development  with Unity3D (Game Development lecture 3)
Game Development with Unity3D (Game Development lecture 3)
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
 
Enterprise Software Development with No Code Solutions.pptx
Enterprise Software Development with No Code Solutions.pptxEnterprise Software Development with No Code Solutions.pptx
Enterprise Software Development with No Code Solutions.pptx
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 

Rails data migrations

  • 1. Andrey Koleshko Back-end developer @ Toptal Github/twitter: @ka8725 Email: ka8725@gmail.com Rails data migrations
  • 2. ● Code is never set in stone ● DB structure mutates ○ Columns/tables rename/drop ○ Move one type of relationship to other (e.g. from “belongs to” to “has and belongs to many”, from “has many” to “has one”, etc.) ● Zero-downtime policy (production experience) ● Ton of data to migrate ● Public API exposed for other services ● NoSQL The problem definition
  • 3. ● Code is never set in stone ● DB structure mutates ○ Columns/tables rename/drop ○ Move one type of relationship to other (e.g. from “belongs to” to “has and belongs to many”, from “has many” to “has one”, etc.) ● Zero-downtime policy (production experience) ● Ton of data to migrate ● Public API exposed for other services ● NoSQL The problem definition
  • 4. ● No production yet ● Production without zero-downtime policy ● Production with zero-downtime policy Different situations
  • 5. ● No production yet ● Production without zero-downtime policy ● Production with zero-downtime policy Different situations: the hardest case
  • 6. Schema migrations != Data migrations Tell things apart
  • 7. class AddStatusToUser < AR::Migration def up add_column :users, :status, :string end def down remove_column :users, :status end end Tell things apart: schema migrations
  • 8. class AddStatusToUser < AR::Migration def up add_column :users, :status, :string User.find_each do |user| user.status = 'active' user.save! end end ... Tell things apart: data migrations
  • 9. ● Write data migrations inside schema migrations (1) ● Write data migrations separately from schema migrations (2) Different solutions
  • 10. ● Write any Rails code carelessly (a) ● Redefine models and use them in place (b) ● Call migration data code written outside (seeds, services, etc.) (c) ● Raw SQL (d) ● Rake tasks (e) Different solutions
  • 11. |{1, 2} x {a, b, c, d, e}| = 10 Different solutions
  • 12. ● Do you need the migrations functioning forever? ● Is a developer environment important more than production? Pick a solution based on balance
  • 13. ● Do you need the migrations functioning forever? ○ No, clean them up from time to time ○ Don’t run all migrations at fresh start ○ Local/staging loads dump and the final schema at once ○ Obfuscate dump if needed ● Is a developer environment important more than production? ○ Obviously no, see the points above My choice
  • 14. class AddStatusToUser < AR::Migration def up add_column :users, :status, :string User.find_each do |user| user.status = 'active' user.save! end end ... Solution #1: Ruby code inside schema migration
  • 15. ● Error-prone - What if someone renames User model later? ● Not recommended Solution #1: Ruby code inside schema migration
  • 16. class AddStatusToUser < AR::Migration class User < ActiveRecord::Base; end def up add_column :users, :status, :string User.find_each { |user| user.update!(status: ‘active’) } end ... Solution #2: Redefine models inside migrations
  • 17. class AddStatusToUser < AR::Migration class User < AR::Base; belongs_to :role, polymorphic: true; end class Role < AR::Base; has_many :users, as: :role; end ---------------------------------------------------------- role = Role.create!(name: 'admin') User.create!(nick: '@ka8725', role: role) Solution #2: Redefine models inside migrations. Bug
  • 18. Solution #2: Redefine models inside migrations. Bug > user = User.find_by(nick: '@ka8725') > user.role # => nil
  • 19. Solution #2: Redefine models inside migrations. Bug > user = User.find_by(nick: '@ka8725') > user.role # => nil > user.role_type # => AddStatusToUser::Role Expected: > user.role_type == Role # => true
  • 20. ● Much better than the previous one ● Error-prone - How to deal with tricky associations? ● Interesting bug with polymorphic associations ● Not recommended Solution #2: Redefine models inside migrations
  • 21. Common approach in Rails community?
  • 22. ● Has all previous problems ● Not a better choice ● Not recommended Solution #3: Call migration data code written outside from schema migrations
  • 23. ● Fast execution ● No previous problems Solution #4: Raw SQL ● SQL knowledge ● More time to code
  • 25. Solution #5: Rake tasks ● Define custom Rake tasks ● Run when needed rake db_migration:fix_data
  • 26. Solution #5: Rake tasks ● Not a bad choice ● Requires some manual work ● Can be automated ● Can be developed to similar solution as schema migrations in Rails
  • 28. Not bad solution for a start ● Define data migrations inside schema migrations ● But write tests for data migrations ● https://railsguides.net/change-data-in-migrations-like-a-boss/ ● https://github.com/ka8725/migration_data
  • 29. ● Similar solution for schema migrations with versioning ○ https://github.com/ilyakatz/data-migrate ● Write SQL ● Schema migrations are made in several steps ○ https://blog.codeship.com/rails-migrations-zero-downtime/ ● Heavy migrations (last for hours) are split into several background jobs scheduled with some interval The best choice suites production zero-downtime
  • 30. The best choice suites production zero-downtime
  • 31. The best choice suites production zero-downtime Sort and run combined: for local env only!
  • 32. ● Schema migrations should be fast (<1s) ● Avoid data migrations inside schema migrations ● Data migrations run after deployment ● Complementary actions are made on following deploys if the data migration is run successfully Production zero-downtime: deployment caveats
  • 36. Zero downtime Production code DB Deploy timeline Schema migrations Symlink Data migrations
  • 37. Split to smaller jobs Process(1-1000) Process(10001-2000) Process(20001-3000) j#1 j#2 j#3 j#6 j#4 j#7 j#8 j#5