Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The 30-Month Migration

20 views

Published on

This talk from RailsConf 2019 describes how our team made deep changes to the data model of our production system over a period of 2.5 years.

Changing your data model is hard. Taking care of existing data requires caution. Exploring and testing possible solutions can be slow. Your new data model may require data completeness or correctness that hasn't been enforced for the existing data.

To manage the risk and minimize disruption to the product roadmap, we broke the effort into four stages, each with its own distinct challenges. I'll describe our rationale, process ... and the lessons we learned along the way.

Published in: Software
  • Be the first to comment

  • Be the first to like this

The 30-Month Migration

  1. 1. The 30-Month Migration Glenn Vanderburg VP of Engineering, First.io @glv
  2. 2. Changing Your Data Model
 Is Hard!
  3. 3. Living With a Poor Data Model
 Is Also Hard!
  4. 4. Four Stages, 2½ Years 15Nov Today 9Feb Stage 2 Stage 3Stage 1 Stage 4 20Jun 18Nov 15Jan 26Jan 20Mar 7Dec 2016 2017 2018 2019
  5. 5. 30 months! 15Nov Today 9Feb Stage 2 Stage 3Stage 1 Stage 4 20Jun 18Nov 15Jan 26Jan 20Mar 7Dec 2016 2017 2018 2019
  6. 6. 30 months! 15Nov Today 9Feb Stage 2 Stage 3Stage 1 Stage 4 20Jun 18Nov 15Jan 26Jan 20Mar 7Dec 2016 2017 2018 2019
  7. 7. A Technical Talk,
 with Mostly Non-Technical Lessons
  8. 8. Three Principles •Validation •Reversibility •Transparency
  9. 9. Introduction:
 Our Big Mistakes* * So far.
  10. 10. Realtor Abby Isaac Jane Kathy Lee Mike Nancy Realtor Bill Oscar Pat Quentin Robert Sally Tina
  11. 11. Realtor Abby Isaac Jane Kathy Lee Mike Nancy Realtor Bill Oscar Pat Quentin Robert Sally Tina
  12. 12. Realtor Abby Isaac Jane Kathy Lee Mike Nancy Realtor Bill Oscar Pat Quentin Robert Sally Tina
  13. 13. Realtor Abby Isaac Jane Kathy Lee Mike Nancy Realtor Bill Oscar Pat Quentin Robert Sally Tina
  14. 14. Realtor Abby Isaac Jane Kathy Lee Mike Nancy Realtor Bill Oscar Pat Quentin Robert Sally Tina
  15. 15. –Gerald Weinberg Things are the way they are because they got that way.
  16. 16. Realtor Abby Postgres Neo4j Realtor Bill Realtor′ Abby Realtor′ Bill Isaac Jane Kathy Lee Mike Nancy Oscar Pat Quentin Robert Sally Tina CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR CR
  17. 17. Isaac Jane Kathy Lee Mike Nancy Oscar Pat Quentin Robert Sally Tina Abby Bill Abby Bill realtors Realtors′ ContactRelationships Contacts Other tables here: subscriptions, payments, notes, appointments, etc. Many other relationships between contacts, and between contacts and their attributes. Postgres Neo4j
  18. 18. Postgres Isaac Jane Kathy Lee Mike Nancy Quentin Sally … Kathy Nancy Oscar Pat Quentin Robert Sally Tina Abby Bill realtors contacts
  19. 19. Stage 1:
 From Neo4j to Postgres
  20. 20. What Drove the Change? • Neo4j/Cypher not as familiar to developers as Postgres/SQL • Neo4j ActiveModel gem less mature and feature-rich than ActiveRecord • Neo4j drivers less mature, less well optimized • Some features required cross-database joins (slow, memory intensive)
  21. 21. Making a Plan • Realtor-by-realtor migration • An importer job that would import a realtor’s Neo4j data into Postgres • The importer needed to avoid duplicating shared data that had already been imported for another realtor • We would use a feature flag to indicate whether a realtor had been migrated or not
  22. 22. Schema Definition • We knew our data in Neo4j was messy. • Neo4j’s referential integrity features weaker than Postgres’ • We weren’t skilled at using the features Neo4j did have • We got very serious about data integrity in the schema: • foreign keys, ON CASCADE, check constraints, exclusion constraints • This was enormously helpful!
  23. 23. Switching Models • The feature flag needed to be readily available everywhere, so we set a thread-local variable in middleware. • A lot of queries start off by calling class methods on a model class • We needed that model class to be the ActiveRecord model if the current realtor’s feature flag was set, and the Neo4j model otherwise Person.find(35) # or Property.where(zip5: "75238")
  24. 24. Switching Models • Exploiting Ruby’s dynamic nature, we were able to build models that could be Neo4j or ActiveRecord models, depending on the feature flag. class Contact extend SwitchingModel switch_between(::ContactV1, ::ContactV2) end class ContactV1 include Neo4j::ActiveNode self.mapped_label_name = "Contact" # ... Neo4j::ActiveNode model code end class ContactV2 < ApplicationRecord self.table_name = :contacts # ... ActiveRecord model code end
  25. 25. Switching Models module SwitchingModel def switch_between(v1_model, v2_model)   @_v1_model = v1_model  @_v2_model = v2_model  end  private  def _v2_mode?    Thread.current.thread_variable_get(:moved_to_postgres) || ENV['FORCE_V2_FEATURE_FLAG'] == '1'  end def _switch  return @_v2_model if _v2_mode?   @_v1_model   end  end 
  26. 26. Switching Models module SwitchingModel def method_missing(meth, *args, &blk)    _switch.send(meth, *args, &blk)    end  def const_missing(name)  _switch.const_get(name)    end def new(*args) _switch.new(*args)    end  private  # ... end
  27. 27. Scopes and More Scopes • A lot of queries contained Cypher fragments • Converting those to scopes allowed controllers to use the same queries, whether the feature flag was set or not • Built a rich vocabulary of scopes that has served us well ever since
  28. 28. Testing • Environment variable override of feature flag • Rake tasks for running two sets of specs • Separate sets of factories • CI running both sets • Lots of comparison testing by developers • Whole company QA swarm in staging
  29. 29. Tracking Progress • Excellent advice from Jess Martin, our CTO • Added an RSpec custom formatter to output total number of v2 specs vs. number of passing v2 specs. • Those went into a spreadsheet with a chart:
  30. 30. Executing • Select employees first (those not doing sales and demos) • Rest of employees • Friendly customers (who would inform us of issues) • Rest of active customers • The whole process took about three weeks
  31. 31. Finishing the Job • After the initial round of employee and select customer migrations, we kicked off the first full batch of customers. • All of a sudden, I had nothing to do! • “I may as well start on the PR to rip out all the V1 and transitional code …” • 10 hours later:
  32. 32. Isaac Jane Kathy Lee Mike Nancy Oscar Pat Quentin Robert Sally Tina Abby Bill realtors contact_relationships contacts Postgres
  33. 33. Stage 2:
 Change Primary Keys to Integers
  34. 34. What Drove the Change? • Postgres UUID primary keys work just fine. • Harder to remember, vdiff, type • Didn’t become an issue until we needed to start tracking source info for a different table that had an integer primary key. • We track sources using a polymorphic join table (sourcings).
  35. 35. A Spike ⭐ id ⭐ first … abc Joe def Susan ghi Rachel jkl Todd mno Melanie contact_names
  36. 36. A Spike id first … integer_id abc Joe 1 def Susan 2 ghi Rachel 3 jkl Todd 4 mno Melanie 5 ⭐ id ⭐ contact_names
  37. 37. A Spike uuid first … integer_id abc Joe 1 def Susan 2 ghi Rachel 3 jkl Todd 4 mno Melanie 5 contact_names
  38. 38. A Spike uuid first … id abc Joe 1 def Susan 2 ghi Rachel 3 jkl Todd 4 mno Melanie 5 ⭐ id ⭐ contact_names
  39. 39. Problem: Foreign Key References ⭐ id ⭐ … abc def ghi jkl mno properties property_notes ⭐ property_id ⭐ … jkl def abc mno ghi
  40. 40. Problem: Foreign Key References ⭐ id ⭐ … integer_id abc 1 def 2 ghi 3 jkl 4 mno 5 properties property_notes ⭐ property_id ⭐ … jkl def abc mno ghi
  41. 41. Problem: Foreign Key References id … integer_id abc 1 def 2 ghi 3 jkl 4 mno 5 properties property_notes property_id … int_property_id jkl 4 def 2 abc 1 mno 5 ghi 3 ⭐ id ⭐ ⭐ property_id ⭐
  42. 42. Problem: Foreign Key References id … integer_id abc 1 def 2 ghi 3 jkl 4 mno 5 properties property_notes property_id … int_property_id jkl 4 def 2 abc 1 mno 5 ghi 3
  43. 43. Problem: Foreign Key References uuid … integer_id abc 1 def 2 ghi 3 jkl 4 mno 5 properties property_notes property_id … int_property_id jkl 4 def 2 abc 1 mno 5 ghi 3
  44. 44. Problem: Foreign Key References uuid … integer_id abc 1 def 2 ghi 3 jkl 4 mno 5 properties property_notes … int_property_id 4 2 1 5 3
  45. 45. Problem: Foreign Key References uuid … id abc 1 def 2 ghi 3 jkl 4 mno 5 properties property_notes … int_property_id 4 2 1 5 3
  46. 46. Problem: Foreign Key References uuid … id abc 1 def 2 ghi 3 jkl 4 mno 5 properties property_notes … property_id 4 2 1 5 3
  47. 47. Problem: Foreign Key References uuid … id abc 1 def 2 ghi 3 jkl 4 mno 5 properties property_notes … property_id 4 2 1 5 3 ⭐ property_id ⭐⭐ id ⭐
  48. 48. Problem: Polymorphic Tables • Remember, this started because of a polymorphic join table, sourcings • Required converting all tables referenced by the polymorphic table at once • Ended up with 5 separate clusters of tables. • Wrote migration helpers to manage the details and make things reversible.
  49. 49. Discovering Constraints • You can query anything about the schema from a set of internal tables and views • Example: finding all foreign key references to the contacts table: • You can do similar things for indexes and other kinds of constraints SELECT * FROM information_schema.constraint_column_usage WHERE table_name = 'contacts' AND column_name = 'id' AND constraint_name <> 'contacts_pkey'
  50. 50. Plan and Wait • Output of the spike: • 3 complex migration helpers • 5 migrations • Ended up waiting 5 months before the pain outweighed the risk
  51. 51. Five Big Migrations • Simple case easy: • Harder cases not so easy: • Worst case: 3 primary keys, 28 foreign keys, 4 polymorphic tables … all in one migration. fix_uuid_primary_key :contact_names fix_uuid_primary_key :avatars fix_uuid_primary_key :properties fix_uuid_foreign_key :properties, :property_notes, on_delete: :cascade fix_uuid_polymorphic_association :sourcings, :sourceable, targets: [:avatars, :properties]
  52. 52. From Spike to Solution • Careful review of the migrations and helpers • Ran the migrations many, many times on clone of production DB • Run, fix error, repeat. (Very thankful for Postgres transactional DDL!) • Fixing error usually meant figuring out how to reflect on some new kind of dependency in Postgres and update the helper to deal with it. • Sometimes meant just coding a workaround for an odd case.
  53. 53. Being Careful • Ran the migrations in staging for timings • We had the luxury of downtime! • But we wanted to understand how long each maintenance window would be. • Made them reversible! • We planned for never having to reverse them, including careful testing and random spot-checks in the migrations. • But we also made sure they could be reversed (including round-trip testing of both schema and table contents).
  54. 54. Being Careful • Build correctness checking into the migration helpers • Remember: we kept the uuid column • At start of change: store random sample of records • After change: find those records and ensure they still refer to same UUID • Finally deployed on five consecutive weekends (simplest first)
  55. 55. Isaac Jane Kathy Lee Mike Nancy Oscar Pat Quentin Robert Sally Tina Abby Bill realtors contact_relationships contacts Postgres
  56. 56. Stage 3:
 Private, Per-Customer Contacts
  57. 57. Swanand Pagnis
  58. 58. What Drove the Change? • Nearly every query had to be filtered based on source • Extra complexity • Joining through polymorphic table was costly • Sooner or later we would miss it and violate data privacy
  59. 59. A Spike • Doing this in Ruby was fairly straightforward … but very slow (about a day per realtor) • Doing it in SQL required fairly advanced skills … but took about ten minutes per realtor • As with stage 1, decided on a user-by-user approach
  60. 60. The Strategy id: 1
 Realtor Alice Realtor Bill Realtor Carl id: 1
 id: 3
 id: 2
 realtors contact_relationships contacts Name: Nancy
 contact_id: 1 Email: nancy@example.com
 contact_id: 1 • Simple example: one contact shared by three realtors.
  61. 61. The Strategy id: 1
 Realtor Alice Realtor Bill Realtor Carl id: 1
 old_contact_id: 1 id: 3
 old_contact_id: 1 id: 2
 old_contact_id: 1 realtors contact_relationships contacts Name: Nancy
 contact_id: 1 Email: nancy@example.com
 contact_id: 1 • First, add old_contact_id column to contact_relationships • Populate it with current value of contact_id
  62. 62. The Strategy id: 1
 contact_relationship_id: " Realtor Alice Realtor Bill Realtor Carl id: 1
 old_contact_id: 1 id: 3
 old_contact_id: 1 id: 2
 old_contact_id: 1 realtors contact_relationships contacts Name: Nancy
 contact_id: 1 Email: nancy@example.com
 contact_id: 1 uniqueness constraint
 on contact_relationship_id • Next, add contact_relationship_id column to contacts • Populate it with NULL (represented as ") • Add uniqueness constraint for that column
  63. 63. id: 1
 old_contact_id: 1 The Strategy id: 1
 contact_relationship_id: " Realtor Alice Realtor Bill Realtor Carl id: 3
 old_contact_id: 1 id: 2
 old_contact_id: 1 realtors contact_relationships contacts Name: Nancy
 contact_id: 1 Email: nancy@example.com
 contact_id: 1 uniqueness constraint
 on contact_relationship_id
  64. 64. The Strategy id: 1
 contact_relationship_id: 1 Realtor Alice Realtor Bill Realtor Carl id: 1
 old_contact_id: 1 id: 3
 old_contact_id: 1 id: 2
 old_contact_id: 1 realtors contact_relationships contacts Name: Nancy
 contact_id: 1 Email: nancy@example.com
 contact_id: 1 uniqueness constraint
 on contact_relationship_id UPDATE contacts SET contact_relationship_id = contact_relationships.id FROM contact_relationships WHERE contacts.id = contact_relationships.contact_id AND contact_relationships.realtor_id = 1 AND contacts.contact_relationship_id IS NULL • Update contact_relationship_id IF it’s NULL
  65. 65. The Strategy id: 1
 contact_relationship_id: 1 Realtor Alice Realtor Bill Realtor Carl id: 1
 old_contact_id: 1 id: 3
 old_contact_id: 1 id: 2
 old_contact_id: 1 realtors contact_relationships contacts Name: Nancy
 contact_id: 1 Email: nancy@example.com
 contact_id: 1 uniqueness constraint
 on contact_relationship_id • Now INSERT into contacts for each of Alice’s contact_relationships • ON CONFLICT just set updated_at on the existing one • and then UPDATE contact_relationships to point to the new contact records ?
  66. 66. INSERT with ON CONFLICT WITH new_contacts AS ( INSERT INTO contacts (cr_id, created_at, updated_at) ( SELECT cr.id AS cr_id, cr.created_at, cr.updated_at FROM contacts INNER JOIN contact_relationships cr ON contact.id = cr.contact_id WHERE cr.realtor_id = 1 AND contacts.contact_relationship_id IS NULL ORDER BY cr_id ASC ) ON CONFLICT ON CONSTRAINT contact_relationship_id_uniqueness DO UPDATE SET updated_at = EXCLUDED.updated_at RETURNING *) UPDATE contact_relationships SET contact_id = new_contacts.id FROM new_contacts WHERE contact_relationships.id = new_contacts.contact_relationship_id;
  67. 67. INSERT with ON CONFLICT WITH new_contacts AS ( INSERT INTO contacts (cr_id, created_at, updated_at) ( SELECT cr.id AS cr_id, cr.created_at, cr.updated_at FROM contacts INNER JOIN contact_relationships cr ON contact.id = cr.contact_id WHERE cr.realtor_id = 1 AND contacts.contact_relationship_id IS NULL ORDER BY cr_id ASC ) ON CONFLICT ON CONSTRAINT contact_relationship_id_uniqueness DO UPDATE SET updated_at = EXCLUDED.updated_at RETURNING *) UPDATE contact_relationships SET contact_id = new_contacts.id FROM new_contacts WHERE contact_relationships.id = new_contacts.contact_relationship_id;
  68. 68. INSERT with ON CONFLICT WITH new_contacts AS ( INSERT INTO contacts (cr_id, created_at, updated_at) ( SELECT cr.id AS cr_id, cr.created_at, cr.updated_at FROM contacts INNER JOIN contact_relationships cr ON contact.id = cr.contact_id WHERE cr.realtor_id = 1 AND contacts.contact_relationship_id IS NULL ORDER BY cr_id ASC ) ON CONFLICT ON CONSTRAINT contact_relationship_id_uniqueness DO UPDATE SET updated_at = EXCLUDED.updated_at RETURNING *) UPDATE contact_relationships SET contact_id = new_contacts.id FROM new_contacts WHERE contact_relationships.id = new_contacts.contact_relationship_id;
  69. 69. INSERT with ON CONFLICT WITH new_contacts AS ( INSERT INTO contacts (cr_id, created_at, updated_at) ( SELECT cr.id AS cr_id, cr.created_at, cr.updated_at FROM contacts INNER JOIN contact_relationships cr ON contact.id = cr.contact_id WHERE cr.realtor_id = 1 AND contacts.contact_relationship_id IS NULL ORDER BY cr_id ASC ) ON CONFLICT ON CONSTRAINT contact_relationship_id_uniqueness DO UPDATE SET updated_at = EXCLUDED.updated_at RETURNING *) UPDATE contact_relationships SET contact_id = new_contacts.id FROM new_contacts WHERE contact_relationships.id = new_contacts.contact_relationship_id;
  70. 70. INSERT with ON CONFLICT WITH new_contacts AS ( INSERT INTO contacts (cr_id, created_at, updated_at) ( SELECT cr.id AS cr_id, cr.created_at, cr.updated_at FROM contacts INNER JOIN contact_relationships cr ON contact.id = cr.contact_id WHERE cr.realtor_id = 1 AND contacts.contact_relationship_id IS NULL ORDER BY cr_id ASC ) ON CONFLICT ON CONSTRAINT contact_relationship_id_uniqueness DO UPDATE SET updated_at = EXCLUDED.updated_at RETURNING *) UPDATE contact_relationships SET contact_id = new_contacts.id FROM new_contacts WHERE contact_relationships.id = new_contacts.contact_relationship_id;
  71. 71. INSERT with ON CONFLICT WITH new_contacts AS ( INSERT INTO contacts (cr_id, created_at, updated_at) ( SELECT cr.id AS cr_id, cr.created_at, cr.updated_at FROM contacts INNER JOIN contact_relationships cr ON contact.id = cr.contact_id WHERE cr.realtor_id = 1 AND contacts.contact_relationship_id IS NULL ORDER BY cr_id ASC ) ON CONFLICT ON CONSTRAINT contact_relationship_id_uniqueness DO UPDATE SET updated_at = EXCLUDED.updated_at RETURNING *) UPDATE contact_relationships SET contact_id = new_contacts.id FROM new_contacts WHERE contact_relationships.id = new_contacts.contact_relationship_id;
  72. 72. The Strategy id: 1
 contact_relationship_id: 1 Realtor Alice Realtor Bill Realtor Carl id: 1
 old_contact_id: 1 id: 3
 old_contact_id: 1 id: 2
 old_contact_id: 1 realtors contact_relationships contacts Name: Nancy
 contact_id: 1 Email: nancy@example.com
 contact_id: 1 uniqueness constraint
 on contact_relationship_id • INSERT into contacts for each of Alice’s contact_relationships • ON CONFLICT just set updated_at on the existing one • and then UPDATE contact_relationships to point to the new contact records id: 1001
 contact_relationship_id: 1 X
  73. 73. The Strategy id: 1
 contact_relationship_id: 1 Realtor Alice Realtor Bill Realtor Carl id: 1
 old_contact_id: 1 id: 3
 old_contact_id: 1 id: 2
 old_contact_id: 1 realtors contact_relationships contacts Name: Nancy
 contact_id: 1 Email: nancy@example.com
 contact_id: 1 uniqueness constraint
 on contact_relationship_id • That time, nothing happened, because Alice was the first realtor for contact 1.
  74. 74. The Strategy id: 1
 contact_relationship_id: 1 Realtor Alice Realtor Bill Realtor Carl id: 1
 old_contact_id: 1 id: 3
 old_contact_id: 1 id: 2
 old_contact_id: 1 realtors contact_relationships contacts Name: Nancy
 contact_id: 1 Email: nancy@example.com
 contact_id: 1 uniqueness constraint
 on contact_relationship_id • Now let’s try Bill.
  75. 75. The Strategy id: 1
 contact_relationship_id: 1 Realtor Alice Realtor Bill Realtor Carl id: 1
 old_contact_id: 1 id: 3
 old_contact_id: 1 id: 2
 old_contact_id: 1 realtors contact_relationships contacts Name: Nancy
 contact_id: 1 Email: nancy@example.com
 contact_id: 1 uniqueness constraint
 on contact_relationship_id • We try to claim the contact for Bill by updating
 contacts.contact_relationship_id • But it isn’t NULL, so we don’t update it to 2 X
  76. 76. The Strategy id: 1
 contact_relationship_id: 1 Realtor Alice Realtor Bill Realtor Carl id: 1
 old_contact_id: 1 id: 3
 old_contact_id: 1 id: 2
 old_contact_id: 1 realtors contact_relationships contacts Name: Nancy
 contact_id: 1 Email: nancy@example.com
 contact_id: 1 uniqueness constraint
 on contact_relationship_id • But the INSERT works because it doesn’t create a uniqueness violation id: 1001
 contact_relationship_id: 2
  77. 77. The Strategy id: 1
 contact_relationship_id: 1 Realtor Alice Realtor Bill Realtor Carl id: 1
 old_contact_id: 1 id: 3
 old_contact_id: 1 id: 2
 old_contact_id: 1 realtors contact_relationships contacts Name: Nancy
 contact_id: 1 Email: nancy@example.com
 contact_id: 1 uniqueness constraint
 on contact_relationship_id • And then the UPDATE fixes up the contact_relationships record • But what about the attached attributes? id: 1001
 contact_relationship_id: 2
  78. 78. The Strategy id: 1
 contact_relationship_id: 1 Realtor Alice Realtor Bill Realtor Carl id: 1
 old_contact_id: 1 id: 3
 old_contact_id: 1 id: 2
 old_contact_id: 1 realtors contact_relationships contacts Name: Nancy
 contact_id: 1 Email: nancy@example.com
 contact_id: 1 uniqueness constraint
 on contact_relationship_id • For each of Bill’s contacts where
 contact_relationships.old_contact_id != contacts.id,
 go copy all of the attached attributes from old_contact_id id: 1001
 contact_relationship_id: 2
  79. 79. The Strategy id: 1
 contact_relationship_id: 1 Realtor Alice Realtor Bill Realtor Carl id: 1
 old_contact_id: 1 id: 3
 old_contact_id: 1 id: 2
 old_contact_id: 1 realtors contact_relationships contacts Name: Nancy
 contact_id: 1 Email: nancy@example.com
 contact_id: 1 uniqueness constraint
 on contact_relationship_id • For each of Bill’s contacts where
 contact_relationships.old_contact_id != contacts.id,
 go copy all of the attached attributes from old_contact_id • A lot of queries, but basically straightforward • Then move on to Carl id: 1001
 contact_relationship_id: 2 Email: nancy@example.com
 contact_id: 2 Name: Nancy
 contact_id: 2
  80. 80. Being Careful • Again: ran these transformations against a clone of production • Run for a realtor, compare against that realtor’s production data • Complete run-through of all realtors in staging before moving on to production • During run-through, I plotted changes to table counts as a sanity check
  81. 81. An OUTER JOIN should’ve been an INNER JOIN
  82. 82. Abby Bill realtors contact_relationships contacts Postgres Isaac Jane Kathy Lee Mike Nancy Quentin Sally … Kathy Nancy Oscar Pat Quentin Robert Sally Tina
  83. 83. Stage 4:
 From Join Table to belongs_to
  84. 84. What Drove the Change? • Everything’s just a little more complex with the join table • Requires constraints and integrity checks that wouldn’t be necessary without it • Another team member challenged me to get rid of it! • It really wasn’t causing us enough trouble to justify a big push • But I realized we could set this up to do opportunistically
  85. 85. The Idea • Go ahead and add the direct contacts.realtor_id foreign key • Populate it to match the existing contact_relationships. • Then just make sure they stay consistent!
  86. 86. Triggers • Rails developers are wary of stored procedures and triggers (for good reason) • But sometimes they’re exactly what you need. This is one of those times. • I had a lot of ignorance to overcome. • So I worked on a spike, curling up with the Postgres manual and experimenting …
  87. 87. ContactRelationships Contacts Realtors insert set realtor_id
  88. 88. ContactRelationships Contacts Realtors insert set realtor_id
  89. 89. ContactRelationships Contacts Realtors insert set realtor_id insert X
  90. 90. ContactRelationships Contacts Realtors insert set realtor_id X set
  91. 91. Triggers Are Difficult
 (for me, anyway) • For efficiency, control the conditions under which invoked • For correctness, decide before/after • Carefully write updates/inserts to only make changes if things are inconsistent
  92. 92. The Plan: A 12-Step Epic • Step 1: build a way to track progress • Step 2: build a way to audit the activity of the triggers • Step 3: add contacts.realtor_id and triggers • Steps 4–6: move fields from contact_relationships to contacts • Steps 7–8: retargeting polymorphic associations • Steps 9-11: retargeting associations, scopes, and query fragments • Step 12: DROP TABLE contact_relationships
  93. 93. Tracking Progress rg --count --ignore-file .rg_crprogress_ignore '[Cc]ontact_?[Rr]elationship' 
 | cut -d : -f 2 
 | sed '2,$s/$/+/; $s/$/p/' 
 | dc
  94. 94. Auditing Trigger Activity • Updated the triggers to log behavior to new contact_relationship_trigger_actions table. • Utility script to audit this table for consistency occasionally. id action contact_relationship_id contact_id performed_update time 4995810 c_setrealtor 5622768 FALSE 2019-02-18 13:31:49.395671 4995811 cr_insert 10607228 5622768 TRUE 2019-02-18 13:31:49.395671 4995812 c_setrealtor 5622769 FALSE 2019-02-18 13:31:50.181528 4995813 cr_insert 10607230 5622769 TRUE 2019-02-18 13:31:50.181528 4995814 c_setrealtor 5622770 FALSE 2019-02-18 13:31:50.474147
  95. 95. Executing • One step a week • Each took 8–10 hours, on average • Most deployments on weekends, even when no downtime required
  96. 96. Postgres Isaac Jane Kathy Lee Mike Nancy Quentin Sally … Kathy Nancy Oscar Pat Quentin Robert Sally Tina Abby Bill realtors contacts
  97. 97. Lessons and Recommendations
  98. 98. Slow and Steady • Incremental, “worst pain first” strategy • Contained risk • Enabled feature development • Produced enormous technical improvement over time
  99. 99. Keep Looking Ahead • We were always looking for ways to improve the system • An “inventory of pain” helps you to identify which pain is the worst right now
  100. 100. Each Stage Was Different! • Entirely different, creative solutions required at each step • Ruby magic • Migrations and database reflections • Fancy Postgres UPSERT (i.e., INSERT … ON CONFLICT) queries and CTEs • Triggers • Entirely different testing strategies, too. • There is no recipe. Find what works.
  101. 101. Leverage Your Database • We Rails developers love ActiveRecord and Arel for queries. • But for all its problems, SQL is powerful. • Data and referential integrity protections can save you. • Without Postgres’ transactional DDL, the risk and effort would have been enormously greater. (I’d guess roughly tenfold.) • Stored procedures and triggers have their place.
  102. 102. The Luxury of Downtime • We have the luxury of being able to schedule maintenance time. • If you can, do that. • If not, you have to explore other techniques. (It’s worth bringing in an experienced database consultant if you need to explore these.)
  103. 103. Focus: The Two-Edged Sword • These kinds of tasks really benefit from intense focus. • But that kind of focus can keep you from seeing danger. • Make sure you come up for air and have someone looking over your shoulder.
  104. 104. What Would We Do Differently? • If we had clearly understood our end goal, we could have done all of this in stage 1. • But we still thought we were building a social graph. • You can never be sure you understand the future of your business.
  105. 105. What Would We Do Differently? • There is one mistake we could have avoided based on technical principles. • We should never have used UUID primary keys. • They are useful only if you need to distribute primary key creation. • Probably where contention on the primary key sequence is a bottleneck. • Maybe also when you need to provide a key with less latency than a DB round trip. • THAT’S IT.
  106. 106. Three Principles •Validation •Reversibility •Transparency

×