Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How to lose 50M Records in 5 minutes

13 views

Published on

On the importance of backups and maintaining a good status quo.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

How to lose 50M Records in 5 minutes

  1. 1. Jon Engineering Director W+R Studios
  2. 2. how to lose 50M records in 5 min
  3. 3. setting the table
  4. 4. Real Estate • MLS: Organization that collects and distributes data • Agent: A licensed Real Estate Agent • Listing: A property for sale • Data Feed: An API offered by an MLS to provide Listings
  5. 5. extract
  6. 6. speed
  7. 7. lack of speed
  8. 8. 150 records / min!
  9. 9. YIKES
  10. 10. extraction • hundreds of API configurations • permission and licensing • extremely slow • new standards
  11. 11. transform
  12. 12. weird data
  13. 13. “AccessibilityFeatures": "", "AdNumber": "", "AdditionalParcelsDescription": "", "AdditionalParcelsYN": "", "Appliances": "Refrigerator,Freezer,Dishwasher", "AppliancesYN": "1", "ArchitecturalStyle": "Mediterranean", "Assessments": "None", "AssessmentsYN": "0", "AssociationAmenities": "Dues Paid Monthly", "AssociationFee": "450.00", "AssociationFee2": "", "AssociationFee2Frequency": "", "AssociationFeeFrequency": "", "AssociationManagementName": "", "AssociationManagementName2": "", "AssociationName": "114 N Sycamore", "AssociationName2": "", "AssociationPhone": "", "AssociationPhone2": "", "AssociationPhone2Ext": "", "AssociationPhoneExt": "", "AssociationYN": "1", "AttachedGarageYN": "0", "AutoSoldDate": "", "AutoSoldYN": "0", "BackOnMarketDate": "", "Basement": "", "BathroomsFull": "2", “FEAT20110510163155967817000000": "3.00", "GF20070914134516788720000000": "", "LIST_36": "", "ROOM_KI_room_length": "", "FEAT20070914185851819154000000": "", "LIST_110": "Casa Grande Union High School", "LIST_37": "Drive", "LIST_38": "", "LIST_39": "Casa Grande", "ROOM_BR4_room_length": "", "LIST_32": "", "LIST_33": "E", "LIST_34": "Elaine", "LIST_35": "", "ROOM_BR5_room_length": "", "ROOM_BR2_room_length": "", "ROOM_BR3_room_length": "", "LIST_119": "2.50", "LIST_117": "", "LIST_118": "", "LIST_40": “AZ", "ROOM_LV_room_width": "", "LIST_22": "215000.00", "LIST_23": "", "LIST_41": "Pinal", "LIST_116": "", "LIST_42": "USA", "LIST_114": "1,601 - 1,800”, “LIST_29": "X41",
  14. 14. Bathrooms: 3.1 BathroomsFull: 3 BathroomsHalf: 1
  15. 15. transformation • wildly different field names • conventions that defy all reason and logic • extremely opinionated customers • difficult to get feedback • regular changes to data feeds • etc.
  16. 16. load
  17. 17. automatic index creation
  18. 18. dynamic schema
  19. 19. load • elasticsearch is awesome • elasticsearch is huge • has some things to watch out for
  20. 20. extract transform load
  21. 21. 1.0 • YAML API config • YAML elasticsearch config • Static Ruby data mapping classes • No local raw, unmapped copy
  22. 22. 1.0 • YAML API config • YAML elasticsearch config • Static Ruby data mapping classes • No local raw, unmapped copy
  23. 23. 1.0 1.5 • YAML API config • YAML elasticsearch config • Static Ruby data mapping classes • No local raw, unmapped copy • Dynamic API config • Dynamic elasticsearch config • Dynamic data mapping system • No local raw, unmapped copy
  24. 24. path to 1.5 • API wrappers worked for either 1.0 or 1.5 • data structure not backwards compatible • no automatic migration • never fully migrated
  25. 25. status quo • both versions in use • test suite was never updated • ~ 5 YAML configs remained • legacy system ignored and forgotten
  26. 26. – Responsible Employee “hey, when i remove this YAML file, all the tests fail. what should I do?”
  27. 27. – guess who “i guess just leave it in there.”
  28. 28. inevitable disaster
  29. 29. path to ruin fundamental change ignore test suite poor status quo quick fixes catastrophic failure last ditch effort created 1.5 reused tests never fully migrated manually deleted ES indices upgrade ES 50M Listings Lost
  30. 30. – Super Responsible Developer “Hey, it’s saying there’s only 4M listings.”
  31. 31. “I’m sure it will be fine in the morning. I’m going to get some ice cream.” – guess who
  32. 32. was is better in the morning?
  33. 33. path to ruin fundamental change ignore test suite poor status quo quick fixes catastrophic failure last ditch effort created 1.5 reused tests never fully migrated upgrade ES 50M Listings Lost manually deleted ES indices
  34. 34. { “2019-10-22 15:25:12”: { “price_list”: [ null, “669000.00” ], “status”: [ null, “Active” ] }, “2019-10-30 07:18:00”: { “price_list”: [ “669000.00”, “659000.00” ] } }
  35. 35. { “2019-10-22 15:25:12”: { “price_list”: [ null, “669000.00” ], “status”: [ null, “Active” ] }, “2019-10-30 07:18:00”: { “price_list”: [ “669000.00”, “659000.00” ] } }
  36. 36. path to ruin fundamental change ignore test suite poor status quo quick fixes catastrophic failure last ditch effort created 1.5 reused tests never fully migrated upgrade ES 50M Listings Lost manually deleted ES indices
  37. 37. path to ruin fundamental change ignore test suite poor status quo quick fixes catastrophic failure last ditch effort created 1.5 reused tests never fully migrated upgrade ES 50M Listings Lost manually deleted ES indices
  38. 38. catastrophe
  39. 39. what can we learn from this?
  40. 40. how to handle catastrophe • stay calm • work the problem • HAVE BACKUPS
  41. 41. how to avoid catastrophe • take Responsibility • follow Best Practices • investigate Root Causes
  42. 42. Present Day
  43. 43. 2.0 • Whole New Project • Stores local raw, unmapped copy • Proactive Alarming • Even more fault tolerant
  44. 44. it happened again!!
  45. 45. nobody noticed.
  46. 46. final thoughts • be proactive about issues, they won’t go away • have a backup plan for unavoidable issues • have a process for everything else
  47. 47. Jon Engineering Director W+R Studios

×