SlideShare a Scribd company logo
Fixing Web Data
in Production
Best practices for bad situations
Aaron Knight, Full Stack Engineer at Voxy
Aaron Knight (@iamaaronknight)
Full Stack Engineer
Voxy.com
I am
A Django web app
8 years old
12 engineers
10+ data stores
Voxy is
Oops.
There’s a problem with the data!
(It was probably my fault)
> SELECT * FROM feature_toggles LIMIT 2;
+-------+----------+--------------------+
| id | user_id | orientation_videos |
|-------+----------+--------------------+
| 1234 | 8923123 | f |
| 1235 | 9213483 | f |
| 1236 | 2136935 | f |
> UPDATE feature_toggles SET
orientation_videos = 't' WHERE...
Hold up!
What could co wrong?
● You bring down the site.
● You make the problem worse.
● You forget what you did.
Never change
data in prod
● Never introduce any bugs.
● Make all the right architecture decisions
the first time.
</sarcasm>
Data fixes are code.
● Check them in to source control.
Data fixes are code.
● Check them in to source control.
● Test them.
Data fixes are code.
● Check them in to source control.
● Test them.
● Code review them.
Track execution.
● Log when a script is executed.
Track execution.
● Log when a script is executed.
● Log everything that changed.
Track execution.
● Log when a script is executed.
● Log everything that changed.
● Log what did not change.
def fix_feature_toggles():
logger.info('Starting fix_feature_toggles script')
for toggle in FeatureToggle.objects.all():
if toggle.orientation_videos:
logger.info('FeatureToggle {} orientation_videos
already exists; skipping'.format(toggle.id))
else:
toggle.orientation_videos = get_correct_value(toggle)
toggle.save()
logger.info(
'FeatureToggle {} orientation_videos updated to
{}'.format(toggle.id, toggle.orientation_videos))
logger.info('Finished fix_feature_toggles script')
Track execution.
● Log when a script is executed.
● Log everything that changed.
● Log what did not change.
● Centralize your logging.
import boto3
firehose = boto3.client('firehose')
def log_to_kinesis(message):
data = OrderedDict([
('script_name', get_filename_of_caller()),
('environment', settings.ENVIRONMENT),
('ts', str(pytz.utc.localize(datetime.datetime.now()))),
('message', message),
])
firehose.put_record(
DeliveryStreamName='backfill-logs',
Record={'Data': (json.dumps(data, sort_keys=False) + 'n')}
)
Track execution.
● Log when a script is executed.
● Log everything that changed.
● Log what did not change.
● Centralize your logging.
● Track the script’s progress.
import tqdm
def backfill_toggles():
count = FeatureToggle.objects.count()
for org in tqdm(FeatureToggle.objects.all(), count=count):
...
Be fault-tolerant.
● Think of possible exceptions.
for user_id in list_of_user_ids:
try:
toggle = FeatureToggle.objects.get(user_id=user_id)
except FeatureToggle.DoesNotExist:
logger.info('FeatureToggle does not exist for User
{}'.format(User_id))
continue
toggle.orientation_videos = True
toggle.save()
Be fault-tolerant.
● Think of possible exceptions.
● Make your scripts idempotent, if possible.
for user_id in list_of_user_ids:
try:
toggle = FeatureToggle.objects.get(user_id=user_id,
backfilled=False)
except FeatureToggle.DoesNotExist:
continue
toggle.orientation_videos = True
toggle.backfilled = True
toggle.save()
Be fault-tolerant.
● Think of possible exceptions.
● Make your scripts idempotent, if possible.
● Make your changes reversible, if possible.
{"environment": "production", "ts": "2017-10-02 18:33:08.805645+00:00",
"message": "unit_id 117 resource_id: None > 597ca7531ce6856f34607de9"}
{"environment": "production", "ts": "2017-10-02 18:33:08.878832+00:00",
"message": "unit_id 28 resource_id: None > 54ca763ca8615a76184dd4a9"}
Know your bottlenecks.
● CPU?
Know your bottlenecks.
● CPU?
● Memory?
Know your bottlenecks.
● CPU?
● Memory?
● Database?
feature_toggles = []
for user_id in user_ids_to_backfill:
feature_toggles.append(FeatureToggle(
user_id=user_id,
orientation_videos=True
)
)
FeatureToggle.objects.bulk_create(feature_toggles)
def backfill_activity_progresses():
conn = psycopg2.connect("some_credentials")
cursor = conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor)
data_to_replicate = []
for index in tqdm(batch_range):
cursor.execute("SELECT user_id, type, correct_answers,
total_answers, is_complete FROM legacy_activity ORDER BY id;")
data_to_replicate.append(cursor.fetchall())
conn.close()
add_to_new_activity_table(data_to_replicate)
Know your bottlenecks.
● CPU?
● Memory?
● Database?
● Developer time?
● Cognitive overhead?
Execute at the right level
of abstraction.
● Use existing functions and the ORM when
you can afford to.
● Use SQL when execution time becomes
significant.
Use database
snapshots.
● Test your script on a backup from
production.
Use database
snapshots.
● Test your script on a backup from
production.
● Take snapshots before you make changes.
Use database
snapshots.
● Test your script on a backup from
production.
● Take snapshots before you make changes.
● Automate your backups.
Aaron Knight (@iamaaronknight)
Full Stack Engineer
Voxy.com
Thanks!

More Related Content

What's hot

Zone.js 2017
Zone.js 2017Zone.js 2017
Zone.js 2017
Jia Li
 
NS2: AWK and GNUplot - PArt III
NS2: AWK and GNUplot - PArt IIINS2: AWK and GNUplot - PArt III
NS2: AWK and GNUplot - PArt III
Ajit Nayak
 
RxJS 5 in Depth
RxJS 5 in DepthRxJS 5 in Depth
RxJS 5 in Depth
C4Media
 
Tech Talk - Immutable Data Structure
Tech Talk - Immutable Data StructureTech Talk - Immutable Data Structure
Tech Talk - Immutable Data Structure
Di Fan
 
オープンデータを使ったモバイルアプリ開発(応用編)
オープンデータを使ったモバイルアプリ開発(応用編)オープンデータを使ったモバイルアプリ開発(応用編)
オープンデータを使ったモバイルアプリ開発(応用編)
Takayuki Goto
 
Sortings
SortingsSortings
Sortings
maamir farooq
 
GPars For Beginners
GPars For BeginnersGPars For Beginners
GPars For Beginners
Matt Passell
 
Data analysis and visualization with mongo db [mongodb world 2016]
Data analysis and visualization with mongo db [mongodb world 2016]Data analysis and visualization with mongo db [mongodb world 2016]
Data analysis and visualization with mongo db [mongodb world 2016]
Alexander Hendorf
 
Common Performance Pitfalls in Odoo apps
Common Performance Pitfalls in Odoo appsCommon Performance Pitfalls in Odoo apps
Common Performance Pitfalls in Odoo apps
Odoo
 
[2019] Java에서 Fiber를 이용하여 동시성concurrency 프로그래밍 쉽게 하기
[2019] Java에서 Fiber를 이용하여 동시성concurrency 프로그래밍 쉽게 하기[2019] Java에서 Fiber를 이용하여 동시성concurrency 프로그래밍 쉽게 하기
[2019] Java에서 Fiber를 이용하여 동시성concurrency 프로그래밍 쉽게 하기
NHN FORWARD
 
C# Assignmet Help
C# Assignmet HelpC# Assignmet Help
C# Assignmet Help
Programming Homework Help
 
Modificacion del programa
Modificacion del programaModificacion del programa
Modificacion del programaMario José
 
rx.js make async programming simpler
rx.js make async programming simplerrx.js make async programming simpler
rx.js make async programming simplerAlexander Mostovenko
 
Functors, applicatives, monads
Functors, applicatives, monadsFunctors, applicatives, monads
Functors, applicatives, monads
rkaippully
 
The Ring programming language version 1.6 book - Part 55 of 189
The Ring programming language version 1.6 book - Part 55 of 189The Ring programming language version 1.6 book - Part 55 of 189
The Ring programming language version 1.6 book - Part 55 of 189
Mahmoud Samir Fayed
 
Atm machine using c++
Atm machine using c++Atm machine using c++
Atm machine using c++Aqib Memon
 
Cassandra Day Denver 2014: Building Java Applications with Apache Cassandra
Cassandra Day Denver 2014: Building Java Applications with Apache CassandraCassandra Day Denver 2014: Building Java Applications with Apache Cassandra
Cassandra Day Denver 2014: Building Java Applications with Apache Cassandra
DataStax Academy
 
Atm machine using c++
Atm machine using c++Atm machine using c++
Atm machine using c++Aqib Memon
 
Using Redux-Saga for Handling Side Effects
Using Redux-Saga for Handling Side EffectsUsing Redux-Saga for Handling Side Effects
Using Redux-Saga for Handling Side Effects
GlobalLogic Ukraine
 

What's hot (20)

Zone.js 2017
Zone.js 2017Zone.js 2017
Zone.js 2017
 
NS2: AWK and GNUplot - PArt III
NS2: AWK and GNUplot - PArt IIINS2: AWK and GNUplot - PArt III
NS2: AWK and GNUplot - PArt III
 
RxJS 5 in Depth
RxJS 5 in DepthRxJS 5 in Depth
RxJS 5 in Depth
 
Tech Talk - Immutable Data Structure
Tech Talk - Immutable Data StructureTech Talk - Immutable Data Structure
Tech Talk - Immutable Data Structure
 
オープンデータを使ったモバイルアプリ開発(応用編)
オープンデータを使ったモバイルアプリ開発(応用編)オープンデータを使ったモバイルアプリ開発(応用編)
オープンデータを使ったモバイルアプリ開発(応用編)
 
Sortings
SortingsSortings
Sortings
 
GPars For Beginners
GPars For BeginnersGPars For Beginners
GPars For Beginners
 
Data analysis and visualization with mongo db [mongodb world 2016]
Data analysis and visualization with mongo db [mongodb world 2016]Data analysis and visualization with mongo db [mongodb world 2016]
Data analysis and visualization with mongo db [mongodb world 2016]
 
Common Performance Pitfalls in Odoo apps
Common Performance Pitfalls in Odoo appsCommon Performance Pitfalls in Odoo apps
Common Performance Pitfalls in Odoo apps
 
Clojure functions examples
Clojure functions examplesClojure functions examples
Clojure functions examples
 
[2019] Java에서 Fiber를 이용하여 동시성concurrency 프로그래밍 쉽게 하기
[2019] Java에서 Fiber를 이용하여 동시성concurrency 프로그래밍 쉽게 하기[2019] Java에서 Fiber를 이용하여 동시성concurrency 프로그래밍 쉽게 하기
[2019] Java에서 Fiber를 이용하여 동시성concurrency 프로그래밍 쉽게 하기
 
C# Assignmet Help
C# Assignmet HelpC# Assignmet Help
C# Assignmet Help
 
Modificacion del programa
Modificacion del programaModificacion del programa
Modificacion del programa
 
rx.js make async programming simpler
rx.js make async programming simplerrx.js make async programming simpler
rx.js make async programming simpler
 
Functors, applicatives, monads
Functors, applicatives, monadsFunctors, applicatives, monads
Functors, applicatives, monads
 
The Ring programming language version 1.6 book - Part 55 of 189
The Ring programming language version 1.6 book - Part 55 of 189The Ring programming language version 1.6 book - Part 55 of 189
The Ring programming language version 1.6 book - Part 55 of 189
 
Atm machine using c++
Atm machine using c++Atm machine using c++
Atm machine using c++
 
Cassandra Day Denver 2014: Building Java Applications with Apache Cassandra
Cassandra Day Denver 2014: Building Java Applications with Apache CassandraCassandra Day Denver 2014: Building Java Applications with Apache Cassandra
Cassandra Day Denver 2014: Building Java Applications with Apache Cassandra
 
Atm machine using c++
Atm machine using c++Atm machine using c++
Atm machine using c++
 
Using Redux-Saga for Handling Side Effects
Using Redux-Saga for Handling Side EffectsUsing Redux-Saga for Handling Side Effects
Using Redux-Saga for Handling Side Effects
 

Similar to Fixing Web Data in Production

Why you should be using structured logs
Why you should be using structured logsWhy you should be using structured logs
Why you should be using structured logs
Stefan Krawczyk
 
Yevhen Tatarynov "From POC to High-Performance .NET applications"
Yevhen Tatarynov "From POC to High-Performance .NET applications"Yevhen Tatarynov "From POC to High-Performance .NET applications"
Yevhen Tatarynov "From POC to High-Performance .NET applications"
LogeekNightUkraine
 
PythonIntro_pycon2010
PythonIntro_pycon2010PythonIntro_pycon2010
PythonIntro_pycon2010
Kannappan Sirchabesan
 
Business Dashboards using Bonobo ETL, Grafana and Apache Airflow
Business Dashboards using Bonobo ETL, Grafana and Apache AirflowBusiness Dashboards using Bonobo ETL, Grafana and Apache Airflow
Business Dashboards using Bonobo ETL, Grafana and Apache Airflow
Romain Dorgueil
 
RxJava on Android
RxJava on AndroidRxJava on Android
RxJava on Android
Dustin Graham
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
Wim Godden
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
PROIDEA
 
PythonBrasil[8] - CPython for dummies
PythonBrasil[8] - CPython for dummiesPythonBrasil[8] - CPython for dummies
PythonBrasil[8] - CPython for dummies
Tatiana Al-Chueyr
 
用 Go 語言打造多台機器 Scale 架構
用 Go 語言打造多台機器 Scale 架構用 Go 語言打造多台機器 Scale 架構
用 Go 語言打造多台機器 Scale 架構
Bo-Yi Wu
 
Examining Oracle GoldenGate Trail Files
Examining Oracle GoldenGate Trail FilesExamining Oracle GoldenGate Trail Files
Examining Oracle GoldenGate Trail Files
Bobby Curtis
 
Puppet Data Mining
Puppet Data MiningPuppet Data Mining
Puppet Data Mining
Gareth Rushgrove
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in C
Steffen Wenz
 
Testing Django APIs
Testing Django APIsTesting Django APIs
Testing Django APIs
tyomo4ka
 
Large volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive PlatformLarge volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive Platform
Martin Zapletal
 
Category theory, Monads, and Duality in the world of (BIG) Data
Category theory, Monads, and Duality in the world of (BIG) DataCategory theory, Monads, and Duality in the world of (BIG) Data
Category theory, Monads, and Duality in the world of (BIG) Data
greenwop
 
Simulator customizing & testing for Xcode 9
Simulator customizing & testing for Xcode 9Simulator customizing & testing for Xcode 9
Simulator customizing & testing for Xcode 9
Bongwon Lee
 
Performance tests - it's a trap
Performance tests - it's a trapPerformance tests - it's a trap
Performance tests - it's a trap
Andrzej Ludwikowski
 
Android Best Practices
Android Best PracticesAndroid Best Practices
Android Best PracticesYekmer Simsek
 
Metrics-Driven Engineering
Metrics-Driven EngineeringMetrics-Driven Engineering
Metrics-Driven Engineering
Mike Brittain
 

Similar to Fixing Web Data in Production (20)

Why you should be using structured logs
Why you should be using structured logsWhy you should be using structured logs
Why you should be using structured logs
 
Yevhen Tatarynov "From POC to High-Performance .NET applications"
Yevhen Tatarynov "From POC to High-Performance .NET applications"Yevhen Tatarynov "From POC to High-Performance .NET applications"
Yevhen Tatarynov "From POC to High-Performance .NET applications"
 
PythonIntro_pycon2010
PythonIntro_pycon2010PythonIntro_pycon2010
PythonIntro_pycon2010
 
Business Dashboards using Bonobo ETL, Grafana and Apache Airflow
Business Dashboards using Bonobo ETL, Grafana and Apache AirflowBusiness Dashboards using Bonobo ETL, Grafana and Apache Airflow
Business Dashboards using Bonobo ETL, Grafana and Apache Airflow
 
RxJava on Android
RxJava on AndroidRxJava on Android
RxJava on Android
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
 
PythonBrasil[8] - CPython for dummies
PythonBrasil[8] - CPython for dummiesPythonBrasil[8] - CPython for dummies
PythonBrasil[8] - CPython for dummies
 
用 Go 語言打造多台機器 Scale 架構
用 Go 語言打造多台機器 Scale 架構用 Go 語言打造多台機器 Scale 架構
用 Go 語言打造多台機器 Scale 架構
 
Examining Oracle GoldenGate Trail Files
Examining Oracle GoldenGate Trail FilesExamining Oracle GoldenGate Trail Files
Examining Oracle GoldenGate Trail Files
 
Puppet Data Mining
Puppet Data MiningPuppet Data Mining
Puppet Data Mining
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in C
 
Testing Django APIs
Testing Django APIsTesting Django APIs
Testing Django APIs
 
Large volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive PlatformLarge volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive Platform
 
Introduction to Django
Introduction to DjangoIntroduction to Django
Introduction to Django
 
Category theory, Monads, and Duality in the world of (BIG) Data
Category theory, Monads, and Duality in the world of (BIG) DataCategory theory, Monads, and Duality in the world of (BIG) Data
Category theory, Monads, and Duality in the world of (BIG) Data
 
Simulator customizing & testing for Xcode 9
Simulator customizing & testing for Xcode 9Simulator customizing & testing for Xcode 9
Simulator customizing & testing for Xcode 9
 
Performance tests - it's a trap
Performance tests - it's a trapPerformance tests - it's a trap
Performance tests - it's a trap
 
Android Best Practices
Android Best PracticesAndroid Best Practices
Android Best Practices
 
Metrics-Driven Engineering
Metrics-Driven EngineeringMetrics-Driven Engineering
Metrics-Driven Engineering
 

Recently uploaded

Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Nidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, TipsNidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, Tips
vrstrong314
 
Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)
abdulrafaychaudhry
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
abdulrafaychaudhry
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
Google
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 

Recently uploaded (20)

Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Nidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, TipsNidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, Tips
 
Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 

Fixing Web Data in Production

  • 1. Fixing Web Data in Production Best practices for bad situations Aaron Knight, Full Stack Engineer at Voxy
  • 2. Aaron Knight (@iamaaronknight) Full Stack Engineer Voxy.com I am
  • 3. A Django web app 8 years old 12 engineers 10+ data stores Voxy is
  • 4. Oops. There’s a problem with the data! (It was probably my fault)
  • 5.
  • 6. > SELECT * FROM feature_toggles LIMIT 2; +-------+----------+--------------------+ | id | user_id | orientation_videos | |-------+----------+--------------------+ | 1234 | 8923123 | f | | 1235 | 9213483 | f | | 1236 | 2136935 | f |
  • 7. > UPDATE feature_toggles SET orientation_videos = 't' WHERE...
  • 8. Hold up! What could co wrong? ● You bring down the site. ● You make the problem worse. ● You forget what you did.
  • 9. Never change data in prod ● Never introduce any bugs. ● Make all the right architecture decisions the first time.
  • 11. Data fixes are code. ● Check them in to source control.
  • 12. Data fixes are code. ● Check them in to source control. ● Test them.
  • 13. Data fixes are code. ● Check them in to source control. ● Test them. ● Code review them.
  • 14. Track execution. ● Log when a script is executed.
  • 15. Track execution. ● Log when a script is executed. ● Log everything that changed.
  • 16. Track execution. ● Log when a script is executed. ● Log everything that changed. ● Log what did not change.
  • 17. def fix_feature_toggles(): logger.info('Starting fix_feature_toggles script') for toggle in FeatureToggle.objects.all(): if toggle.orientation_videos: logger.info('FeatureToggle {} orientation_videos already exists; skipping'.format(toggle.id)) else: toggle.orientation_videos = get_correct_value(toggle) toggle.save() logger.info( 'FeatureToggle {} orientation_videos updated to {}'.format(toggle.id, toggle.orientation_videos)) logger.info('Finished fix_feature_toggles script')
  • 18. Track execution. ● Log when a script is executed. ● Log everything that changed. ● Log what did not change. ● Centralize your logging.
  • 19.
  • 20. import boto3 firehose = boto3.client('firehose') def log_to_kinesis(message): data = OrderedDict([ ('script_name', get_filename_of_caller()), ('environment', settings.ENVIRONMENT), ('ts', str(pytz.utc.localize(datetime.datetime.now()))), ('message', message), ]) firehose.put_record( DeliveryStreamName='backfill-logs', Record={'Data': (json.dumps(data, sort_keys=False) + 'n')} )
  • 21. Track execution. ● Log when a script is executed. ● Log everything that changed. ● Log what did not change. ● Centralize your logging. ● Track the script’s progress.
  • 22.
  • 23. import tqdm def backfill_toggles(): count = FeatureToggle.objects.count() for org in tqdm(FeatureToggle.objects.all(), count=count): ...
  • 24. Be fault-tolerant. ● Think of possible exceptions.
  • 25.
  • 26. for user_id in list_of_user_ids: try: toggle = FeatureToggle.objects.get(user_id=user_id) except FeatureToggle.DoesNotExist: logger.info('FeatureToggle does not exist for User {}'.format(User_id)) continue toggle.orientation_videos = True toggle.save()
  • 27. Be fault-tolerant. ● Think of possible exceptions. ● Make your scripts idempotent, if possible.
  • 28. for user_id in list_of_user_ids: try: toggle = FeatureToggle.objects.get(user_id=user_id, backfilled=False) except FeatureToggle.DoesNotExist: continue toggle.orientation_videos = True toggle.backfilled = True toggle.save()
  • 29. Be fault-tolerant. ● Think of possible exceptions. ● Make your scripts idempotent, if possible. ● Make your changes reversible, if possible.
  • 30. {"environment": "production", "ts": "2017-10-02 18:33:08.805645+00:00", "message": "unit_id 117 resource_id: None > 597ca7531ce6856f34607de9"} {"environment": "production", "ts": "2017-10-02 18:33:08.878832+00:00", "message": "unit_id 28 resource_id: None > 54ca763ca8615a76184dd4a9"}
  • 32. Know your bottlenecks. ● CPU? ● Memory?
  • 33. Know your bottlenecks. ● CPU? ● Memory? ● Database?
  • 34. feature_toggles = [] for user_id in user_ids_to_backfill: feature_toggles.append(FeatureToggle( user_id=user_id, orientation_videos=True ) ) FeatureToggle.objects.bulk_create(feature_toggles)
  • 35. def backfill_activity_progresses(): conn = psycopg2.connect("some_credentials") cursor = conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor) data_to_replicate = [] for index in tqdm(batch_range): cursor.execute("SELECT user_id, type, correct_answers, total_answers, is_complete FROM legacy_activity ORDER BY id;") data_to_replicate.append(cursor.fetchall()) conn.close() add_to_new_activity_table(data_to_replicate)
  • 36. Know your bottlenecks. ● CPU? ● Memory? ● Database? ● Developer time? ● Cognitive overhead?
  • 37. Execute at the right level of abstraction. ● Use existing functions and the ORM when you can afford to. ● Use SQL when execution time becomes significant.
  • 38. Use database snapshots. ● Test your script on a backup from production.
  • 39. Use database snapshots. ● Test your script on a backup from production. ● Take snapshots before you make changes.
  • 40. Use database snapshots. ● Test your script on a backup from production. ● Take snapshots before you make changes. ● Automate your backups.
  • 41. Aaron Knight (@iamaaronknight) Full Stack Engineer Voxy.com Thanks!

Editor's Notes

  1. 0
  2. 3. This is a talk about how to deal with problems that arise with your data in production.
  3. Imagine that you’re working on a web application. It has an admin interface where admins can toggle different features on and off for different users.
  4. Somehow, those feature toggles got messed up. Now the orientation_videos flag is set to False for a large number of users.
  5. Fortunately, you have a way to recover which users are supposed to have that feature enabled so you can go into your database and fix the problem.
  6. If that is your response to updating production data in your web application, I’m going to suggest that you stop and think about the variety of ways in which such an operation can go wrong.
  7. 5. The right approach is to never, ever change production data.
  8. But of course that’s not how things work in the real world.
  9. 6. So instead, let’s talk about some realistic ways to fix your data safely. The first bit of advice should be obvious: don’t just start executing SQL queries or shell commands off the cuff. Treat these changes as code. That means checking them in...
  10. 7. Testing them. This might seem like a waste of time since you’re probably going to throw this code away after you run it. But better to waste a little time writing a test than a lot of time trying to reverse a catastrophic mistake to your data.
  11. Whatever process your team has for code review, do that.
  12. 10. Secondly, when you execute one of these scripts, the last thing that you want is code that runs silently for an indeterminate period of time, and may or may not have had the desired effect. So be very generous with your logging.
  13. Secondly, when you execute one of these scripts, the last thing that you want is code that runs silently for an indeterminate period of time, and may or may not have had the desired effect. So be very generous with your logging.
  14. Secondly, when you execute one of these scripts, the last thing that you want is code that runs silently for an indeterminate period of time, and may or may not have had the desired effect. So be very generous with your logging.
  15. Here’s an example script. Note that we’re logging at the beginning of the function, at the end, and for every code path in between.
  16. Another consideration is that you should centralize your logging. These logs need to be accessible to anyone on your team who may need to look back and see what happened.
  17. Any tool that works for you is fine, but what we use is Amazon Kinesis Firehose. We use it to write logs from our scripts to Amazon S3. What’s great about this service is the simplicity of using it.
  18. You set up a firehose in AWS, and then writing a log to an S3 bucket is just as simple as this. Note that I have a cute little function that gets the filename of the script that’s calling this function.
  19. Since I imagine that you’ll still be running these scripts manually, it’s pretty important to know how far along you are.
  20. For that, we use tqdm
  21. You pass an iterable in the “tqdm” function, passing it a count if your iterable is expensive to get the length of, and it gives you a little progress bar which shows time estimates.
  22. 15. Speaking of time, your script might take a long time to run. And won’t it be annoying if it runs for 3 hours and then fails halfway through?
  23. It’s always easy to forget to handle error conditions.
  24. So it’s a good idea to practice defensive programming in this case. Think about possible exceptions and catch them, making sure to log.
  25. Another thing to think about is, when your script does break 2 hours in, can you safely run it again and get the desired results?
  26. This might not always be possible or necessary, but in some extreme cases we have resorted to adding a new field to a model to track which items have been backfilled.
  27. Another nice feature is reversibility. If you screwed something up, is it possible to figure out what the previous state of the data was?
  28. This is where really detailed logging comes into play. If your logs contain all of the necessary information, you could conceivably parse them to get the original state of your data and reverse the damage.
  29. 20. If you’re dealing with a lot of data that needs to be fixed, you’re going to start needing to do some actual engineering. You’re going to need to think about what bottlenecks you might encounter. Maybe you’re doing something computationally intensive, in which case you might need to think about how to parallelize the job.
  30. Or maybe the naive version of your script is going to load several GB of data into memory. In that case, you might need to rewrite to use a generator or something.
  31. More likely, the database is going to be your big issue.
  32. In that case, it’s time to explore some of the features of your ORM. For example, here’s a construct in the Django ORM that I’ve used to bulk create objects instead of creating them one by one. That can be a huge time saver.
  33. If you’re still running too slow, you might want to drop down directly to the SQL level and skip Object instantiation and so on. You can get huge performance gains this way.
  34. But please, please don’t do these things if you don’t need to. I’ve gotten into PR debates with coworkers who wanted to over-optimize a script that takes 15 minutes to run. Don’t get fancy.
  35. So my general rule of thumb is to use the ORM when you can, and use SQL or the equivalent when you need to.
  36. 24. Let’s talk about other issue which should be obvious but needs to be said. Backups are your friend. First of all, if it’s at all feasible, set up a snapshot of your data and run your script against that before running it against the real thing.
  37. If you’re going to be doing something risky, please back up your data before you execute your script. Right before.
  38. If you’re going to be doing significant changes rather often, consider automating your database backups so that you snapshot right before your scripts run.
  39. There are lots of other considerations, but those are some of the big ones.