FROM NASA
TO STARTUPS
TO BIG COMMERCE
Building Maintainable,
Scalable Projects
DANIEL GREENFELD
SUCCESS!!!
SUCCESS!!!
SUCCESS!!!
SUCCESS!!!
FAILURE.
Let’s
get
started
I ❤aerospace
(you’ve been warned)
BIGPLANS
http://en.wikipedia.org/wiki/Soyuz_(spacecraft)
The Reality
http://en.wikipedia.org/wiki/Conestoga_(rocket)
Conestega
First
commercial rocket!
1996!
https://www.flickr.com/photos/archer10/2215343914/
https://www.flickr.com/photos/ramcguire/19953122
Early Design
Mistakes
Early Design Mistakes
• Not understanding or knowing the requirements
• Over or under architecting
• Choosing the wrong tools
• Premature Optimization
• Management fiat made with ignorance or bad data
Technical
Debt
Loan Against the Future
• Get it done fast!
• Grouping data poorly
• Hardcoding
• Upcoming technology will fix it
• Leave testing and documentation for later
Technical Debt
Hard to
Maintain
Enhance
Scale
Projects
2006
http://science.nasa.gov
More NASA work
Get It Done!!!
https://www.djangopackages.com/
Bad Practices
@ PyCon 2011
https://www.flickr.com/photos/pydanny/5670716870/
Had mistakes at start
We insisted on rebuild
Rejected
Rebuilt and works great
Electronic Giftcards
Epic Volume
Our Challenge
Refactor
SUCCESS!!!
SUCCESS!!!
SUCCESS!!!
Challenges
Great Engineering Team
Scaling?
no-name
projects
can go
viral
But even
So Many Things
to Get Wrong
is a
constant
Failure
But that’s okay
How Do We
Build for the
Future?
Starting a
Project
Greenfield Projects
http://commons.wikimedia.org/wiki/File:A_Green_field_in_the_countryside_-_geograph.org.uk_-_193455.jpg
Greenfield Projects
Or Greenfeld Projects?
Greenfield Projects
http://commons.wikimedia.org/wiki/File:Green_fields_in_Gramado.jpg
Starting Basics
Kelly Johnson
Reknowned and prolific aircraft design engineer
Invented Seriously Awesome Planes!!!
P-38 Lightning
http://en.wikipedia.org/wiki/Lockheed_P-38_Lightning#mediaviewer/File:Lockheed_P-38J_Lightning_-_1.jpg
U-2
http://commons.wikimedia.org/wiki/File:Usaf.u2.750pix.jpg
SR-71
http://commons.wikimedia.org/wiki/File:Lockheed_SR-71_Blackbird.jpg
Kelly Johnson
Reknowned and prolific aircraft design engineer
AREA 51
Invented Seriously Awesome Planes!!!
Kelly Johnson
Reknowned and prolific aircraft design engineer
AREA 51
Invented Seriously Awesome Planes!!!
Really Good Manager
Keep It Simple, Stupid
Kelly Johnson Said:
KISS
Kelly Johnson Said:
Keep It Simple, Stupid
Keep it Simple Stupid
Keep It Simple, Stupid
KISS
Kelly Johnson Said:
Keep it Simple Stupid
If Kelly Johnson kept
it simple,
why can’t we?
Simple != Simplistic
Needless
Complexity
is
Bad
Bad Complexity
Swap two position values in a list
Simplicity is a Virtue
Foundations: Simple
But not Simplistic
Feature Construction: Simple
Bug Fixes: Simple
Be careful with databases that
make simplistic approaches easy.
JSON => SQL
Pro-Tip:
Keep
management
of data relations
out of the code!
http://en.wikipedia.org/wiki/File:Map_of_USA_with_state_names_2.svg
http://upload.wikimedia.org/wikipedia/commons/thumb/b/b2/
CourtGavel.JPG/600px-CourtGavel.JPG
http://commons.wikimedia.org/wiki/File:Benz-velo.jpg
Reasons to consider
Non-Relational Databases
• Ephemeral Data (caching)
• Hierarchical Data
• Other data structures that don’t easily map to relations
• Seating charts (Eventbrite!)
• Organization charts in dysfunctional companies
Pro-Tip
Would it work in a spreadsheet?
relational databaseYES!
No! non-relational database
Simplicity isn’t always
the answer
Sometimes you do
need complexity
Right at the Beginning
Valid Reason for Complexity:
Database
Transactions
Transactions aren’t complex, but
engineers often think they are.
Note:
http://commons.wikimedia.org/wiki/
File:WinonaSavingsBankVault.JPG
http://commons.wikimedia.org/wiki/File:Okayama_Red_Cross_Hospital.jpg
Database transactions protect
finances
and
lives
Valid Cause of Complexity
transactions
If one operation fails,
revert
revert
revert
Don’t
code solutions to handle
transactions
Use databases
with transactions
PostgreSQL MySQL
Avoiding Database
Transactions
is Simplistic
You have to build special
workarounds to make things reliable
You have to build special
workarounds to make things reliable
Avoiding Database
Transactions
is Simplistic
You have to build special
workarounds to make things reliable
Avoiding Database
Transactions
is Simplistic
Is a simplistic approach
to avoid complexity…
A) Bad Design Decision?
!
B) Technical Debt?
Answer
• Plan to fix later?
• Otherwise:
Technical Debt
Bad Design Decision?
More
Thoughts
on
Complexity
Complex Foundations:
Harder to maintain
Complexity2
=
Complex Foundations
Complex Business
+
Complexity2
=
Hard to Debug
Hard to Enhance
+
Complex Code Causes
Edge Cases
• Are you testing all the BRANCH (if/switch) statements?
• How do you test that ‘brilliant’ class heirarchy?
Your business logic is going
to be enough complexity
Building finance project?
Spend your time making it make money?
Fighting the project’s design?
Don’t Get Fancy
• Build a solid foundation first.
• “prototype” projects often aren’t.
• Beware virality
• No money
• Fancy stuff can come later.
Choosing Tools
Choosing Tools
is
Hard
Hard
Hard
Do it the Right Way
Management Fiat
Don’t Use
Management Fiat
with ignorance
or bad data
Mistake: Management Fiat
Choosing Enterprise Software
because it’s expensive
Your engineer’s recommendations
!
!
!
!
!
vs
Expensive White Papers written by well-
paid mostly non-technical analysts that
either state the obvious, expensive, or
non-sensical
Mistake: Management Fiat
HIPAA
Health Insurance and
Accountability Act
Mistake: Management Fiat
Wrong advice for
Your engineer’s recommendations
vs
Anything in a
business management
magazine.
Mistake: Management Fiat
Salesperson’
vs
That person you just met who might
possibly has a vested financial interest in
what they are recommending.
Your engineer’s recommendations
Mistake: Management Fiat
Your engineer’s recommendations when they
constantly tell you how smart they are.
!
vs
Burning your
money in a fire.
Same result, but faster
Mistake: Management Fiat
http://commons.wikimedia.org/wiki/File:Burning-money-and-
yuanbao-at-the-cemetery-3249.JPG
Your engineer’s recommendations when they want to
continue using a 10+ year old legacy platform
vs
The Bus Factor
Mistake: Management Fiat
http://en.wikipedia.org/wiki/File:GMBus.jpg
http://en.wikipedia.org/wiki/Bus_factor
Bus Factor!
http://commons.wikimedia.org/wiki/File:Arriva_T6_nearside.JPG
Is documented?
Does the job?
Has traction?
Choosing tools
• Ignore corporate marketing hype
• Even for corporate open source projects
• Unless…
Choosing tools
It’s software for popular proprietary hardware
Toolkit
Ecosystems
Is there an Ecosystem?
• Python, Django, Flask, et al
• Python package Index
• Django Packages
• Language + Framework means lots of optimized
packages already exist to do much of the work.
Tools with traction have extendable components
Case Study:
5000+ for Django
46000+ for Python
Parallels
78K+
83K+
Don’t Reinvent the Wheel
http://commons.wikimedia.org/wiki/File:London-Eye-2009.JPG
Example: CMS
Don’t build your own!
Unless your project is
about reinventing wheels
http://commons.wikimedia.org/wiki/File:Triple_Rotacaster_commercial_industrial_omni_wheel.jpg
Pro-Tip #1
Toolkit Ecosystems and Engineers
If engineers don’t know that tool ecosystems exist,
consider them junior.
http://commons.wikimedia.org/wiki/File:Fawnpuppy.jpg
Educate them!
Pro-Tip #2
Toolkit Ecosystems and Engineers
If engineers refuse to work with an ecosystem,
!
don’t waste your time with them.
They
purposefully
lower the
Bus Factor
http://commons.wikimedia.org/wiki/File:Keiseibus-twinbus-20071013.jpg
Best Practices
Insist on
Best Practices
Cleaner
Consistent
Stronger
Elegant
CODE
Easier to
Debug
Easier to
Add Features
New Staff Ramp Up
Faster
Faster
Faster
How to Best Practice
Research and find the common consensus
PEP-8
PEP-20
Example: Python
Find References
Research and find the common consensus
Research and find the common consensus
Find References
Research and find the common consensus
• MySQL
• PostgreSQL
• MongoDB
• Redis
• CouchDB
• Cassandra
• BigTable
Find References
Management Pro-Tip
Part 1
If your engineers consistently
refuse to follow defined best practices…
They
purposefully
lower the
Bus Factor
http://commons.wikimedia.org/wiki/File:B43OleBillatIWMLondon.jpg
Management Pro-Tip
Part 2
Get new ones
Tests
Always have them!
Even some coverage is
better than no coverage.
Always have a
working test harness
And a few tests that check really obvious things.
Tests on existing projects
without test harnesses
seleniumhq.org
Add a test for
everybug fix.
Every Programming Tool
has Test Harnesses
Management Pro-Tip
Engineers who consistently
refuse to write tests lower
their bus factor to 1.
Prepare to
keep them foreverhttp://commons.wikimedia.org/wiki/File:Ride_On_5312_at_Glenmont.jpg
Version Control
Works oneveryoperatingsystem
Tutorials andbookseverywhere
Just startwithitoraddit
Wehavenoexcusesnottouseit!
Version Control
Pro-Tips
Pro-Tip #1
Dropbox is NOT
software-project-acceptable
version control!
Pro-Tip #2
An FTP server for backing up
files isn’t either
Pro-Tip #3
Most good engineers won’t
work on a project that doesn’t
have version control
Pro-Tip #4
Most good engineers won’t
work at companies
without version control
Scaling
Scaling usually
isn’t a problem
97% Chance
You Ain’t Gonna Need it
A 3% Case
“Register here”
http://commons.wikimedia.org/wiki/
File:President_Barack_Obama.jpg
Other 3% Cases
A major celebrity tweets about your project
You are launching the official site
for a blockbuster movie
You just go viral
What if your
foundations are bad?
Bad
Foundations
are Fixable
W. Lloyd MacKenzie, via Flickr @ http://www.flickr.com/photos/saffron_blaze/
With enough
engineers, money, and
headaches, you can fix
anything
Two Quick
Scaling Fixes
Based on the fact that the early scaling problems
are almost always database-related.
It’s almost always
about the data
First Fix: Caching
The first healthcare.gov patch
implemented by their tiger team
added caching.
Second Fix: Indexing
Almost every datastore supports indexes.
Apply them to places with
the most data reads.
The Big
Scaling Mistake
to Avoid
Switching Tools Too Early
• Changing to the unusual datastore de-jour
• Trying a new programming language
• Switching out template languages.
• Changing Application Frameworks
before attempting to improve the database
Careful of the hype
Replacing Tools
mobile site
Went from Rails to Node.js.
Giant Improvements!
!
(A coincidence, perhaps?)
Node.js was just part of the story
They also modernized servers
and added more.
Replacing Tools
Careful of the hype
Replacing Frameworks
But recognize excitement
Clearly Linkedin Engineers were
excited to work with Node.js
Engineer interest in tools is important.
It inspires us to do great things.
Management Pro-Tip
Uninspired Engineers don’t do the best work
and are much more prone to leave.
Don’t Optimize
Prematurely
Instead…
Follow Best
Practices
By following best practices, you
are laying a foundation that
makes scaling a project easier.
Long Term
Scaling Fixes
Add More Servers
DevOps!
DevOps!
Ansible
Puppet
Chef
Docker
Salt Stack
DevOps!
Catch: Devops != Cheap
Ansible
Puppet
Chef
Docker
Salt Stack
Pro-Tip
Undocumented DevOps
lowers the bus factor of
operations staff.
Prepare to
keep them forever
http://commons.wikimedia.org/wiki/File:E85bus.jpg
Software as a Service
Heroku
Engineyard Google App Engine
Firebase
Various AWS Products
Gets
Expensive
Software as a Service
Vendor
Lock-In
Software as a Service
How to replace
tools safely
Don’t switch all at once!
Version 4
Language Switches
Identify bottlenecks in the code
Try applying new language there.
Example Switches
Aim for giantspeed boosts
• Numpy
• Scipy
• Pandas
Move slow big data/scientific calculations to:
More Example Switches
Port components to compiled languages such as:
• PyPy
• Go
• C
• Java
Aim for giantspeed boosts
Maintenance
Over Time
Case Study: Every Project
• Your project has been around for a while
• Problems
• Fixing bugs is hard
• Adding features is hard
Why is it harder?
• Project is no longer the bright and shiny
• Adding features adds to complexity
• Bugs caused by unforeseen edge cases
• Not enough tests make catching developer introduced
bugs harder
• Mistakes at the beginning are really starting to show
Original Engineer(s) was an Idiot
Original Engineer(s)
are always idiots
I’ve yet to join a project
where I didn’t feel like
ranting all the time
Rant Time!
• Why use a relational database?
• Why not use a non-relational database?
• Why this programming language?
• Why not use OO or functional programming techniques?
• Why use OO or functional programming techniques?
• What the heck is this programming pattern anyway?
Caveat: The Constant
of the Worst Code
Ever
Hindsight is 20/20
• No one predicts with 100% accuracy
• Not on software projects
• It’s easy with hindsight for us to complain about the
decisions made.
Reality Check
• Making accurate predictions is hard
• Projects grow organically
• At least you are getting paid to work on this, right?
Be Understanding
• Don’t be a jerk.!
• Try to understand why things evolved the way they did.
• Forgive your predecessor
• They can provide useful information!
• Circumstances can and will be reversed
Maintenance
Over Time
Why is it harder?
• Project is no longer the bright and shiny
• Adding features adds to complexity
• Bugs caused by unforeseen edge cases
• Not enough tests make catching developer introduced
bugs harder
• Mistakes at the beginning are really starting to show
Original Engineer(s) was an Idiot
HOW
DO WE FIX THIS?
The Basics
Maintenance
Over Time
The Basics
Make engineer setup and system config
as fast/easy as possible
Vagrant Docker
Chef
PuppetAnsible
Salt Stack
More Basics: Tests
No more code changes without tests.
New code releases must maintain test coverage or increase it.
More Basics: Document!
Document everything as you come across it
Markdown, RestructuredText, Google Docs, even Wikis
Don’t use Word or Pages!
Refactor
Mercilessly
Once the basics are in place…
Reward Successful
Refactoring
Give them a gnome!
Courtesy Daniel Roy Greenfeld
Code Reduction Gnome
Courtesy Daniel Roy Greenfeld
New code must follow
best practices
Basic: Follow Best Practices
Bug-fixed code might be
moved to best practices
Embrace
Code
Reuse
3+ Identical Lines of Code
in the same place?
Stick it in a
function or method!
Write a test for that
function or method!
Any place this function
or method is used…
Write a test!
Release Early and Often
• Rather than do massiveupdates periodically
• Do incrementalreleases constantly
• Users respond more favorably
• Bug management is easier
Use Feature Flags
• Turn new features off and on as needed
• Libraries can help you
• Use it for A/B testing
• In use at Eventbrite and others
http://en.wikipedia.org/wiki/Falcon_9
Summary
pydanny.com
Engineer at
Co-Author of
Architecture Team

From NASA to Startups to Big Commerce