scaling compiled applications
Joe Damato
@joedamato
about me

• Joe Damato
• San Francisco, CA
• i like:
• systems
programming

• debugging
• performance
analysis
@joedamato
timetobleed.com
sassanosystems.c
om
http://jenkins-ci.org/
Repeatable builds

• the most important thing for compiled software
• set up a build server
• with jenkins (or another type of software)
• backup your built objects
• copying them to s3 is not a horrible idea
• regardless of source control system or branching strategy,
ensure you can always rebuild any version of your
software
Jenkins
problems (maybe
• git plugin didn’t work on windows
fixed?)

• branch building is painful, jenkins API can
help

• getting windows build agent working is
painful
почему?
why?
tools for...

• RPMs
• DEBs
• Everything else
• windows installers (MSIs)
• other linux/unix like OS’s
• etc
chroot ??
chroot:
•

•
•

an operation that changes the apparent root directory
for the current running process [...].

A program that is run in such a modified environment
cannot name (and therefore normally not access) files
outside the designated directory tree.

(from wikipedia)
RPM
mock
https://fedorahosted.org/mock/
DEB
pbuilder
https://wiki.ubuntu.com/PbuilderHowto
Everything else

• KVM
• Amazon EC2
• other virtualization
KVM

EC2
KVM
• Create a base image on disk
• Clone base image
• Boot the cloned image
• Do the build and copy built object out.
• Delete cloned image when done
• Base image is still pristine and can be reused.
Create builds in cleanroom

• Avoid contaminating builds with artifacts from previous
builds.

• chroots help
• use mock or pbuilder for RPMs and DEBs
• KVM, EC2, or equivalent for everything else
• Always create pristine base images and do builds in a
copy.

• Use SSDs
Tool problems

• git-buildpackage can’t set changelog distribution field
• signing key setup is really painful (especially for
RPMs)

• deb naming scheme for packages is quite painful
• all tools are undocumented and very hard to actually
use

• recent versions of KVM provided by ubuntu fail to
boot VMs sometimes
two types of
linking....

•
•

dynamic linking
static linking
static linking

• calls to functions in library are resolved at compile
time.

• code from the library is copied into the resulting
binary.
dynamic
linking
• calls to functions are resolved at
runtime.

• code for a library lives in it’s own
object:

• libblah.so.4
• libblah.dll
http://www.akkadia.org/drepper/no_static_linking.html
почему?
why?
static
linking
• figure out which libraries your app needs
• pick a supported major release, if possible
• build and package this library
• link it statically against your binary during build
• you now have fewer stones to turn over when
debugging
static linking

• you will need to track your upstream deps
• you will probably want to package your upstream
deps

• you can then point your chroots and build envs at
private deb, rpm, etc repos in your internal
infrastructure

• merging upstream changes in becomes its own
project
почему?
why?
Use files

• src/
•redhat5/
•some_internal_thing.c
•ubuntu1204/
•some_internal_thing.c
•debian6/
•some_internal_thing.c
Use the build system

Determine which file
to build at compile
time.
Use
modules

• break up ifdef soup into separate files
• use the build system to compile the right file at
build time

• this seems obvious but many C libraries and
programs are full of crazy ifdef soup.
Use modules
• very easy to fall down a rabbit hole breaking things
apart

• can make the build process more complicated and
harder to debug
почему?
why?
Capture debug
symbols

• DEB and RPM can both output debug packages
that contain debug symbols.

• output these packages.
• store these and make backups.
• (or just don’t strip your binary)
Use googlecoredumper to catch
• you can use google-coredumper
segfaults, bus errors, and other bad things.

• you can output a coredump when this
happens.

• you can use this coredump and your debug
symbols to figure out what is going on.
Plan for
failure
• Have a backup plan
• Capture debug symbols during your automated
build process.

• Store them somewhere safe (and make
backups).

• Capture coredumps (if possible).
• Use coredumps and debug symbols to figure
out what happened.
Plan for
failure

• can significantly increase complexity
• google coredumper can’t help if your kernel is
buggy

• some linux distributions don’t allow ptrace
• google coredumper only supports linux
http://jenkins-ci.org/
Check things
like...

• Is the binary actually statically linked?
• Does it get copied to the right path?
• Are the right config files autogenerated?
• Does the version string the program outputs match
the package version?

• ....
You also need...
correctness
testing.
RSpec is useful for
this.
Automated
Testing test every
• It will be impossible to build and
change on every supported platform.

• Use your build server to do this for you.
• Test things like:
• installing/uninstalling the object
• object is actually statically linked
• correctness testing (RSpec can be useful)
Automated
testing too much
• Fine line between not enough and

• Windows can be painful, but cygwin can help with
scripting

• Easy to forget about testing the actual package
install/removal

• Can be difficult to get working with branch builds
To
summarize...
sassanosystems.c
om
@joedamato
спасиб
о
Joe Damato

Joe Damato