Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

CentOS at Facebook

232 views

Published on

CentOS at Facebook talk from the CentOS Dojo 2019

Published in: Technology
  • Be the first to comment

  • Be the first to like this

CentOS at Facebook

  1. 1. CentOS at Facebook Davide Cavalca & Phil Dibowitz Production Engineering CentOS Dojo 2019
  2. 2. • Infrastructure • Packaging • RPM and DNF • OS Upgrades • CentOS TheNumberAfter7 Agenda
  3. 3. Infrastructure
  4. 4. • OS Team manages the bare metal experience of the fleet • OS as a platform • Individual teams are responsible for their own hosts Infrastructure How does it work?
  5. 5. • Working with the community on long-term direction • We move fast; open source often moves faster • We don’t need to write everything ourselves • Sharing our code means sharing the maintenance and having others extend it • DevConf.CZ 2017 talk: tinyurl.com/y7gx6nro Infrastructure Upstream first
  6. 6. • Stable releases • Binary compatibility • Security updates • Mature and well understood tooling • EPEL • Shared ecosystem with Fedora Infrastructure Why CentOS?
  7. 7. Infrastructure FTL – Fast Thin Layer
  8. 8. • Backports from Fedora Rawhide for stuff we care about • Mostly plumbing and low-level packages • %facebook macro to gate internal stuff • GitHub: facebookincubator/rpm-backports • CentOS + FTL = stable distro, moving fast Infrastructure FTL – Fast Thin Layer
  9. 9. • Rawhide backport of latest stable • GitHub: facebookincubator/systemd-compat-libs • Tracking upstream development in master • New features: cgroup2, BPF, portable services, etc. • fb_systemd and fb_timers cookbooks • All Systems Go 2018 talk: tinyurl.com/ycecofb9 Infrastructure Plumbing – systemd
  10. 10. • dbus is used for local IPC in systemd • We run a backport of dbus-daemon • Started testing dbus-broker, so far so good • Still no way to upgrade/restart without rebooting • pystemd: thin Cython wrapper on top of sd-bus • Exposes sytemd dbus object model • GitHub: facebookincubator/pystemd Infrastructure Plumbing – dbus
  11. 11. • CentOS 7 version + backported network-scripts from git master • IPv6 improvements, per-device routes, ifdown, additional configuration options • GitHub: fedora-sysv/initscripts Infrastructure Plumbing – initscripts / network-scripts
  12. 12. • Upstream kernel (or as close as possible) • Development in master • Internal stable and development branches • Testing and rollout automation • SCALE 15x talk: tinyurl.com/ybqgwn47 Infrastructure Kernel
  13. 13. Packaging
  14. 14. • Public mirror: mirror.facebook.net/centos/ • Upstream repos straight from CentOS • Internal repos: site-packages and fb-runtime Packaging Server side – layout
  15. 15. • Stored in GlusterFS • Served per-region via Apache2 • Cached by many varnish boxes • Local CLIs + publisher to control input Packaging Server side – implementation
  16. 16. Packaging Server side – implementation
  17. 17. • yummy: mock wrapper with SCM integration • Store specfiles + patches in source control • Store blobs in GlusterFS-backed NFS • Automation to sync to GitHub • Can submit to central build system – required for signing Packaging Tooling – system packages
  18. 18. • packman: rpmbuild wrapper for simple RPMs • Integrates with buck and our SCM tooling • Build software, generate specfile, package it up • Simple YAML configuration • Can submit to central build system – required for signing Packaging Tooling – fb-runtime packages
  19. 19. • RPM: backported from Rawhide • YUM: CentOS + backported PRs • DNF: built from yum4 repos – soon to be Rawhide • RPM, yum and DNF managed by cookbooks • fb_rpm is open source Packaging Client side
  20. 20. RPM and DNF (at scale)
  21. 21. • I/O thrash fsync() on close() (PR#187 on rpm)→ • rpmdb corruption issues → dcrpm • Duplicate packages → package-cleanup wrapper • Scaling Yum → pkgtorrent (merged upstream) RPM and DNF at scale RPM and DNF/YUM at scale
  22. 22. • Automate detection and remediation of common issues • rpmdb corruption, stuck processes, etc. • Works on Linux and OSX (!) • Runs before every Chef run • GitHub: facebookincubator/dcrpm RPM and DNF at scale dcrpm
  23. 23. RPM and DNF at scale dcrpm – before and after • RPM-related failures over time • Baseline of errors still WIP
  24. 24. • Merge package-cleanup wrapper into dcrpm • Test new database formats • Test DNF in production RPM and DNF at scale Work in progress
  25. 25. OS Updates
  26. 26. • CentOS 5 6 and 6 7→ → • No in-place updates: reprovision the host from scratch • Easier to ensure a good state • Side effect: clean slate offers opportunity to deprecate unwanted features or tools OS updates Major releases
  27. 27. • About a year from initial PoC to first production machine • About two years to migrate 100% of the fleet • Documentation and automation • Stateless vs stateful services • Prerequisites and hidden dependencies OS updates CentOS 7 migration
  28. 28. • Rolling OS updates • Every two weeks we sync down the latest updates… • …and roll them out over two weeks • Easy stop button and opt out for individual packages OS updates Minor releases and rolling updates
  29. 29. • /yum/centos/7.x/rollingupdates/YYYYMMDD • Tool to automate syncing and validation • ‘yum upgrade’* kicked off via fb_yum • Manual per-machine canary opt-in • Early canary 0.1% 1% 2% …→ → → → OS updates Rolling OS updates – implementation
  30. 30. CentOS TheNumberAfter7
  31. 31. • RHEL as a preview of CentOS • Handful of non-prod testing hosts • No major blockers (yet) • Get a head start playing with modules, dnf, etc. Early Testing RHEL 8 Beta
  32. 32. • Will also be shipped as Yum 4 • Fully supports modularity • Faster (as of last recent versions) • Chef fully supports it • Has better solutions for cleaning duplicates • We still need to port pkgtorrent DNF The new hotness
  33. 33. • See https://docs.fedoraproject.org/en-US/modularity/ • Pick a “stream” from different modules • Where is my $foo package?! Oh... • Really exciting stuff! But... • Some pieces still being worked out Modularity The new hotness
  34. 34. CentOS is our friend!
  35. 35. • CentOS has a great community • It is part of the Fedora/Red Hat family • Modularity will enable much of our existing practices • Allows us to support ourselves and move faster/slower as necessary CentOS Facebook loves CentOS!
  36. 36. Questions?
  37. 37. • github.com/facebookincubator/rpm-backports • github.com/facebookincubator/systemd-compat-libs • github.com/facebookincubator/pystemd • github.com/facebookincubator/dcrpm • github.com/facebook/chef-cookbooks • github.com/fedora-sysv/initscripts • SCALE 15x (kernel): tinyurl.com/ybqgwn47 • DevConf.CZ 2017 (upstream): tinyurl.com/y7gx6nro • ASG 2018 (systemd): tinyurl.com/ycecofb9 Questions? Here’s all the links in the meantime!

×