Atmosphere 2014: Really large scale systems configuration - Phil Dibowitz
- 120 views
For many years, Facebook managed its systems with cfengine2. With many individual clusters over 10k nodes in size, a slew of different constantly-changing system configurations, and small teams, this ...
For many years, Facebook managed its systems with cfengine2. With many individual clusters over 10k nodes in size, a slew of different constantly-changing system configurations, and small teams, this system was showing its age and the complexity was steadily increasing, limiting its effectiveness and usability. It was difficult to integrate with internal systems, testing was often impractical, and it provided no isolation of configurations, among many other problems. After an extensive evaluation of the tools and paradigms in modern systems configuration management – open source, proprietary, and a potential home-grown solution – we built a system based on the open-source project Chef. The evaluation process involved understanding the direction we wanted to take in managing the next many iterations of systems, clusters, and teams. More importantly, we evaluated the various paradigms behind effective configuration management and the different kinds of scale they provide. What we ended up with is an extremely flexible system that allows a tiny team to manage an incredibly large number of systems with a variety of unique configuration needs. In this talk we will look at the paradigms behind the system we built, the software we chose and why, and the system we built using that software. Further, we will look at how the philosophies we followed can apply to anyone wanting to scale their systems infrastructure.
Phil Dibowitz - Phil Dibowitz has been working in systems engineering for 12 years and is currently a production engineer at Facebook. Initially, he worked on the traffic infrastructure team, automating load balancer configuration management, as well as designing and building the production IPv6 infrastructure. He now leads the team responsible for rebuilding the configuration management system from the ground up. Prior to Facebook, he worked at Google, where he managed the large Gmail environment, and at Ticketmaster, where he co-authored and open sourced a configuration management tool called Spine. He also contributes to, and maintains, various open source projects and has spoken at conferences and LUG’s on a variety of topics from Path MTU Discovery to X509.
- Total Views
- Views on SlideShare
- Embed Views