SlideShare a Scribd company logo
Automation at Brainly
… or how to enter the world of automation in a “different way”.
OPS stack:
● ~80 servers, heavy usage of LXC containers
(~1000)
● 99.9% Debian, 1 Ubuntu host :)
● Nginx / Apache2, 2k reqs per sec
● 200 million page views monthly
● 700Mbps peak traffic
● Python is dominant
About Brainly
World’s largest homework help social network, connecting over 40 million users monthly
DEV stack:
● PHP
- Symfony 2
- SOA projects
- 200 reqs per sec on russian version
● Erlang
- 55k concurrent users
- 22k events per sec
● Native Apps
- iOS
- Android
● Puppet was not feasible for us
- *lots* of dependencies which make containers bigger/heavier
- problems with Puppet's declarative language
- seemed incoherent, lacking integration of orchestration
- steep learning curve
- YMMV
● "packaging as automation" as an intermediate solution
- dependency hell, installing one package could result in uninstalling others
- inflexible, lots of code duplication in debian/rules file
- LOTS of custom bash and PHP scripts, usually very hard to reuse
and not standardized
- this was a dead end :(
● Ansible
- initially used only for orchestration
- maintaining it required keeping up2date inventory, which later
simplified and helped with lots of things
Starting point
● we decided to move forward with Ansible and use it for setting up machines as
well
● first project was nagios monitoring plugins setup
● turned out to be ideal for containers and our needs in general
- very little dependencies to begin with (python2, python-apt),
and small footprint - "configured" Python modules are transferred
directly to machine, no need for local repositories
- very light, no compilation on the destination host is needed
- easy to understand. Tasks/playbooks map directly to actions
an ops/devops would have done if he was doing it by hand
- compatible with "automation by packages". We were able to
migrate from the old system in small steps.
First steps with Ansible
● all policies, rules, and good practices written down in automation's repo main
directory
● helps with introducing new people into the team or with devops approach
- newbies are able to start committing to repo quickly
- what's in GUIDELINES.md, that's law and changing it requires wider
consensus
- gives examples on how to deal with certain problems in standardized way
● few examples:
- limit the number of tags, each of them should be self-contained
with no cross-dependencies.
- do not include roles/tasks inside other roles,
this creates hard to follow dependencies
- NEVER subset the list of hosts inside the role, do it in site.yml.
Otherwise debugging roles/hosts will become difficult
- think twice before adding new role and esp. groups. As infrastructure
grows, it becomes hard to manage and/or creates "dead” code/roles
Avoiding regressions
● one of the policies introduced was storing one-off scripts in a
separate directory in our automation repo.
● most of them are Ansible playbooks used just for one particular
task (i.e. Squeeze->Wheezy migration)
● version-control everything!
● turned out to be very useful, some of them turned out to be useful
enough to be rewritten to proper role or a tool
Ugly-hacks reusability
● available on GitHub and Ansible Galaxy:
https://galaxy.ansible.com/list#/roles/940
https://galaxy.ansible.com/list#/roles/941
● “base” role:
- is reused across 8 different production roles we have ATM
- contains basic monitoring, log rotation, packages installation, etc…
- includes PHP setup in modphp/prefork configuration
- PHP disabled functions control
- basic security setup
- does not include any site-specific stuff
● "site” role:
- contains all site specific stuff and dependencies
(vhosts, additional packages, etc...)
- usually very simple
- more than one site role possible, only one base role though
● It is an example of how we make our roles reusable
Apache2 automation
● automatically setups monitoring basing on inventory and host groups
● implements devops approach - if dev has root on machine, he also has
access to all monitoring stuff related to this system
● automatic host dependencies basing on host groups
● provisioning new hosts is no longer so painful ("auto-discovery")
● all services configuration is stored as YAML files, and used in templates
● role uses DNS data directly from inventory in order to make monitoring
independent of DNS failures
Icinga
DNS migration
● at the beginning:
- dozens of authoritative name servers, each of them having
customized configuration, running ~100 zones, all created by hand
- the main reason for that was using DNS for switching between
primary/secondary servers/services
● three phases:
- slurping configuration into Ansible
- normalizing the configuration
- improving the setup
● Python script which uses Ansible API to fetch normalized zone configuration from
each server
- results available in a neat hash, with per-host, per-zone keys!
- normalization using named-checkconf tool
● use slurped configuration to re-generate all configs, this time using only the data
available to Ansible's
● "push-button" migration, after all recipes were ready :)
● secure: all zone transfers are signed with individual keys, ACLs are tight
● playbooks use dns data directly from inventory
● changing/migrating slaves/masters is easy, NS records are auto-generated
● updates to zones automatically bump serial, while still preserving the
YYYYMMDDxx format
● CRM records are auto-generated as well
* see next slide about CRM automation
● dns entries are always up2date thanks to some custom action modules
- ansible_ssh_host variables are harvested and processed into zones
- only custom entries and zone primary/secondary server names are
now stored in YAML
- new hosts are automatically added to zones, decommissioned
ones - removed
- auto-generation of reverse zones
DNS automation
● we have ~130 CRM clusters
● setting them up by hand would be "difficult" at best, impossible at worst
● available on Ansible Galaxy:
- https://galaxy.ansible.com/list#/roles/956
- https://galaxy.ansible.com/list#/roles/979
● follows pattern from apache2_base
- “base” role suitable for manually set up clusters
- "cluster” role provides service upon base, with few reusable snippets
and a possibility for more complex configurations
● automatic membership based on ansible inventory (no multicasts!)
● the most difficult part was providing synchronous handlers
● few simple configurations are provided, like single service-single vip
Corosync & Pacemaker
● initially we did not have time nor resources to set up full fledged LDAP
● we needed:
- user should be able to log in even during a network outage
- removal/adding users, ssh-keys, custom settings, etc..
all had to be supported
- it had to be reusable/accessible in other roles
(i.e. Icinga/monitoring)
- different privileges for dev,production and other environments
- UID/GID unification
● turned out to be simpler than we thought - users are managed using few
simple tasks and group_vars data. Rest is handled via variables precedence.
● migration/standardization required some effort though
User management automation
● standard ansible inventory management becomes a bit cumbersome with 100’s of
hosts:
- each host has to have ansible_ssh_host defined
- adding/removing large number of hosts/groups required editing lots of files
and/or one-off scripts
- ip address management using google docs does not scale ;)
● Ansible has well defined dynamic inventory API, with scripts available for AWS,
Cobbler, Rackspace, Docker, and many others.
● we wrote our own, which is based on YAML file, version controlled by git:
- python API allowing to manipulate the inventory easily
- logic and syntax checking of the inventory
● available as opensource: https://github.com/brainly/inventory_tool
Inventory management
● we are leasing our servers from Hetzner, no direct Layer 2 connectivity
● all tunnel setups are done using Ansible, new server
is automatically added to our network
● firewalls are set up by Ansible as well:
- OPS contribute the base firewall, DEVs can open
the ports of interest for their application
- ferm at it's base, for easy rule making and keeping in-kernel firewall in sync
with on-disk rules
- rules are auto-generated basing on inventory, adding/removing hosts is
automatically reconfigures FW
Networking
● based on Bareos, opensource Bacula fork
● new hosts are automatically set up for backup,
extending storage space is no longer a problem
● authentication using certificates, PITA without ansible
Backups
● deployment done by Python script calling Ansible API
● simple tasks implemented using ansible playbooks
● complex logic implemented in Python
Deployments
● Jinja2 template error messages are "difficult" to interpret
● templates sometimes grow to huge complexity
● Jinja2 is designed for speed, but with tradeoffs - some Python operators are
missing and creating custom plugins/filters poses some problems
● multi-inheritance, problems with 2-headed trees
● speed, improved with "pipelining=True", containerization on the long run
● some useful functionality requires paid subscription (Ansible Tower)
- RESTfull API, useful if you want to push new application version
to productions via i.e. Jenkins
- schedules - currently we need to push the changes ourselves
Not everything is perfect
● developers by default have RO access to repo, RW on case-by-case basis
● changes to systems owned by developers are done by developers,
OPS only provide the platform and tools
● all non-trivial changes require a Pull Request and a review from Ops
● encrypt mission critical data with Ansible Vault and push it directly to the repo
- *strong* encryption
- available to Ansible without the need for decryption
(password still required though)
- all security sensitive stuff can be skipped by developers with
"--skip-tags" option to ansible-playbooks
Dev,DevOps,Ops
● some of the things we mentioned can be find on our Github account
● we are working on opensourcing more stuff
https://github.com/brainly
Opensource! Opensource! Opensource!
● time needed to deploy new markets dropped considerably
● increased productivity
● better cooperation with developers
● more workpower, Devs are no longer blocked so much, we can push
tasks to them
● infrastructure as a code
● versioning
● code-reuse, less copy-pasting
Conclusions
We are hiring!
http://brainly.co/jobs/
Questions?
Thank you!

More Related Content

What's hot

PuppetCamp Sydney 2012 - Building a Multimaster Environment
PuppetCamp Sydney 2012 - Building a Multimaster EnvironmentPuppetCamp Sydney 2012 - Building a Multimaster Environment
PuppetCamp Sydney 2012 - Building a Multimaster Environment
Greg Cockburn
 
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to Ansible
Knoldus Inc.
 
Ansible MySQL MHA
Ansible MySQL MHAAnsible MySQL MHA
Ansible MySQL MHA
Alkin Tezuysal
 
Beyond Puppet
Beyond PuppetBeyond Puppet
Beyond Puppet
Kris Buytaert
 
Salt conf 2014 - Using SaltStack in high availability environments
Salt conf 2014 - Using SaltStack in high availability environmentsSalt conf 2014 - Using SaltStack in high availability environments
Salt conf 2014 - Using SaltStack in high availability environments
Benjamin Cane
 
Devops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftDevops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShift
Yaniv cohen
 
Puppet for dummies - ZendCon 2011 Edition
Puppet for dummies - ZendCon 2011 EditionPuppet for dummies - ZendCon 2011 Edition
Puppet for dummies - ZendCon 2011 Edition
Joshua Thijssen
 
Herd your chickens: Ansible for DB2 configuration management
Herd your chickens: Ansible for DB2 configuration managementHerd your chickens: Ansible for DB2 configuration management
Herd your chickens: Ansible for DB2 configuration management
Frederik Engelen
 
Automated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. AnsibleAutomated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. Ansible
Alberto Molina Coballes
 
Puppet and Telefonica R&D
Puppet and Telefonica R&DPuppet and Telefonica R&D
Puppet and Telefonica R&D
Puppet
 
Using Puppet - Real World Configuration Management
Using Puppet - Real World Configuration ManagementUsing Puppet - Real World Configuration Management
Using Puppet - Real World Configuration Management
James Turnbull
 
Vagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptopVagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptop
Lorin Hochstein
 
PuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of PuppetPuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of Puppet
Walter Heck
 
MySQL DevOps at Outbrain
MySQL DevOps at OutbrainMySQL DevOps at Outbrain
MySQL DevOps at Outbrain
Shlomi Noach
 
TXLF: Chef- Software Defined Infrastructure Today & Tomorrow
TXLF: Chef- Software Defined Infrastructure Today & TomorrowTXLF: Chef- Software Defined Infrastructure Today & Tomorrow
TXLF: Chef- Software Defined Infrastructure Today & Tomorrow
Matt Ray
 
Ansible: What, Why & How
Ansible: What, Why & HowAnsible: What, Why & How
Ansible: What, Why & How
Alfonso Cabrera
 
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStackSaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltStack
 
Understanding salt modular sub-systems and customization
Understanding salt   modular sub-systems and customizationUnderstanding salt   modular sub-systems and customization
Understanding salt modular sub-systems and customization
jasondenning
 
Puppet for SysAdmins
Puppet for SysAdminsPuppet for SysAdmins
Puppet for SysAdmins
Puppet
 
Zabbix Performance Tuning
Zabbix Performance TuningZabbix Performance Tuning
Zabbix Performance Tuning
Ricardo Santos
 

What's hot (20)

PuppetCamp Sydney 2012 - Building a Multimaster Environment
PuppetCamp Sydney 2012 - Building a Multimaster EnvironmentPuppetCamp Sydney 2012 - Building a Multimaster Environment
PuppetCamp Sydney 2012 - Building a Multimaster Environment
 
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to Ansible
 
Ansible MySQL MHA
Ansible MySQL MHAAnsible MySQL MHA
Ansible MySQL MHA
 
Beyond Puppet
Beyond PuppetBeyond Puppet
Beyond Puppet
 
Salt conf 2014 - Using SaltStack in high availability environments
Salt conf 2014 - Using SaltStack in high availability environmentsSalt conf 2014 - Using SaltStack in high availability environments
Salt conf 2014 - Using SaltStack in high availability environments
 
Devops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftDevops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShift
 
Puppet for dummies - ZendCon 2011 Edition
Puppet for dummies - ZendCon 2011 EditionPuppet for dummies - ZendCon 2011 Edition
Puppet for dummies - ZendCon 2011 Edition
 
Herd your chickens: Ansible for DB2 configuration management
Herd your chickens: Ansible for DB2 configuration managementHerd your chickens: Ansible for DB2 configuration management
Herd your chickens: Ansible for DB2 configuration management
 
Automated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. AnsibleAutomated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. Ansible
 
Puppet and Telefonica R&D
Puppet and Telefonica R&DPuppet and Telefonica R&D
Puppet and Telefonica R&D
 
Using Puppet - Real World Configuration Management
Using Puppet - Real World Configuration ManagementUsing Puppet - Real World Configuration Management
Using Puppet - Real World Configuration Management
 
Vagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptopVagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptop
 
PuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of PuppetPuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of Puppet
 
MySQL DevOps at Outbrain
MySQL DevOps at OutbrainMySQL DevOps at Outbrain
MySQL DevOps at Outbrain
 
TXLF: Chef- Software Defined Infrastructure Today & Tomorrow
TXLF: Chef- Software Defined Infrastructure Today & TomorrowTXLF: Chef- Software Defined Infrastructure Today & Tomorrow
TXLF: Chef- Software Defined Infrastructure Today & Tomorrow
 
Ansible: What, Why & How
Ansible: What, Why & HowAnsible: What, Why & How
Ansible: What, Why & How
 
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStackSaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
 
Understanding salt modular sub-systems and customization
Understanding salt   modular sub-systems and customizationUnderstanding salt   modular sub-systems and customization
Understanding salt modular sub-systems and customization
 
Puppet for SysAdmins
Puppet for SysAdminsPuppet for SysAdmins
Puppet for SysAdmins
 
Zabbix Performance Tuning
Zabbix Performance TuningZabbix Performance Tuning
Zabbix Performance Tuning
 

Viewers also liked

JDD2015: Sustainability Supporting Data Variability: Keeping Core Components ...
JDD2015: Sustainability Supporting Data Variability: Keeping Core Components ...JDD2015: Sustainability Supporting Data Variability: Keeping Core Components ...
JDD2015: Sustainability Supporting Data Variability: Keeping Core Components ...
PROIDEA
 
4Developers 2015: Programowanie synchroniczne i asynchroniczne - dwa światy k...
4Developers 2015: Programowanie synchroniczne i asynchroniczne - dwa światy k...4Developers 2015: Programowanie synchroniczne i asynchroniczne - dwa światy k...
4Developers 2015: Programowanie synchroniczne i asynchroniczne - dwa światy k...
PROIDEA
 
JDD2015: What is code? - Jakub Marchwicki
JDD2015: What is code? - Jakub MarchwickiJDD2015: What is code? - Jakub Marchwicki
JDD2015: What is code? - Jakub Marchwicki
PROIDEA
 
4Developers 2015: Customer Journey Based UX Design - Łukasz Szadkowski
4Developers 2015: Customer Journey Based UX Design - Łukasz Szadkowski4Developers 2015: Customer Journey Based UX Design - Łukasz Szadkowski
4Developers 2015: Customer Journey Based UX Design - Łukasz Szadkowski
PROIDEA
 
4Developers 2015: Dobrze posól swoje hasło: skróty haseł w webie - Leszek Kru...
4Developers 2015: Dobrze posól swoje hasło: skróty haseł w webie - Leszek Kru...4Developers 2015: Dobrze posól swoje hasło: skróty haseł w webie - Leszek Kru...
4Developers 2015: Dobrze posól swoje hasło: skróty haseł w webie - Leszek Kru...
PROIDEA
 
JDD2015: Ratpack: core of your micro-services - Andrey Adamovich
JDD2015: Ratpack: core of your micro-services - Andrey AdamovichJDD2015: Ratpack: core of your micro-services - Andrey Adamovich
JDD2015: Ratpack: core of your micro-services - Andrey Adamovich
PROIDEA
 
JDD2015: Java Everywhere Again—with DukeScript - Jaroslav Tulach
JDD2015: Java Everywhere Again—with DukeScript - Jaroslav TulachJDD2015: Java Everywhere Again—with DukeScript - Jaroslav Tulach
JDD2015: Java Everywhere Again—with DukeScript - Jaroslav Tulach
PROIDEA
 
4Developers 2015: Skalowanie i integracja systemów w asynchronicznym stylu - ...
4Developers 2015: Skalowanie i integracja systemów w asynchronicznym stylu - ...4Developers 2015: Skalowanie i integracja systemów w asynchronicznym stylu - ...
4Developers 2015: Skalowanie i integracja systemów w asynchronicznym stylu - ...
PROIDEA
 
4Developers 2015: Jak (w końcu) zacząć pracować z DDD wykorzystując BDD - Kac...
4Developers 2015: Jak (w końcu) zacząć pracować z DDD wykorzystując BDD - Kac...4Developers 2015: Jak (w końcu) zacząć pracować z DDD wykorzystując BDD - Kac...
4Developers 2015: Jak (w końcu) zacząć pracować z DDD wykorzystując BDD - Kac...
PROIDEA
 
Sprytniejsze testowanie kodu java ze spock framework (zaawansowane techniki) ...
Sprytniejsze testowanie kodu java ze spock framework (zaawansowane techniki) ...Sprytniejsze testowanie kodu java ze spock framework (zaawansowane techniki) ...
Sprytniejsze testowanie kodu java ze spock framework (zaawansowane techniki) ...
PROIDEA
 
JDD2015: Make your world event driven - Krzysztof Dębski
JDD2015: Make your world event driven - Krzysztof DębskiJDD2015: Make your world event driven - Krzysztof Dębski
JDD2015: Make your world event driven - Krzysztof Dębski
PROIDEA
 
JDD2015: Piękny Pan od HR radzi, czyli 1011 błędów, które popełniają programi...
JDD2015: Piękny Pan od HR radzi, czyli 1011 błędów, które popełniają programi...JDD2015: Piękny Pan od HR radzi, czyli 1011 błędów, które popełniają programi...
JDD2015: Piękny Pan od HR radzi, czyli 1011 błędów, które popełniają programi...
PROIDEA
 
JDD2015: Twenty-one years of "Design Patterns" - Ralph Johnson
JDD2015: Twenty-one years of "Design Patterns" - Ralph JohnsonJDD2015: Twenty-one years of "Design Patterns" - Ralph Johnson
JDD2015: Twenty-one years of "Design Patterns" - Ralph Johnson
PROIDEA
 
4Developers 2015: .NET 2015 - co nowego? - Michał Dudak, Future Processing
4Developers 2015: .NET 2015 - co nowego? - Michał Dudak, Future Processing4Developers 2015: .NET 2015 - co nowego? - Michał Dudak, Future Processing
4Developers 2015: .NET 2015 - co nowego? - Michał Dudak, Future Processing
PROIDEA
 
PLNOG15: BGP Route Reflector from practical point of view
PLNOG15: BGP Route Reflector from practical point of viewPLNOG15: BGP Route Reflector from practical point of view
PLNOG15: BGP Route Reflector from practical point of view
PROIDEA
 
DevOpsDays Warsaw 2015: From core Java to Devops team – Krzysztof Debski
DevOpsDays Warsaw 2015: From core Java to Devops team – Krzysztof DebskiDevOpsDays Warsaw 2015: From core Java to Devops team – Krzysztof Debski
DevOpsDays Warsaw 2015: From core Java to Devops team – Krzysztof Debski
PROIDEA
 
DevOpsDays Warsaw 2015: JaaC - Jenkins as a Code – Łukasz Szczęsny
DevOpsDays Warsaw 2015: JaaC - Jenkins as a Code – Łukasz SzczęsnyDevOpsDays Warsaw 2015: JaaC - Jenkins as a Code – Łukasz Szczęsny
DevOpsDays Warsaw 2015: JaaC - Jenkins as a Code – Łukasz Szczęsny
PROIDEA
 
Introduction to HTML
Introduction to HTMLIntroduction to HTML
Introduction to HTML
Gc university faisalabad
 
4Developers 2015: Continuous Security in DevOps - Maciej Lasyk
4Developers 2015: Continuous Security in DevOps - Maciej Lasyk4Developers 2015: Continuous Security in DevOps - Maciej Lasyk
4Developers 2015: Continuous Security in DevOps - Maciej Lasyk
PROIDEA
 

Viewers also liked (20)

JDD2015: Sustainability Supporting Data Variability: Keeping Core Components ...
JDD2015: Sustainability Supporting Data Variability: Keeping Core Components ...JDD2015: Sustainability Supporting Data Variability: Keeping Core Components ...
JDD2015: Sustainability Supporting Data Variability: Keeping Core Components ...
 
4Developers 2015: Programowanie synchroniczne i asynchroniczne - dwa światy k...
4Developers 2015: Programowanie synchroniczne i asynchroniczne - dwa światy k...4Developers 2015: Programowanie synchroniczne i asynchroniczne - dwa światy k...
4Developers 2015: Programowanie synchroniczne i asynchroniczne - dwa światy k...
 
JDD2015: What is code? - Jakub Marchwicki
JDD2015: What is code? - Jakub MarchwickiJDD2015: What is code? - Jakub Marchwicki
JDD2015: What is code? - Jakub Marchwicki
 
4Developers 2015: Customer Journey Based UX Design - Łukasz Szadkowski
4Developers 2015: Customer Journey Based UX Design - Łukasz Szadkowski4Developers 2015: Customer Journey Based UX Design - Łukasz Szadkowski
4Developers 2015: Customer Journey Based UX Design - Łukasz Szadkowski
 
4Developers 2015: Dobrze posól swoje hasło: skróty haseł w webie - Leszek Kru...
4Developers 2015: Dobrze posól swoje hasło: skróty haseł w webie - Leszek Kru...4Developers 2015: Dobrze posól swoje hasło: skróty haseł w webie - Leszek Kru...
4Developers 2015: Dobrze posól swoje hasło: skróty haseł w webie - Leszek Kru...
 
JDD2015: Ratpack: core of your micro-services - Andrey Adamovich
JDD2015: Ratpack: core of your micro-services - Andrey AdamovichJDD2015: Ratpack: core of your micro-services - Andrey Adamovich
JDD2015: Ratpack: core of your micro-services - Andrey Adamovich
 
JDD2015: Java Everywhere Again—with DukeScript - Jaroslav Tulach
JDD2015: Java Everywhere Again—with DukeScript - Jaroslav TulachJDD2015: Java Everywhere Again—with DukeScript - Jaroslav Tulach
JDD2015: Java Everywhere Again—with DukeScript - Jaroslav Tulach
 
4Developers 2015: Skalowanie i integracja systemów w asynchronicznym stylu - ...
4Developers 2015: Skalowanie i integracja systemów w asynchronicznym stylu - ...4Developers 2015: Skalowanie i integracja systemów w asynchronicznym stylu - ...
4Developers 2015: Skalowanie i integracja systemów w asynchronicznym stylu - ...
 
4Developers 2015: Jak (w końcu) zacząć pracować z DDD wykorzystując BDD - Kac...
4Developers 2015: Jak (w końcu) zacząć pracować z DDD wykorzystując BDD - Kac...4Developers 2015: Jak (w końcu) zacząć pracować z DDD wykorzystując BDD - Kac...
4Developers 2015: Jak (w końcu) zacząć pracować z DDD wykorzystując BDD - Kac...
 
Sprytniejsze testowanie kodu java ze spock framework (zaawansowane techniki) ...
Sprytniejsze testowanie kodu java ze spock framework (zaawansowane techniki) ...Sprytniejsze testowanie kodu java ze spock framework (zaawansowane techniki) ...
Sprytniejsze testowanie kodu java ze spock framework (zaawansowane techniki) ...
 
JDD2015: Make your world event driven - Krzysztof Dębski
JDD2015: Make your world event driven - Krzysztof DębskiJDD2015: Make your world event driven - Krzysztof Dębski
JDD2015: Make your world event driven - Krzysztof Dębski
 
JDD2015: Piękny Pan od HR radzi, czyli 1011 błędów, które popełniają programi...
JDD2015: Piękny Pan od HR radzi, czyli 1011 błędów, które popełniają programi...JDD2015: Piękny Pan od HR radzi, czyli 1011 błędów, które popełniają programi...
JDD2015: Piękny Pan od HR radzi, czyli 1011 błędów, które popełniają programi...
 
JDD2015: Twenty-one years of "Design Patterns" - Ralph Johnson
JDD2015: Twenty-one years of "Design Patterns" - Ralph JohnsonJDD2015: Twenty-one years of "Design Patterns" - Ralph Johnson
JDD2015: Twenty-one years of "Design Patterns" - Ralph Johnson
 
4Developers 2015: .NET 2015 - co nowego? - Michał Dudak, Future Processing
4Developers 2015: .NET 2015 - co nowego? - Michał Dudak, Future Processing4Developers 2015: .NET 2015 - co nowego? - Michał Dudak, Future Processing
4Developers 2015: .NET 2015 - co nowego? - Michał Dudak, Future Processing
 
PLNOG15: BGP Route Reflector from practical point of view
PLNOG15: BGP Route Reflector from practical point of viewPLNOG15: BGP Route Reflector from practical point of view
PLNOG15: BGP Route Reflector from practical point of view
 
DevOpsDays Warsaw 2015: From core Java to Devops team – Krzysztof Debski
DevOpsDays Warsaw 2015: From core Java to Devops team – Krzysztof DebskiDevOpsDays Warsaw 2015: From core Java to Devops team – Krzysztof Debski
DevOpsDays Warsaw 2015: From core Java to Devops team – Krzysztof Debski
 
DevOpsDays Warsaw 2015: JaaC - Jenkins as a Code – Łukasz Szczęsny
DevOpsDays Warsaw 2015: JaaC - Jenkins as a Code – Łukasz SzczęsnyDevOpsDays Warsaw 2015: JaaC - Jenkins as a Code – Łukasz Szczęsny
DevOpsDays Warsaw 2015: JaaC - Jenkins as a Code – Łukasz Szczęsny
 
nakabayasi m
nakabayasi mnakabayasi m
nakabayasi m
 
Introduction to HTML
Introduction to HTMLIntroduction to HTML
Introduction to HTML
 
4Developers 2015: Continuous Security in DevOps - Maciej Lasyk
4Developers 2015: Continuous Security in DevOps - Maciej Lasyk4Developers 2015: Continuous Security in DevOps - Maciej Lasyk
4Developers 2015: Continuous Security in DevOps - Maciej Lasyk
 

Similar to PLNOG14: Automation at Brainly - Paweł Rozlach

Ansible intro
Ansible introAnsible intro
#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible
Cédric Delgehier
 
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
Wong Hoi Sing Edison
 
Enabling ceph-mgr to control Ceph services via Kubernetes
Enabling ceph-mgr to control Ceph services via KubernetesEnabling ceph-mgr to control Ceph services via Kubernetes
Enabling ceph-mgr to control Ceph services via Kubernetes
mountpoint.io
 
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to Ansible
CoreStack
 
Ansible.pdf
Ansible.pdfAnsible.pdf
Ansible.pdf
shaikshazil1
 
Kubernetes extensibility: CRDs & Operators
Kubernetes extensibility: CRDs & OperatorsKubernetes extensibility: CRDs & Operators
Kubernetes extensibility: CRDs & Operators
SIGHUP
 
Kubernetes extensibility: crd & operators
Kubernetes extensibility: crd & operators Kubernetes extensibility: crd & operators
Kubernetes extensibility: crd & operators
Giacomo Tirabassi
 
SCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingSCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scaling
Stanislav Osipov
 
Ansible & Salt - Vincent Boon
Ansible & Salt - Vincent BoonAnsible & Salt - Vincent Boon
Ansible & Salt - Vincent Boon
MyNOG
 
Ansible Automation to Rule Them All
Ansible Automation to Rule Them AllAnsible Automation to Rule Them All
Ansible Automation to Rule Them All
Tim Fairweather
 
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios
 
Introduction to ansible
Introduction to ansibleIntroduction to ansible
Introduction to ansible
Omid Vahdaty
 
[BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes]
[BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes][BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes]
[BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes]
Wong Hoi Sing Edison
 
Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)
Divante
 
Infrastructure as Data with Ansible
Infrastructure as Data with AnsibleInfrastructure as Data with Ansible
Infrastructure as Data with Ansible
Carlo Bonamico
 
Infrastructure as data with Ansible: systems and cloud deployment and managem...
Infrastructure as data with Ansible: systems and cloud deployment and managem...Infrastructure as data with Ansible: systems and cloud deployment and managem...
Infrastructure as data with Ansible: systems and cloud deployment and managem...
Codemotion
 
My "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsMy "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails Projects
GR8Conf
 
Do more with Galera Cluster in your OpenStack cloud
Do more with Galera Cluster in your OpenStack cloudDo more with Galera Cluster in your OpenStack cloud
Do more with Galera Cluster in your OpenStack cloud
philip_stoev
 
Deploying Perl apps on dotCloud
Deploying Perl apps on dotCloudDeploying Perl apps on dotCloud
Deploying Perl apps on dotCloud
daoswald
 

Similar to PLNOG14: Automation at Brainly - Paweł Rozlach (20)

Ansible intro
Ansible introAnsible intro
Ansible intro
 
#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible
 
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
 
Enabling ceph-mgr to control Ceph services via Kubernetes
Enabling ceph-mgr to control Ceph services via KubernetesEnabling ceph-mgr to control Ceph services via Kubernetes
Enabling ceph-mgr to control Ceph services via Kubernetes
 
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to Ansible
 
Ansible.pdf
Ansible.pdfAnsible.pdf
Ansible.pdf
 
Kubernetes extensibility: CRDs & Operators
Kubernetes extensibility: CRDs & OperatorsKubernetes extensibility: CRDs & Operators
Kubernetes extensibility: CRDs & Operators
 
Kubernetes extensibility: crd & operators
Kubernetes extensibility: crd & operators Kubernetes extensibility: crd & operators
Kubernetes extensibility: crd & operators
 
SCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingSCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scaling
 
Ansible & Salt - Vincent Boon
Ansible & Salt - Vincent BoonAnsible & Salt - Vincent Boon
Ansible & Salt - Vincent Boon
 
Ansible Automation to Rule Them All
Ansible Automation to Rule Them AllAnsible Automation to Rule Them All
Ansible Automation to Rule Them All
 
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
 
Introduction to ansible
Introduction to ansibleIntroduction to ansible
Introduction to ansible
 
[BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes]
[BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes][BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes]
[BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes]
 
Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)
 
Infrastructure as Data with Ansible
Infrastructure as Data with AnsibleInfrastructure as Data with Ansible
Infrastructure as Data with Ansible
 
Infrastructure as data with Ansible: systems and cloud deployment and managem...
Infrastructure as data with Ansible: systems and cloud deployment and managem...Infrastructure as data with Ansible: systems and cloud deployment and managem...
Infrastructure as data with Ansible: systems and cloud deployment and managem...
 
My "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsMy "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails Projects
 
Do more with Galera Cluster in your OpenStack cloud
Do more with Galera Cluster in your OpenStack cloudDo more with Galera Cluster in your OpenStack cloud
Do more with Galera Cluster in your OpenStack cloud
 
Deploying Perl apps on dotCloud
Deploying Perl apps on dotCloudDeploying Perl apps on dotCloud
Deploying Perl apps on dotCloud
 

Recently uploaded

快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
3a0sd7z3
 
Securing BGP: Operational Strategies and Best Practices for Network Defenders...
Securing BGP: Operational Strategies and Best Practices for Network Defenders...Securing BGP: Operational Strategies and Best Practices for Network Defenders...
Securing BGP: Operational Strategies and Best Practices for Network Defenders...
APNIC
 
HijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process HollowingHijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process Hollowing
Donato Onofri
 
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
APNIC
 
How to make a complaint to the police for Social Media Fraud.pdf
How to make a complaint to the police for Social Media Fraud.pdfHow to make a complaint to the police for Social Media Fraud.pdf
How to make a complaint to the police for Social Media Fraud.pdf
Infosec train
 
Bengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal BrandingBengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal Branding
Tarandeep Singh
 
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
thezot
 
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
dtagbe
 
cyber crime.pptx..........................
cyber crime.pptx..........................cyber crime.pptx..........................
cyber crime.pptx..........................
GNAMBIKARAO
 
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
rtunex8r
 
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
3a0sd7z3
 

Recently uploaded (11)

快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
 
Securing BGP: Operational Strategies and Best Practices for Network Defenders...
Securing BGP: Operational Strategies and Best Practices for Network Defenders...Securing BGP: Operational Strategies and Best Practices for Network Defenders...
Securing BGP: Operational Strategies and Best Practices for Network Defenders...
 
HijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process HollowingHijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process Hollowing
 
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...
 
How to make a complaint to the police for Social Media Fraud.pdf
How to make a complaint to the police for Social Media Fraud.pdfHow to make a complaint to the police for Social Media Fraud.pdf
How to make a complaint to the police for Social Media Fraud.pdf
 
Bengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal BrandingBengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal Branding
 
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
一比一原版新西兰林肯大学毕业证(Lincoln毕业证书)学历如何办理
 
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
 
cyber crime.pptx..........................
cyber crime.pptx..........................cyber crime.pptx..........................
cyber crime.pptx..........................
 
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
 
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
 

PLNOG14: Automation at Brainly - Paweł Rozlach

  • 1. Automation at Brainly … or how to enter the world of automation in a “different way”.
  • 2. OPS stack: ● ~80 servers, heavy usage of LXC containers (~1000) ● 99.9% Debian, 1 Ubuntu host :) ● Nginx / Apache2, 2k reqs per sec ● 200 million page views monthly ● 700Mbps peak traffic ● Python is dominant About Brainly World’s largest homework help social network, connecting over 40 million users monthly DEV stack: ● PHP - Symfony 2 - SOA projects - 200 reqs per sec on russian version ● Erlang - 55k concurrent users - 22k events per sec ● Native Apps - iOS - Android
  • 3. ● Puppet was not feasible for us - *lots* of dependencies which make containers bigger/heavier - problems with Puppet's declarative language - seemed incoherent, lacking integration of orchestration - steep learning curve - YMMV ● "packaging as automation" as an intermediate solution - dependency hell, installing one package could result in uninstalling others - inflexible, lots of code duplication in debian/rules file - LOTS of custom bash and PHP scripts, usually very hard to reuse and not standardized - this was a dead end :( ● Ansible - initially used only for orchestration - maintaining it required keeping up2date inventory, which later simplified and helped with lots of things Starting point
  • 4. ● we decided to move forward with Ansible and use it for setting up machines as well ● first project was nagios monitoring plugins setup ● turned out to be ideal for containers and our needs in general - very little dependencies to begin with (python2, python-apt), and small footprint - "configured" Python modules are transferred directly to machine, no need for local repositories - very light, no compilation on the destination host is needed - easy to understand. Tasks/playbooks map directly to actions an ops/devops would have done if he was doing it by hand - compatible with "automation by packages". We were able to migrate from the old system in small steps. First steps with Ansible
  • 5. ● all policies, rules, and good practices written down in automation's repo main directory ● helps with introducing new people into the team or with devops approach - newbies are able to start committing to repo quickly - what's in GUIDELINES.md, that's law and changing it requires wider consensus - gives examples on how to deal with certain problems in standardized way ● few examples: - limit the number of tags, each of them should be self-contained with no cross-dependencies. - do not include roles/tasks inside other roles, this creates hard to follow dependencies - NEVER subset the list of hosts inside the role, do it in site.yml. Otherwise debugging roles/hosts will become difficult - think twice before adding new role and esp. groups. As infrastructure grows, it becomes hard to manage and/or creates "dead” code/roles Avoiding regressions
  • 6. ● one of the policies introduced was storing one-off scripts in a separate directory in our automation repo. ● most of them are Ansible playbooks used just for one particular task (i.e. Squeeze->Wheezy migration) ● version-control everything! ● turned out to be very useful, some of them turned out to be useful enough to be rewritten to proper role or a tool Ugly-hacks reusability
  • 7.
  • 8. ● available on GitHub and Ansible Galaxy: https://galaxy.ansible.com/list#/roles/940 https://galaxy.ansible.com/list#/roles/941 ● “base” role: - is reused across 8 different production roles we have ATM - contains basic monitoring, log rotation, packages installation, etc… - includes PHP setup in modphp/prefork configuration - PHP disabled functions control - basic security setup - does not include any site-specific stuff ● "site” role: - contains all site specific stuff and dependencies (vhosts, additional packages, etc...) - usually very simple - more than one site role possible, only one base role though ● It is an example of how we make our roles reusable Apache2 automation
  • 9. ● automatically setups monitoring basing on inventory and host groups ● implements devops approach - if dev has root on machine, he also has access to all monitoring stuff related to this system ● automatic host dependencies basing on host groups ● provisioning new hosts is no longer so painful ("auto-discovery") ● all services configuration is stored as YAML files, and used in templates ● role uses DNS data directly from inventory in order to make monitoring independent of DNS failures Icinga
  • 10. DNS migration ● at the beginning: - dozens of authoritative name servers, each of them having customized configuration, running ~100 zones, all created by hand - the main reason for that was using DNS for switching between primary/secondary servers/services ● three phases: - slurping configuration into Ansible - normalizing the configuration - improving the setup ● Python script which uses Ansible API to fetch normalized zone configuration from each server - results available in a neat hash, with per-host, per-zone keys! - normalization using named-checkconf tool ● use slurped configuration to re-generate all configs, this time using only the data available to Ansible's ● "push-button" migration, after all recipes were ready :)
  • 11. ● secure: all zone transfers are signed with individual keys, ACLs are tight ● playbooks use dns data directly from inventory ● changing/migrating slaves/masters is easy, NS records are auto-generated ● updates to zones automatically bump serial, while still preserving the YYYYMMDDxx format ● CRM records are auto-generated as well * see next slide about CRM automation ● dns entries are always up2date thanks to some custom action modules - ansible_ssh_host variables are harvested and processed into zones - only custom entries and zone primary/secondary server names are now stored in YAML - new hosts are automatically added to zones, decommissioned ones - removed - auto-generation of reverse zones DNS automation
  • 12. ● we have ~130 CRM clusters ● setting them up by hand would be "difficult" at best, impossible at worst ● available on Ansible Galaxy: - https://galaxy.ansible.com/list#/roles/956 - https://galaxy.ansible.com/list#/roles/979 ● follows pattern from apache2_base - “base” role suitable for manually set up clusters - "cluster” role provides service upon base, with few reusable snippets and a possibility for more complex configurations ● automatic membership based on ansible inventory (no multicasts!) ● the most difficult part was providing synchronous handlers ● few simple configurations are provided, like single service-single vip Corosync & Pacemaker
  • 13. ● initially we did not have time nor resources to set up full fledged LDAP ● we needed: - user should be able to log in even during a network outage - removal/adding users, ssh-keys, custom settings, etc.. all had to be supported - it had to be reusable/accessible in other roles (i.e. Icinga/monitoring) - different privileges for dev,production and other environments - UID/GID unification ● turned out to be simpler than we thought - users are managed using few simple tasks and group_vars data. Rest is handled via variables precedence. ● migration/standardization required some effort though User management automation
  • 14. ● standard ansible inventory management becomes a bit cumbersome with 100’s of hosts: - each host has to have ansible_ssh_host defined - adding/removing large number of hosts/groups required editing lots of files and/or one-off scripts - ip address management using google docs does not scale ;) ● Ansible has well defined dynamic inventory API, with scripts available for AWS, Cobbler, Rackspace, Docker, and many others. ● we wrote our own, which is based on YAML file, version controlled by git: - python API allowing to manipulate the inventory easily - logic and syntax checking of the inventory ● available as opensource: https://github.com/brainly/inventory_tool Inventory management
  • 15. ● we are leasing our servers from Hetzner, no direct Layer 2 connectivity ● all tunnel setups are done using Ansible, new server is automatically added to our network ● firewalls are set up by Ansible as well: - OPS contribute the base firewall, DEVs can open the ports of interest for their application - ferm at it's base, for easy rule making and keeping in-kernel firewall in sync with on-disk rules - rules are auto-generated basing on inventory, adding/removing hosts is automatically reconfigures FW Networking
  • 16. ● based on Bareos, opensource Bacula fork ● new hosts are automatically set up for backup, extending storage space is no longer a problem ● authentication using certificates, PITA without ansible Backups
  • 17. ● deployment done by Python script calling Ansible API ● simple tasks implemented using ansible playbooks ● complex logic implemented in Python Deployments
  • 18. ● Jinja2 template error messages are "difficult" to interpret ● templates sometimes grow to huge complexity ● Jinja2 is designed for speed, but with tradeoffs - some Python operators are missing and creating custom plugins/filters poses some problems ● multi-inheritance, problems with 2-headed trees ● speed, improved with "pipelining=True", containerization on the long run ● some useful functionality requires paid subscription (Ansible Tower) - RESTfull API, useful if you want to push new application version to productions via i.e. Jenkins - schedules - currently we need to push the changes ourselves Not everything is perfect
  • 19. ● developers by default have RO access to repo, RW on case-by-case basis ● changes to systems owned by developers are done by developers, OPS only provide the platform and tools ● all non-trivial changes require a Pull Request and a review from Ops ● encrypt mission critical data with Ansible Vault and push it directly to the repo - *strong* encryption - available to Ansible without the need for decryption (password still required though) - all security sensitive stuff can be skipped by developers with "--skip-tags" option to ansible-playbooks Dev,DevOps,Ops
  • 20.
  • 21. ● some of the things we mentioned can be find on our Github account ● we are working on opensourcing more stuff https://github.com/brainly Opensource! Opensource! Opensource!
  • 22. ● time needed to deploy new markets dropped considerably ● increased productivity ● better cooperation with developers ● more workpower, Devs are no longer blocked so much, we can push tasks to them ● infrastructure as a code ● versioning ● code-reuse, less copy-pasting Conclusions