SlideShare a Scribd company logo
Best practices
What is Ansible?
● A configuration management system
● Agentless design: ‘controller’ (admin’s localhost) supervise everything
● No mandatory data server to work with.
● Uses ssh as a primal transport, but there are many other transports too.
An example
● Install
● Configure reverse-proxy for an application
Name of things
Name of things
● Task + task + task => tasklist
● Tasks + vars + defaults => role
● Tasklist + hosts => play
● Play + play + … = playbook
● Playbooks + inventories = ansible repo (unofficial)
● Each module configure specific thing on the host
● Examples:
○ template
○ apt
○ systemd
○ stat
○ postgresql_user
○ object_storage
○ cron
○ crm_resource
○ …
○ ~ 2200 modules in ansible 2.4
variables & templates
Ansible allow to use variables to pass argument to modules.
- Each variable is processed with jinja2 template engine
- Tasks can register variables, there is a set_fact module
- Each task, play and role may have own local-scoped variables
- Nested definition is OK
- Recursion is prohibited
- Variables are expanded at the moment of use (in modules and conditions)
- Dedicated templates for configs are processed the same way as variables
● Are called if affected task was changed
● Are called once per play
● Can be flushed (called) earlier with meta: flush_handlers
● Have a play visibility
● Roles can notify each other hander’s:
○ It’s complicated. Try to avoid this.
● Can listen to other handler’s notification
● Are called in order of declaration, not in order of notifications
● Error handling/retry policy: at most once
○ This is bad
handlers and includes
include_role import_role
outer action inner + outer
inner action outer action inner + outer action
INNER + OUTER hander outer only outer only outer only inner only inner only inner only
INNER handler only inner NOT FOUND NOT FOUND inner inner inner
OUTER handler only outer outer outer outer outer outer
● evaluated at the moment of execution
● Evaluated on every iteration for loops
● Separately for each entry in ‘block’
● Have a special hack for ‘is defined’
- All of them are slow and clumsy.
- Ansible 2.5: iter_items → loops.
- Complicated branching is bad.
- Complexity is bad.
loop_var: user
label: ‘{{user.short_name}} at {{user_department}}’
● Each task or fail, or change something, or ‘success (no change)’, or skipped
● Each task should report change only if there are changes made.
● Second run of the same task should yield ‘no change’
Important for:
- Testing
- Stability and audit
- Handler’s calls
Ansible is not
a programming language.
ansible developer
What is ‘big’ means for an ansible project?
● 911 files
● 49132 lines
● 1196 files
● 52504 lines
● 1668 files
● 175745 lines
● Estimated yaml multiplicator for line count: ~x3
Not-a-code consequences
● Global variables everywhere
● foo: ‘{{foo + 1}}’ is officially broken. Forever.
● A practical call stack depth: 3-5
● It’s hard to change values in dictionaries and lists
● Data queries are crazy and complicated (json_query filter in Jinja2):
Sources of pain
● Dependencies
● Slow execution over ssh
● Memory hogging on includes (partially fixed in 2.4.3 and 2.5)
● Data query
● Rudimental modularity
● Name conflicts
● Non-typed interfaces between roles
● A horrible error reporting for jinja2 templates/filters
● Unpredictable visibility for global variables
● Variable precedence is complicated and is broken in include_role.
Ansible is a muscle, not a skeleton
● Everything is permitted
● Most errors are detected at runtime
○ Or even silently succeeded with incorrect behavior
● No universally accepted style guide (* try ansible-lint)
● No well-known design patterns
● Best practices are at level of elementary school
Why do we still use Ansible?
Because it’s the best we have insofar.
Some bones to build a skeleton
1. Execution flow: tasks and roles are assigned to hosts
2. Hosts are the first class objects to work with
3. Groups and groups inheritance to keep relations between hosts
4. Group variables
5. A simple iteration over lists
6. Transparent access to hosts ‘by ansible magic’
... I wish I this list would be longer...
Best practices
(High level)
No overengineering
It’s not java or python. Every act of overengineering bites you badly.
● Play is better than role
● Role is better than play, repeated twice in two different playbooks
● Tasklist in a role is better than a second role
● If you can join two roles through a play, use the play
○ If you can’t - use a wrapper role
● Play for host is better than delegate_to in task
● Delegate_to is better than poking into hostvars of other host
● Everytime you iterate over hosts in a group, God kills a cat
Project layout: partitioning
● Сommon basics: users, basic packages (vim/iptables), hostname, ssh keys
● Project-specific simple configuration (standard software && simple configs)
● Non-trivial configuration for standard software: e.g. databases, pacemaker
● Non-standard software (custom apps, git deploy, venv, etc)
● Ad-hoc scripts, cron jobs, etc
● Monitoring
● Bootstrap code (run-once tasks, initialization, etc)
● Upgrade procedure(s)
● Recovery procedures
Project layout
Included in site.yaml
● Users and basic software
● Software installation and configuration
● Database creation
● Monitoring
Used separately:
● Bootstrap
● Update procedure
● Recovery procedure
● Helper scripts for staging
○ Copy data from production
○ Tests for recovered system
○ Creation/teardown for staging
● Inventory update/generation
Scope reduction
Each piece of code should work within its own domain:
If we configure application foo we shouldn’t touch random bits outside of foo:
❌ NO
● add nginx configuration for foo
● use this magic query to find
database IP
● transform list of users from global
userlist to foo format
● Use wrapper role to configure
nginx (include_role, import_role)
● Use role to search database IP
● Pass userlist explicitly from
playbook or another wrapper role
There is no the sane way to describe dependencies.
- Old style (with dependencies in meta) do not work and is been deprecating.
- New style include_role/import_role ignores meta-dependecies.
The single way to create dependency is to do it manually.
- import_role when role_foo_called is not defined
- set_fact: role_foo_called inside a role
Or, just call it twice if it’s fast.
Explicit dependencies
Name it! Name it right!
● Everything should have a hyperonym (common name for few things)
○ F.e. ‘configuration playbooks’ VS ‘script playbooks’
○ Configuration playbooks should be linted to the perfection
○ Script playbooks may have unconditional ‘command/shell’ with ‘changed always’ status
● Different types of groups
○ F.e. ‘Execution groups’ VS ‘groups for variables’
○ Groups for variables should never have assigned tasks (f.e. hosts: database_settings)
● Name your components!
○ F.e. ‘bgp-push’ VS ‘bgp-pull’, ‘agents’, ‘central’, ‘external_access’, etc.
“Naming things” is the 2nd hard computing problem
Best practices
(low-level details)
Simple tricks
● ansible -i staging --list-hosts all
● ansible-playbook -i staging site.yaml --list-tags
○ Tags should have meaning!
● ansible-playbook -i staging site.yaml --check --diff
Ansible-lint !!!!!!!!!!111 one one one
● Points to subtle errors in the code playbooks
● Best practices (handlers vs “when: foo|changed” filter)
● Clarity. If lint understand that, people understand that.
● Force more semantic on shell/command
How much time it takes?
● ~ 30 lint warnings per hour.
● I cleared my project within 4 hours. There where 3 real-life bugs and 10 minor
improvements, all found by ansible-linter
Shell and command modules
● Main source of chaos if used inaccurately
● Rules:
○ If they gather information: changed_when: False
○ If they are idempotent: find a way to report changes.
○ If they are not idempotent: use only after query:
■ where: ‘foo’ in previous_query.stdout
■ where: previous_query.rc == 2
● You can refactor if those modules are idempotent
● You can not refactor if those modules are not idempotent
shell drama
And if I can’t detect changes or failure?
You are doing it wrong.
Find a way.
shell example
ip link set up command always returns 0, and never gives output.
❌ NO
- name: Link up
shell: |
ip link set up dev {{dev}}
- name: Check link status
command: ip link show {{dev}}
register: link_status
changed_when: False
- name: Link up
command: ip link set up dev {{dev}}
when: ‘UP’ not in link_status.stdout
shell example #2
foobar does not report failures at all.
We want to execute foobar add and we can to do foobar list .
❌ NO
- name: Add to foobar
shell: |
foobar add {{obj}}
- name: Check foobar status
register: old_fobar_output
changed_when: False
- name: Add to foobar
shell: |
foobar add {{obj}} && foobar list
register: new_foobar
when: obj not in old_foobar_output
failed_when: obj not in new_foobar
Apt: update_cache
Theoretical question: is it updated or not?
For practical reasons answer is: no changes
Option 1: integrate into install
- name: Install foo
become: yes
name: foo
state: {{foo_install_state}}
update_cache: {{apt_update_cache}}
cache_valid_time: {{apt_cache_valid_time}}
Option 2: use without changes
- name: Update apt cache
become: yes
update_cache: yes
cache_valid_time: {{cache_time}}
changed_when: False
Best practices
● Finds your bugs before production
● Helps to refactor
● Forces you to think of modularity
Development environment
Primary staging:
● virtual machines or real servers. Imitate production as close as possible
Development environment(s):
● Almost like staging, but faster and with omissions
● LXC (or docker) at localhost speedup runs for ~30-50%
● Deploy containers by Ansible, drop them by ansible
● Automate rebuild
● Delegate all Ansible tasks to CI/CD server (Jenkins?)
● One job for production, one for staging
● Software updates and other workflow tasks - separate jobs
● Production should be updated only through CI/CD server
○ Keep logs
○ Keep last deployed commit* in those logs
● *Do you use git for your playbooks? You should.
● Run production ‘full ansible run’ often.
○ Make it safe. Second full run = zero changes. Mandatory to have.
● Run staging ‘full ansible run’ before production for all changes.
○ It guards production and saves your face.
New and reinstalled servers
● Forget old ssh keys
● Remember new ones
● Install python, ssh keys, creates users
● Install all upgrades, restart server
Per role tests
+ Ansible way to test roles
+ Easier to debug
- Time consuming
- No inter-role integration
- Often meaningless without a context
Variables & environments
Places to hide a variable
● Inventory (host, group_name:vars)
● inventory/host_vars
● inventory/group_vars
● host_vars
● group_vars [all.yaml, group_name.yaml]
● roles/default
● roles/vars
● ‘vars:’ in any task or role
● register in any task
● import_vars
● defaults/vars of imported role
Ansible variables without supervision
Rules to keep sanity
● host_vars are banned anywhere except an inventory
● Roles/vars should be avoided
● Roles should avoid to expose variables to other roles in the same play(book)
○ Reduce global state, OK?
○ If they do - this is called an ‘interface’. Document it.
■ Example: search-fo-database-ip can set a variable db_ip.
● Environment-specific variables are kept in the inventory
● Project-specific variables are kept in group_vars
● Roles should use defaults for rarely changed variables
● Use local ‘vars:’ statement for task-local calculations
Variables and environments
● production/
● staging/
● lab1/
● user_list -> group_vars/all.yaml
● domain_prefix -> inventory/group_vars/all.yaml
● foo_listen_port -> group_vars/foo.yaml
● db_password ->inventory/group_vars/dbaccess.yaml
● retry_timeout ->roles/foo/default/main.yaml
Rule of thumb
You must be able to add
another environment by
creating a new inventory
(file/directory) with no
changes outside that
How long to think before adding a variable
roles/foo/tasks/*.yaml (vars section for task) 5 seconds no docs
roles/foo/defaults/main.yaml 30 seconds role docs
roles/foo/tasks/*.yaml (register) 1 minute no docs
roles/foo/tasks/*.yaml (set_fact, role-internal) 1 minute no docs
group_var 10 minutes role or project docs
Inventory 30 minutes role or project docs
roles/foo/tasks/.*.yaml (set_fact, external use outside of the role) 60+ minutes role and project docs
For use in a command line (ansible-playbook -e) 60+ minutes role and project docs
Assertions and validations
- name: validating variables
msg: "please choose scenario"
- osd_group_name is defined
- osd_group_name in group_names
- not containerized_deployment
- osd_scenario == 'dummy'
From ceph-ansible
- name: Check ansible version
run_once: True
that: "ansible_version.full|version_compare('2.4','>=')"
msg: >
"You must update Ansible to at least 2.4"
delegate_to: localhost
- always
fail module with ‘when’ assert module
Tags proliferation
- name: Configure foo
template: src=foo.conf.j2 dest=/etc/foo.conf
notify: restart foo
- foo
Tags proliferation
- name: Configure foo
template: src=foo.conf.j2 dest=/etc/foo.conf
notify: restart foo
- foo
- configure
Tags proliferation
- name: Configure foo
template: src=foo.conf.j2 dest=/etc/foo.conf
notify: restart foo
- foo
- configure
- restart
Tags proliferation
- name: Configure foo
become: yes
template: src=foo.conf.j2 dest=/etc/foo.conf
notify: restart foo
- foo
- configure
- restart
- become
Tags proliferation
- name: Configure foo
become: yes
template: src=foo.conf.j2 dest=/etc/foo.conf
notify: restart foo
- foo
- configure
- restart
- become
- ip
Tags proliferation
- name: Configure foo
become: yes
template: src=foo.conf.j2 dest=/etc/foo.conf
notify: restart foo
- foo
- configure
- restart
- become
- ip
- dont_do_like_this
Concise tags
Including tags:
● One tag - one scenario
● --tags your_tag should either:
○ Finish successfully for a new installation
○ Finish successfully for an existing
● If you have some tag for few plays in
a playbook, may be it’s better to split
it to separate playbook and use
Excluding tags:
● Should be used with --skip-tags
● For long or complicated operations
● Each ‘always’ tag should have
additional tag for skip:
- debug: var=foo
- always
- debug_foo
tag examples
- apt (all operations with apt, in all roles)
- registrations (all operations with registration in a project API, in all roles)
- foo_upgrade (all apt operations to install components of foo project)
- git (all operations related to git pull/clone)
- ip (all operations related to adding/removing IP addresses on server)
- discovery ( all ‘search-for-*-ip’ roles)
- services (tasks to configure shinken services, ~80 of them, shinken only)
- drop (specific for copy-database.yaml, tasks to drop database)
-- limit
To limit or not to limit?
Line in a template:
allow_ip = {% for h in group.all %} {{(hostvars[h]).ansible_default_ipv4.address}} {% endfor %}
ansible-playbook -i inventory test.yaml ✅
ansible-playbook -i inventory test.yaml --limit host1 ❌
fatal: [host2]: FAILED! => {"changed": false, "msg": "dict has no element ansible_default_ipv4"}
We need information about all hosts, but we have used --limit
1. Forbid to use limits in project 😟
2. Write a partial content 😓
3. Lineinfile on per-host basis 😦
4. Gather facts for all hosts forcefully 😥
5. Use fact cache 😕
6. Use external database 😖
7. Skip task if not a full run 🤔
Partial content
{% for h in group.all %}
{% if (hostvars[h]).ansible_default_ipv4 is defined %}
{% endfor %}
{% endfor %}
Good: none
- incomplete config
- ‘changed’ for each time with different --limit❌
- name: Add host to config
lineinfile: path=/etc/foo.conf line=”host {{(hostvars[item]).ansible_default_ipv4.address}}”
when: (hostvars[item]).ansible_default_ipv4 is defined
with_items: groups.all
Good: survive --limit with no changes or broken config
Bad: old values are not removed
Note: Can be used only if config use one IP per line
Forceful fact gathering
- setup: subset=network
delegate_to: {{item}}
delegate_facts: yes
with_items: groups.all
when: (hostvars[item]).ansible_default_ipv4 is not defined
- always
- gather_facts
- no random ‘changed’
- Always full config
- remove old values
- fast (see ‘when’ part)
- fails if any host is down or is not provisioned yet
Fact cache
● Do as in forceful fact gathering
● Set fact caching in ansible.cfg
● Hope it will be there
- Works most of the time
always - most = bugs sometime
External database
● Register each host in etcd/consul
● Query data on each run
Works with --limit
External service dependency (down/provision)
Removal of the old entities is a problem
Skip if not full run
- name: Configure foo
template: src=foo.conf.j2 dest=/etc/foo.conf
when: full_run
full_run: '{{play_hosts == groups.all}}'
- Works perfectly with --limit
- Won’t fail if some host is down and --limit was used
- Fast
- Updates and removes old data as needed on each full run
- Does not update config if --limit
Template & task relationship
● Keep templates as simple as possible
● Use ‘vars:’ section for explicit variable declaration
● Never use global variables in a template. Exceptions:
○ Iterations over all hosts
○ Ansible built-in variables
○ A special global variable documented in a project and in a role
○ Very complicated queries. Use comments in the task to list used
variables inside the template.
If a template is small, use ‘copy’ with ‘content’ argument to
inline it
- template:
dest: /etc/foobar.conf
content: |
source_ip = {{ansible_default_ipv4.address}}
Debugging templates: variables
- debug var={{item}}
- myvar1
- myvar2
- ansible_default_ipv4
- all_other_variables_in_template
Debugging templates: Jinja2
Explicit templatization in a separate playbook (f.e. temp.yaml)
- template:
dest: /tmp/foo.conf
delegate_to: localhost
transport: local
- some_var
- another_var
Templates everywhere
You don’t need to use ‘template’ to use jinja2. Every variable is a {{template}}.
- copy
- lineinfile
- blockinfile
- all file names for all copy/stat/file modules
- arguments to shell and command modules
- all other modules (apt, postgres_user, etc)
External Jinja2
- name: Ugly example
argument: ‘{{(hostvars[var1]).cust_facts[3]|json_query(“[?name=”+ ..
- name: Better example
foo: argument={{foo_argument}}
Foo_argument: ‘{{lookup(‘template’, ‘foo_arguments.j2’)}}
Roles: structure
1. Use defaults for rarely changed values. Do not use hard-coded constants.
2. Split role in parts
3. Allow to call role parts independently
4. Allow to reuse part of the role
5. Use call caching
Nginx: install + configure site
- import_tasklist: install.yaml
- import_tasklist: configure_site.yaml
- import_role:
name: nginx
nginx_site: ...
- name : install nginx
apt: name=nginx state=installed
when: nginx_installed is not defined
register: nginx_installed
Files in roles: vendor in role
- Easy to do: file: src=myfile dest=/var/lib/foo/myfile
- Single authority
- Versions
- Keep golden artifacts in the ansible repo
Files in roles: external source
- A tidy git.
- Need external storage.
- Version control.
private apt repo || private git repo || swift container (bad!)
Wrapper role
We have application server foo which should reside behind nginx.
● Foo want database IP, port address to listen
● Nginx need port to proxy_pass, domain, and ssl settings
Role foo configure foo only.
Role nginx configure any nginx site and it needs bunch of additional variables.
Wrapper role glues them together, but does not change anything in foo or nginx.
Wrapper role
- name: Configure foo for {{foo_source_ip}}
include_role: name=foo tasks_from=configure_foo
local_api_ip: '{{foo_local_ip}}'
local_api_port: '{{foo_local_port}}'
- name: Configure nginx for {{foo_source_ip}}
include_role: name=nginx tasks_from=configure_site
- name: 'rttgod_{{foo_source_ip}}'
listen_address: '{{foo_source_ip}}’
port: '{{foo_external_api_port}}'
proxy_pass: 'http://{{foo_local_ip}}:{{foo_local_port}}
Include_role VS import_role
- Make it like it was written in the place of ‘include’.
- Can override handlers
- Defaults are respected
(imported role use own default, but does not change parents defaults)
- Does not support loops
- Supports conditions:
- A condition is applied to each task in the import_role role.
Include_role VS import_role
- Supports loops
- Absolute mess
- Broken in each new ansible release in a new way (hello, 2.5):
- Delegation
- Handlers
- Defaults vs set_fact
- Parent’s variable access
- include_tasks is much more reasonable, but requires more files and lines.
A proper looping with an include in a role
- name: Loop over something
Include_tasks: per_something.yaml
with_items: ‘{{something}}’
- Name: in per_something.yaml
import_role: name=foo
var1: ‘{{item}}’
- name: A task in role ‘foo’
foo: arg=var1
Works in ansible 2.5!
● Avoid cross-role handlers (except for wrapper roles)
● Use meta: flush_handlers
At least once persistent handlers
- name: setup foo
apt: name=foo state=installed
notify: foo installed
- … other tasks here…
- meta: flush_handlers
- name: check if restart is needed
stat: path={{foo_flag}}
register: foo_restart_flag
- block:
- name: Restart foo
service name=foo state=restarted
- name: cleanup restart flag
file: path={{foo_flag}} state=absent
when: foo_restart_flag.stat.exists
- name: foo installed
path: ‘{{foo_flag}}’
state: touch
foo_flag: /var/run/foo-inst.flag
Plugin types
module ≠ plugin
- lookup_plugins/
- Load data from external sources
- Perform calculations and queries
- Iterate
- action_plugins/
- Do stuff on hosts
- vars_plugins
- inventory_plugins
All plugins are written in Python, and can be stored in ‘*_plugins/’ directory near a
playbook, or within a role.
Lookup plugins
1. Try to do it with ansible.
2. Try to do it with in-line jinja2 template
3. Try to do it with in-line json_query
4. Try to do it with external jinja2_template
5. If not, write a plugin
Rule of thumb: if jinja2 template more then ⅓ of plugin (and it’s tests), write a
plugin. If less, use a jinja2.
Python in ansible complicates reading! A lot.
Plugin without tests is worse then jinja2 of any complexity.
Lookup plugins: an example
from __future__ import (absolute_import, division, print_function)
__metaclass__ = type
from ansible.plugins.lookup import LookupBase
import copy
class LookupModule(LookupBase):
def run(self, terms, **kwargs):
data = terms or kwargs
assigned_something = data['assigned_something']
assigned_others = data['assigned_others']
somethings = data['somethings']
foo_source_ips = []
for something in somethings:
for data in something.get('datas', []):
if data['other'] in assigned_others:
return foo_source_ips
Lookup plugins: an example
- name: Register IP
method: PUT
url: ‘{{url}}’
body_format: json
body: '{"something": "{{item["something"]}}","other": "{{item["other"]”[data"]}}}"}'
- 200
- 201
- 304
register: reg_status
changed_when: reg_status.status in [200, 201]
with_my_custom_filter: '{{something}}'
Lookup plugins: json_query equivalent
- name: looping over
include_tasks: process_other.yaml
with_items: '{{selected_datas}}'
loop_var: data
label: '{{other}} @ {{data.foo_source_ip|default("no ip")}}'
when: data.foo_source_ip is defined and data.other in assigned_others
somethings: '{{global_config["somethings"]}}'
query: "[?name=='{{assigned_something}}'].datas"
selected_datas: '{{global_config.somethings|json_query(query)}}'
foo_source_ip: '{{data.foo_source_ip}}'
something: '{{assigned_something}}'
other: '{{data.other}}'
Other plugins
I have no experience with them, sorry.
Key ideas for action plugins, when to write them:
- Too many too complicated command/shells in a playbook/role
- Needed reusability
- Better test coverage
- Complicated data types in use
Adding features Cleaning up the mess
Refactoring when adding features
● Use small steps
● Write a plan for refactoring before changing anything
● Paper drawing is advised.
● Use ‘not changed’ status to see if refactoring does not change anything
● Use ansible-playbook --check --diff
● Do two steps refactoring:
○ Change internals without changes in the result
○ Do small, simple changes which to change the result
● Do not forget to add cleanup code if needed
○ Drop it later
● Each step should have separate commit with a multi-line description
○ You can do this, I believe in you!
Refactoring when cleaning up mess
- Find scenarios for execution
- Eliminate false ‘changed’
- Reduce spread between files (no hostvars!)
- Split plays into playbooks
- Split tasklist into roles
- Replace hardcoded values with variables
- In templates too!
- Do you remember about staging?
- Reduce complexity of queries and iterations
- Replace ‘shell/command’ with modules
- Ansible-lint
Refactoring example: Scraps from my table
● Write all ideas, even
● Write all variables and file
names you’ve introduced or
● Draw arrows between objects
Final advice:
● Every role and every playbook cut the corners.
● Cut as few corners as possible.
● Each ‘cut corner’ has consequences.
● Amount of time dedicated to a role or to a playbook is a function of it’s
Be safe, be reasonable, and let ansible-lint to be with you.

More Related Content

What's hot

Ansible presentation
Ansible presentationAnsible presentation
Ansible presentation
Kumar Y
Ansible presentation
Ansible presentationAnsible presentation
Ansible presentation
John Lynch
Automation with ansible
Automation with ansibleAutomation with ansible
Automation with ansible
Khizer Naeem
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to Ansible
Knoldus Inc.
Kamil Lelonek
Ansible roles done right
Ansible roles done rightAnsible roles done right
Ansible roles done right
Dan Vaida
Ansible intro
Ansible introAnsible intro
Configuration Management in Ansible
Configuration Management in Ansible Configuration Management in Ansible
Configuration Management in Ansible
Bangladesh Network Operators Group
Knoldus Inc.
IT Automation with Ansible
IT Automation with AnsibleIT Automation with Ansible
IT Automation with Ansible
Rayed Alrashed
What Is Ansible? | How Ansible Works? | Ansible Tutorial For Beginners | DevO...
What Is Ansible? | How Ansible Works? | Ansible Tutorial For Beginners | DevO...What Is Ansible? | How Ansible Works? | Ansible Tutorial For Beginners | DevO...
What Is Ansible? | How Ansible Works? | Ansible Tutorial For Beginners | DevO...
ansible why ?
ansible why ?ansible why ?
ansible why ?
Yashar Esmaildokht
DevOps with Ansible
DevOps with AnsibleDevOps with Ansible
DevOps with Ansible
Swapnil Jain
Systemd: the modern Linux init system you will learn to love
Systemd: the modern Linux init system you will learn to loveSystemd: the modern Linux init system you will learn to love
Systemd: the modern Linux init system you will learn to love
Alison Chaiken
Vishal Yadav
Ansible 101
Ansible 101Ansible 101
Ansible 101
Gena Mykhailiuta
Ansible - Hands on Training
Ansible - Hands on TrainingAnsible - Hands on Training
Ansible - Hands on Training
Mehmet Ali Aydın
DevOps Meetup ansible
DevOps Meetup   ansibleDevOps Meetup   ansible
DevOps Meetup ansible
Raul Leite
Hands On Introduction To Ansible Configuration Management With Ansible Comple...
Hands On Introduction To Ansible Configuration Management With Ansible Comple...Hands On Introduction To Ansible Configuration Management With Ansible Comple...
Hands On Introduction To Ansible Configuration Management With Ansible Comple...

What's hot (20)

Ansible presentation
Ansible presentationAnsible presentation
Ansible presentation
Ansible presentation
Ansible presentationAnsible presentation
Ansible presentation
Automation with ansible
Automation with ansibleAutomation with ansible
Automation with ansible
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to Ansible
Ansible roles done right
Ansible roles done rightAnsible roles done right
Ansible roles done right
Ansible intro
Ansible introAnsible intro
Ansible intro
Configuration Management in Ansible
Configuration Management in Ansible Configuration Management in Ansible
Configuration Management in Ansible
IT Automation with Ansible
IT Automation with AnsibleIT Automation with Ansible
IT Automation with Ansible
What Is Ansible? | How Ansible Works? | Ansible Tutorial For Beginners | DevO...
What Is Ansible? | How Ansible Works? | Ansible Tutorial For Beginners | DevO...What Is Ansible? | How Ansible Works? | Ansible Tutorial For Beginners | DevO...
What Is Ansible? | How Ansible Works? | Ansible Tutorial For Beginners | DevO...
ansible why ?
ansible why ?ansible why ?
ansible why ?
DevOps with Ansible
DevOps with AnsibleDevOps with Ansible
DevOps with Ansible
Systemd: the modern Linux init system you will learn to love
Systemd: the modern Linux init system you will learn to loveSystemd: the modern Linux init system you will learn to love
Systemd: the modern Linux init system you will learn to love
Ansible 101
Ansible 101Ansible 101
Ansible 101
Ansible - Hands on Training
Ansible - Hands on TrainingAnsible - Hands on Training
Ansible - Hands on Training
DevOps Meetup ansible
DevOps Meetup   ansibleDevOps Meetup   ansible
DevOps Meetup ansible
Hands On Introduction To Ansible Configuration Management With Ansible Comple...
Hands On Introduction To Ansible Configuration Management With Ansible Comple...Hands On Introduction To Ansible Configuration Management With Ansible Comple...
Hands On Introduction To Ansible Configuration Management With Ansible Comple...

Similar to Best practices for ansible

Creating a Mature Puppet System
Creating a Mature Puppet SystemCreating a Mature Puppet System
Creating a Mature Puppet System
Creating a mature puppet system
Creating a mature puppet systemCreating a mature puppet system
Creating a mature puppet systemrkhatibi
#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible
Cédric Delgehier
Introduction to Ansible - (dev ops for people who hate devops)
Introduction to Ansible - (dev ops for people who hate devops)Introduction to Ansible - (dev ops for people who hate devops)
Introduction to Ansible - (dev ops for people who hate devops)
Jude A. Goonawardena
Automating with ansible (part a)
Automating with ansible (part a)Automating with ansible (part a)
Automating with ansible (part a)
iman darabi
Automating with ansible (Part A)
Automating with ansible (Part A)Automating with ansible (Part A)
Automating with ansible (Part A)
iman darabi
Network Automation: Ansible 101
Network Automation: Ansible 101Network Automation: Ansible 101
Network Automation: Ansible 101
Automation@Brainly - Polish Linux Autumn 2014
Automation@Brainly - Polish Linux Autumn 2014Automation@Brainly - Polish Linux Autumn 2014
Automation@Brainly - Polish Linux Autumn 2014
Ansible 202 - sysarmy
Ansible 202 - sysarmyAnsible 202 - sysarmy
Ansible 202 - sysarmy
Sebastian Montini
PLNOG14: Automation at Brainly - Paweł Rozlach
PLNOG14: Automation at Brainly - Paweł RozlachPLNOG14: Automation at Brainly - Paweł Rozlach
PLNOG14: Automation at Brainly - Paweł Rozlach
PLNOG Automation@Brainly
PLNOG Automation@BrainlyPLNOG Automation@Brainly
PLNOG Automation@Brainly
Automating MySQL operations with Puppet
Automating MySQL operations with PuppetAutomating MySQL operations with Puppet
Automating MySQL operations with Puppet
Kris Buytaert
03 ansible towerbestpractices-nicholas
03 ansible towerbestpractices-nicholas03 ansible towerbestpractices-nicholas
03 ansible towerbestpractices-nicholas
Khairul Zebua
How I hack on puppet modules
How I hack on puppet modulesHow I hack on puppet modules
How I hack on puppet modules
Kris Buytaert
Introduction to Ansible - Peter Halligan
Introduction to Ansible - Peter HalliganIntroduction to Ansible - Peter Halligan
Introduction to Ansible - Peter Halligan
Automation and Ansible
Automation and AnsibleAutomation and Ansible
Automation and Ansible
Getting big without getting fat, in perl
Getting big without getting fat, in perlGetting big without getting fat, in perl
Getting big without getting fat, in perl
Dean Hamstead
Replication using PostgreSQL Replicator
Replication using PostgreSQL ReplicatorReplication using PostgreSQL Replicator
Replication using PostgreSQL ReplicatorCommand Prompt., Inc

Similar to Best practices for ansible (20)

Creating a Mature Puppet System
Creating a Mature Puppet SystemCreating a Mature Puppet System
Creating a Mature Puppet System
Creating a mature puppet system
Creating a mature puppet systemCreating a mature puppet system
Creating a mature puppet system
#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible
Introduction to Ansible - (dev ops for people who hate devops)
Introduction to Ansible - (dev ops for people who hate devops)Introduction to Ansible - (dev ops for people who hate devops)
Introduction to Ansible - (dev ops for people who hate devops)
Automating with ansible (part a)
Automating with ansible (part a)Automating with ansible (part a)
Automating with ansible (part a)
Automating with ansible (Part A)
Automating with ansible (Part A)Automating with ansible (Part A)
Automating with ansible (Part A)
Network Automation: Ansible 101
Network Automation: Ansible 101Network Automation: Ansible 101
Network Automation: Ansible 101
Automation@Brainly - Polish Linux Autumn 2014
Automation@Brainly - Polish Linux Autumn 2014Automation@Brainly - Polish Linux Autumn 2014
Automation@Brainly - Polish Linux Autumn 2014
Ansible 202 - sysarmy
Ansible 202 - sysarmyAnsible 202 - sysarmy
Ansible 202 - sysarmy
PLNOG14: Automation at Brainly - Paweł Rozlach
PLNOG14: Automation at Brainly - Paweł RozlachPLNOG14: Automation at Brainly - Paweł Rozlach
PLNOG14: Automation at Brainly - Paweł Rozlach
PLNOG Automation@Brainly
PLNOG Automation@BrainlyPLNOG Automation@Brainly
PLNOG Automation@Brainly
Automating MySQL operations with Puppet
Automating MySQL operations with PuppetAutomating MySQL operations with Puppet
Automating MySQL operations with Puppet
03 ansible towerbestpractices-nicholas
03 ansible towerbestpractices-nicholas03 ansible towerbestpractices-nicholas
03 ansible towerbestpractices-nicholas
How I hack on puppet modules
How I hack on puppet modulesHow I hack on puppet modules
How I hack on puppet modules
Introduction to Ansible - Peter Halligan
Introduction to Ansible - Peter HalliganIntroduction to Ansible - Peter Halligan
Introduction to Ansible - Peter Halligan
Automation and Ansible
Automation and AnsibleAutomation and Ansible
Automation and Ansible
Getting big without getting fat, in perl
Getting big without getting fat, in perlGetting big without getting fat, in perl
Getting big without getting fat, in perl
Go replicator
Go replicatorGo replicator
Go replicator
Replication using PostgreSQL Replicator
Replication using PostgreSQL ReplicatorReplication using PostgreSQL Replicator
Replication using PostgreSQL Replicator
05. haskell streaming io
05. haskell streaming io05. haskell streaming io
05. haskell streaming io

Recently uploaded

Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance

Recently uploaded (20)

Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf

Best practices for ansible

  • 2. Introduction What is Ansible? ● A configuration management system ● Agentless design: ‘controller’ (admin’s localhost) supervise everything ● No mandatory data server to work with. ● Uses ssh as a primal transport, but there are many other transports too.
  • 3. An example nginx: ● Install ● Configure reverse-proxy for an application
  • 5. Name of things ● Task + task + task => tasklist ● Tasks + vars + defaults => role ● Tasklist + hosts => play ● Play + play + … = playbook ● Playbooks + inventories = ansible repo (unofficial)
  • 6. modules ● Each module configure specific thing on the host ● Examples: ○ template ○ apt ○ systemd ○ stat ○ postgresql_user ○ object_storage ○ cron ○ crm_resource ○ … ○ ~ 2200 modules in ansible 2.4
  • 7. variables & templates Ansible allow to use variables to pass argument to modules. - Each variable is processed with jinja2 template engine - Tasks can register variables, there is a set_fact module - Each task, play and role may have own local-scoped variables - Nested definition is OK - Recursion is prohibited - Variables are expanded at the moment of use (in modules and conditions) - Dedicated templates for configs are processed the same way as variables
  • 8. handlers ● Are called if affected task was changed ● Are called once per play ● Can be flushed (called) earlier with meta: flush_handlers ● Have a play visibility ● Roles can notify each other hander’s: ○ It’s complicated. Try to avoid this. ● Can listen to other handler’s notification ● Are called in order of declaration, not in order of notifications ● Error handling/retry policy: at most once ○ This is bad
  • 9. handlers and includes include_role import_role inner action outer action inner + outer action inner action outer action inner + outer action INNER + OUTER hander outer only outer only outer only inner only inner only inner only INNER handler only inner NOT FOUND NOT FOUND inner inner inner OUTER handler only outer outer outer outer outer outer
  • 10. Conditionals ● evaluated at the moment of execution ● Evaluated on every iteration for loops ● Separately for each entry in ‘block’ ● Have a special hack for ‘is defined’
  • 11. Loops - All of them are slow and clumsy. - Ansible 2.5: iter_items → loops. - Complicated branching is bad. - Complexity is bad. loop_control: loop_var: user label: ‘{{user.short_name}} at {{user_department}}’
  • 12. idempotency ● Each task or fail, or change something, or ‘success (no change)’, or skipped ● Each task should report change only if there are changes made. ● Second run of the same task should yield ‘no change’ Important for: - Testing - Stability and audit - Handler’s calls
  • 13. Ansible is not a programming language. ansible developer
  • 14. What is ‘big’ means for an ansible project? Kubespray ● 911 files ● 49132 lines Openstack-ansible ● 1196 files ● 52504 lines Openshift-ansible ● 1668 files ● 175745 lines ● Estimated yaml multiplicator for line count: ~x3
  • 15. Not-a-code consequences ● Global variables everywhere ● foo: ‘{{foo + 1}}’ is officially broken. Forever. ● A practical call stack depth: 3-5 ● It’s hard to change values in dictionaries and lists ● Data queries are crazy and complicated (json_query filter in Jinja2):
  • 16. Sources of pain ● Dependencies ● Slow execution over ssh ● Memory hogging on includes (partially fixed in 2.4.3 and 2.5) ● Data query ● Rudimental modularity ● Name conflicts ● Non-typed interfaces between roles ● A horrible error reporting for jinja2 templates/filters ● Unpredictable visibility for global variables ● Variable precedence is complicated and is broken in include_role.
  • 17. Ansible is a muscle, not a skeleton ● Everything is permitted ● Most errors are detected at runtime ○ Or even silently succeeded with incorrect behavior ● No universally accepted style guide (* try ansible-lint) ● No well-known design patterns ● Best practices are at level of elementary school Why do we still use Ansible? Because it’s the best we have insofar.
  • 18. Some bones to build a skeleton 1. Execution flow: tasks and roles are assigned to hosts 2. Hosts are the first class objects to work with 3. Groups and groups inheritance to keep relations between hosts 4. Group variables 5. A simple iteration over lists 6. Transparent access to hosts ‘by ansible magic’ ... I wish I this list would be longer...
  • 20. No overengineering It’s not java or python. Every act of overengineering bites you badly. ● Play is better than role ● Role is better than play, repeated twice in two different playbooks ● Tasklist in a role is better than a second role ● If you can join two roles through a play, use the play ○ If you can’t - use a wrapper role ● Play for host is better than delegate_to in task ● Delegate_to is better than poking into hostvars of other host ● Everytime you iterate over hosts in a group, God kills a cat
  • 21. Project layout: partitioning ● Сommon basics: users, basic packages (vim/iptables), hostname, ssh keys ● Project-specific simple configuration (standard software && simple configs) ● Non-trivial configuration for standard software: e.g. databases, pacemaker ● Non-standard software (custom apps, git deploy, venv, etc) ● Ad-hoc scripts, cron jobs, etc ● Monitoring ● Bootstrap code (run-once tasks, initialization, etc) ● Upgrade procedure(s) ● Recovery procedures
  • 22. Project layout Included in site.yaml ● Users and basic software ● Software installation and configuration ● Database creation ● Monitoring Used separately: ● Bootstrap ● Update procedure ● Recovery procedure ● Helper scripts for staging ○ Copy data from production ○ Tests for recovered system ○ Creation/teardown for staging ● Inventory update/generation
  • 23. Scope reduction Each piece of code should work within its own domain: If we configure application foo we shouldn’t touch random bits outside of foo: ❌ NO ● add nginx configuration for foo ● use this magic query to find database IP ● transform list of users from global userlist to foo format ✅ YES ● Use wrapper role to configure nginx (include_role, import_role) ● Use role to search database IP ● Pass userlist explicitly from playbook or another wrapper role
  • 24. There is no the sane way to describe dependencies. - Old style (with dependencies in meta) do not work and is been deprecating. - New style include_role/import_role ignores meta-dependecies. The single way to create dependency is to do it manually. - import_role when role_foo_called is not defined - set_fact: role_foo_called inside a role Or, just call it twice if it’s fast. Explicit dependencies
  • 25. Name it! Name it right! Examples: ● Everything should have a hyperonym (common name for few things) ○ F.e. ‘configuration playbooks’ VS ‘script playbooks’ ○ Configuration playbooks should be linted to the perfection ○ Script playbooks may have unconditional ‘command/shell’ with ‘changed always’ status ● Different types of groups ○ F.e. ‘Execution groups’ VS ‘groups for variables’ ○ Groups for variables should never have assigned tasks (f.e. hosts: database_settings) ● Name your components! ○ F.e. ‘bgp-push’ VS ‘bgp-pull’, ‘agents’, ‘central’, ‘external_access’, etc. “Naming things” is the 2nd hard computing problem
  • 27. Simple tricks ● ansible -i staging --list-hosts all ● ansible-playbook -i staging site.yaml --list-tags ○ Tags should have meaning! ● ansible-playbook -i staging site.yaml --check --diff
  • 28. Ansible-lint !!!!!!!!!!111 one one one ● Points to subtle errors in the code playbooks ● Best practices (handlers vs “when: foo|changed” filter) ● Clarity. If lint understand that, people understand that. ● Force more semantic on shell/command How much time it takes? ● ~ 30 lint warnings per hour. ● I cleared my project within 4 hours. There where 3 real-life bugs and 10 minor improvements, all found by ansible-linter
  • 29. Shell and command modules ● Main source of chaos if used inaccurately ● Rules: ○ If they gather information: changed_when: False ○ If they are idempotent: find a way to report changes. ○ If they are not idempotent: use only after query: ■ where: ‘foo’ in previous_query.stdout ■ where: previous_query.rc == 2 ● You can refactor if those modules are idempotent ● You can not refactor if those modules are not idempotent
  • 30. shell drama And if I can’t detect changes or failure? You are doing it wrong. Find a way. .
  • 31. shell example ip link set up command always returns 0, and never gives output. ❌ NO - name: Link up shell: | ip link set up dev {{dev}} ✅ YES - name: Check link status command: ip link show {{dev}} register: link_status changed_when: False - name: Link up command: ip link set up dev {{dev}} when: ‘UP’ not in link_status.stdout
  • 32. shell example #2 foobar does not report failures at all. We want to execute foobar add and we can to do foobar list . ❌ NO - name: Add to foobar shell: | foobar add {{obj}} ✅ YES - name: Check foobar status register: old_fobar_output changed_when: False - name: Add to foobar shell: | foobar add {{obj}} && foobar list register: new_foobar when: obj not in old_foobar_output failed_when: obj not in new_foobar
  • 33. Apt: update_cache Theoretical question: is it updated or not? For practical reasons answer is: no changes Option 1: integrate into install - name: Install foo become: yes apt: name: foo state: {{foo_install_state}} update_cache: {{apt_update_cache}} cache_valid_time: {{apt_cache_valid_time}} Option 2: use without changes - name: Update apt cache become: yes apt: update_cache: yes cache_valid_time: {{cache_time}} changed_when: False
  • 35. Staging MUST HAVE STAGING AT ANY COST Staging: ● Finds your bugs before production ● Helps to refactor ● Forces you to think of modularity
  • 36. Development environment Primary staging: ● virtual machines or real servers. Imitate production as close as possible Development environment(s): ● Almost like staging, but faster and with omissions ● LXC (or docker) at localhost speedup runs for ~30-50% ● Deploy containers by Ansible, drop them by ansible ● Automate rebuild
  • 37. CI/CD ● Delegate all Ansible tasks to CI/CD server (Jenkins?) ● One job for production, one for staging ● Software updates and other workflow tasks - separate jobs ● Production should be updated only through CI/CD server ○ Keep logs ○ Keep last deployed commit* in those logs ● *Do you use git for your playbooks? You should. ● Run production ‘full ansible run’ often. ○ Make it safe. Second full run = zero changes. Mandatory to have. ● Run staging ‘full ansible run’ before production for all changes. ○ It guards production and saves your face.
  • 38. New and reinstalled servers Bootstrap.yaml: ● Forget old ssh keys ● Remember new ones ● Install python, ssh keys, creates users ● Install all upgrades, restart server
  • 39. Per role tests + Ansible way to test roles + Easier to debug - Time consuming - No inter-role integration - Often meaningless without a context
  • 41. Places to hide a variable ● Inventory (host, group_name:vars) ● inventory/host_vars ● inventory/group_vars ● host_vars ● group_vars [all.yaml, group_name.yaml] ● roles/default ● roles/vars ● ‘vars:’ in any task or role ● register in any task ● import_vars ● defaults/vars of imported role Ansible variables without supervision
  • 42. Rules to keep sanity ● host_vars are banned anywhere except an inventory ● Roles/vars should be avoided ● Roles should avoid to expose variables to other roles in the same play(book) ○ Reduce global state, OK? ○ If they do - this is called an ‘interface’. Document it. ■ Example: search-fo-database-ip can set a variable db_ip. ● Environment-specific variables are kept in the inventory ● Project-specific variables are kept in group_vars ● Roles should use defaults for rarely changed variables ● Use local ‘vars:’ statement for task-local calculations
  • 43. Variables and environments Environments: ● production/ ● staging/ ● lab1/ Variables: ● user_list -> group_vars/all.yaml ● domain_prefix -> inventory/group_vars/all.yaml ● foo_listen_port -> group_vars/foo.yaml ● db_password ->inventory/group_vars/dbaccess.yaml ● retry_timeout ->roles/foo/default/main.yaml Rule of thumb You must be able to add another environment by creating a new inventory (file/directory) with no changes outside that inventory.
  • 44. How long to think before adding a variable roles/foo/tasks/*.yaml (vars section for task) 5 seconds no docs roles/foo/defaults/main.yaml 30 seconds role docs roles/foo/tasks/*.yaml (register) 1 minute no docs roles/foo/tasks/*.yaml (set_fact, role-internal) 1 minute no docs group_var 10 minutes role or project docs Inventory 30 minutes role or project docs roles/foo/tasks/.*.yaml (set_fact, external use outside of the role) 60+ minutes role and project docs Mandatory! For use in a command line (ansible-playbook -e) 60+ minutes role and project docs Mandatory!
  • 45. Assertions and validations - name: validating variables Fail: msg: "please choose scenario" when: - osd_group_name is defined - osd_group_name in group_names - not containerized_deployment - osd_scenario == 'dummy' From ceph-ansible - name: Check ansible version run_once: True assert: that: "ansible_version.full|version_compare('2.4','>=')" msg: > "You must update Ansible to at least 2.4" delegate_to: localhost tags: - always fail module with ‘when’ assert module
  • 46. Tags
  • 47. Tags proliferation - name: Configure foo template: src=foo.conf.j2 dest=/etc/foo.conf notify: restart foo tags: - foo
  • 48. Tags proliferation - name: Configure foo template: src=foo.conf.j2 dest=/etc/foo.conf notify: restart foo tags: - foo - configure
  • 49. Tags proliferation - name: Configure foo template: src=foo.conf.j2 dest=/etc/foo.conf notify: restart foo tags: - foo - configure - restart
  • 50. Tags proliferation - name: Configure foo become: yes template: src=foo.conf.j2 dest=/etc/foo.conf notify: restart foo tags: - foo - configure - restart - become
  • 51. Tags proliferation - name: Configure foo become: yes template: src=foo.conf.j2 dest=/etc/foo.conf notify: restart foo tags: - foo - configure - restart - become - ip
  • 52. Tags proliferation - name: Configure foo become: yes template: src=foo.conf.j2 dest=/etc/foo.conf notify: restart foo tags: - foo - configure - restart - become - ip - dont_do_like_this
  • 53. Concise tags Including tags: ● One tag - one scenario ● --tags your_tag should either: ○ Finish successfully for a new installation ○ Finish successfully for an existing installation ● If you have some tag for few plays in a playbook, may be it’s better to split it to separate playbook and use include_playbook. Excluding tags: ● Should be used with --skip-tags ● For long or complicated operations only. ● Each ‘always’ tag should have additional tag for skip: - debug: var=foo tags: - always - debug_foo
  • 54. tag examples - apt (all operations with apt, in all roles) - registrations (all operations with registration in a project API, in all roles) - foo_upgrade (all apt operations to install components of foo project) - git (all operations related to git pull/clone) - ip (all operations related to adding/removing IP addresses on server) - discovery ( all ‘search-for-*-ip’ roles) - services (tasks to configure shinken services, ~80 of them, shinken only) - drop (specific for copy-database.yaml, tasks to drop database)
  • 56. To limit or not to limit? Line in a template: allow_ip = {% for h in group.all %} {{(hostvars[h]).ansible_default_ipv4.address}} {% endfor %} ansible-playbook -i inventory test.yaml ✅ ansible-playbook -i inventory test.yaml --limit host1 ❌ fatal: [host2]: FAILED! => {"changed": false, "msg": "dict has no element ansible_default_ipv4"}
  • 57. Solutions We need information about all hosts, but we have used --limit 1. Forbid to use limits in project 😟 2. Write a partial content 😓 3. Lineinfile on per-host basis 😦 4. Gather facts for all hosts forcefully 😥 5. Use fact cache 😕 6. Use external database 😖 7. Skip task if not a full run 🤔
  • 58. Partial content {% for h in group.all %} {% if (hostvars[h]).ansible_default_ipv4 is defined %} {{(hostvars[h]).ansible_default_ipv4.address}} {% endfor %} {% endfor %} Good: none Bad: - incomplete config - ‘changed’ for each time with different --limit❌
  • 59. Lineinfile - name: Add host to config lineinfile: path=/etc/foo.conf line=”host {{(hostvars[item]).ansible_default_ipv4.address}}” when: (hostvars[item]).ansible_default_ipv4 is defined with_items: groups.all Good: survive --limit with no changes or broken config Bad: old values are not removed Note: Can be used only if config use one IP per line
  • 60. Forceful fact gathering - setup: subset=network delegate_to: {{item}} delegate_facts: yes with_items: groups.all when: (hostvars[item]).ansible_default_ipv4 is not defined tags: - always - gather_facts Good: - no random ‘changed’ - Always full config - remove old values - fast (see ‘when’ part) Bad: - fails if any host is down or is not provisioned yet
  • 61. Fact cache ● Do as in forceful fact gathering ● Set fact caching in ansible.cfg ● Hope it will be there Good: - Works most of the time Bad: always - most = bugs sometime
  • 62. External database ● Register each host in etcd/consul ● Query data on each run Good: Works with --limit Bad: External service dependency (down/provision) Removal of the old entities is a problem
  • 63. Skip if not full run - name: Configure foo template: src=foo.conf.j2 dest=/etc/foo.conf when: full_run vars: full_run: '{{play_hosts == groups.all}}' Good: - Works perfectly with --limit - Won’t fail if some host is down and --limit was used - Fast - Updates and removes old data as needed on each full run Bad: - Does not update config if --limit ✅
  • 65. Template & task relationship ● Keep templates as simple as possible ● Use ‘vars:’ section for explicit variable declaration ● Never use global variables in a template. Exceptions: ○ Iterations over all hosts ○ Ansible built-in variables ○ A special global variable documented in a project and in a role ○ Very complicated queries. Use comments in the task to list used variables inside the template.
  • 66. Simplify If a template is small, use ‘copy’ with ‘content’ argument to inline it - template: dest: /etc/foobar.conf content: | source_ip = {{ansible_default_ipv4.address}}
  • 67. Debugging templates: variables - debug var={{item}} with_items: - myvar1 - myvar2 - ansible_default_ipv4 - all_other_variables_in_template
  • 68. Debugging templates: Jinja2 Explicit templatization in a separate playbook (f.e. temp.yaml) - template: src=roles/somerole/templates/foo.conf.j2 dest: /tmp/foo.conf delegate_to: localhost transport: local vars: - some_var - another_var
  • 69. Templates everywhere You don’t need to use ‘template’ to use jinja2. Every variable is a {{template}}. - copy - lineinfile - blockinfile - all file names for all copy/stat/file modules - arguments to shell and command modules - all other modules (apt, postgres_user, etc)
  • 70. External Jinja2 - name: Ugly example foo: argument: ‘{{(hostvars[var1]).cust_facts[3]|json_query(“[?name=”+ .. - name: Better example foo: argument={{foo_argument}} vars: Foo_argument: ‘{{lookup(‘template’, ‘foo_arguments.j2’)}}
  • 71. Roles
  • 72. Roles: structure 1. Use defaults for rarely changed values. Do not use hard-coded constants. 2. Split role in parts 3. Allow to call role parts independently 4. Allow to reuse part of the role 5. Use call caching Nginx: install + configure site roles/nginx/tasks/main.yaml: - import_tasklist: install.yaml - import_tasklist: configure_site.yaml - import_role: name: nginx tasks_from: configure_site.yaml vars: nginx_site: ... - name : install nginx apt: name=nginx state=installed when: nginx_installed is not defined register: nginx_installed
  • 73. Files in roles: vendor in role Good: - Easy to do: file: src=myfile dest=/var/lib/foo/myfile - Single authority - Versions Bad: - Keep golden artifacts in the ansible repo
  • 74. Files in roles: external source Good: - A tidy git. Bad: - Need external storage. - Version control. Examples private apt repo || private git repo || swift container (bad!)
  • 75. Wrapper role We have application server foo which should reside behind nginx. ● Foo want database IP, port address to listen ● Nginx need port to proxy_pass, domain, and ssl settings Role foo configure foo only. Role nginx configure any nginx site and it needs bunch of additional variables. Wrapper role glues them together, but does not change anything in foo or nginx.
  • 76. Wrapper role - name: Configure foo for {{foo_source_ip}} include_role: name=foo tasks_from=configure_foo vars: local_api_ip: '{{foo_local_ip}}' local_api_port: '{{foo_local_port}}' - name: Configure nginx for {{foo_source_ip}} include_role: name=nginx tasks_from=configure_site vars: nginx_sites: - name: 'rttgod_{{foo_source_ip}}' listen_address: '{{foo_source_ip}}’ port: '{{foo_external_api_port}}' locations: proxy_pass: 'http://{{foo_local_ip}}:{{foo_local_port}}
  • 77. Include_role VS import_role import_role: - Make it like it was written in the place of ‘include’. - Can override handlers - Defaults are respected (imported role use own default, but does not change parents defaults) - Does not support loops - Supports conditions: - A condition is applied to each task in the import_role role.
  • 78. Include_role VS import_role include_role: - Supports loops - Absolute mess - Broken in each new ansible release in a new way (hello, 2.5): - Delegation - Handlers - Defaults vs set_fact - Parent’s variable access - include_tasks is much more reasonable, but requires more files and lines.
  • 79. A proper looping with an include in a role - name: Loop over something Include_tasks: per_something.yaml with_items: ‘{{something}}’ - Name: in per_something.yaml import_role: name=foo vars: var1: ‘{{item}}’ - name: A task in role ‘foo’ foo: arg=var1 delegate_to: Works in ansible 2.5!
  • 81. handlers ● Avoid cross-role handlers (except for wrapper roles) ● Use meta: flush_handlers
  • 82. At least once persistent handlers role/tasks/main.yaml: - name: setup foo apt: name=foo state=installed notify: foo installed - … other tasks here… - meta: flush_handlers - name: check if restart is needed stat: path={{foo_flag}} register: foo_restart_flag - block: - name: Restart foo service name=foo state=restarted - name: cleanup restart flag file: path={{foo_flag}} state=absent when: foo_restart_flag.stat.exists handlers/main.yaml: - name: foo installed file: path: ‘{{foo_flag}}’ state: touch role/vars/main.yaml: foo_flag: /var/run/foo-inst.flag
  • 84. Plugin types module ≠ plugin - lookup_plugins/ - Load data from external sources - Perform calculations and queries - Iterate - action_plugins/ - Do stuff on hosts - vars_plugins - inventory_plugins All plugins are written in Python, and can be stored in ‘*_plugins/’ directory near a playbook, or within a role.
  • 85. Lookup plugins 1. Try to do it with ansible. 2. Try to do it with in-line jinja2 template 3. Try to do it with in-line json_query 4. Try to do it with external jinja2_template 5. If not, write a plugin Rule of thumb: if jinja2 template more then ⅓ of plugin (and it’s tests), write a plugin. If less, use a jinja2. Python in ansible complicates reading! A lot. Plugin without tests is worse then jinja2 of any complexity.
  • 86. Lookup plugins: an example from __future__ import (absolute_import, division, print_function) __metaclass__ = type from ansible.plugins.lookup import LookupBase import copy class LookupModule(LookupBase): def run(self, terms, **kwargs): data = terms or kwargs assigned_something = data['assigned_something'] assigned_others = data['assigned_others'] somethings = data['somethings'] foo_source_ips = [] for something in somethings: for data in something.get('datas', []): if data['other'] in assigned_others: foo_source_ips.append(data['foo_source_ip']) return foo_source_ips
  • 87. Lookup plugins: an example - name: Register IP Uri: method: PUT url: ‘{{url}}’ body_format: json body: '{"something": "{{item["something"]}}","other": "{{item["other"]”[data"]}}}"}' Status_code: - 200 - 201 - 304 register: reg_status changed_when: reg_status.status in [200, 201] with_my_custom_filter: '{{something}}'
  • 88. Lookup plugins: json_query equivalent - name: looping over include_tasks: process_other.yaml with_items: '{{selected_datas}}' Loop_control: loop_var: data label: '{{other}} @ {{data.foo_source_ip|default("no ip")}}' when: data.foo_source_ip is defined and data.other in assigned_others vars: somethings: '{{global_config["somethings"]}}' query: "[?name=='{{assigned_something}}'].datas" selected_datas: '{{global_config.somethings|json_query(query)}}' foo_source_ip: '{{data.foo_source_ip}}' something: '{{assigned_something}}' other: '{{data.other}}'
  • 89. Other plugins I have no experience with them, sorry. Key ideas for action plugins, when to write them: - Too many too complicated command/shells in a playbook/role - Needed reusability - Better test coverage - Complicated data types in use
  • 92. Refactoring when adding features ● Use small steps ● Write a plan for refactoring before changing anything ● Paper drawing is advised. ● Use ‘not changed’ status to see if refactoring does not change anything ● Use ansible-playbook --check --diff ● Do two steps refactoring: ○ Change internals without changes in the result ○ Do small, simple changes which to change the result ● Do not forget to add cleanup code if needed ○ Drop it later ● Each step should have separate commit with a multi-line description ○ You can do this, I believe in you!
  • 93. Refactoring when cleaning up mess - Find scenarios for execution - Eliminate false ‘changed’ - Reduce spread between files (no hostvars!) - Split plays into playbooks - Split tasklist into roles - Replace hardcoded values with variables - In templates too! - Do you remember about staging? - Reduce complexity of queries and iterations - Replace ‘shell/command’ with modules - Ansible-lint
  • 94. Refactoring example: Scraps from my table ● Write all ideas, even discarded. ● Write all variables and file names you’ve introduced or changed ● Draw arrows between objects
  • 95. THE END Final advice: ● Every role and every playbook cut the corners. ● Cut as few corners as possible. ● Each ‘cut corner’ has consequences. ● Amount of time dedicated to a role or to a playbook is a function of it’s importance. Be safe, be reasonable, and let ansible-lint to be with you.

Editor's Notes

  1. - about ansible, pre 2.0, bad 2.3, 2.4, small revolution at 2.5 - about my experience - expectation on audience. Someone knew some things better than me - some I stole from others, some are my own inventions Not in this presentation: vault, tower, network
  2. Why it’s simple Why it’s complicated
  3. A play or a playbook can not be in a role!
  4. Few examples here, they cover almost everything.
  5. Origin of Jinja Explain ‘moment of usage’
  6. Will explain ‘at least once’ VS ‘at most once’
  7. - delegate_to/include/loop will be explained later
  8. 2.5 - just a cosmetics
  9. It’s bad. Too many places, too many ways of thinking
  10. Why so many on tags? Because tags are usefull, but ansible gives no hint on how to use them and when to stop. I wanted to give counterexamples, but they are hard to show because it’s hard to show inconsistency on a short slide
  11. It’s should be in refactoring part too. Pay attention to this.
  12. It doesn’t matter what this photo is about. Key is a spirit - what to do. There are many object and their relationship is compicated. Draw it.