The document discusses automation best practices for companies. It recommends starting with simple automation that evolves over time, identifying common tasks for automation through issue tracking, and spending time automating processes for non-engineering teams. Automation should aim to make teams more self-sufficient through tools like chat interfaces that provide visibility, auditing, and enforce policies. Keeping automation interfaces simple but with detailed logging is advised.
2. Overview
1. Where to begin with automation (10m)
2. Tools and tips for building automation (15m)
3. Monitoring and maintaining tools you’ve built (15m)
3. If you want to build an engineering company,
hire more than 50% engineers.
4.
5. Evolution of Automation
Automated systems evolve over time. Start simple.
Manually Build
A Cluster
Automatically
Build A Cluster
Self Service
Creation Of A
Cluster
Phase 1 Phase 2 Phase 3
6. Start with the most common
Issue tracking tools help tell you where to automate
Look at the most common types of requests coming
into a team and begin automating those:
Deploys, Database Migrations, Access Requests,
Service Provisioning, etc.
If you’re small enough you should have a good sense
of what you spend the most time doing repetitively.
7. Always Be An Un-blocker
Identify Where Teams Are Waiting On Others
New
Feature
Developed
Dev Run
TestingQA Deploy
CodeOps
New
Feature
Developed
Dev Run
TestingDev Deploy
CodeDev
8. Automation Is Company Wide
Work Closely With All Of Your Teams
Too commonly automation is only prioritized for engineering teams.
All teams benefit from some of the simplest improvements to process.
Spend time with non-engineering teams and understand their workflows.
Small investments in these processes early on prevent hiring people to
manually do this labor.
It’s generally never too early to automate, but
usually too late once it’s needed.
9. Section Review
Key Takeaways On Where To Begin
1. Automation evolves over time. Start simple, and build on those ideas.
2. Identify the most common requests through ticketing systems and
frequent asks by other teams.
3. Become an “Unblocker” to help speed up productivity across your
company.
4. Spend time talking across other teams to find out what automation
would make their lives easier.
13. Phase 3
Self Service Automated Deploys: 2 Minutes
Developer
Needs Deploy
Self Service
Deploys
2 Minutes
14. Phase 3
Feedback Of Status/Health Is Essential
Developer
Needs Deploy
Self Service
Deploys
2 Minutes
Problems? Failures? Error Rates?
15. Creating Self Service Tools
Multiple ways to build tools
Interface Pros Cons
Command Line Easy to use Difficult to maintain versioning across
many systems. Library dependencies
(w/o compiled binaries)
API Easy for building complex tooling Requires detailed documentation
Web Interface Easy to interact with. Easy secondary
execution methods in emergency.
Have to also maintain a
website/interface. Generally interfaces
with APIs/Command Line tools in the
backend.
Chat Interface Easy to use and easy to onboard new
engineers. Easy secondary execution
methods in emergency.
Depends on API, CommandLine tools,
and config management on the back
end.
16. Creating Self Service Tools
Chat is a powerful way to build tooling
Some people create web interfaces, but this becomes difficult to maintain,
and now you have to manage additional services and resource for this
portal.
Becomes one big project that gets managed as it grows and robs resources
from development teams.
Chat is a simple interface to develop for and maintain.
17. Advantages Of Chat Based
Don’t reinvent the wheel if you don’t have to
Visibility about what is happening in the environment across teams
Audit trail for quicker troubleshooting and added compliance
Self service to help teams work faster without being dependent on
others
Easy to enforce policy and good practice
18. Advantages Of Using Chat Automation
Visibility On What’s Happening In The Company
[Mackenzie Kosut 2:18PM] @mrbot deploy production
[MrBot 2:18 PM] Deploying git commit 577f84e37 from Staging to
Production for @mackenzie
[MrBot 2:19 PM] Deploy 577f84e37 out to 10% of Production
[MrBot 2:21 PM] Error levels healthy (0.02% down from %0.021)
proceeding with rest of deploy of 577f84e37 to Production
[MrBot 2:22 PM] Deploy 577f84e37 out to 100% of Production [MrBot
2:27 PM] Deploy 577f84e37 health is good (0.019% down from %0.021).
Deploy 577f84e37 successful for @mackenzie
1. Developer runs deploy
2. Code is deployed to 10% of Production
3. Monitoring verifies code is healthy
4. System automatically finished deploys
19. Advantages Of Using Chat Automation
Ability To Audit & Backtrack Incidents
[Mackenzie Kosut 2:18PM] @mrbot run migration #183
[MrBot 2:18 PM] Running migration #183 for @mackenzie
[MrBot 2:19 PM] [FAIL] MySQL Error Message: Table '%s' was locked with
a READ lock and can't be updated
[AlertSystem 2:19 PM] [FAIL] Production Web (mysite.com) HTTPS failed
healthcheck
[MrBot 2:21 PM] Migration failed
1. Developer runs automated migration using chat command
2. Migration fails
3. Website health check fails
4. Migration aborts
20. Advantages Of Using Chat Automation
Easy To Enforce Policy & Best Practice
1. Developer attempts unauthorized after hour deploy
2. System prevents deploy and educates user on how to over-ride
[Mackenzie Kosut 4:51 AM] @mrbot deploy production
[MrBot 4:51 AM] Sorry Mackenzie, only authorized users can run
production deploys after hours. Re-run with +escalate to initiate
approval process.
* Good practice is to always have any production change, regardless of size or complexity, reviewed by two people.
21. Large Number of Open Source Projects
Don’t reinvent the wheel if you don’t have to
Leverage many of the available tools for
building chat interfaces.
Many come with a large number of add-
ons and capabilities out of the box.
Large number of users also means a
bigger community for support and
guidance.
22. Building Chat Automation Is Easy
Create Automation With Familiar Tools
Deploy
Production
Ansible
Wrapper
deploy.yml
Ansible
Wrapper
You don’t need to be a CoffeeScript* expert, simply use it
as a wrapper to execute languages of choice.
23. - name: Launch instance with ebs
local_action:
ec2 keypair={{ keypair }}
instance_type={{ instance_type.stdout }}
image={{ base_ami_201704.stdout }}
region={{ region }}
count={{ count }}
tenancy={{ tenancy }}
wait=true
wait_timeout={{ wait_timeout }}
vpc_subnet_id={{ availability_zone.stdout }}
ebs_optimized={{ ebs_optimized.stdout }}
group=ansible_default,ansible_{{primary_role}}
register: ec2
when: ebs_optimized.stdout == 'true'
Simplify The Interface For Familiar Tools
Example of an Ansible playbook that Hubot calls via CoffeeScript
24. Building a Chat Bot
Build automation around processes you understand
Instead of trying to build a messenger bot
from scratch, understand how your users
would interact with it.
It’s informative to be a mechanical turk and
manually handle responses. Find the most
common requests and begin automating
those.
Obviously this works for launching smaller
services, once at scale, you need a
different approach.
25. Section Review
Key Takeaways On Self Service Tools
1. Self service tools allow teams to move quicker without being blocked on
others.
2. Teams need confidence in being self service which requires proper
metrics and feedback of how their process is performing.
3. Automated processes should have a manual fall back mode in the event
of an emergency.
4. Spend time talking across other teams to find out what automation
would make their lives easier.
27. Self Service Is Useful Everywhere
Data Science & Analytics Is A Great Example
Requests
DataSales
Researches
Request
Data Provides
ReportData
Requests
DataSales
Views
Report,
Tweaks As
Needed
Sales
28. Self Service Is Useful Everywhere
Data Science & Analytics Is A Great Example
29. Self Service Is Useful Everywhere
Data Science & Analytics Is A Great Example
Database
Managing
Users&Rights
SQL ClientUser
Database
Managing
Users&Rights
Managing multiple users and grants across multiple databases can be frustrating.
30. Self Service Is Useful Everywhere
Data Science & Analytics Is A Great Example
AbstractedReportingLayer
User
Database
Database
GroupUserPer
Role/Team
GroupUserPer
Role/Team
Abstracting away into an abstraction layer allows centralized control and auditing.
32. FAA Automation Paradox
This Applies To The World Of Infrastructure
https://www.faa.gov/c/content/dam/faa/regulations-policies/documents/rmh_ch07.pdf
“..because the pilot lacked critical skills and the flight crew
relied too heavily on an automated system it did not fully
understand.”
1. Build automation tools but understand how they work, what they
do, and when to use them.
2. Don’t build tools dependent on other tools. Dependency chains
like these complicate debugging.
3. Production automation systems should be simple to understand,
simple to run, and simple to troubleshoot.
33. Keep Your Automation Simple
Follow The Unix Tools Philosophy
Make each program do one thing well. To do a new job, build afresh rather than complicate old programs
by adding new "features".
Expect the output of every program to become the input to another, as yet unknown, program.
Don't clutter output with extraneous information. Avoid stringently columnar or binary input formats. Don't
insist on interactive input.
Design and build software, even operating systems, to be tried early, ideally within weeks. Don't hesitate
to throw away the clumsy parts and rebuild them.
Use tools in preference to unskilled help to lighten a programming task, even if you have to detour to build
the tools and expect to throw some of them out after you've finished using them.
https://en.wikipedia.org/wiki/Unix_philosophy
34. Output Status & Failure
Keep Interface Simple, But Logging Detailed
1. Keep the output to the end user simple and readable
2. Always ensure you have detailed logging written somewhere
3. Failures should explain why it failed, and what action to take next.
35. Don’t Underestimate Macros
Simple Local Tools Can Be Powerful To Your Organization
Open Ticket
(ZenDesk)
Look Up User ID In
Admin Tool
Search Site Usage In
Analytics Tool
Respond With Standard
Response
Open Ticket
(ZenDesk)
Look Up User In
Admin Tool
Search Site Usage In
Analytics Tool
Respond With Standard
Response
30s 30s 30s
5s 5s
U 1
ManuallyMacros
36.
37. VoiceOps with Alexa
Looking forward at what’s possible
Alexa..
.. Ask production for error rate?
.. Ask production to ban IP 14.12.13.5
.. Ask production to deploy master
.. Ask development to spin down unused instances
.. Ask production to verify backups
.. Ask staging for current testing status
38. Finding Time To Build Automation
Let’s Be Honest, There Is Never A Right Time
1. Dedicate time weekly to work on automation. 2-4 hours, same day and
time, sit down and spend time just automating.
2. Work with others to bounce ideas off of what to build next and how to
improve on what you have.
3. Try to automate ahead of the problems before the tasks become too
great. Dedicating time helps give you this advantage.
4. Think of the 2-4 hours of time as an investment. You will get multiple
times this free time back once better systems are in place.
39. Summary
Keep Interface Simple, But Logging Detailed
1. When building automation, build it around a self-service model.
2. Don’t reinvent the wheel, build with what tools & frameworks already
exist.
3. Keep your tools simple, not dependent on others, and easy to
understand.
4. Build easy to understand interfaces but retain detailed logging
5. Automation helps you build controls & policy as you grow
6. Don’t underestimate simple automation tools
“Always be unblocking others.”