This document provides an introduction to cloud engineering and site reliability engineering (SRE). It begins with a brief history of cloud computing and its emergence in the 2000s with services from Amazon, Google, Microsoft and others. It describes the roles of operators in ensuring stability and developers in pursuing agility. It defines key concepts like DevOps, SRE, and cloud computing. It discusses how DevOps breaks down silos between operators and developers. It also provides examples to illustrate why cloud infrastructure is important and offers resources for learning more.
2. A Bit Of History
● Universities and companies rented out computation time on mainframe computers. At the time, renting was one
of the only ways to access computing resources as computing technology was too large and expensive to be
owned or managed by individuals.
● Cloud computing, however, didn’t become a mainstream reality and a popular term until the first decade of the
21st century.
● This decade saw the launch of cloud services like Amazon’s Elastic Compute (EC2) and Simple Storage Service
(S3) in 2006, Heroku in 2007, Google Cloud Platform in 2008, Alibaba Cloud in 2009, Windows Azure (now
Microsoft Azure) in 2010, IBM’s SmartCloud in 2011, and DigitalOcean in 2011.
https://do.co/2MTDQBT
4. Cloud Computing
- I don’t want to buy RAM, CPU, Hard
Disk, power supply network cables etc.
- I also need to run this JavaScript
code currently running on my laptop
somewhere that everybody across the
Globe can access it.
- I also need a Database to store my
users.
- Welcome Google Cloud, AWS, Azure,
Digital Ocean, Heroku.
Definitions
DevOps
- Developers wants to push features
faster.
- Operators do not want new changes to
break production.
- This creates conflicting interest
though all parties serve the same
customers.
- DevOps is a set of practices that
breaks these Silos between developers
and operators and other parts of the
organization.
- Agile Methods, Implementing Gradual
Changes, Tooling & Automation,
Measure Everything.
Site Reliability
Engineering
- Originated From Google and is a
Prescriptive way of Implementing
DevOps Principles.
- class SRE implements DevOps
- Shared Ownership of Production with
developers including tooling,
- Blameless Post Mortems and ensure
failure doesn't happen the same way
again.
- Leverage CI/CD and canary
deployments and Automate yourself out
of the Job.
5. Operators - StabilityDevelopers - Agility
Algorithms
Writing Softwares etc..
Network Topologies
Fault Recovery
High Availability etc..
DevOps Broke The
Silo
- Operators and Developers
work in the same office.
- Joint Daily standups.
- More visibility
- Involve the
security/marketing/PR team
early in the process
6. Some Buzz Words - Don’t get distracted by them.
Infrastructure As A Service (IAAS)
Platform As A Services (PAAS)
Infrastructure As Code (IAC)
Domain Name Systems (DNS)
Software As A Service (SAAS)
AutoScaling
LoadBalancing
Continuous Integration/ Continuous
Deployment (CI/CD)
Private Cloud
CLoud Native
Containers
Docker
Kubernetes
Virtual Machines
Linux
Serverless
Hybrid CLoud
Public CLoud
7. Why Is Cloud Important ?
With the recent sanctions
Twitter is placing on
individuals , lots of people
started moving away to other
platforms like
https://parler.com and
https://gab.com . Gab runs on
bare metals and this is what
the CEO had to say.
In the Cloud, all you need
to do is to Click a button 💁
8. In Your SRE Journey, accept
Failure as normal. Only then can
you plan for failure even before
it happens.
Being Able to dive deep,
troubleshoot and replicate
issues is a skill you need to
thrive in this space.