Presenter: Perry Statham
SRE Squad Leader with IBM Cloud DevOps Services
In this presentation, the IBM DevOps Services SRE team will give a brief introduction to Site Reliability Engineering, then show how they adopted its principals in their existing enterprise organization.
An overview of Google's Site Reliability Engineering with a view toward possible incorporation in the IEEE P2675 DevOps security standard. (Creative Commons with credit.)
Getting started with Site Reliability Engineering (SRE)Abeer R
"Getting started with Site Reliability Engineering (SRE): A guide to improving systems reliability at production"
This is an intro guide to share some of the common concepts of SRE to a non-technical audience. We will look at both technical and organizational changes that should be adopted to increase operational efficiency, ultimately benefiting for global optimizations - such as minimize downtime, improve systems architecture & infrastructure:
- improving incident response
- Defining error budgets
- Better monitoring of systems
- Getting the best out of systems alerting
- Eliminating manual, repetitive actions (toils) by automation
- Designing better on-call shifts/rotations
How to design the role of the Site Reliability Engineer (who effectively works between application development teams and operations support teams)
Site Reliability Engineering (SRE) - Tech Talk by Keet SugathadasaKeet Sugathadasa
When it comes to Site Reliability Engineering, short for SRE, the resources available online are only limited to the books published by Google themselves. They do share some useful case studies that will help us understand what SRE is, and how to understand the concepts given in it, but they do not clearly explain how to build your own SRE team for your organization. The concept of SRE was cooked fresh within the walls of Google and later released to the general public as a practice for anyone to follow.
In this presentation I would like to give a brief introduction to SRE and why it is important to any Software Engineering organization. This is based on my experiences and learnings from leading a Site Reliability Engineering team for leading organizations in the US and Norway.
This presentation was conducted by me as a Tech Talk as an Associate Technical Lead at Creative Software Sri Lanka.
Overview of Site Reliability Engineering (SRE) & best practicesAshutosh Agarwal
In any software organization, stability & innovation are always at loggerheads - the faster you move, the more things will break. This talk defines what SRE org looks like at high-tech organizations (Google, Uber).
SRE (service reliability engineer) on big DevOps platform running on the clou...DevClub_lv
SRE (service reliability engineer). The talk is to explain the SRE philosophy and the principles of production engineering and operations in clouds.
(Language – English)
Pavlo is ADOP (Accenture DevOps Platform) Service Reliability Team Lead, SRE practitioner. Has more then 18 years of IT experience in Ops and Dev.
An overview of Google's Site Reliability Engineering with a view toward possible incorporation in the IEEE P2675 DevOps security standard. (Creative Commons with credit.)
Getting started with Site Reliability Engineering (SRE)Abeer R
"Getting started with Site Reliability Engineering (SRE): A guide to improving systems reliability at production"
This is an intro guide to share some of the common concepts of SRE to a non-technical audience. We will look at both technical and organizational changes that should be adopted to increase operational efficiency, ultimately benefiting for global optimizations - such as minimize downtime, improve systems architecture & infrastructure:
- improving incident response
- Defining error budgets
- Better monitoring of systems
- Getting the best out of systems alerting
- Eliminating manual, repetitive actions (toils) by automation
- Designing better on-call shifts/rotations
How to design the role of the Site Reliability Engineer (who effectively works between application development teams and operations support teams)
Site Reliability Engineering (SRE) - Tech Talk by Keet SugathadasaKeet Sugathadasa
When it comes to Site Reliability Engineering, short for SRE, the resources available online are only limited to the books published by Google themselves. They do share some useful case studies that will help us understand what SRE is, and how to understand the concepts given in it, but they do not clearly explain how to build your own SRE team for your organization. The concept of SRE was cooked fresh within the walls of Google and later released to the general public as a practice for anyone to follow.
In this presentation I would like to give a brief introduction to SRE and why it is important to any Software Engineering organization. This is based on my experiences and learnings from leading a Site Reliability Engineering team for leading organizations in the US and Norway.
This presentation was conducted by me as a Tech Talk as an Associate Technical Lead at Creative Software Sri Lanka.
Overview of Site Reliability Engineering (SRE) & best practicesAshutosh Agarwal
In any software organization, stability & innovation are always at loggerheads - the faster you move, the more things will break. This talk defines what SRE org looks like at high-tech organizations (Google, Uber).
SRE (service reliability engineer) on big DevOps platform running on the clou...DevClub_lv
SRE (service reliability engineer). The talk is to explain the SRE philosophy and the principles of production engineering and operations in clouds.
(Language – English)
Pavlo is ADOP (Accenture DevOps Platform) Service Reliability Team Lead, SRE practitioner. Has more then 18 years of IT experience in Ops and Dev.
How to bootstrap an SRE team into your company. How to hire them, what to have them work on and how to interact with them as a team. Finally some thought on general practices to consider before your SREs arrive. There are also kitten pictures.
<p>From <a href="https://en.wikipedia.org/wiki/Site_reliability_engineering" target="_blank">Wikipedia</a>: Site reliability engineering (SRE) is a discipline that incorporates aspects of software engineering and applies that to operations whose goals are to create ultra-scalable and highly reliable software systems.<p>
<p>Over the past year Acquia has built their own SRE team to help their products and services scale with the demand of our growing number of customers. We wish to share our experience so that others are enabled to do the same and reap the rewards.</p>
<p>This presentation will discuss how the SRE team came about at Acquia, what achievements we have made so far, and the lessons we have learned along the way. We will then show the steps on how to introduce SRE to your workplace so you can deliver more reliable and scalable services to your customers! We will specifically cover:</p>
<ul>
<li>SRE's basic concepts and history from Google</li>
<li>The management support you will need to get started</li>
<li>Introducing the idea of service level objectives and error budgets</li>
<li>Operational Responsibility Assessments as a tool to measure risk</li>
<li>Creating a Launch Readiness Checklist to standardize and improve product launches</li>
<li>Finding ideal candidates for your SRE team</li></ul>
<p>The intended audience are software engineers, system administrators, and managers that have a desire to improve how they do their work and how their products/services perform.</p>
SRE-iously! Defining the Principles, Habits, and Practices of Site Reliabilit...Tori Wieldt
How do you make DevOps magic when you aren’t Google? This talk will help whether you’re still figuring out how to create a site reliability practice at your company or you’re trying to improve the processes and habits of an existing SRE team.
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain what is and isn't toil, how to identify, measure and eliminate them.
Youtube channel here: https://youtu.be/EgpCw15fIK8
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain the term SRE (Site Reliability Engineering) and introduce key metrics for an SRE team SLI, SLO, and SLA.
Youtube Channel here: https://www.youtube.com/playlist?list=PLm_COkBtXzFq5uxmamT0tqXo-aKftLC1U
Independently from the DevOps movement but starting from the same problems, Google developed its own strategy defining a new specific role called SRE (Site Reliability Engineer). This introduction tries to explain the history and the concept of this methodology and to compare it with the DevOps manifesto to understand what does it mean to adopt DevOps and what does it mean to be an SRE and what the two things are sharing and where they diverge.
SRE-iously: Defining the Principles, Habits, and Practices of Site Reliabilit...New Relic
No matter how you define it, the Site Reliability Engineer (SRE) role is clearly expanding into more and more companies. To be effective in this new role, SREs must possess a depth of understanding of how different systems work together, how they fail, how they can be improved, and how they can best be designed and monitored.
How Small Team Get Ready for SRE (public version)Setyo Legowo
How Urbanindo small team engineering team implement Site Reliability Engineering (SRE) in their daily work life and why we choose SRE instead of ordinary DevOps.
Managing a team and project are quite synonymous. Especially, teams require effective distribution of responsibility / roles. Once that is setup, a proper process guides people to make progress. All this fits into a product lifecycle, which is essential to develop the right product, in the right way, and deliver it at the right time.
Site Reliability Engineer (SRE), We Keep The Lights On 24/7NUS-ISS
There are many phases in the software development cycle, from requirements to development and testing, but at the tail of the process, is an often overlooked aspect: deployment and delivery. With the paradigm shift of delivering on-site software to offering software-as-a-service, Site Reliability Engineering is beginning to take a greater role in product delivery.
This session aims to give a glimpse of the work that goes into site reliability engineering (SRE) and effort that goes into keeping a service going 24/7.
In this presentation I will speak how are the SRE and DevOps, what is a reliability. Also about the reliability approach in Competitive Gaming in Wargaming and show a few cases.
How to bootstrap an SRE team into your company. How to hire them, what to have them work on and how to interact with them as a team. Finally some thought on general practices to consider before your SREs arrive. There are also kitten pictures.
<p>From <a href="https://en.wikipedia.org/wiki/Site_reliability_engineering" target="_blank">Wikipedia</a>: Site reliability engineering (SRE) is a discipline that incorporates aspects of software engineering and applies that to operations whose goals are to create ultra-scalable and highly reliable software systems.<p>
<p>Over the past year Acquia has built their own SRE team to help their products and services scale with the demand of our growing number of customers. We wish to share our experience so that others are enabled to do the same and reap the rewards.</p>
<p>This presentation will discuss how the SRE team came about at Acquia, what achievements we have made so far, and the lessons we have learned along the way. We will then show the steps on how to introduce SRE to your workplace so you can deliver more reliable and scalable services to your customers! We will specifically cover:</p>
<ul>
<li>SRE's basic concepts and history from Google</li>
<li>The management support you will need to get started</li>
<li>Introducing the idea of service level objectives and error budgets</li>
<li>Operational Responsibility Assessments as a tool to measure risk</li>
<li>Creating a Launch Readiness Checklist to standardize and improve product launches</li>
<li>Finding ideal candidates for your SRE team</li></ul>
<p>The intended audience are software engineers, system administrators, and managers that have a desire to improve how they do their work and how their products/services perform.</p>
SRE-iously! Defining the Principles, Habits, and Practices of Site Reliabilit...Tori Wieldt
How do you make DevOps magic when you aren’t Google? This talk will help whether you’re still figuring out how to create a site reliability practice at your company or you’re trying to improve the processes and habits of an existing SRE team.
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain what is and isn't toil, how to identify, measure and eliminate them.
Youtube channel here: https://youtu.be/EgpCw15fIK8
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain the term SRE (Site Reliability Engineering) and introduce key metrics for an SRE team SLI, SLO, and SLA.
Youtube Channel here: https://www.youtube.com/playlist?list=PLm_COkBtXzFq5uxmamT0tqXo-aKftLC1U
Independently from the DevOps movement but starting from the same problems, Google developed its own strategy defining a new specific role called SRE (Site Reliability Engineer). This introduction tries to explain the history and the concept of this methodology and to compare it with the DevOps manifesto to understand what does it mean to adopt DevOps and what does it mean to be an SRE and what the two things are sharing and where they diverge.
SRE-iously: Defining the Principles, Habits, and Practices of Site Reliabilit...New Relic
No matter how you define it, the Site Reliability Engineer (SRE) role is clearly expanding into more and more companies. To be effective in this new role, SREs must possess a depth of understanding of how different systems work together, how they fail, how they can be improved, and how they can best be designed and monitored.
How Small Team Get Ready for SRE (public version)Setyo Legowo
How Urbanindo small team engineering team implement Site Reliability Engineering (SRE) in their daily work life and why we choose SRE instead of ordinary DevOps.
Managing a team and project are quite synonymous. Especially, teams require effective distribution of responsibility / roles. Once that is setup, a proper process guides people to make progress. All this fits into a product lifecycle, which is essential to develop the right product, in the right way, and deliver it at the right time.
Site Reliability Engineer (SRE), We Keep The Lights On 24/7NUS-ISS
There are many phases in the software development cycle, from requirements to development and testing, but at the tail of the process, is an often overlooked aspect: deployment and delivery. With the paradigm shift of delivering on-site software to offering software-as-a-service, Site Reliability Engineering is beginning to take a greater role in product delivery.
This session aims to give a glimpse of the work that goes into site reliability engineering (SRE) and effort that goes into keeping a service going 24/7.
In this presentation I will speak how are the SRE and DevOps, what is a reliability. Also about the reliability approach in Competitive Gaming in Wargaming and show a few cases.
S.R.E - create ultra-scalable and highly reliable systemsRicardo Amaro
Site Reliability Engineering enables agility and stability.
SREs use Software Engineering to automate themselves out of the Job.
My advice, if you want to implement this change in your company is to start with action items, alter your training and hiring, implement error budgets, do blameless postmortems and reduce toil.
https://events.drupal.org/dublin2016/sessions/sre-create-ultra-scalable-and-highly-reliable-systems
Prometheus is a next-generation monitoring system. It lets you see you not just what your systems look like from the outside, but also gives visibility into the internals and business aspects of your systems. This allows everyone to benefit, including both operations and developers. This talk will look at the concepts behind monitoring with Prometheus, how it's designed, why it's suitable for Cloud Native environments and how you can get involved.
Eoin Woods, CTO at Endava, provides insights into what we mean by agility and explores why successful Agile Transformation initiatives go beyond the development teams, in a whitepaper that discusses the six aspects of an organisation that need to evolve to achieve true agility.
Agile Project Failures: Root Causes and Corrective ActionsTechWell
Agile initiatives always begin with the best of intentions—accelerate delivery, better meet customer needs, or improve software quality. Unfortunately, some agile projects do not deliver on these expectations. If you want help to ensure the success of your agile project or get an agile project back on track, this session is for you. Jeff Payne discusses the most common causes of agile project failure and how you can avoid these issues—or mitigate their damaging effects. Poor project management, ineffective requirements development, failed communications, software development problems, and (non)agile testing can all contribute to project failure. Learn practical tips and techniques for identifying early warning signs that your agile project might be in trouble and how you can best get your project back on track. Gain the knowledge you need to guide your organization toward agile project implementations that serve the business and the stakeholders.
Top DevOps Best Practices for a Successful Transition in 2023SofiaCarter4
How to make a successful transition to DevOps in 2023. Explore 12 top DevOps Best Practices for a successful transition in 2023. https://bit.ly/3uDL2Vj
SRE Roundtable with 4 DevOps Ambassadors
A roundtable conversation about Site Reliability Engineering (SRE).
Please join us as four of the DevOps Institute's Ambassadors discuss SRE.
DevOps and SRE (and a little history of Google and SRE) - Helen Beal
SRE and ITIL - Donna Knapp
Benefits of SRE - Craig Pearson
SRE and Security - Niladri Choudhuri
And then the team will answer questions for 30+ minutes. We are very interested in hearing your questions. You can tweet them to @ITSMAcademy, add to the registration form, or bring them with you and chat in during the session.
How to improve Customer and Employee Experience with IT Service ManagementITSM Academy, Inc.
Chris Gallacher, Principal Consultant, Forrester Research
How to improve Customer and Employee Experience with ITSM
Please join us as Chris Gallacher provides the latest insights on how to assess and mature your IT services and capabilities by adopting industry best practices to enhance your Customer’s Experience by evaluating and addressing your Employee’s Experience.
Donna Knapp, Curriculum Development Manager, ITSM Academy
How to Create a Great Customer Experience
A key activity in the ITIL 4 service value chain is 'engage'. One reason why this activity is particularly important is that it represents the start of the customer journey. The most successful organizations understand and master the customer journey; often by walking in their customers’ shoes and experiencing the end-to-end journey for themselves.
In this session, Donna Knapp introduces concepts from the new ITIL® 4: Drive Stakeholder Value publication including ways to optimize the customer journey and create a great customer experience.
Donna Knapp, Curriculum Development Manager, ITSM Academy
Digital has changed everything! It has enabled organizations to introduce new business models and to significantly change how they do business. Most importantly, it has changed expectations regarding the development, delivery, and use of digital technologies. Speed is crucial, but not at the expense of quality and resilience. In this session, Donna Knapp introduces concepts from the new ITIL® 4 High Velocity IT publication including new ways of thinking and working when speed (across the organization, not just in IT) is key.
Mario Vivas, CEO, River Horse
ITIL4 is out and everyone is eager to learn about the updates this release introduces. This session will summarize the key changes that ITIL4 presents with a focus on the more operational processes that organizations deliver on a day to day basis (Incident, Problem, Change and so on). The ServiceNow platform has a powerful set of baseline features and optional plugin functions that can help an organization align with the recommendations of ITIL4.
Please join us and Mario to learn about how you can start applying ITIL4 concepts in your ServiceNow implementations!
Vicki Rogers, Senior Manager of Change, Georgia Tech
Learn how Georgia Tech adopted the IT change management process (AKA change enablement practice in ITIL v4), designed to help control the life cycle of strategic, tactical, and operational changes to IT services through standardized procedures.
In this webinar, host Vicki Rogers will briefly describe and define change management and how it fits into the ITSM model, touching on changes with ITIL v4.
Please join us as Vick
Is this the End of ITIL? NO, it is the end-to-end of ITIL!ITSM Academy, Inc.
Paul Wilkinson, GamingWorks
Is this the End of ITIL? NO, it is the end-to-end of ITIL!
As we celebrate the first year anniversary of ITIL 4, enterprises are working their way to understand and clarify the full impact and value of the update. Do these questions sound familiar to you (or have you asked them yourself?):
"ITIL 4 is theoretical, how do we translate theory into practice?"
"How do you demonstrate the relevance of ITIL 4?"
"How do we get buy-in from the other stakeholders in the value chain?"
"How do we get the business to buy-in to apply effective IT governance to prioritize scarce resources against exploding demands?"
"Do we need to shift from SLAs to XLAs?"
In this session we will explore how the MarsLander Experiential Learning Workshop has been effectively used to address these challenges and show concrete, pragmatic takeaways.
And yes, Paul will be talking about their simulation, but will also unveil lots of pragmatic advice you can use to better understand the value proposition of ITIL 4.
Artificial Intelligence (AI) & The Future of Employee ServiceITSM Academy, Inc.
Dan Turchin, Astound
he bots are coming… but not to take your job. Learn how and why AI and machine learning are making humans better and how organizations like McDonald’s and adidas are delivering better service today.
Astound co-founder Dan Turchin will discuss the future of AI in IT and provide actionable tips that will guarantee your AI initiatives succeed.
Key takeaways:
How artificial intelligence is impacting IT
Why machine learning accelerates shift left strategies
How AI and natural language processing (NLP) are used to improve KPIs like MTTR, FCR, cost per ticket, and customer satisfaction
How AI-driven automation benefits the entire service lifecycle from provisioning and monitoring to incident, problem, and change management
Greg Sanker, CIO, Author, speaker and practitioner with over 30 years of global IT experience
Every organization makes changes on a daily basis, and every change has the potential to go wrong and potentially do great harm to the business. How do you balance the need to manage risk, without slowing everything to a grinding halt to go to CAB?
Thankfully, ITIL 4 has some great news, and this webinar will get you up to speed on Change Control.
What’s in it for you?
Join if you:
Want to know what’s new in ITIL4 Change Control
Have an unpopular CAB
Need help doing change management in a DevOps world
Far more than a new name for the same old thing, Change Control helps you manage changes at the speed of business.
Introducing “The V*A*L*U*E Formula: Do more with less and reduce stress"ITSM Academy, Inc.
Ken Wendle, Author, speaker, consultant and instructor
A common thread shared by almost every best practice or methodology – from ITIL to Lean to DevOps - is VALUE. Understanding, focusing on and improving our value ultimately is the most important thing any organization can do.
The V*A*L*U*E Formula - an exciting new book by Ken Wendle - explains and explores the “five essential elements of value” which work together to help individuals and organizations define, focus, augment, differentiate and deliver their full value potential.
This webinar will introduce and discuss “The V*A*L*U*E Formula” and model.
Topics and Key Takeaways:
It comes down to VALUE
Crafting a compelling Value Vision
Aspects of Alignment
Facets of Leverage and Uniqueness
Unleashing value through effective Execution
Alan Nance, FISM, Managing Partner CitrusCollab LLLP ... thought leader, innovator, creative disruptor
Service Management exists to guarantee a valuable experience to customers and colleagues. Despite years of implementing best practices, the reputation of most technology departments is below par in the eyes of business leaders.
90% of CEOs feel they aren’t meeting their customer needs.
85% of CEOs don’t think technology is performing critical functions.
One of the core reasons is that technology teams are often trapped into measuring output rather than outcome, and KPIs on activities rather than XPIs that guarantee experience.
Luckily, ITIL 4 now connects to the world of design thinking and experience management with its focus on co-creation and outcomes. But how can we include eXperience Level Agreements (XLA) effectively and quickly?
In this presentation Alan will explain all-things XLA.
DJ Schleen, DevSecOps Evangelist
You wouldn’t make changes in a production environment without testing first, so why make changes to a production pipeline and risk breaking the CI/CD process? This talk will introduce the concept of Blue/Green *pipelines* to optimize flow and continuously experiment with security toolsets without interrupting critical DevSecOps infrastructure. If you are a Continuous Delivery Architect, or are interested in minimizing interruption while integrating security toolsets this talk will resonate with you.
Mark Blanke, OwlPoint
Whether you are new to ITIL, Captains of ITIL 3 - or somewhere in between, this webinar is for you. Join us as Mark Blanke, President of OwlPoint, shares with us 5 practical steps to map where your organization is today and how to plan for your journey.
Donna Knapp, ITSM Academy's Curriculum Development Manager / Author
ITSM Academy was so excited to introduce ITIL 4 in our January webinar. We can tell our 300+ audience members were excited as well by the number of questions we did get to answer. Join us in February for an extended Q&A session and the latest ITIL 4 news.
Donna Knapp
Join us for an introduction to ITIL 4. We’ll share – at a high level – key ITIL 4 concepts and how those concepts support the management of modern IT services. If your organization is leveraging agile, lean and DevOps practices, you will like what you see in this current evolution of the ITIL framework.
Mike Orzen, Lean Technologist / Author, Donna Knapp, Curriculum Development Manager / Author
It’s been said that where there is demand there is a value stream. It seems you can’t visit an IT website these days without reading about the importance of value stream management. As Agile, Lean and DevOps practices accelerate our ability to satisfy customer demand, it’s important that ITSM professionals understand the relationship between value streams and processes.
Join Mike Orzen, author of Lean IT and The Lean IT Field Guide and Donna Knapp, author of The ITSM Process Design Guide, for a conversation about value streams and the benefits of incorporating Lean thinking into our continuous improvement efforts. Mike and Donna will set the stage and then ‘open up the mics’ for your questions about anything related to value streams and value stream mapping.
Sean Mack, CEO, xOps
There is a prevalent myth that DevOps and IT Service Management (ITSM) and the IT Infrastructure Library (ITIL) are incompatible. However this supposition has very little basis. ITIL is a framework from which you can take or leave portions you like and, in fact, this framework provides many useful paradigms for DevOps help implementations.
There’s actually wide synergy between ITIL and DevOps. If we understand ITIL as a process framework and see DevOps as, primarily, a culture of collaboration, there is no reason we cannot have a process framework integrate very well with a culture of collaboration.
This talk looks at the overlap of ITIL and DevOps and outlines some practical ways in which the DevOps philosophy can be applied to ITIL and operations process management.
Modernizing Service Management Processes with Self-Service AccessITSM Academy, Inc.
Webinar - Nick Schneider - Driven by a range of initiatives: DevOps, Digital Transformation, Modern Operations, etc - today Infrastructure and Operations organizations are experiencing growing pressure to modernize established ITSM processes and practices. The impetus is help streamline and accelerate software and service delivery to safely improve their organization’s speed, agility and responsiveness to the market. This presentation will provide real world experiences and practical guidance on how to modernize key processes with Self Service Access to safely empower software developers with autonomy, feedback loops and visibility across the end to end SDLC (Software Development Lifecycle).
Service Portfolio - Preparing for the Future of your OrganizationITSM Academy, Inc.
Presenters: Michael Cardinal
The key to effective Service Management is the definition and management of services. And the key to managing services is a robust Service Portfolio and Service Portfolio Management process. But how do you go about setting up and implementing a Service Portfolio?
This webinar helps answer that question and provide other hints and tips around Service Portfolio Management.
Status Quo or Status Whoa? with Brad Utterback, an ITSM Academy WebinarITSM Academy, Inc.
Presenter: Brad Utterback, ITSM Consultant and Trainer
Whoa is often used to "stop" motion. But it is also an idiom for something that causes us to pay attention. For example, something that's new, creative and a game changer for the business. Continual Service Improvement is a set of best practices that can help you move off the " status quo" and on to the "status WHOA" by ensuring the service organization grows and changes with the business and ensuring your people are enabled to innovate and create new ways of being effective and efficient.
Please join us as we explore in more detail the benefits of Continual Service Improvement.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Site Reliability Engineering: An Enterprise Adoption Story (an ITSM Academy Webinar)
1. Site Reliability Engineering:
An Enterprise Adoption Story
Perry Statham
SRE Squad Leader, Bluemix DevOps Services
Ritchie Schacher
STSM, SRE Architect, Bluemix DevOps Services
2. Introductions
2
Perry Statham leads the SRE squad for the IBM®
Bluemix® Continuous Delivery offering. Since
joining IBM in 1998, he has been principle
developer and team leader of several application
management offerings; deployment architect of the
Jazz for Service Management tool set; lead
deployment outside-in designer; and technical
leader of an IaaS environment hosting more than
2000 virtual and bare metal machines.
Prior to IBM, Perry worked for many years with a
healthcare IT startup company where he did both
development and operations long before it became
known as DevOps.
Perry Statham
SRE Squad Leader
Bluemix DevOps Services
Ritchie Schacher is a Senior Technical Staff
Member with IBM® Bluemix® DevOps Services,
and lead architect for SRE. He has a background in
developer tools and team collaboration. He has
broad experience in all aspects of the software
development lifecycle, including product
conception, planning and managing, design, coding,
testing, deploying, hosting, releasing, and
supporting products.
For the last 3 years he has been a part of an exciting
team developing and supporting cloud-based SaaS
offerings for Bluemix®. In his spare time, Ritchie
enjoys playing classical guitar.
Ritchie Schacher
STSM SRE, Architect
Bluemix DevOps Services
3. Once upon a time…
…there were two teams in a large company.
3
One of the teams developed
software products.
01001001
01000010
01001101
The other team operated
hosted instances of those
products.
wtf?
4. Then along came DevOps. Yay!
4
Build it. Deploy it. Run it. Manage it.
01001001
01000010
01001101
5. It was a huge improvement, but…
§ How do we balance change velocity vs. availability, reliability, security, and other
operational attributes? For example:
– Do we fix a bug in the UI, or do we fix a bug that intermittently requires the backend app to be
restarted?
– Do we add a great new feature, or do we automate backups?
– Do we improve an app’s performance, or do we reduce its mean time to recovery (MTTR)?
– Do we improve deployment automation, or do we improve standards compliance?
§ When each service squad has it’s own DevOps process:
– How do we encourage process consistency and discipline?
– How do we scale specialized reliability engineering skills?
– How do we avoid incident response burnout?
5
6. Note: With some our own additions and interpretations, this
section largely comes from the Google SRE Book [Book]. All
quotes without a reference are from the same book.
Any misunderstandings are ours and not the fault of the book
authors.
Site Reliability Engineering
§ “One could … view SRE as a specific implementation of DevOps with some idiosyncratic
extensions.”
§ “In general, an SRE team is responsible for the availability, latency, performance, efficiency,
change management, monitoring, emergency response, and capacity planning of their
service(s).”
6
“What is Site Reliability Engineering as it has become defined at
Google? My explanation is simple: SRE is what happens when
you ask a software engineer to design an operations team.” –
Benjamin Traynor Sloss, Vice President, Engineering, Google
7. SRE Principles and Tenets
7
§ Know the Service Level
§ Embrace Risk
§ Eliminate Toil
§ Know What's Broken and Why
§ Stuff Happens
§ Automate [Almost] Everything
§ Reliable Releases
§ Keep it Simple
8. Know the Service Level
§ Indicators (SLI): a carefully defined quantitative measure of some aspect of the level of
service that is provided
– Examples include availability, request latency, error rate, and system throughput.
– Be wary of using the mean (average) of a set of metrics collected over some period of time,
since outliers are often hidden.
– Percentiles usually provide much better metric aggregations.
– No more than a handful. Not every metric makes a good SLI.
§ Objectives (SLO): a target value or range of values for a service level that is measured
by an SLI
§ Agreements (SLA): an explicit or implicit contract with your users that includes
consequences of meeting (or missing) the SLOs they contain
8
9. Embrace Risk
§ Goal is to explicitly align the risk taken by a given service with the risk the business is
willing to bear.
– Strive to make a service reliable enough, but no more reliable than it needs to be.
§ Which availability metric?
– Time Based: Uptime / (Uptime + Downtime)
– Request Based: Successful Requests / (Total Requests)
– Something else?
9
10. Embrace Risk: Error Budgets
§ Error Budgets balance the goals of
– Product development teams, which are largely evaluated on product velocity, which creates an incentive
to push new code as quickly as possible; and
– SRE teams, which are evaluated based upon reliability of a service, which implies an incentive to push
back against a high rate of change
§ A service’s error budget is 100% minus the SLO (e.g.100 - 99.95% = .05%)
– In general, for any software service or system, 100% is not the right reliability target because no user
can tell the difference between a system being 100% available and 99.999% available.
§ SRE’s goal is no longer “zero outages”; rather, SREs and product developers aim to spend the error budget
getting maximum feature velocity.
§ An outage is no longer a “bad” thing – it is an expected part of the process of innovation, and an occurrence that
both development and SRE teams manage rather than fear.
§ Whenever a service exceeds their error budget, ALL WORK MUST be related to improving
availability.
10
11. Embrace Risk: Error Budgets
11
“An error budget stems from this basic observation: 100% is the wrong reliability target for basically
everything. The business or the product must establish what the availability target is for the system. If the
service natively sits there and throws errors, you know, .01% of the time, you're blowing your entire
unavailability budget on something that gets you nothing. So you have an incentive in both the development
world and the SRE team to improve the service's native stability so that you'll have budget left to spend on
things you do want, like feature launches.” – Ben Traynor [Tra17]
“The other crucial advantage of this is that SRE no longer has to apply any judgment about what the
development team is doing. SRE measures and enforces, but we do not assess or judge. Our take is "As long
as your availability as we measure it is above your Service Level Objective (SLO), you're clearly doing a good
job. You're making accurate decisions about how risky something is, how many experiments you should run,
and so on. So knock yourselves out and launch whatever you want. We're not going to interfere." And this
continues until you blow the budget.”
– Ben Traynor [Tra17]
12. Eliminate Toil
§ In SRE, we want to spend time on long-term engineering project work instead of
operational work.
– Because the term operational work maybe misinterpreted, we use a specific word: toil.
– SREs should spend less than X% of their time on toil and the rest on coding
§ Excess toil is redirected to development team
§ Toil is mundane, repetitive operational work providing no enduring value, which scales
linearly with service growth.
§ The work of reducing toil and scaling up services is the “Engineering” in Site Reliability
Engineering
12
“If a human operator needs to touch your
system during normal operations, you have a
bug. The definition of normal changes as your
systems grow. - Carla Geisser, Google SRE
13. Know What’s Wrong and Why
§ If you can’t monitor a service, then you don’t know what’s happening, and if you’re blind to what’s
happening, then you can’t be reliable.
§ Monitoring a complex application is a significant engineering endeavor in and of itself.
§ Monitoring should answer the questions:
– What’s wrong?
– Why is it wrong?
§ Use the right output:
– Alerts: Signify that a human needs to take action immediately in response to something that is either
happening or about to happen.
– Tickets: Signify that a human needs to take action, but not immediately.
– Logging: No one needs to look at this information, but it is recorded for diagnostic or forensic purposes.
§ Metrics that are useful for overall system operation, but do not influence major user interactions,
are not usually SLIs.
– For example, disk space metrics are not usually web UI application indicators.
13
“…a system that requires a human to read an email and
decide whether or not some type of action needs to be taken
in response is fundamentally flawed. Monitoring should never
require a human to interpret any part of the alerting domain.
Instead, software should do the interpreting, and humans
should be notified only when they need to take action.”
14. Stuff Happens, So Reduce Repair Time
§ Reliability is a function of mean time to failure (MTTF) and mean time to repair (MTTR)
[Sch15]
– The most relevant metric in evaluating the effectiveness of emergency response is how
quickly the response team can bring the system back to health – that is, the MTTR.
§ Humans add latency
– Even if a given system experiences more actual failures, a system that can avoid emergencies
that require human intervention will have higher availability than a system that requires hands-
on intervention.
14
15. Automate [Almost] Everything
§ Automation provides
– Consistency as systems scale
– A platform for extending to other systems
– Faster repairs for common problems
– Faster action than humans
– Time savings by decoupling operator from operation
§ But it’s not a panacea
– It can hide systemic problems.
§ For example, auto-restarting a process periodically can hide memory leaks
– It can perform the wrong, or even a damaging, operation because of a poor design or implementation
§ Remember the old adage “Garbage-In, Garbage-Out”
15
“If we are engineering processes and solutions that are
not automatable, we continue having to staff humans to
maintain the system. If we have to staff humans to do the
work, we are feeding the machines with the blood, sweat,
and tears of human beings. Think The Matrix with less
special effects and more pissed off System
Administrators.” - Joseph Bironas, Google SRE
16. Reliable Releases
§ Running reliable services requires reliable release processes.
§ Continuously build and deploy, including
– Automating check gates
– A/B deployments and other methods for checking sanity
§ As an SRE, don’t be afraid to roll-back a problem release.
§ Use engineering principles to manage configuration, including
– Treating configuration as code, with
§ version control
§ reviews and checks
§ testing
§ change management
– Automating configuration “deployment”
16
17. Keep it Simple
§ Types of complexity:
– Essential complexity is the complexity inherent in a given situation that cannot be removed
from a problem definition.
– Accidental complexity is more fluid and can be resolved with engineering effort
§ SREs minimize accidental complexity
– They push back when it’s introduced into systems
– They constantly strive to eliminate complexity in systems they onboard and for which they
assume operational responsibility
§ Every new line of code written is a liability.
– SRE promotes practices that make it more likely that all code has an essential purpose, such
as scrutinizing code to make sure that it actually drives business goals, routinely removing
dead code, and building bloat detection into all levels of testing.
17
The price of reliability is the pursuit of the utmost
simplicity. - C.A.R. Hoare, Turing Award lecture
18. Our Adoption Story
A version of the rest of our slides was shown at SRECon17 Americas during the session
I’m an SRE Lead! Now What?
How to Bootstrap and Organize Your SRE Team.
The recording is available at https://www.usenix.org/conference/srecon17americas/program/presentation/schacher
18
19. A Staged Approach
1. Define Scope and Obtain Stakeholder Buy-in
2. Define Tools, Processes, and Backlog
3. Build and Organize the SRE Team
4. Implement and Evolve
19
Maturity/Phase
Scale
Velocity
Impact
Iterate along the way
20. Define Scope and Obtain Stakeholder Buy-in
Build the leadership team and the mission
§ SRE Technical leads, Service Managers, Architects
§ Subject matter experts
§ Define core values
Define the roles and responsibilities (SRE - Dev)
Define the Benchmarks
§ Scorecards and checklists
§ Promotion gates and approvals
Ensure stakeholder buy-in
§ Executives
§ Peer service managers
§ Service tech leads
20
21. Example: SRE Values
21
We are obsessed with availability and reliability
We are an Engineering team
§ We promote a high-performance culture
§ We fill our skills gaps as necessary
§ We have an automation mindset
§ We use metrics, monitoring and measurement to ensure
accountability
§ We use Agile to manage our workload
We shape direction and Positively Influence Outcomes for our services
§ Our RCA actions, checklists, and gap analyses directly feed into
development prioritization
We partner with the service teams in all respects
§ We understand and can use our services and capabilities
§ We focus on the customer experience
§ We discourage an ”us vs. them” mindset
§ We align our priorities
We respect the autonomy of the service squads
22. Example: SRE Responsibilities and Control Points
22
Responsibilities
§ Maintain Stated Availability of Service
§ Incident, Problem, Capacity, Security, and Change Management
§ Development of features that enhance Site Availability
§ Automation and Monitoring Development
§ Creation and Publishing of Dashboards and Availability Metrics
Control Points
§ Provide Architectural Review and Signoff on a Service based on ability to achieve availability targets
§ Accept or Reject services based on their ability to achieve SLA
§ Own the Pipeline and Deployments to production
§ Gate changes into a production Service contingent on error budget
§ Use Error Budget to balance development prioritization between SRE content and new features
Decisions and recommendations of the SRE team are independent of functional or date commitments
23. Define Tools, Processes, and Backlog
Know, define or redefine tools and processes
§ Tracking and Planning
§ Monitoring and Availability Reporting
§ Recovery Procedures & Automation
§ Incident Management
§ RCAs (with traceability and reporting)
§ Deployments
§ Test reporting
§ Collaboration
23
24. 24
24x7
Technical
Operations
Center
First Responders
Escalation / SWAT
Communication /
Actions
Automation
Event
Mgmt
System
Monitoring
Sources
Support
Tickets
Incident Response Manager
SME Engineer
SME Engineer
Engineer SRE Response Team
Manager
Collaboration
ToolMonitoring Sources
Support Tickets
Squad Leader
Squad Engineer
Squad Engineer
Bots
SRE & Dev
Community
Platform
Engineering
Automatio
n
Response Mgr
Engineer
Monitoring Sources
Velocity
• Incidents can be resolved by multiple (e.g. TOC, SRE,
Dev) teams via multiple entry points
• Consistent data flows ensure consistent metrics
Multi-Level Response
• Lowers cost for incidents that do not require deep skills
• Layered response ensures incidents are always handled
Correlation /
Enrichment
Dependency
Identification
Impact Assessment
Exception(s)
Alert Notification
System
Reactive Proactive
DevOps Squad(s)
Example: Incident Operational Model
25. Define Tools, Processes, and Backlog
Establish tracking and planning first; everything else becomes
a prioritized work item
Organize SRE backlog
Ensure prioritization of SRE work items is consistent with
feature/function of service
Determine control points for adjusting work priorities, e.g.
§ 7 day / 30 day availability metrics
§ Business priorities of the service
§ Security compliance
Establish collaboration communication and consistent information
sharing
§ Where do you go to find specific information types?
§ Expedite information access and team communication
during incident recovery
25
26. Build and Organize the Team
Build the engineering team
§ New hires or rotational assignments?
§ Evaluate candidate skills and experience
§ Ensure broad skill coverage across all aspects of SRE
Organize vertically and horizontally: by service and by discipline
Designate SMEs for each service
§ Avoid Human SPoF
Develop domain expertise (in the services)
§ Architecture and topology, development environments, microservices, deployments
§ Acquire skills to develop SRE enhancements to service
Define and train call out groups for incident resolution
Cultivate the SRE mindset
§ Leadership coaching, all hands meetings, reinforce SRE values, conduct retrospectives
26
28. Implement and Evolve
Used a phased implementation for SRE ownership of a Service
§ Level I: Monitoring and reporting tool frameworks, first responder service, Incident
Manager support, RCA leadership
§ Level II: Incident Mgmt, Deployment Mgmt, Availability, Security, Compliance and
BC/DR
Onboard services to entry level of SRE ownership (Level 1)
Conduct gap analysis of services to:
§ Assess the maturity of the service
§ Identify gaps gating full SRE take over
Define a roadmap of activities required for SRE Ownership and collaborate with
service development to build a plan to address
Define service SLOs and SLIs
Implement SLI monitoring and reporting
SRE Take over of services (Level II Full SRE Ownership)
28
29. Avoiding Pitfalls and Lessons Learned
Manage expectations
§ Agree on the mission and scope first
§ Keep it grounded in reality (e.g. development is NOT off the hook)
§ Review and negotiate the roles and responsibilities
§ Get buy-in from everyone
Don’t throw an army of engineers at the problem before you have your mission, scope and backlog
defined
§ Engineers will appreciate a plan and structure with clearly communicated expectations
§ Executives and peer development managers will know what their role is and what to expect
Crawl, Walk, Run
§ Tackle the obvious items first (e.g RCAs, monitoring gaps, runbooks, processes, reducing MTTA/MTTR)
§ Show early success, incremental progress
§ Build on success with improved metrics and automation
Ruthlessly prioritize the backlog
§ Better to select a handful of items to focus on and do well, than try to do everything with slow velocity
§ Review the priorities monthly with stakeholders
§ Practice Agile!
29
30. Avoiding Pitfalls and Lessons Learned
DevOps and SRE are more than just rebranding of Dev and Operations
§ Push the engineers out of an ops mindset, comfort zone
§ Watch for an ”us vs. them” culture
§ Be pragmatic vs. purist, without sacrificing principles
§ Push for automation
SRE doesn’t come for free
§ This is a focus that did not exist before; either pull resources from services or hire new
§ Short term velocity will be slow but will improve
§ Good thing is that once established, this approach scales
Not enough engineers, not right skills
§ SRE must be seeded with engineers experienced in the services
§ Establish a resource rotation model with development to balance out the team
§ Negotiate with the stakeholders; know who they are sending over
30
33. Open Toolchain
33
A sample open toolchain for building, and deploying and
managing three microservices
Toolchains provide an integrated set of tools that
support the best practices to build, deploy and
manage your apps.
You can create toolchains that include Bluemix
services, open source tools, and third-party tools that
make development and operations repeatable and
easier to manage.
Rapidly instantiate new toolchains from templates to
on-board new teams quickly.
Create and manage toolchains of best-of-breed industry tools
bluemix.net/devops
34. Integrated Delivery Pipeline
34
Easy Setup
• Deploy an application from a Git repository in a
few clicks.
Continuous Integration
• Automate builds and deployments for many
types of code, running builds automatically
when code changes.
Continuous Testing
• Integrate automated unit tests as part of your
builds.
Continuous Deliver to Multiple Cloud
Platforms
• Deploy applications to one or many Cloud
Foundry or IBM Containers on Bluemix
environments.
36. References
§ [Book] B. Beyer, et. al., “Site Reliability Engineering: How Google Runs Production Systems”,
ISBN: 9781491929124
§ [Try17] B. Traynor, “What is ‘Site Reliability Engineering’?”, blog post, captured 15 April 2017.
§ [Sch15] B. Schwartz, “The Factors That Impact Availability, Visualized”, blog post, 21 December
2015.
36