Anurag Gupta spoke at the AWS Big Data Meetup in Palo Alto and described the AWS DevOps culture. In the talk he gives pointers on how service owners can setup monitoring that will continually reduce operational burden.
Create Agile, Automated and Predictable IT Infrastructure in the CloudRightScale
RightScale Webinar: February 9, 2010 – In this Webinar, RightScale founder Thorsten von Eicken walks you through the differences between RightScale's ServerTemplate technology and machine images. During the live demonstration, see how ServerTemplates allow you to deploy agile, automated and predictable IT infrastructure in the cloud using RightScale. We also present research on how much time our customers have saved by utilizing this technology.
When we started building serverless applications back in 2017, there was quite a lot to learn.
So we'd like to share some mistakes, important pieces, and concepts for production-ready serverless projects.
This time: Retries.
More Information:
API-Retries: https://lnkd.in/dVwwEsat
Lambda-Retries: https://lnkd.in/dEs6xXyb
SQS-Retries: https://lnkd.in/dPm2x7ZT
Building Scalable Websites for the CloudRightScale
Brian Adler, Solutions Consultant at RightScale, led this session at the RightScale User Conference 2010 in Santa Clara.
Session Abstract: RightScale has extensive experience building scalable websites from the ground up. More importantly, we have invaluable experience rescuing companies who have come to us after their self-built websites failed to scale when hit with unexpected traffic levels. In this session, we'll review reference architectures and share best practices from proven, scalable websites deployed in the cloud. Additionally, we'll cover alternate methods for load balancing along with techniques to improve your web application's availability and reliability.
Create Agile, Automated and Predictable IT Infrastructure in the CloudRightScale
RightScale Webinar: February 9, 2010 – In this Webinar, RightScale founder Thorsten von Eicken walks you through the differences between RightScale's ServerTemplate technology and machine images. During the live demonstration, see how ServerTemplates allow you to deploy agile, automated and predictable IT infrastructure in the cloud using RightScale. We also present research on how much time our customers have saved by utilizing this technology.
When we started building serverless applications back in 2017, there was quite a lot to learn.
So we'd like to share some mistakes, important pieces, and concepts for production-ready serverless projects.
This time: Retries.
More Information:
API-Retries: https://lnkd.in/dVwwEsat
Lambda-Retries: https://lnkd.in/dEs6xXyb
SQS-Retries: https://lnkd.in/dPm2x7ZT
Building Scalable Websites for the CloudRightScale
Brian Adler, Solutions Consultant at RightScale, led this session at the RightScale User Conference 2010 in Santa Clara.
Session Abstract: RightScale has extensive experience building scalable websites from the ground up. More importantly, we have invaluable experience rescuing companies who have come to us after their self-built websites failed to scale when hit with unexpected traffic levels. In this session, we'll review reference architectures and share best practices from proven, scalable websites deployed in the cloud. Additionally, we'll cover alternate methods for load balancing along with techniques to improve your web application's availability and reliability.
After building serverless applications for quite a few years, we decided to share some important pieces and concepts for production-ready serverless projects.
This time: Circuit Breaker.
More Information: https://lnkd.in/dExSRsjp
Continuing our series of mistakes, important pieces, and concepts for production-ready serverless projects in 2022.
"exponential backoff"
In-depth backoff and jitter comparison: https://lnkd.in/disA6tQq
AWS SDK (node) custom backoff: https://lnkd.in/dFrMbGfs
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain the term SRE (Site Reliability Engineering) and introduce key metrics for an SRE team SLI, SLO, and SLA.
Youtube Channel here: https://www.youtube.com/playlist?list=PLm_COkBtXzFq5uxmamT0tqXo-aKftLC1U
RightScale Webinar: February 1, 2011 – Just like our customers, RightScale runs in the cloud and requires the best platform to automate operations. As such, RightScale uses RightScale to manage RightScale. Our complete infrastructure – development, testing, staging, and production – consists of servers that are configured, launched and managed by the RightScale Platform.
Rafael Saavedra, VP Engineering at RightScale, led this session at the RightScale User Conference 2010 in Santa Clara.
Session Abstract: Just like our customers, RightScale runs in the cloud and requires the best platform to automate operations. As such, RightScale uses RightScale to manage RightScale. Our complete infrastructure – development, testing, staging, and production – consists of servers that are configured, launched and managed by the RightScale Platform. In this talk, we'll present insights into how our different systems are set up and managed through the RightScale dashboard, how we organize the different production deployments, how we roll out major and minor upgrades to our infrastructure, and what best practices we follow during normal and emergency operations.
How To Combine Back-End & Front-End Testing with BlazeMeter & Sauce LabsSauce Labs
Sauce Labs and BlazeMeter teamed up for an awesome webinar, giving step-by-step instructions on how to get real-world results from your front end while applying load to the backend.
Automating Deployments with Bamboo and Ansible - Randall Thomson, Senior TechOps Engineer - LogicMonitor
LogicMonitor uses Atlassian Bamboo and Ansible to manage the deployment of applications throughout their microservice based infrastructure. The process integrates tightly with the LogicMonitor API to programmatically set SDTs and OpsNotes. Additional integration with HipChat sends automated room notifications. Randall Thomson will speak on how the LogicMonitor TechOps team uses Ansible and Bamboo to empower their Development team to safely and securely deploy applications in test and production environments.
AI Powered Full Stack Monitoring using Dynatrace - Himanshu Chhetri, CTO - Addteq
How do you effectively monitor the health of your Atlassian ecosystem and easily troubleshoot issues? DynaTrace, one of the recommended monitoring tools in Atlassian's enterprise documentation, is capable of automatically detecting performance issues in infrastructure, application and even provide insights into user experience across the globe. Himanshu Chhetri will present insights and real-world use cases using DynaTrace to monitor your mission-critical Atlassian tools.
After building serverless applications for quite a few years, we decided to share some important pieces and concepts for production-ready serverless projects.
This time: Lambda Reserved Concurrency.
More Information: https://lnkd.in/dRxHdrCa
Divide and Conquer: Easier Continuous Delivery using Micro-ServicesCarlos Sanchez
Docker has revolutionized the way people think about applications and deployments. It provides a simple way to run and distribute Linux containers for a variety of use cases, from lightweight virtual machines to complex distributed micro-services architectures.
Containers allow to run services in isolation with a minimum performance penalty, increased speed, easier configuration and less complexity, making it ideal for continuous integration and continuous delivery based workloads.
But testing a distributed micro-services architecture is no easy task, requiring a shift in mindset and tooling to accommodate the new architecture.
We will provide insight on our experience creating a Jenkins platform based on distributed Docker containers running on Apache Mesos and Marathon, applicable for all types of applications, but specially Java and JVM based ones.
After building serverless applications for a couple of years now, we decided to share some important pieces and concepts for production-ready serverless projects.
Let's start with: What are Custom SDK Timeouts and why you should care.
Default timeouts and troubleshooting: https://lnkd.in/dZsgxk9p
After building serverless applications for quite a few years, we decided to share some important pieces and concepts for production-ready serverless projects.
This time: Circuit Breaker.
More Information: https://lnkd.in/dExSRsjp
Continuing our series of mistakes, important pieces, and concepts for production-ready serverless projects in 2022.
"exponential backoff"
In-depth backoff and jitter comparison: https://lnkd.in/disA6tQq
AWS SDK (node) custom backoff: https://lnkd.in/dFrMbGfs
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain the term SRE (Site Reliability Engineering) and introduce key metrics for an SRE team SLI, SLO, and SLA.
Youtube Channel here: https://www.youtube.com/playlist?list=PLm_COkBtXzFq5uxmamT0tqXo-aKftLC1U
RightScale Webinar: February 1, 2011 – Just like our customers, RightScale runs in the cloud and requires the best platform to automate operations. As such, RightScale uses RightScale to manage RightScale. Our complete infrastructure – development, testing, staging, and production – consists of servers that are configured, launched and managed by the RightScale Platform.
Rafael Saavedra, VP Engineering at RightScale, led this session at the RightScale User Conference 2010 in Santa Clara.
Session Abstract: Just like our customers, RightScale runs in the cloud and requires the best platform to automate operations. As such, RightScale uses RightScale to manage RightScale. Our complete infrastructure – development, testing, staging, and production – consists of servers that are configured, launched and managed by the RightScale Platform. In this talk, we'll present insights into how our different systems are set up and managed through the RightScale dashboard, how we organize the different production deployments, how we roll out major and minor upgrades to our infrastructure, and what best practices we follow during normal and emergency operations.
How To Combine Back-End & Front-End Testing with BlazeMeter & Sauce LabsSauce Labs
Sauce Labs and BlazeMeter teamed up for an awesome webinar, giving step-by-step instructions on how to get real-world results from your front end while applying load to the backend.
Automating Deployments with Bamboo and Ansible - Randall Thomson, Senior TechOps Engineer - LogicMonitor
LogicMonitor uses Atlassian Bamboo and Ansible to manage the deployment of applications throughout their microservice based infrastructure. The process integrates tightly with the LogicMonitor API to programmatically set SDTs and OpsNotes. Additional integration with HipChat sends automated room notifications. Randall Thomson will speak on how the LogicMonitor TechOps team uses Ansible and Bamboo to empower their Development team to safely and securely deploy applications in test and production environments.
AI Powered Full Stack Monitoring using Dynatrace - Himanshu Chhetri, CTO - Addteq
How do you effectively monitor the health of your Atlassian ecosystem and easily troubleshoot issues? DynaTrace, one of the recommended monitoring tools in Atlassian's enterprise documentation, is capable of automatically detecting performance issues in infrastructure, application and even provide insights into user experience across the globe. Himanshu Chhetri will present insights and real-world use cases using DynaTrace to monitor your mission-critical Atlassian tools.
After building serverless applications for quite a few years, we decided to share some important pieces and concepts for production-ready serverless projects.
This time: Lambda Reserved Concurrency.
More Information: https://lnkd.in/dRxHdrCa
Divide and Conquer: Easier Continuous Delivery using Micro-ServicesCarlos Sanchez
Docker has revolutionized the way people think about applications and deployments. It provides a simple way to run and distribute Linux containers for a variety of use cases, from lightweight virtual machines to complex distributed micro-services architectures.
Containers allow to run services in isolation with a minimum performance penalty, increased speed, easier configuration and less complexity, making it ideal for continuous integration and continuous delivery based workloads.
But testing a distributed micro-services architecture is no easy task, requiring a shift in mindset and tooling to accommodate the new architecture.
We will provide insight on our experience creating a Jenkins platform based on distributed Docker containers running on Apache Mesos and Marathon, applicable for all types of applications, but specially Java and JVM based ones.
After building serverless applications for a couple of years now, we decided to share some important pieces and concepts for production-ready serverless projects.
Let's start with: What are Custom SDK Timeouts and why you should care.
Default timeouts and troubleshooting: https://lnkd.in/dZsgxk9p
UnConference for Georgia Southern Computer Science March 31, 2015Christopher Curtin
I presented to the Georgia Southern Computer Science ACM group. Rather than one topic for 90 minutes, I decided to do an UnConference. I presented them a list of 8-9 topics, let them vote on what to talk about, then repeated.
Each presentation was ~8 minutes, (Except Career) and was by no means an attempt to explain the full concept or technology. Only to wake up their interest.
RightScale Webinar: So you want to move to the cloud... but you’re not sure what that means, or where you would even start. Or you want to get your feet wet with a proof-of-concept project before you bring out the big guns. We asked Brian Adler, our Professional Services Architect who works directly with customers on cloud projects every single day, to select five cloud projects that you can get started with (and complete!) quickly. In this webinar, Brian and Rafael Saavedra, our VP of Engineering, will walk you through those five projects and will help you demonstrate success in the cloud now.
Scalability refers to the idea of a system in which every application or piece of infrastructure can be expanded to handle increased load.
For example, suppose your web application gets featured on a popular website. Suddenly, thousands of visitors are using your app – can your infrastructure handle the traffic? Having a scalable web application ensures that it can scale up to handle the load and not crash. Crashing (or even just slow) pages leave your users unhappy and your app with a bad reputation.
Operations: Production Readiness Review – How to stop bad things from HappeningAmazon Web Services
There is more to deploying code than pushing the deploy button. A good practice that many companies follow is a Production Readiness Review (PRR) which is essentially a pre-flight check list before a service launches. This helps ensure new services are properly architected, monitored, secured, and more. We’ll walk through an example PRR and discuss the value of ensuring each of these is properly taken care of before your service launches.
Managing the performance of enterprise applications is hard. Managing and optimizing the performance of enterprise applications on shared virtualized infrastructure (i.e. cloud computing) is even harder. This article outlines the specifics of capacity planning and performance management of EAs deployed in the cloud.
With AWS you can choose the right database technology and software for the job. Given the myriad of choices, from relational databases to non-relational stores, this session provides details and examples of some of the choices available to you. This session also provides details about real-world deployments from customers using Amazon RDS, Amazon ElastiCache, Amazon DynamoDB, and Amazon Redshift.
Building a Scalable Architecture for web appsDirecti Group
Visit http://wiki.directi.com/x/LwAj for the video. This is a presentation I delivered at the Great Indian Developer Summit 2008. It covers a wide-array of topics and a plethora of lessons we have learnt (some the hard way) over the last 9 years in building web apps that are used by millions of users serving billions of page views every month. Topics and Techniques include Vertical scaling, Horizontal Scaling, Vertical Partitioning, Horizontal Partitioning, Loose Coupling, Caching, Clustering, Reverse Proxying and more.
An introduction to Workload Modelling for Cloud ApplicationsRavi Yogesh
A high-level overview of Workload Modelling as a part of Performance Testing Life Cycle with focus on the challenges faced in Cloud environment relative to traditional IT infrastructure.
Starting Your DevOps Journey – Practical Tips for OpsDynatrace
To watch, please see:
https://info.dynatrace.com/apm_wc_getting_started_with_devops_na_registration.html
Starting Your DevOps Journey: Practical Tips for Ops
In this webinar, Andreas Grabner, Chief DevOps Activist at Dynatrace, shares practical tips that all IT groups from Dev to Ops can use to start their DevOps journey quickly. With experience from hundreds of DevOps deployments, Andi provides insights it would take your team months or years to learn firsthand.
- Learn how everyone on your Ops team can use APM to better understand and monitor SLAs, Performance and End User Impact of their applications.
- Foster better collaboration between Ops and architects by extending basic system monitoring to monolith and microservices architectures.
- Shift-left your testing and QA by working with metrics that you and the architects agreed on up front, resulting in early relevant feedback and faster code deployments.
- Hear why changing the cultural mindset from “fear of change” to “Continuous Innovation and Optimization” is critical for success.
Andi is joined by guest speaker, Brian Chandler, Systems Engineer at Raymond James, who shares commonly used Ops dashboards that increase collaboration across IT teams and pro-actively break down silos!
RightScale Webinar: January 13, 2011 – Watch this webinar for a look behind the scenes as we discuss ServerTemplates and how are they different from alternate approaches.
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...Prolifics
Abstract: Recent projects have stressed the "need for speed" while handling large amounts of data, with near zero downtime. An analysis of multiple environments has identified optimizations and architectures that improve both performance and reliability. The session covers data gathering and analysis, discussing everything from the network (multiple NICs, nearby catalogs, high speed Ethernet), to the latest features of extreme scale. Performance analysis helps pinpoint where time is spent (bottlenecks) and we discuss optimization techniques (MQ tuning, IIB performance best practices) as well as helpful IBM support pacs. Log Analysis pinpoints system stress points (e.g. CPU starvation) and steps on the path to near zero downtime.
Similar to Anurag Gupta's talk on DevOps at AWS. Nov 17 at the Palo Alto AWS Big Data Meetup (20)
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
2. Dev/Ops
How I learned to stop worrying and love my pager
Dev/Ops - your dev org is your ops org
I get a pager! You get a pager! Everyone gets a pager!
Why would I possibly want this?
It motivates design for operability
It aligns your interests w/ your customer experience
It improves the feedback loop to customer needs
3. Monitor everything
Every API call to your service,
Every API call you make to a dependent service
Canary traffic for things that vary (eg SQL statements)
Most of the metrics won’t be meaningful. That’s OK
Page on your high signal-to-noise metrics
Monitor these metrics during deployments
Median/Average, Fleet-wide, coarse time grain are obscuring
Measure TP90, TP99 (99th percentile response time)
Measure at finer and finer grain
Evaluate per-customer metrics
Look for the needles in the haystack
4. Correction-of-Error (COE) Reporting
Meet weekly on operations (execs, service operators)
Review each issue that happened.
“Spin the wheel” to review a service’s metrics
Support a “truth-seeking” culture
Looking for data, process improvements
COE
- Customer impact
- Timeline: incidence to detection to response to resolution
- 5 Whys? Get to actionable changes to extinguish cause
- Actions
5. Ops is Dev
Humans are fallible
circa 1% defect injection rate
Error rate changes based on time of day (3am vs 3pm)
New ones show up, have unique issues
Limit human access to machines
Use code/scripts/tools instead
Scripts are code
unit test, code review, deploy, automate
6. Ops load correlates to business growth
As your business does well, your
operations needs to become great
Growing 100-200% YoY is hard.
Improving ops 100-200% YoY is really
hard.
Improving ops 2% each week is possible.
Use Pareto analysis to prioritize work
Bonus – each customer gets a better
experience even as your own ops load
stays constant
Amazon Redshift has grown rapidly since it became generally
available in February 2013. While our guiding principles have
served us well over the past two years, we now manage many
thousands of database instances and below offer some lessons we
have learned from operating databases at scale.
Design escalators, not elevators: Failures are common when
operating large fleets with many service dependencies. A key
lesson for us has been to design systems that degrade on failures
rather than losing outright availability. These are a common
design pattern when working with hardware failures, for example,
replicating data blocks to mask issues with disks. They are less
common when working with software or service dependencies,
though still necessary when operating in a dynamic environment.
Amazon overall (including AWS) had 50 million code
deployments over the past 12 months. Inevitably, at this scale, a
small number of regressions will occur and cause issues until
reverted. It is helpful to make one’s own service resilient to an
underlying service outage. For example, we support the ability to
preconfigure nodes in each data center, allowing us to continue to
provision and replace nodes for a period of time if there is an
Amazon EC2 provisioning interruption. One can locally increase
replication to withstand an Amazon S3 or network interruption.
We are adding similar mitigation strategies for other external
understanding that, even if not a widespread concern, each issue is
meaningful to the customer experiencing it. In Figure 5, Sev 2
refers to a severity 2 alarm that causes an engineer to get paged.
This means operational load roughly correlates to business
success. Within Amazon Redshift, we collect error logs across our
fleet and monitor tickets to understand top ten causes of error,
with the aim of extinguishing one of the top ten causes of error
each week.
Figure 5: Tickets per cluster over time
Pareto analysis is equally useful in understanding customer
functional requirements. However, it is more difficult to collect.
7. Escalators, not elevators
Failures happen.
Durability failures are “easy”
mirroring, quorums, well understood techniques
Availability failures are “hard” –
want to degrade on unavailability not cascade failures
tolerate 1-2 hours of unavailability (time to detect, fix)
- eg caching IP addresses when DNS is unavailable
- eg maintaining instance warm pools rather than provisioning
- eg losing the ability to restore a backup, not lose writes
8. Ship often
Continuous delivery should be to the
customer
Benefits
Customers prefer small patches
Rollback is easier
Rollback is less likely
Faster response to customer issues
We push a new database engine
version, including both features and
bug fixes, every two weeks.
dependencies that can fail independently from the database itself.
Continuous delivery should be to the customer: Many
engineering organizations now use continuous build and
automated test pipelines to a releasable staging environment.
However, few actually push the release itself at a frequent pace.
While customers would prefer small patches to large ones for the
same reasons engineering organizations prefer to build and test
continuously, patching is an onerous process. This often leads to
special-case, one-off patches per customer that are limited in
scope – while necessary, they make patching yet more fragile.
Figure 4: Cumulative features deployed over time
Amazon Redshift is set up to automatically patch customer
clusters on a weekly basis in a 30-minute window specified by the
Cumulative features deployed over time