This presentation talks about three topics related to monitoring. The first is a brief history and future forecast of monitoring trends. The second is a second look at the inputs, outputs, and techniques for setting SLOs. The third sets some basic tenets one should always follow when monitoring systems.
There are many modern techniques for identifying anomalies in datasets. There are fewer that work as online algorithms suitable for application to real-time streaming data. What’s worse? Most of these methodologies require a deep understanding of the data itself. In this talk, we tour what the options are for identifying anomalies in real-time data and discuss how much we really need to know before hand to guess at the ever-useful question: is this normal?
Craftsmanship in software tends to erode as team sizes increase. This can be due to a large variety of reasons, but is often dependent on code base size, team size, and autonomy. In this session I'll talk about some of the challenges companies face as these things change and how to manipulate teams, architectures and how people work to maintain software craftsmanship will still delivering product.
There are two common tenets of operations: "hell is other people's software," and "better software is produced by those forced to operate it." In this session I'll take a fly-by-tour of two pieces of software that were built from the ground up for operability from the hard-earned teachings of their inoperable predecessors: a distributed datastore replacing PostgreSQL, and a message queue replacing RabbitMQ.
We'll discuss specific design aspects that increase resiliency in the event of failure and observability at all times.
Chaos Engineering, When should you release the monkeys?Thoughtworks
Chaos Engineering is listed as 'Trial' in the ThoughtWorks Tech Radar, but what is it really and how is it different from traditional testing? When and why should you get started with Chaos Engineering and is Chaos Monkey the right place to start when you do?
This presentation talks about three topics related to monitoring. The first is a brief history and future forecast of monitoring trends. The second is a second look at the inputs, outputs, and techniques for setting SLOs. The third sets some basic tenets one should always follow when monitoring systems.
There are many modern techniques for identifying anomalies in datasets. There are fewer that work as online algorithms suitable for application to real-time streaming data. What’s worse? Most of these methodologies require a deep understanding of the data itself. In this talk, we tour what the options are for identifying anomalies in real-time data and discuss how much we really need to know before hand to guess at the ever-useful question: is this normal?
Craftsmanship in software tends to erode as team sizes increase. This can be due to a large variety of reasons, but is often dependent on code base size, team size, and autonomy. In this session I'll talk about some of the challenges companies face as these things change and how to manipulate teams, architectures and how people work to maintain software craftsmanship will still delivering product.
There are two common tenets of operations: "hell is other people's software," and "better software is produced by those forced to operate it." In this session I'll take a fly-by-tour of two pieces of software that were built from the ground up for operability from the hard-earned teachings of their inoperable predecessors: a distributed datastore replacing PostgreSQL, and a message queue replacing RabbitMQ.
We'll discuss specific design aspects that increase resiliency in the event of failure and observability at all times.
Chaos Engineering, When should you release the monkeys?Thoughtworks
Chaos Engineering is listed as 'Trial' in the ThoughtWorks Tech Radar, but what is it really and how is it different from traditional testing? When and why should you get started with Chaos Engineering and is Chaos Monkey the right place to start when you do?
Shift Left. Wait, what? No, Shift Right!!!Phillip Maddux
Presented on November 7, 2018 at Triangle DevOps (https://www.meetup.com/triangle-devops/).
Recently in the DevSecOps world there has been a call to shift left. However, application security has been shifting left for years already. What we should be doing is shifting application security to the right (production). This can be done by instrumenting applications for security.
The left is not wrong, just not right; It's time to shift right!Phillip Maddux
In the last few years of AppSec and DevOps, we've heard the calls to shift left. But how far left can we go, and is it really going to help eliminate exploitable bugs or scale your AppSec program? What if we consider a different direction, shifting right! Can a focus on shifting to the right be more effective in mitigating real-world threats and prioritization? In this presentation, I'll explore these questions and propose concepts that show why shifting right is right!
Overview of Site Reliability Engineering (SRE) & best practicesAshutosh Agarwal
In any software organization, stability & innovation are always at loggerheads - the faster you move, the more things will break. This talk defines what SRE org looks like at high-tech organizations (Google, Uber).
General overview of what is "Chaos Engineering", the current
"perturbation models" available and the benefits of Chaos Engineering to Customers, Business and Tech.
Arranging software development around a wide mix of programming languages and technologies offers both challenges and rewards. In this talk Ryan will explore the pros and cons that the 6Wunderkinder team found when working with over 10 different programming languages in a single product.
CSA Raleigh application security and deception in the cloudPhillip Maddux
Presented on January 17, 2019 at Raleigh/Durham/RTP - Cloud Security Alliance Chapter (https://www.meetup.com/Raleigh-Durham-RTP-Cloud-Security-Alliance-Chapter/).
Over the last several years there has been a steady and increasing march towards shifting applications to the cloud. To keep pace with this cloud adoption, in some cases multi-cloud adoption, security teams need consistent real-time threat visibility over their web applications production. In this talk, we'll discuss some of the foundational concepts that comprise a practical approach to threat visibility and securing applications in the cloud. In addition, extending visibility to breaches in progress, deception can be a valuable layer in your defense in depth strategy. We'll discuss the concept of deception and how it can be deployed in the cloud. Overall, the audience will gain a greater insight into application security and deception for the cloud. As we head into 2019, we need to prepare for a year that will prove these concepts are critical for defending deployments in the cloud.
Curious about how chaos engineering can make your systems more resilient?
Get a comprehensive introduction to the history, principles, and practice of chaos engineering
You will walk away from this session with an in-depth understanding of what chaos engineering is, why it’s crucial to prevent outages, and how you can use it to build resilience into your own systems.
This presentation was given at LinkedIn. It is a collection of guidelines and wisdom for re-thinking how we do engineering for massively scalable systems. Useful for anyone who cares about Big Data, Distributed Computing, Hadoop, and more.
Building a culture where software projects get donethegdb
(My slides from http://qconsf.com/presentation/building-culture-where-software-projects-get-done.)
Software engineering has existed as a discipline for over fifty years, and still people continue to fall into all of the classic engineering fallacies. Rewrites fail with high frequency, feature creep smothers nascent projects, and engineering timelines get repeatedly pushed back until they are all but meaningless. None of these statements are surprising, but what's surprising is that great engineers who are very aware of these pitfalls can ensnared by them anyway.
In this talk, we'll go over how we've structured the culture at Stripe to avoid these traps. We're certainly not immune to the classic fallacies, but each time a project goes poorly we adapt our workflow to mitigate the relevant failure mode in the future. We'll describe the principles driving how we approach software projects (and the history of how we discovered them), as well as why it's important to bake these ideas directly into the culture.
Shift Left. Wait, what? No, Shift Right!!!Phillip Maddux
Presented on November 7, 2018 at Triangle DevOps (https://www.meetup.com/triangle-devops/).
Recently in the DevSecOps world there has been a call to shift left. However, application security has been shifting left for years already. What we should be doing is shifting application security to the right (production). This can be done by instrumenting applications for security.
The left is not wrong, just not right; It's time to shift right!Phillip Maddux
In the last few years of AppSec and DevOps, we've heard the calls to shift left. But how far left can we go, and is it really going to help eliminate exploitable bugs or scale your AppSec program? What if we consider a different direction, shifting right! Can a focus on shifting to the right be more effective in mitigating real-world threats and prioritization? In this presentation, I'll explore these questions and propose concepts that show why shifting right is right!
Overview of Site Reliability Engineering (SRE) & best practicesAshutosh Agarwal
In any software organization, stability & innovation are always at loggerheads - the faster you move, the more things will break. This talk defines what SRE org looks like at high-tech organizations (Google, Uber).
General overview of what is "Chaos Engineering", the current
"perturbation models" available and the benefits of Chaos Engineering to Customers, Business and Tech.
Arranging software development around a wide mix of programming languages and technologies offers both challenges and rewards. In this talk Ryan will explore the pros and cons that the 6Wunderkinder team found when working with over 10 different programming languages in a single product.
CSA Raleigh application security and deception in the cloudPhillip Maddux
Presented on January 17, 2019 at Raleigh/Durham/RTP - Cloud Security Alliance Chapter (https://www.meetup.com/Raleigh-Durham-RTP-Cloud-Security-Alliance-Chapter/).
Over the last several years there has been a steady and increasing march towards shifting applications to the cloud. To keep pace with this cloud adoption, in some cases multi-cloud adoption, security teams need consistent real-time threat visibility over their web applications production. In this talk, we'll discuss some of the foundational concepts that comprise a practical approach to threat visibility and securing applications in the cloud. In addition, extending visibility to breaches in progress, deception can be a valuable layer in your defense in depth strategy. We'll discuss the concept of deception and how it can be deployed in the cloud. Overall, the audience will gain a greater insight into application security and deception for the cloud. As we head into 2019, we need to prepare for a year that will prove these concepts are critical for defending deployments in the cloud.
Curious about how chaos engineering can make your systems more resilient?
Get a comprehensive introduction to the history, principles, and practice of chaos engineering
You will walk away from this session with an in-depth understanding of what chaos engineering is, why it’s crucial to prevent outages, and how you can use it to build resilience into your own systems.
This presentation was given at LinkedIn. It is a collection of guidelines and wisdom for re-thinking how we do engineering for massively scalable systems. Useful for anyone who cares about Big Data, Distributed Computing, Hadoop, and more.
Building a culture where software projects get donethegdb
(My slides from http://qconsf.com/presentation/building-culture-where-software-projects-get-done.)
Software engineering has existed as a discipline for over fifty years, and still people continue to fall into all of the classic engineering fallacies. Rewrites fail with high frequency, feature creep smothers nascent projects, and engineering timelines get repeatedly pushed back until they are all but meaningless. None of these statements are surprising, but what's surprising is that great engineers who are very aware of these pitfalls can ensnared by them anyway.
In this talk, we'll go over how we've structured the culture at Stripe to avoid these traps. We're certainly not immune to the classic fallacies, but each time a project goes poorly we adapt our workflow to mitigate the relevant failure mode in the future. We'll describe the principles driving how we approach software projects (and the history of how we discovered them), as well as why it's important to bake these ideas directly into the culture.
Presentation from Smart ERP Solutions covering effective ways to work with Oracle's PeopleSoft support. Includes guidance on SR escalations and techniques to help expedite resolutions.
Talk given by David Lucey, Systems Engineering Architect at Salesforce, at Open Network Users Group in May 2016
“Livestock, not Pets.” We’ve all heard the phrase, but it seems to be so much harder in practice. It’s even worse when applications are developed over decades.
Well, the Salesforce application suite has been developed over decades, with a massive number of products, features, and offerings within its own ecosystem. Come see how Salesforce wrangles that livestock and handles their scale of infrastructure at a high velocity – all while maintaining their high level of security.
A Common Sense Guide to Agile Development and Testing that might just change your Agile approach forever.
Answering the 9 most common questions asked about Agile Testing:
- What is Agile Testing?
- Do we still need testers in Agile?
- What is an Agile Tester?
- What does a Software Tester Actually Do?
- Should we automate our testing?
- What tools should we use for our Agile Testing?
- How Much Should we Automate?
- How can we automate and still finish the sprint?
- How can we finish all our testing in the sprint?
A high quality download of the 9 points as a free "Print out and Keep" Poster is available at http://eviltester.com/agile
Abstract
More and more the world runs on software, furthermore software is increasingly controlling devices in the real world. Software failures can now have a greater impact than just loss of data, physical damage and injury are now concerns. While many high reliability specifications exist, such as MISRA and DO-178B, they can be too “heavy” for many projects and are typically domain specific (automotive and airborne systems respectively) and are not used.
This presentation explores various software techniques that can be used to harden a software system and make it more reliable. The presentation also covers key questions to be answered when developing software that interacts with the real world.
Specifically we will be looking at cases where the software needs to be more reliable than “average” but does not justify investment in a formal specification such as MISRA or DO-178B.
Bio
Lloyd Moore is the founder and owner of CyberData Corporation, which provides consulting services in the robotics, machine vision and industrial automation fields. Lloyd has worked in software industry for 25 years. His formal training in biological-based artificial intelligence, electronics, and psychology. Lloyd is also currently the president of the Northwest C++ User’s Group and an organizer of the Seattle Robotics Society Robothon event.
Advanced Maintenance And Reliability (Maintenance and Reliability Best Pract...Ricky Smith CMRP, CMRT
Maintenance and reliability has taken great strides toward managing asset reliability by applying known best practices in maintenance and reliability finding that they can optimize reliability and reduce total cost and reduce risk by applying known best practices. However, if not most organizations are still trapped in the old way of thinking.
As Albert Einstein once said: "the significant problems we face cannot be solved with the same level of thinking we were at when we created them".
The anonymised slides from an old (but hopefully still relevant) talk on the case for placing a strategic focus on design testability. The material covers the technical, process and organisational considerations arising from such a strategy and is predominantly a summary of the ideas presented in Brett Pettichord's 2001 "Design For Testability' paper available here. The presentation makes a case for why a high level of design testability can be seen as a critical success factor in achieving sustained agility.
Introduction to Software Engineering and Software Process Modelssantoshkawade5
S/W Engineering
Software Engineering Fundamentals: Introduction to software engineering, The Nature of Software, Defining Software, Software Engineering Practice.
A Generic Process Model, defining a Framework Activity, Identifying a Task Set, Process Patterns, Process Assessment and Improvement, Prescriptive Process Models, The Waterfall Model, Incremental Process Models, Evolutionary Process Models, Concurrent Models, A Final Word on Evolutionary Processes. Unified Process, Agile software development: Agile methods, plan driven and agile development.
A tour of challenges today's software engineers will fast (and material they should familiarize themselves with) to cope with the issues that arise due to the distributed nature of today's applications.
Technology changes and process changes in how people build and manage Internet systems have driven a need for a new approach to monitoring. We talk about why, what and how.
One of the dying skill sets in today’s engineering teams is the multi-disciplinary analyst that can truly dissect dysfunction in the radically complex architectures of today. As tools emerge that connect the dots, it might be faster to collect the data needed to analysis and decision making, but the knowledge and techniques to actually make the assessments needed are hard to come by.
In this session, we’ll walk through a complex architecture and discuss what an engineer in this role really needs to understand. We’ll analyze a few anecdotal problems and see why this world of magical automation and elastic deployments will never really displace the need for root on a production box, a debugger, and the ability to move fast, take risks and destroy performance problems.
In this session we’ll leave the need for performance a foregone conclusion and take a whirlwind tour through the complexity of modern Internet architectures. The complexities lead to evil optimization problems and significant challenges troubleshooting production issues to a speedy and successful end.
Starting with the simple facts that you can’t fix what you can’t see and you can’t improve what you can’t measure, we’ll discuss what needs monitoring and why. We’ll talk about unlikely allies in the fight for time and budget to instrument systems, applications and processes for observability.
You’ll leave the session with a better understanding of what it looks like to troubleshoot the storm of a malfunctioning large architecture and some tools and techniques you can use to not be swallowed by the Kraken.
User generated data is an old problem. Systems and network telemetry, page analytics and application state combine to form an ever growing mountain of data collected by today's tools. Collecting and storing this data requires more than just a single application, having no single point where the user touches the system and gets an answer makes debugging a nightmare and reproducing the error intractable. Distributed systems require a clear perspective on production systems and access to data in real time to have any hope of solving complex problems related to state, all while not impacting user experience.
We will explain the problem, the pains and how we solved them. Develop in production; push code to development.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
1. can be applied to the nascent world of microservices.
Put some SRE
in your microservices
Hard-won lessons from the world of SRE…
2. The many faces of
Theo Schlossnagle
@postwait
CEO Circonus
3. The nature of the problem
Software Sucks
Once you’ve run software at scale,
you have a deep understanding of
how it is all tied together with
loose string and hope.
4. All software will fail, but
good software
fails well
• Consider the phrase:
“have you used X in anger.”
5. Never undervalue grace in failure.
Rule . 𝛌1 Crash landings should be both
fast and controlled.
6. What it means to
fail quickly & safely
• The scope of failure should
collapse completely.
• The time to failure should be
measured in small multiples of
normal service time
• Nothing outside the scope of
failure should be impacted.
https://www.youtube.com/watch?v=5SL1A2d2e7M
8. Pragmatic analysis is required to
understand failure’s
true nature
• Post-mortem analysis is critical
• Stack traces
• Forensic logs
• Images (cores, dumps, etc.)
9. The difference between a shock and electrocution is real.
Rule . 𝛌3 Use circuit breakers.
10. Circuit breakers are designed to
avoid
cascading failure
• it’s not all about,
especially with microservices
• protect yourselves and others
• circuit breakers of many type
• timing
• queue depth
• concurrency
http://melissaomarkham.com
11. You cannot understand what you cannot measure.
Rule . 𝛌4 Behavior is complex.
Understand it.
12. Don’t measure to assess availability
measure to understand
Build robust models of behavior
Understand performance changes
Don’t use averages
Don’t use percentiles alone
13. Don’t measure to assess availability
measure to understand
Build robust models of behavior
Understand performance changes
Don’t use averages
Don’t use percentiles alone
14. It’s easy to demand perfection; it’s also stupid.
Rule . 𝛌5 Have an failure budget.
15. Avoid failure is simply impossible,
expect and manage
failure
• use failure budgets
• set expectations reasonably
• define and reward successes on
improvement and competency,
not just uptime.
16. Justice should be blind; operations should not.
Rule . 𝛌6 Instrumentation &
Observability have no equals.
17. For every “I wonder what X is right now?”
in production,
you must have answers
DTrace
eBPF
Instrument code for observability
https://www.pinterest.com/pin/441775044670412234/