According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain distributed monitoring concepts.
Youtube channel here: https://youtu.be/EgpCw15fIK8
SRE Demystified - 16 - NALSD - Non-Abstract Large System DesignDr Ganesh Iyer
This document discusses Non-abstract Large System Design (NALSD), an iterative process for designing distributed systems. NALSD involves designing systems with realistic constraints in mind from the start, and assessing how designs would work at scale. It describes taking a basic design and refining it through iterations, considering whether the design is feasible, resilient, and can meet goals with available resources. Each iteration informs the next. NALSD is a skill for evaluating how well systems can fulfill requirements when deployed in real environments.
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain various practical alerting considerations and views from Google.
Youtube channel here: https://youtu.be/EgpCw15fIK8
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain different SLIs typically associated with a system. I will explain Availability, latency and quality SLIs in brief.
Youtube channel here: https://youtu.be/EgpCw15fIK8
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain continuous release engineering and configuration management.
Youtube channel here: https://youtu.be/EgpCw15fIK8
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain the term SRE (Site Reliability Engineering) and introduce key metrics for an SRE team SLI, SLO, and SLA.
Youtube Channel here: https://www.youtube.com/playlist?list=PLm_COkBtXzFq5uxmamT0tqXo-aKftLC1U
SRE Demystified - 12 - Docs that matter -1 Dr Ganesh Iyer
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain important documents required for onboarding new services, running services and production products.
Youtube video here: https://youtu.be/Uq5jvBdox48
SRE aims to balance system stability and agility by pursuing simplicity. The key aspects of simplicity according to SRE are minimizing accidental complexity, reducing software bloat through unnecessary lines of code, designing minimal yet effective APIs, creating modular systems, and implementing single changes in releases to easily measure their impact. The ultimate goal is reliable systems that allow for developer agility.
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain what is and isn't toil, how to identify, measure and eliminate them.
Youtube channel here: https://youtu.be/EgpCw15fIK8
SRE Demystified - 16 - NALSD - Non-Abstract Large System DesignDr Ganesh Iyer
This document discusses Non-abstract Large System Design (NALSD), an iterative process for designing distributed systems. NALSD involves designing systems with realistic constraints in mind from the start, and assessing how designs would work at scale. It describes taking a basic design and refining it through iterations, considering whether the design is feasible, resilient, and can meet goals with available resources. Each iteration informs the next. NALSD is a skill for evaluating how well systems can fulfill requirements when deployed in real environments.
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain various practical alerting considerations and views from Google.
Youtube channel here: https://youtu.be/EgpCw15fIK8
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain different SLIs typically associated with a system. I will explain Availability, latency and quality SLIs in brief.
Youtube channel here: https://youtu.be/EgpCw15fIK8
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain continuous release engineering and configuration management.
Youtube channel here: https://youtu.be/EgpCw15fIK8
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain the term SRE (Site Reliability Engineering) and introduce key metrics for an SRE team SLI, SLO, and SLA.
Youtube Channel here: https://www.youtube.com/playlist?list=PLm_COkBtXzFq5uxmamT0tqXo-aKftLC1U
SRE Demystified - 12 - Docs that matter -1 Dr Ganesh Iyer
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain important documents required for onboarding new services, running services and production products.
Youtube video here: https://youtu.be/Uq5jvBdox48
SRE aims to balance system stability and agility by pursuing simplicity. The key aspects of simplicity according to SRE are minimizing accidental complexity, reducing software bloat through unnecessary lines of code, designing minimal yet effective APIs, creating modular systems, and implementing single changes in releases to easily measure their impact. The ultimate goal is reliable systems that allow for developer agility.
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain what is and isn't toil, how to identify, measure and eliminate them.
Youtube channel here: https://youtu.be/EgpCw15fIK8
Parameter Sniffing is usually thought of as the bad guy, in association with a performance problem in your database. Contrary to the popular belief, Parameter Sniffing is usually the good guy, continuously working under the hood to help your database applications run faster. However, it can sometimes go wrong, causing severe performance degradation of your queries.
In this session we will discuss the workings of Parameter Sniffing and demonstrate how it helps improve the performance of your database applications. We will also explore how Parameter Sniffing can go wrong and its impact. Several ways to fix bad Parameter Sniffing will be demonstrated to help make an appropriate choice for your scenario.
The document discusses performance monitoring, which analyzes live data beyond initial performance testing. It outlines key parameters to monitor like uptime, page speed, and memory usage. Tools like Splunk can centralize monitoring across distributed systems. The document provides a case study of high page load times and explores potential root causes like slow API calls or custom UI code. Analyzing log messages and database queries can help identify specific problems.
Ride the database in JUnit tests with Database RiderMikalai Alimenkou
For a long time DB related testing in Java world has been a real pain and most developers tried to reduce number of such tests as much as possible. With good in-memory database implementations like H2, schema migration solutions like Liquibase or Flyway, containerization with libraries like TestContainers, database management is now much simpler. But test data management is still a pain. Some developers use SQL dumps, others insert data via JPA/JDBC or rely on prepared data sets. Good old DBUnit may be a good option, but it is not so developer friendly and not adopted well for modern annotations driven development style. Database Rider closes the gap between modern Java development environment and DBUnit, bringing DBUnit closer to your JUnit tests, so database testing will feel like a breeze. In addition to flexible data sets management this library provides other useful features: programmatic data sets definition, leak hunting, data sets export, constraints management, etc. As contributor and loyal user for many years, I would like to share my experience with Database Rider and demonstrate how to make database testing a fun again!
- The document discusses various processes and tools to help system administrators and project managers improve project workflow. It focuses on recovery, deployment, and profiling.
- Key processes discussed include backups, version control, system deployment, project task management, load testing, and keeping statistics on performance, traffic, and servers. Tools mentioned include Chef, Puppet, Jenkins, Cacti, Ganglia, Munin, Google Analytics, and Piwik.
- The main benefits are improved continuity, ability to recover from disasters, saving time, and reducing headaches through establishing standard processes. Recovery and deployment processes provide the most return on time invested while profiling offers the least.
Modern CI/CD in the microservices world with KubernetesMikalai Alimenkou
In this talk, we will go through the design process of modern CI/CD for the microservices-based system with Kubernetes support. We will discuss how to verify consistency between microservices, apply different levels of quality gates and promote artifacts between environments. Thanks to Kubernetes we will review different approaches of environment resources optimization for development needs during CI/CD cycles.
Actually using metrics for making meaninful decisions
Several development teams have metrics and analysis tools available that can be used to support important decisions, but very few of them actually use this information. This talk will introduce software analytics, a technique that can be used to use this information to make decisions and take action. It will also present a Lean Software Analytics Canvas, a simple tool that can be used by small and medium agile teams to incorporate software analytics in their development process. Realistic examples and results based on the experience of applying the canvas in real teams will be also presented.
ImprovingSBHSRemotetriggered server and websiteDevdutt Kumar
This document discusses improvements made to the SBHS remote-triggered virtual lab server and website. It introduces the SBHS project, which uses a LAB-IN-A-BOX setup for control systems experiments. Python, Django, and MySQL were used as technologies. Issues addressed include the Git code not pulling updates, errors preventing server connections, and implementing automated slot booking and health monitoring of the server by regularly checking equipment and logging status.
Grails has great performance characteristics but as with all full stack frameworks, attention must be paid to optimize performance. In this talk Lari will discuss common missteps that can easily be avoided and share tips and tricks which help profile and tune Grails applications.
Baltimore share point user group june 2015Toby McGrail
This document summarizes a presentation on troubleshooting SharePoint 2010/2013. It discusses correlation IDs, diagnostic logging, usage and health data collection, the health analyzer, developer dashboard, tools like the SharePoint diagnostic toolkit, PowerShell commands, and demos of tools like the ULS viewer. Specific topics covered include log levels, trace logging, monitoring crawl logs, IIS app pools, counters and thresholds, network troubleshooting, and client browser issues. The presentation aims to provide tips and tricks for monitoring and troubleshooting a SharePoint environment.
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...rschuppe
Application Performance doesn't come easy. How to find the root cause of performance issues in modern and complex applications? All you have is a complaining user to start with?
In this presentation (mainly in German, but understandable for english speakers) I'd reprised the fundamentals of trouble shooting and have some new examples on how to tackle issues.
Follow up presentation to "Performance Trouble Shooting 101 - Schweine, Schlangen und Papierschnitte"
This document discusses techniques for improving web performance. It begins with research on how caching and cookies impact performance. It then outlines 14 rules for optimizing performance, such as making fewer HTTP requests, using content delivery networks, gzipping components, placing scripts at the bottom of pages, and avoiding redirects. Case studies demonstrate how following these rules can significantly improve page load times. The document emphasizes starting performance improvements by focusing on front-end optimizations and advocates evangelizing best practices.
The document provides an overview of using Perfmon and Profiler to monitor SQL Server performance. It discusses capturing key metrics with Perfmon including memory, storage, CPU and connection metrics. Traces can then be captured with Profiler to analyze queries and correlate them with resource usage. The metrics and traces can help identify performance issues which can then be mitigated through methods like adding indexes, memory, storage or changing queries. Examples are provided of using this process to solve problems related to disk queues, memory pressure and transaction log backups. Resources for further information are also listed.
The 5S Approach to Performance Tuning by Chuck EzellDatavail
If your organization relies on data, optimizing the performance of your database can increase your earnings and savings. Many factors large and small can affect performance, so fine-tuning your database is essential. Performance Tuning expert and Senior Applications Tuner for Datavail, Chuck Ezell, sheds light on the right questions to get the answers that will help you move forward by using a defined approach, refered to as 5S.
This performance tuning white paper addresses each stage of this novel approach, as well as key performance issues: SQL, Space, Sessions, Statistics, and Scheduled Processes.
The rise of cloud and containers has led to systems that are much more distributed and dynamic in nature. Highly elastic microservice and serverless architectures mean containers spin up on demand and scale to zero when that demand goes away. This generates a continous stream of infrastructure data.
On the business side, we have started storing lot of data and this data contains enormours information, specially when married with infrastructure data this gives holistic health information of the entire platform. We will talk about how to achieve this kind of fine-grained observability at scale in real-time.
The document discusses techniques for improving web page performance, including making fewer HTTP requests through image maps, CSS sprites, inline images, and combined scripts and stylesheets. It also covers using a content delivery network, adding expiration headers, CSS/JavaScript optimization, parallel downloads, cookies, browser caching, and more. Case studies and experiments demonstrate the impact of these techniques on major websites. The goal is to help web developers optimize front-end performance.
DockerCon SF 2019 - Observability WorkshopKevin Crawley
This document contains the slides from a workshop on observability presented by Kevin Crawley of Instana and Single Music. The workshop covered distributed tracing using Jaeger and Prometheus, challenges with open source monitoring tools, and advanced use cases for distributed tracing demonstrated through Single Music's experience. The agenda included labs on setting up Kubernetes and applications, monitoring metrics with Grafana and Prometheus, distributed tracing with Jaeger, and analytics use cases.
The document summarizes a presentation about using Zend Server's monitoring and profiling features to diagnose performance problems in applications. It begins with an introduction to the speaker and an overview of the session. It then discusses common issues like slow performance, errors and high memory usage that are difficult to reproduce. The presentation demonstrates how to use Zend Server's monitoring and code tracing to diagnose examples of these types of problems in a sample BeerIOU application. It also covers the performance impact and advantages of Zend Server's monitoring features.
The document provides tips to improve web application performance. It recommends minimizing HTTP requests by combining images, CSS, and JavaScript files. Other tips include enabling HTTP compression, using appropriate image formats, compressing assets, placing CSS at the top and JavaScript at the bottom of pages, using a content delivery network, caching appropriately, and reducing cookie size. The document emphasizes reducing the number of server roundtrips to improve response time.
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)Sascha Wenninger
Provides an overview of popular integration approaches, maps them to SAP's integration tools and concludes with some lessons learnt in their application.
From Zero to Performance Hero in Minutes - Agile Testing Days 2014 PotsdamAndreas Grabner
As a Tester you need to level up. You can do more than functional verification or reporting Response Time
In my Performance Clinic Workshops I show you real life exampls on why Applications fail and what you can do to find these problems when you are testing these applications.
I am using Free Tools for all of these excercises - especially Dynatrace which gives full End-to-End Visibility (Browser to Database). You can test and download Dynatrace for Free @ http://bit.ly/atd2014challenge
Parameter Sniffing is usually thought of as the bad guy, in association with a performance problem in your database. Contrary to the popular belief, Parameter Sniffing is usually the good guy, continuously working under the hood to help your database applications run faster. However, it can sometimes go wrong, causing severe performance degradation of your queries.
In this session we will discuss the workings of Parameter Sniffing and demonstrate how it helps improve the performance of your database applications. We will also explore how Parameter Sniffing can go wrong and its impact. Several ways to fix bad Parameter Sniffing will be demonstrated to help make an appropriate choice for your scenario.
The document discusses performance monitoring, which analyzes live data beyond initial performance testing. It outlines key parameters to monitor like uptime, page speed, and memory usage. Tools like Splunk can centralize monitoring across distributed systems. The document provides a case study of high page load times and explores potential root causes like slow API calls or custom UI code. Analyzing log messages and database queries can help identify specific problems.
Ride the database in JUnit tests with Database RiderMikalai Alimenkou
For a long time DB related testing in Java world has been a real pain and most developers tried to reduce number of such tests as much as possible. With good in-memory database implementations like H2, schema migration solutions like Liquibase or Flyway, containerization with libraries like TestContainers, database management is now much simpler. But test data management is still a pain. Some developers use SQL dumps, others insert data via JPA/JDBC or rely on prepared data sets. Good old DBUnit may be a good option, but it is not so developer friendly and not adopted well for modern annotations driven development style. Database Rider closes the gap between modern Java development environment and DBUnit, bringing DBUnit closer to your JUnit tests, so database testing will feel like a breeze. In addition to flexible data sets management this library provides other useful features: programmatic data sets definition, leak hunting, data sets export, constraints management, etc. As contributor and loyal user for many years, I would like to share my experience with Database Rider and demonstrate how to make database testing a fun again!
- The document discusses various processes and tools to help system administrators and project managers improve project workflow. It focuses on recovery, deployment, and profiling.
- Key processes discussed include backups, version control, system deployment, project task management, load testing, and keeping statistics on performance, traffic, and servers. Tools mentioned include Chef, Puppet, Jenkins, Cacti, Ganglia, Munin, Google Analytics, and Piwik.
- The main benefits are improved continuity, ability to recover from disasters, saving time, and reducing headaches through establishing standard processes. Recovery and deployment processes provide the most return on time invested while profiling offers the least.
Modern CI/CD in the microservices world with KubernetesMikalai Alimenkou
In this talk, we will go through the design process of modern CI/CD for the microservices-based system with Kubernetes support. We will discuss how to verify consistency between microservices, apply different levels of quality gates and promote artifacts between environments. Thanks to Kubernetes we will review different approaches of environment resources optimization for development needs during CI/CD cycles.
Actually using metrics for making meaninful decisions
Several development teams have metrics and analysis tools available that can be used to support important decisions, but very few of them actually use this information. This talk will introduce software analytics, a technique that can be used to use this information to make decisions and take action. It will also present a Lean Software Analytics Canvas, a simple tool that can be used by small and medium agile teams to incorporate software analytics in their development process. Realistic examples and results based on the experience of applying the canvas in real teams will be also presented.
ImprovingSBHSRemotetriggered server and websiteDevdutt Kumar
This document discusses improvements made to the SBHS remote-triggered virtual lab server and website. It introduces the SBHS project, which uses a LAB-IN-A-BOX setup for control systems experiments. Python, Django, and MySQL were used as technologies. Issues addressed include the Git code not pulling updates, errors preventing server connections, and implementing automated slot booking and health monitoring of the server by regularly checking equipment and logging status.
Grails has great performance characteristics but as with all full stack frameworks, attention must be paid to optimize performance. In this talk Lari will discuss common missteps that can easily be avoided and share tips and tricks which help profile and tune Grails applications.
Baltimore share point user group june 2015Toby McGrail
This document summarizes a presentation on troubleshooting SharePoint 2010/2013. It discusses correlation IDs, diagnostic logging, usage and health data collection, the health analyzer, developer dashboard, tools like the SharePoint diagnostic toolkit, PowerShell commands, and demos of tools like the ULS viewer. Specific topics covered include log levels, trace logging, monitoring crawl logs, IIS app pools, counters and thresholds, network troubleshooting, and client browser issues. The presentation aims to provide tips and tricks for monitoring and troubleshooting a SharePoint environment.
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...rschuppe
Application Performance doesn't come easy. How to find the root cause of performance issues in modern and complex applications? All you have is a complaining user to start with?
In this presentation (mainly in German, but understandable for english speakers) I'd reprised the fundamentals of trouble shooting and have some new examples on how to tackle issues.
Follow up presentation to "Performance Trouble Shooting 101 - Schweine, Schlangen und Papierschnitte"
This document discusses techniques for improving web performance. It begins with research on how caching and cookies impact performance. It then outlines 14 rules for optimizing performance, such as making fewer HTTP requests, using content delivery networks, gzipping components, placing scripts at the bottom of pages, and avoiding redirects. Case studies demonstrate how following these rules can significantly improve page load times. The document emphasizes starting performance improvements by focusing on front-end optimizations and advocates evangelizing best practices.
The document provides an overview of using Perfmon and Profiler to monitor SQL Server performance. It discusses capturing key metrics with Perfmon including memory, storage, CPU and connection metrics. Traces can then be captured with Profiler to analyze queries and correlate them with resource usage. The metrics and traces can help identify performance issues which can then be mitigated through methods like adding indexes, memory, storage or changing queries. Examples are provided of using this process to solve problems related to disk queues, memory pressure and transaction log backups. Resources for further information are also listed.
The 5S Approach to Performance Tuning by Chuck EzellDatavail
If your organization relies on data, optimizing the performance of your database can increase your earnings and savings. Many factors large and small can affect performance, so fine-tuning your database is essential. Performance Tuning expert and Senior Applications Tuner for Datavail, Chuck Ezell, sheds light on the right questions to get the answers that will help you move forward by using a defined approach, refered to as 5S.
This performance tuning white paper addresses each stage of this novel approach, as well as key performance issues: SQL, Space, Sessions, Statistics, and Scheduled Processes.
The rise of cloud and containers has led to systems that are much more distributed and dynamic in nature. Highly elastic microservice and serverless architectures mean containers spin up on demand and scale to zero when that demand goes away. This generates a continous stream of infrastructure data.
On the business side, we have started storing lot of data and this data contains enormours information, specially when married with infrastructure data this gives holistic health information of the entire platform. We will talk about how to achieve this kind of fine-grained observability at scale in real-time.
The document discusses techniques for improving web page performance, including making fewer HTTP requests through image maps, CSS sprites, inline images, and combined scripts and stylesheets. It also covers using a content delivery network, adding expiration headers, CSS/JavaScript optimization, parallel downloads, cookies, browser caching, and more. Case studies and experiments demonstrate the impact of these techniques on major websites. The goal is to help web developers optimize front-end performance.
DockerCon SF 2019 - Observability WorkshopKevin Crawley
This document contains the slides from a workshop on observability presented by Kevin Crawley of Instana and Single Music. The workshop covered distributed tracing using Jaeger and Prometheus, challenges with open source monitoring tools, and advanced use cases for distributed tracing demonstrated through Single Music's experience. The agenda included labs on setting up Kubernetes and applications, monitoring metrics with Grafana and Prometheus, distributed tracing with Jaeger, and analytics use cases.
The document summarizes a presentation about using Zend Server's monitoring and profiling features to diagnose performance problems in applications. It begins with an introduction to the speaker and an overview of the session. It then discusses common issues like slow performance, errors and high memory usage that are difficult to reproduce. The presentation demonstrates how to use Zend Server's monitoring and code tracing to diagnose examples of these types of problems in a sample BeerIOU application. It also covers the performance impact and advantages of Zend Server's monitoring features.
The document provides tips to improve web application performance. It recommends minimizing HTTP requests by combining images, CSS, and JavaScript files. Other tips include enabling HTTP compression, using appropriate image formats, compressing assets, placing CSS at the top and JavaScript at the bottom of pages, using a content delivery network, caching appropriately, and reducing cookie size. The document emphasizes reducing the number of server roundtrips to improve response time.
Navigating SAP’s Integration Options (Mastering SAP Technologies 2013)Sascha Wenninger
Provides an overview of popular integration approaches, maps them to SAP's integration tools and concludes with some lessons learnt in their application.
From Zero to Performance Hero in Minutes - Agile Testing Days 2014 PotsdamAndreas Grabner
As a Tester you need to level up. You can do more than functional verification or reporting Response Time
In my Performance Clinic Workshops I show you real life exampls on why Applications fail and what you can do to find these problems when you are testing these applications.
I am using Free Tools for all of these excercises - especially Dynatrace which gives full End-to-End Visibility (Browser to Database). You can test and download Dynatrace for Free @ http://bit.ly/atd2014challenge
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...Yahoo Developer Network
This document discusses the challenges of operationalizing big data applications and how full stack performance intelligence can help DataOps teams address issues. It describes how intelligence can provide automated diagnosis and remediation to solve problems, automated detection and prevention to be proactive, and automated what-if analysis and planning to prepare for future use. Real-life examples show how intelligence can help with proactively detecting SLA violations, diagnosing Hive/Spark application failures, and planning a migration of applications to the cloud.
This document summarizes Mike Solomon's experience scaling YouTube's architecture using Python. It discusses:
1) Starting simply with all services on the same boxes before gradually separating services like search and thumbnails onto their own machines.
2) Systematically removing bottlenecks revealed by user demand growth and replacing components to maintain scalability.
3) Using caching, database partitioning, and bulk data migration techniques to scale the MySQL database horizontally as user traffic increased.
4) Favoring simplicity, customizing open source software, and taking a "Pythonic" approach to build scalable, transparent APIs.
This document summarizes Mike Solomon's experience scaling YouTube's architecture using Python. It discusses:
1) Starting simply with all services on the same boxes before gradually separating services like search and thumbnails onto specialized machines.
2) Systematically removing bottlenecks revealed by user demand growth and replacing components to continuously improve.
3) Using Python's flexibility to enable migrating databases and services with minimal disruption.
4) Balancing machine resources by overlaying orthogonal tasks on the same hardware.
Orion Network Performance Monitor (NPM) Optimization and Tuning TrainingSolarWinds
For more information on NPM, visit: http://www.solarwinds.com/network-performance-monitor.aspx
During this Orion NPM training session, we'll demonstrate the processes for optimizing Orion's performance. We'll cover tuning parameters and procedures for:
• Orion's website
• Orion's SQL database backend
• Orion's data collection services
The information covered in this class will be helpful for those administering Orion servers of all sizes, small and large, who are interested in receiving optimal performance.
Application Performance Troubleshooting 1x1 - Von Schweinen, Schlangen und Pa...rschuppe
Application Performance doesn't come easy. How to find the root cause of performance issues in modern and complex applications? All you have is a complaining user to start with?
In this presentation (mainly in German, but understandable for english speakers) I'd present the fundamentals of trouble shooting and have concrete examples on how to tackle issues.
Empowering Customers with Personalized InsightsCloudera, Inc.
Opower, a Cloudera customer, discusss how they implemented a scalable energy analysis platform that generates personalized insights for millions of people. To date, Opower’s insights have collectively saved over 5 terawatt hours of energy and $500 million in energy bills.
Web applications are becoming increasingly data intensive and complex. Yet, users demand a great user experience, including blazingly fast speeds, across many device types. In this talk, we will show you how you can dramatically improve the performance of your web applications by using Sencha Ext JS and Ext Speeder. You will learn how to: accelerate your back-end data requests up to 10x by leveraging sophisticated in-memory, object-oriented techniques, significantly improve application responsiveness without making any modifications to your client Ext JS application, and quickly get started with database acceleration in standard J2EE environments.
Similar to SRE Demystified - 06 - Distributed Monitoring (20)
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain key SRE processes. Video: https://youtu.be/BdFmRJAnB6A
This document discusses various types of documents used by SRE teams at Google for different purposes:
1. Quarterly service review documents and presentations that provide an overview of a service's performance, sustainability, risks, and health to SRE leadership and product teams.
2. Production best practices review documents that detail an SRE team's website, on-call health, projects vs interrupts, SLOs, and capacity planning to help the team adopt best practices.
3. Documents for running SRE teams like Google's SRE workbook that provide guidance on engagement models.
4. Onboarding documents like training materials, checklists, and role-playing drills to help new SREs.
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain what is release engineering and important release engineering philosophies.
Youtube channel here: https://youtu.be/EgpCw15fIK8
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain how SREs engage with other teams especially service owners / developers.
Youtube channel here: https://youtu.be/EgpCw15fIK8
Machine Learning for Statisticians - IntroductionDr Ganesh Iyer
Introduction to Machine Learning for Statisticians. From the webinar given for Sacred Hearts College, Tevara, Ernakulam, India on 8/8/2020. It briefly introduces ML concepts and what does it mean for statisticians.
Making Decisions - A Game Theoretic approachDr Ganesh Iyer
Webinar recording of the webinar conducted on 18-07-2020 for Rajagiri School of Engineering and Technology.
Speaker - Dr Ganesh Neelakanta Iyer
Topics:
Overview of Game Theory, Non cooperative games, cooperative games and mechanism design principles.
Game Theory and its engineering applications delivered at ViTECoN 2019 at VIT, Vellore. It gives introduction to types of games, sample from different engineering domains
Machine learning and its applications was a gentle introduction to machine learning presented by Dr. Ganesh Neelakanta Iyer. The presentation covered an introduction to machine learning, different types of machine learning problems including classification, regression, and clustering. It also provided examples of applications of machine learning at companies like Facebook, Google, and McDonald's. The presentation concluded with discussing the general machine learning framework and steps involved in working with machine learning problems.
Characteristics of successful entrepreneurs, How to start a business, Habits of successful entrepreneurs, Some highly successful entrepreneurs - Walt Disney, Small kids who are very successful
Introduction to dockers and kubernetes. Learn how this helps you to build scalable and portable applications with cloud. It introduces the basic concepts of dockers, its differences with virtualization, then explain the need for orchestration and do some hands-on experiments with dockers
Containerization Principles Overview for app development and deploymentDr Ganesh Iyer
This is the slide deck from recent Workshop conducted as part of IEEE INDICON 2018 on Containerization principles for next-generation application development and deployment.
A comprehensive overview of various Game Theory principles and examples from Engineering and other fields to know how we can use it to solve various research problems.
Demystifying Containerization Principles for Data ScientistsDr Ganesh Iyer
Demystifying Containerization Principles for Data Scientists - An introductory tutorial on how Dockers can be used as a development environment for data science projects
Cloud computing for image processing and bio informaticsDr Ganesh Iyer
Slides from the keynote speech given at SAPIENCE 2018, International conference at SNGCE Kolenchery. It speaks about, the usage of Cloud computing for image processing and genomics
Introduction to IoT, Current trends and challenges. It also describes some of the industry standard platforms such as Microsoft Azure IoT Edge and AWS IoT. Trends described includes Edge computing, Security, Cognitive Computing, Analytics, Containers and Microservices
Docker 101 - High level introduction to dockerDr Ganesh Iyer
This document provides an overview of Docker containers and their benefits. It begins by explaining what Docker containers are, noting that they wrap up software code and dependencies into lightweight packages that can run consistently on any hardware platform. It then discusses some key benefits of Docker containers like their portability, efficiency, and ability to eliminate compatibility issues. The document provides examples of how Docker solves problems related to managing multiple software stacks and environments. It also compares Docker containers to virtual machines. Finally, it outlines some common use cases for Docker like application development, CI/CD workflows, microservices, and hybrid cloud deployments.
This document provides a summary and experience details of Dr. Ganesh Neelakanta Iyer. It summarizes that he is a QA Architect at Progress with over 12 years of experience in research, development, and QA. He has designed several QA automation frameworks and leads product QA initiatives. He is also passionate about product evangelism through various forums, workshops, and mentoring.
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...Alex Pruden
Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty, is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing post-quantum security.
Paper Link: https://eprint.iacr.org/2024/257
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/how-axelera-ai-uses-digital-compute-in-memory-to-deliver-fast-and-energy-efficient-computer-vision-a-presentation-from-axelera-ai/
Bram Verhoef, Head of Machine Learning at Axelera AI, presents the “How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-efficient Computer Vision” tutorial at the May 2024 Embedded Vision Summit.
As artificial intelligence inference transitions from cloud environments to edge locations, computer vision applications achieve heightened responsiveness, reliability and privacy. This migration, however, introduces the challenge of operating within the stringent confines of resource constraints typical at the edge, including small form factors, low energy budgets and diminished memory and computational capacities. Axelera AI addresses these challenges through an innovative approach of performing digital computations within memory itself. This technique facilitates the realization of high-performance, energy-efficient and cost-effective computer vision capabilities at the thin and thick edge, extending the frontier of what is achievable with current technologies.
In this presentation, Verhoef unveils his company’s pioneering chip technology and demonstrates its capacity to deliver exceptional frames-per-second performance across a range of standard computer vision networks typical of applications in security, surveillance and the industrial sector. This shows that advanced computer vision can be accessible and efficient, even at the very edge of our technological ecosystem.
Essentials of Automations: Exploring Attributes & Automation ParametersSafe Software
Building automations in FME Flow can save time, money, and help businesses scale by eliminating data silos and providing data to stakeholders in real-time. One essential component to orchestrating complex automations is the use of attributes & automation parameters (both formerly known as “keys”). In fact, it’s unlikely you’ll ever build an Automation without using these components, but what exactly are they?
Attributes & automation parameters enable the automation author to pass data values from one automation component to the next. During this webinar, our FME Flow Specialists will cover leveraging the three types of these output attributes & parameters in FME Flow: Event, Custom, and Automation. As a bonus, they’ll also be making use of the Split-Merge Block functionality.
You’ll leave this webinar with a better understanding of how to maximize the potential of automations by making use of attributes & automation parameters, with the ultimate goal of setting your enterprise integration workflows up on autopilot.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/temporal-event-neural-networks-a-more-efficient-alternative-to-the-transformer-a-presentation-from-brainchip/
Chris Jones, Director of Product Management at BrainChip , presents the “Temporal Event Neural Networks: A More Efficient Alternative to the Transformer” tutorial at the May 2024 Embedded Vision Summit.
The expansion of AI services necessitates enhanced computational capabilities on edge devices. Temporal Event Neural Networks (TENNs), developed by BrainChip, represent a novel and highly efficient state-space network. TENNs demonstrate exceptional proficiency in handling multi-dimensional streaming data, facilitating advancements in object detection, action recognition, speech enhancement and language model/sequence generation. Through the utilization of polynomial-based continuous convolutions, TENNs streamline models, expedite training processes and significantly diminish memory requirements, achieving notable reductions of up to 50x in parameters and 5,000x in energy consumption compared to prevailing methodologies like transformers.
Integration with BrainChip’s Akida neuromorphic hardware IP further enhances TENNs’ capabilities, enabling the realization of highly capable, portable and passively cooled edge devices. This presentation delves into the technical innovations underlying TENNs, presents real-world benchmarks, and elucidates how this cutting-edge approach is positioned to revolutionize edge AI across diverse applications.
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
3. Why Monitor?
3
Analyzing long-term trends
Comparing time/iterations
Alerting
Building dashboards
How big is my database and how fast is it
growing? How quickly is my daily-active user
count growing?
Are queries faster with Acme Bucket of Bytes
2.72 versus Ajax DB 3.14? How much better
is my memcache hit rate with an extra node?
Is my site slower than it was last week?
Something is broken,
and somebody needs to
fix it right now! Or,
something might break
soon, so attention
needed soon.
Dashboards should answer basic questions
about your service, and normally include
some form of the four golden signals
Conducting ad hoc retrospective analysis
Our latency just shot up; what else happened around the same time?
https://landing.google.com/sre/sre-book/chapters/monitoring-distributed-systems/
4. Setting Reasonable Expectations for Monitoring
• Simpler and faster monitoring systems, with better tools for post
hoc analysis
• Other uses of monitoring data such as capacity planning and
traffic prediction can tolerate more fragility, and thus, more
complexity
• To keep noise low and signal high, the elements of your
monitoring system that direct to a pager need to be very simple
and robust
• Rules that generate alerts for humans should be simple to
understand and represent a clear failure
4https://landing.google.com/sre/sre-book/chapters/monitoring-distributed-systems/
5. Example symptoms and causes
5
Symptom Cause
I’m serving HTTP 500s or 404s Database servers are refusing connections
My responses are slow
CPUs are overloaded by a bogosort, or an Ethernet
cable is crimped under a rack, visible as partial
packet loss
Users in Antarctica aren’t receiving animated cat
GIFs
Your Content Distribution Network hates scientists
and felines, and thus blacklisted some client IPs
Private content is world-readable
A new software push caused ACLs to be forgotten
and allowed all requests
https://landing.google.com/sre/sre-book/chapters/monitoring-distributed-systems/
6. Black-box monitoring White-box monitoring
• Restricted to external
service behaviour
• e.g. ping, http
6
• Internal metrics and logs to
surface system issues
• e.g. request_errors_total,
request_processing_time,
DevOps School
9. Choosing an Appropriate Resolution for
Measurements
• Observing CPU load over the time span of a minute won’t reveal even
quite long-lived spikes that drive high tail latencies
• For a web service targeting no more than 9 hours aggregate downtime
per year (99.9% annual uptime), probing for a 200 (success) status
more than once or twice a minute is probably unnecessarily frequent.
• Similarly, checking hard drive fullness for a service targeting 99.9%
availability more than once every 1–2 minutes is probably unnecessary
9