This talk will provide motivation for the extensive instrumentation of complex computer systems and make the argument that such systems. This talk will provide practical starting points in Erlang projects and maintain a perspective on the human organization around the computer system. Brian will focus on getting started with instrumentation in a systematic way and follow up with the challenge of interpreting and acting on metrics emitted from a production system in a way which does not overwhelm operators’ ability to effectively control or prioritize faults in the system. He’ll use historical examples and case studies from my work to keep the talk anchored in the practical.
Talk objectives:
Brian hopes to convince the audience of two things:
* that monitoring and instrumentation is an essential component of any long-lived system and
* that it's not so hard to get started, after all.
He’ll keep a clear-eyed view of what works and is difficult in practice so that the audience can make a reasoned decision after the talk.
Target audience:
This talk would appeal to engineers with long-running production employments, operations folks and Erlangers in general.
Monitoring Complex Systems - Chicago Erlang, 2014Brian Troutwine
Imagine being responsible for monitoring 100 servers. Now imagine 1000. Each server has 100 different things to keep track of. What do you pay attention to and what do you ignore? What is important? In this talk Brian will show how Erlang can be used to capture more information without compromising clarity — i.e. to keep track of the forest without loosing site of the trees!
10 Billion a Day, 100 Milliseconds Per: Monitoring Real-Time Bidding at AdRollBrian Troutwine
This is the talk I gave at Erlang Factory SF Bay Area 2014. In it I discussed the instrumentation by default approach taken in the AdRoll real-time bidding team, discuss the technical details of the libraries we use and lessons learned to adapt your organization to deal with the onslaught of data from instrumentation.
From list sorting to network routing, and from hash tables to capacity planning, a programmer's daily work is filled with probability. We use probabilistic algorithms, data structures, and systems constantly often without even thinking about it. Experienced engineers reach for probabilistic algorithms frequently and intentionally, especially when building systems of serious scale. How do probabilistic algorithms actually work in practice? And how do we know they'll be safe and reliable in our critical production systems? We'll address those questions, explore a few algorithms, and see why "with high probability" is often better than "exactly".
Strata 2014 Talk:Tracking a Soccer Game with Big DataSrinath Perera
Mobile devices, sensors and GPS are driving the demand to handle big data in both batch and real time. This presentation discusses how we used complex event processing (CEP) and MapReduce based technologies to track and process data from a soccer match as part of the annual DEBS event processing challenge. In 2013, the challenge included a data set generated by a real soccer match in which sensors were placed in the soccer ball and players’ shoes. This session will review how we used CEP to implement DESB challenge and achieved throughput in excess of 100,000 events/sec. It also will examine how we extended the solution to conduct batch processing using business activity monitoring (BAM) using the same framework, enabling users to obtain both instant analytics as well as more detailed batch processing based results.
Deadlock is a situation where a set of processes are blocked because each process is holding a resource and waiting for another resource acquired by some other process.
Mutual Exclusion: One or more than one resource are non-sharable (Only one process can use at a time)
Monitoring Complex Systems - Chicago Erlang, 2014Brian Troutwine
Imagine being responsible for monitoring 100 servers. Now imagine 1000. Each server has 100 different things to keep track of. What do you pay attention to and what do you ignore? What is important? In this talk Brian will show how Erlang can be used to capture more information without compromising clarity — i.e. to keep track of the forest without loosing site of the trees!
10 Billion a Day, 100 Milliseconds Per: Monitoring Real-Time Bidding at AdRollBrian Troutwine
This is the talk I gave at Erlang Factory SF Bay Area 2014. In it I discussed the instrumentation by default approach taken in the AdRoll real-time bidding team, discuss the technical details of the libraries we use and lessons learned to adapt your organization to deal with the onslaught of data from instrumentation.
From list sorting to network routing, and from hash tables to capacity planning, a programmer's daily work is filled with probability. We use probabilistic algorithms, data structures, and systems constantly often without even thinking about it. Experienced engineers reach for probabilistic algorithms frequently and intentionally, especially when building systems of serious scale. How do probabilistic algorithms actually work in practice? And how do we know they'll be safe and reliable in our critical production systems? We'll address those questions, explore a few algorithms, and see why "with high probability" is often better than "exactly".
Strata 2014 Talk:Tracking a Soccer Game with Big DataSrinath Perera
Mobile devices, sensors and GPS are driving the demand to handle big data in both batch and real time. This presentation discusses how we used complex event processing (CEP) and MapReduce based technologies to track and process data from a soccer match as part of the annual DEBS event processing challenge. In 2013, the challenge included a data set generated by a real soccer match in which sensors were placed in the soccer ball and players’ shoes. This session will review how we used CEP to implement DESB challenge and achieved throughput in excess of 100,000 events/sec. It also will examine how we extended the solution to conduct batch processing using business activity monitoring (BAM) using the same framework, enabling users to obtain both instant analytics as well as more detailed batch processing based results.
Deadlock is a situation where a set of processes are blocked because each process is holding a resource and waiting for another resource acquired by some other process.
Mutual Exclusion: One or more than one resource are non-sharable (Only one process can use at a time)
Polyglot Persistence in the Real World: Cassandra + S3 + MapReducethumbtacktech
This talk focuses on building a system from scratch, showing how to perform analytical queries in near real-time and still get the benefits of high performance database engine of Cassandra. The key subjects of my speech are:
● The splendors and miseries of NoSQL
● Apache Cassandra use-cases
● Difficulties of using MapReduce directly in Cassandra
● Amazon cloud solutions: Elastic MapReduce and S3
● “real-enough” time analysis
In particular the talk dives into ways of handling different kinds of semi-ad-hoc queries when using Cassandra, the pitfalls in designing a schema around a specific analytics use case. Some attention will be paid towards dealing with time series data in particular, which can present a real problem when using Column-Family or Key-Value store databases.
The Cassandra database is an excellent choice when you need scalability and high availability without compromising performance. Cassandra’s linear scalability, proven fault tolerance and tunable consistency, combined with its being optimized for write traffic, make it an attractive choice for performing structured logging of application and transactional events. But using a columnar store like Cassandra for analytical needs poses its own problems, problems we solved by careful construction of Column Families combined with diplomatic use of Hadoop.
This tutorial focuses on building a similar system from scratch, showing how to perform analytical queries in near real time and still getting the benefits of the high-performance database engine of Cassandra. The key subjects are:
• The splendors and miseries of NoSQL
• Apache Cassandra use cases
• Difficulties of using Map/Reduce directly in Cassandra
• Amazon cloud solutions: Elastic MapReduce and S3
• “Real-enough” time analysis
Where the wild things are - Benchmarking and Micro-OptimisationsMatt Warren
You don’t want to prematurely optimise, but sometimes you want to optimise, the question is - where to start? Profiling and Benchmarking can help you figure out what your application is doing and where performance problems could arise - allowing you to find (and fix!) them before your customers do.
If you aren’t already benchmarking your code this talk will offer some starting points. We’ll look at how to accurately benchmark in .NET and things to avoid. Along the way we’ll also discover some surprising code optimisations!
In this talk about performance for the 2018-01-25 Dallas Oracle Users Group meeting, Cary Millsap talks about application design, ways to control Oracle sql_trace, measurement intrusion, and function interposition.
Feature Engineering - Getting most out of data for predictive models - TDC 2017Gabriel Moreira
How should data be preprocessed for use in machine learning algorithms? How to identify the most predictive attributes of a dataset? What features can generate to improve the accuracy of a model?
Feature Engineering is the process of extracting and selecting, from raw data, features that can be used effectively in predictive models. As the quality of the features greatly influences the quality of the results, knowing the main techniques and pitfalls will help you to succeed in the use of machine learning in your projects.
In this talk, we will present methods and techniques that allow us to extract the maximum potential of the features of a dataset, increasing flexibility, simplicity and accuracy of the models. The analysis of the distribution of features and their correlations, the transformation of numeric attributes (such as scaling, normalization, log-based transformation, binning), categorical attributes (such as one-hot encoding, feature hashing, Temporal (date / time), and free-text attributes (text vectorization, topic modeling).
Python, Python, Scikit-learn, and Spark SQL examples will be presented and how to use domain knowledge and intuition to select and generate features relevant to predictive models.
(Moonconf 2016) Fetching Moths from the Works: Correctness Methods in SoftwareBrian Troutwine
We live in a nice world. There’s a wealth of historical thought on achieving correctness in software–shipping code that does only what is intended, not less and not more–and there are a whole bunch of methods available to us as practitioners. Some of these are hard to apply, some are easy. For instance, case testing is widely used and considered standard practice. Property testing is understood to exist but not widely used. The application of advanced logics? Way out there.
If you look around you’ll find a lot of software fails a lot of the time. Why is that?
In this talk I’ll give an overview of the methods for producing correct systems and will discuss each in its historical context. With each method, we’ll keep an eye out for present applications and the difficulty of doing so. We’ll discuss why there’s so much buggy software in the world. I expect there will be talk of spaceships a bit. By the end of this talk you ought to be able to make reasoned decisions about applying correctness methods in your own work and have a good shot at building better software.
Getting Uphill on a Candle: Crushed Spines, Detached Retinas and One Small StepBrian Troutwine
Looking back through history, we often view NASA’s early mission in terms of “getting to the Moon”, discussing how this or that program served the purpose of answering Kennedy’s challenge. This is wrong-headed. In this talk I will discuss aeronautics research beginning with the Writght Brothers and ending with the first Shuttle launch in 1981. We’ll see how NASA is an organization whose primary mission is basic research and development in aeronautics for the benefit of the public at large and space exploration. We’ll see how the Lunar Program was a focusing of research to a practical, political aim which built off decades of basic research and necessarily side-lined other programs. It’s my aim to convince you that Moonshot projects cannot be considered independently of their organizations and its history.
More Related Content
Similar to Monitoring Complex Systems: Keeping Your Head on Straight in a Hard World
Polyglot Persistence in the Real World: Cassandra + S3 + MapReducethumbtacktech
This talk focuses on building a system from scratch, showing how to perform analytical queries in near real-time and still get the benefits of high performance database engine of Cassandra. The key subjects of my speech are:
● The splendors and miseries of NoSQL
● Apache Cassandra use-cases
● Difficulties of using MapReduce directly in Cassandra
● Amazon cloud solutions: Elastic MapReduce and S3
● “real-enough” time analysis
In particular the talk dives into ways of handling different kinds of semi-ad-hoc queries when using Cassandra, the pitfalls in designing a schema around a specific analytics use case. Some attention will be paid towards dealing with time series data in particular, which can present a real problem when using Column-Family or Key-Value store databases.
The Cassandra database is an excellent choice when you need scalability and high availability without compromising performance. Cassandra’s linear scalability, proven fault tolerance and tunable consistency, combined with its being optimized for write traffic, make it an attractive choice for performing structured logging of application and transactional events. But using a columnar store like Cassandra for analytical needs poses its own problems, problems we solved by careful construction of Column Families combined with diplomatic use of Hadoop.
This tutorial focuses on building a similar system from scratch, showing how to perform analytical queries in near real time and still getting the benefits of the high-performance database engine of Cassandra. The key subjects are:
• The splendors and miseries of NoSQL
• Apache Cassandra use cases
• Difficulties of using Map/Reduce directly in Cassandra
• Amazon cloud solutions: Elastic MapReduce and S3
• “Real-enough” time analysis
Where the wild things are - Benchmarking and Micro-OptimisationsMatt Warren
You don’t want to prematurely optimise, but sometimes you want to optimise, the question is - where to start? Profiling and Benchmarking can help you figure out what your application is doing and where performance problems could arise - allowing you to find (and fix!) them before your customers do.
If you aren’t already benchmarking your code this talk will offer some starting points. We’ll look at how to accurately benchmark in .NET and things to avoid. Along the way we’ll also discover some surprising code optimisations!
In this talk about performance for the 2018-01-25 Dallas Oracle Users Group meeting, Cary Millsap talks about application design, ways to control Oracle sql_trace, measurement intrusion, and function interposition.
Feature Engineering - Getting most out of data for predictive models - TDC 2017Gabriel Moreira
How should data be preprocessed for use in machine learning algorithms? How to identify the most predictive attributes of a dataset? What features can generate to improve the accuracy of a model?
Feature Engineering is the process of extracting and selecting, from raw data, features that can be used effectively in predictive models. As the quality of the features greatly influences the quality of the results, knowing the main techniques and pitfalls will help you to succeed in the use of machine learning in your projects.
In this talk, we will present methods and techniques that allow us to extract the maximum potential of the features of a dataset, increasing flexibility, simplicity and accuracy of the models. The analysis of the distribution of features and their correlations, the transformation of numeric attributes (such as scaling, normalization, log-based transformation, binning), categorical attributes (such as one-hot encoding, feature hashing, Temporal (date / time), and free-text attributes (text vectorization, topic modeling).
Python, Python, Scikit-learn, and Spark SQL examples will be presented and how to use domain knowledge and intuition to select and generate features relevant to predictive models.
Similar to Monitoring Complex Systems: Keeping Your Head on Straight in a Hard World (20)
(Moonconf 2016) Fetching Moths from the Works: Correctness Methods in SoftwareBrian Troutwine
We live in a nice world. There’s a wealth of historical thought on achieving correctness in software–shipping code that does only what is intended, not less and not more–and there are a whole bunch of methods available to us as practitioners. Some of these are hard to apply, some are easy. For instance, case testing is widely used and considered standard practice. Property testing is understood to exist but not widely used. The application of advanced logics? Way out there.
If you look around you’ll find a lot of software fails a lot of the time. Why is that?
In this talk I’ll give an overview of the methods for producing correct systems and will discuss each in its historical context. With each method, we’ll keep an eye out for present applications and the difficulty of doing so. We’ll discuss why there’s so much buggy software in the world. I expect there will be talk of spaceships a bit. By the end of this talk you ought to be able to make reasoned decisions about applying correctness methods in your own work and have a good shot at building better software.
Getting Uphill on a Candle: Crushed Spines, Detached Retinas and One Small StepBrian Troutwine
Looking back through history, we often view NASA’s early mission in terms of “getting to the Moon”, discussing how this or that program served the purpose of answering Kennedy’s challenge. This is wrong-headed. In this talk I will discuss aeronautics research beginning with the Writght Brothers and ending with the first Shuttle launch in 1981. We’ll see how NASA is an organization whose primary mission is basic research and development in aeronautics for the benefit of the public at large and space exploration. We’ll see how the Lunar Program was a focusing of research to a practical, political aim which built off decades of basic research and necessarily side-lined other programs. It’s my aim to convince you that Moonshot projects cannot be considered independently of their organizations and its history.
The Charming Genius of the Apollo Guidance ComputerBrian Troutwine
The Apollo Project was the first flight system to deploy with a digital, general-purpose computer made of integrated circuits at its core: the Apollo Guidance Computer (AGC). It was a complete research project: no IC computer had run consecutively for more than a few hours, sophisticated programming techniques were unknown and the interactive human/computer interface had to be invented and made to appeal to astronauts opposed to machine interference in flight operations.
In this talk I'll give the historical context for the AGC, discuss its initial design and the evolution of this design as the Apollo Project progressed. We'll do a deep-dive on the machine architecture and note how tight integration with a special-purpose vehicle admitted incredibly sophisticated behaviour from a primitive machine. We'll further discuss the human/computer interface for the AGC, how the astronaut's flight roles dictated the computer's role and vice versa. Motivating examples from select Apollo flights will be used.
Throughout, we'll keep an eye on lessons to be gleaned from the experience of engineering the AGC and how we can adapt these lessons to modern computer systems in mission-critical deployments.
Fault-tolerance on the Cheap: Making Systems That (Probably) Won't Fall Over Brian Troutwine
Building computer systems that are reliable is hard. The functional programming community has invested a lot of time and energy into up-front-correctness guarantees: types and the like. Unfortunately, absolutely correct software is time-consuming to write and expensive as a result. Fault-tolerant systems achieve system-total reliability by accepting that sub-components will fail and planning for that failure as a first-class concern of the system. As companies embrace the wave of "as-a-service" architectures, failure of sub-systems become a more pressing concern. Using examples from heavy industry, aeronautics and telecom systems, this talk will explore how you can design for fault-tolerance and how functional programming techniques get us most of the way there.
Let it crash! The Erlang Approach to Building Reliable ServicesBrian Troutwine
In this talk, using the Erlang hacker's semi-official motto "Let it Crash!" as a lens, I'll speak to how radical simplicity of implementation, straight-forward runtime characteristics and discoverability of the running system lead to computer systems which have great success in networked, always-on deployments.
I will argue that while Erlang natively implements many features which aid the construction of such systems--functional programming language semantics, lack of global mutable state, first-class networking, for instance--these characteristics can be replicated in any computer system, as a part of initial design of new systems or the gradual evolution of an existing project. I'll discuss common design patterns and anti-patterns, using my own work at AdRoll and project experience reports from a variety of fields--both successes and failures--to advance my argument.
This will be a very practical talk and will be accessible to engineers of all backgrounds.
Automation With Humans in Mind: Making Complex Systems Predictable, Reliable ...Brian Troutwine
I believe that our current approach to designing software systems is driving society in a bad direction. In particular, I believe we are creating a society predicated on automation which is oriented to be serviced by humans or, requiring no service, is simply in control of humans. Ignoring the dystopian overtones of this, I argue that this is a technically flawed approach, that such automation is less reliable, less flexible and less robust through time than a system designed with humans as the controlling party in mind. I will argue--with a mix of personal experience, reference to academic literature and historical examples--that complex systems designed with human control in mind are more lasting through time, more technically excellent and just generally more useful. I will further argue that a re-orientation toward human supremacy in computer systems is especially important as we begin to tightly couple western civilization's technology to the internet, being the Internet of Things. I'll talk a bit about the political and social implications, as well, after I've made a purely technical argument.
Instrumentation as a Living Documentation: Teaching Humans About Complex SystemsBrian Troutwine
Instrumentation of Complex Systems is necessary and addresses the issues of static documentation of said systems. Instrumentation is flawed, flaws which are resolvable with an intentional kind of documentation.
Given at Write the Docs, Portland OR 2014.
Presentation slides given at Erloung Bay Area, January 2014. The deck is a brief introduction to the Erlang library exometer and gives an overview of my work at AdRoll to increase monitoring and insight of the running real-time bidding system.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfJay Das
With the advent of artificial intelligence or AI tools, project management processes are undergoing a transformative shift. By using tools like ChatGPT, and Bard organizations can empower their leaders and managers to plan, execute, and monitor projects more effectively.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
13. The Problem Domain
• Low latency ( < 100 ms per transaction)
• Firm real-time system
14. The Problem Domain
• Low latency ( < 100 ms per transaction)
• Firm real-time system
• Highly concurrent (~2 million transactions
per second, peak)
15. The Problem Domain
• Low latency ( < 100 ms per transaction)
• Firm real-time system
• Highly concurrent (~2 million transactions
per second, peak)
• Global, 24/7 operation
19. Humans are bad at predicting the
performance of complex systems(…).
Our ability to create large and
complex systems fools us into
believing that we’re also entitled to
understand them.
C A R L O S B U E N O
“ M AT U R E O P T I M I Z AT I O N H A N D B O O K ”
42. Important Terms
metric a measurement
entry a receiver and aggregator of metrics
reporter that which samples entries periodically
and ships them to another system
subscription the definition of the regular interval on
which reporters sample entries
43. exometer
• Responsive upstream (Ulf Wiger never sleeps?)
• Metric collection, aggregation and reporting
decoupled.
• Static and dynamic configuration.
• Very low, predictable runtime overhead.
58. Creating Reporters
•Very easy to add your own reporters and
entries.
•Reporters and entries can be proprietary.
Just have to be loaded at runtime.
•Authors are responsive to issues.
73. • Scheduler threads were locked to CPUs
• Background process comes on every 20
minutes, consumes a lot of CPU time
• No cpu-shield was set up on our
production systems
• OS bumped a scheduler thread off its CPU,
backing up its run-queue