1) Monitoring availability is important for Elixir applications. Having applications in multiple AWS regions can provide 99.99998% availability but failover between regions is not instant due to DNS propagation delays.
2) 70% of outages are caused by changes to live systems so monitoring release processes is key. Logging requests with request IDs and aggregating data helps monitor availability and failures.
3) Measuring business and system metrics along with VM metrics provides full visibility. Bleacher Report's monitoring includes logging to Logz, metrics to Datadog, release tracking with Jenkins, and alerting with Opsgenie.
OpWorks is a customizable, web-based solution for facility operators, superintendents, and managers. OpWorks lets you create a single user interface for data entry and reporting, and assists with making better facility management and operating decisions.
Lessons Learned Developing and Managing High Volume Apache Spark Pipelines i...Databricks
Quby is the creator and provider of Toon, a leading European smart home platform. We enable Toon users to control and monitor their homes using both an in-home display and app. As a data driven company, we use machine learning algorithms to generate actionable insights for our end users. We have developed data driven services to ensure that users do not needlessly waste energy and can receive real-time alerts about problems with their heating system.
In this talk, Erni will describe our journey of productionizing data science algorithms. We’ll take a deep dive into our pipeline and describe our streamlined development and deployment workflow. We’ll explain how we define and manage dependencies between jobs in multiple environments (test, acceptance and production) and schedule the pipeline computation. We’ll delve into scale challenges, metrics, monitoring and data quality. Also, we will reflect on the lessons learned while building high volume infrastructure that offers multiple data driven services to hundreds of thousands of users.
OpWorks is a customizable, web-based solution for facility operators, superintendents, and managers. OpWorks lets you create a single user interface for data entry and reporting, and assists with making better facility management and operating decisions.
Lessons Learned Developing and Managing High Volume Apache Spark Pipelines i...Databricks
Quby is the creator and provider of Toon, a leading European smart home platform. We enable Toon users to control and monitor their homes using both an in-home display and app. As a data driven company, we use machine learning algorithms to generate actionable insights for our end users. We have developed data driven services to ensure that users do not needlessly waste energy and can receive real-time alerts about problems with their heating system.
In this talk, Erni will describe our journey of productionizing data science algorithms. We’ll take a deep dive into our pipeline and describe our streamlined development and deployment workflow. We’ll explain how we define and manage dependencies between jobs in multiple environments (test, acceptance and production) and schedule the pipeline computation. We’ll delve into scale challenges, metrics, monitoring and data quality. Also, we will reflect on the lessons learned while building high volume infrastructure that offers multiple data driven services to hundreds of thousands of users.
System reliability is one of most important features for the Network service. With the development of visualized network functions, system reliability and stability becomes more complex than before. Tes ting and verifying OPNFV infrastructure for tolerating the faults and failures is a great challenge. To address this challenge, the presentation presents a test framework which covers the means and types of faults injection as well as the implement of fault injection tools. Some test cases are shown to validate the framework.
“Spikey Workloads”:
Emergency Management in the Cloud
One of the best use cases for the cloud involves websites with surges in computing needs. This session will feature organizations that have leveraged the cloud to handle their unique burst workloads without breaking the bank:
Speaker: , Solutions Architect, Amazon Web Services
One of the best use cases for the cloud involves websites with surges in computing needs. This session will feature organizations that have leveraged the cloud to handle their unique burst workloads without breaking the bank:
Speaker: Cameron Maxwell, Professional Services, Amazon Web Services
VMworld 2013: Network Function Virtualization in the Cloud: Case for Enterpri...VMworld
VMworld 2013
Alka Gupta, VMware
Sanjay Aiyagari, VMware
Allon Dafner, Amdocs
Iain Woolf, Alcatel-Lucent
Artur Tyloch, Nokia Solutions and Networks
Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare
There is currently no accepted standard for the measurement or monitoring of RCS Services, even though we believe that this is vital to assure the quality and reliability of such services -and to establish a framework for reliable comparison across implementations.
To this end Ascom has defined a formal definition and implementation strategy to help the Operations team solve a range of challenges, including issues related to EPC, IMS and the Application Server.
We will describe this solution in a number of short articles. This article describes the 1-to-1 Chat test case.
ETSI NFV#13 NFV resiliency presentation - ali kafel - stratusAli Kafel
This white paper makes the case for:
Why Resiliency Management Needs to be in the Software Infrastructure. It Covers:
- Fault Management and Resiliency Management
- Seamless Protection for Faster and Simpler Devl
- Multiple Levels of Availability
- Speed of Service Restoration & Redundancy Restoration
- State Management
- Demonstrating Carrier Grade Availability and Resiliency
How Automation And Intelligence Can Simplify Your High AvailabilityPrecisely
The demand for resilient IT systems continues to accelerate. Today’s IBM i customers increasingly understand the need for a high availability solution to keep these critical business systems available and prevent data loss.
An important driver for customers choosing an HA solution is the ease-of-use and automation. As these IBM i customers evaluate how to leverage new technologies and deployments options, like running their HA in a cloud environment, IT resources are challenged to find new approaches to increasing efficiency and productivity. In short, they need an HA solution that requires a minimum of configuration effort and ever-increasing automation.
Assure MIMIX, the leader in IBM i high availability and disaster recovery, has new several capabilities that can dramatically simplify configuration tasks, make monitoring easy and assist customers moving to the cloud. We can help these customers keep their mission-critical business applications running continuously, even while migrating to the cloud. During the webinar, we will show a demo of the highly automated Journal-Driven Configuration that takes the complexity out of setting up Assure MIMIX replication.
In addition to the demo, join us on this webinar to learn about:
- Initial synchronization improvements
- Streamlining migration to a new system with replication
- Support for IBM’s new Virtual Serial Number
Webinar - Achieving ce 2.0 network integrity - a solid foundation to enable t...Veryx Technologies
This Webinar reveals:
- A new approach to optimize CE 2.0 service design
- Cutting-edge test methodologies for service activation
- Proactive means to detect and diagnose service degradation
- How to increase availability using automated diagnostics and assisted troubleshooting
- Innovative test solutions designed to follow the evolution from CE 2.0 to the ‘Third Network’ paradigm.
Webinar Slides: How Bluefin Payment Systems Ensures 24/7/365 Operation and Ap...Continuent
Learn how Bluefin Payment Systems provides 24/7/365 operation and application availability for their payment gateway and decryption-as-a-service, essential to Point-Of-Sale solutions in retail, mobile, call centers and kiosks. We’ll discuss why Bluefin uses Tungsten Clustering, and how Bluefin runs two co-located data centers with multi-master replication between each cluster in each data center, with full fail-over within the cluster and between clusters, handling 350 million records each month.
AGENDA
- Subscription experience of four (4) select SaaS customers
- How we provide revenue protection
- How we enable global revenue growth
- Customer case: How Bluefin provides 24x7x365 operation and application availability for their payment gateway and decryption-as-a-service
VoLTE Service Monitoring - VoLTE Voice CallJose Gonzalez
There is currently no accepted standard for the measurement or monitoring of VoLTE Services, even though we believe that this is vital to assure the quality and reliability of such services - and to establish a framework for reliable comparison across implementations.
To this end Ascom has defined a formal definition and implementation strategy to help the Operations team solve a range of challenges, including issues related to EPC, IMS and the Application Server. We will describe this solution in a number of short articles.
This article describes the architecture of our solution and the VoLTE Voice Call test case.
Today, most mobile connectivity issues are quickly attributed to “bad Wi-Fi”. Very often it may not be a wireless or RF related issue at all. With Aruba Clarity, IT organisations now have visibility into non-RF metrics not only giving them end-to-end visibility into a wireless user experience, but also the ability to foresee connectivity issues before users are even impacted. Check out the webinar recording where this presentation was used. https://attendee.gotowebinar.com/register/224478872155652612
Register for the upcoming webinars: https://community.arubanetworks.com/t5/Training-Certification-Career/EMEA-Airheads-Webinars-Jul-Dec-2017/td-p/271908
As your service footprint grows, adding traffic control capabilities beyond stock solutions like kube-proxy becomes critical. Envoy provides fine grained routing control, load shedding, and metrics that help you scale your environment smoothly. We'll walk through several traffic control strategies using Envoy.
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
System reliability is one of most important features for the Network service. With the development of visualized network functions, system reliability and stability becomes more complex than before. Tes ting and verifying OPNFV infrastructure for tolerating the faults and failures is a great challenge. To address this challenge, the presentation presents a test framework which covers the means and types of faults injection as well as the implement of fault injection tools. Some test cases are shown to validate the framework.
“Spikey Workloads”:
Emergency Management in the Cloud
One of the best use cases for the cloud involves websites with surges in computing needs. This session will feature organizations that have leveraged the cloud to handle their unique burst workloads without breaking the bank:
Speaker: , Solutions Architect, Amazon Web Services
One of the best use cases for the cloud involves websites with surges in computing needs. This session will feature organizations that have leveraged the cloud to handle their unique burst workloads without breaking the bank:
Speaker: Cameron Maxwell, Professional Services, Amazon Web Services
VMworld 2013: Network Function Virtualization in the Cloud: Case for Enterpri...VMworld
VMworld 2013
Alka Gupta, VMware
Sanjay Aiyagari, VMware
Allon Dafner, Amdocs
Iain Woolf, Alcatel-Lucent
Artur Tyloch, Nokia Solutions and Networks
Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare
There is currently no accepted standard for the measurement or monitoring of RCS Services, even though we believe that this is vital to assure the quality and reliability of such services -and to establish a framework for reliable comparison across implementations.
To this end Ascom has defined a formal definition and implementation strategy to help the Operations team solve a range of challenges, including issues related to EPC, IMS and the Application Server.
We will describe this solution in a number of short articles. This article describes the 1-to-1 Chat test case.
ETSI NFV#13 NFV resiliency presentation - ali kafel - stratusAli Kafel
This white paper makes the case for:
Why Resiliency Management Needs to be in the Software Infrastructure. It Covers:
- Fault Management and Resiliency Management
- Seamless Protection for Faster and Simpler Devl
- Multiple Levels of Availability
- Speed of Service Restoration & Redundancy Restoration
- State Management
- Demonstrating Carrier Grade Availability and Resiliency
How Automation And Intelligence Can Simplify Your High AvailabilityPrecisely
The demand for resilient IT systems continues to accelerate. Today’s IBM i customers increasingly understand the need for a high availability solution to keep these critical business systems available and prevent data loss.
An important driver for customers choosing an HA solution is the ease-of-use and automation. As these IBM i customers evaluate how to leverage new technologies and deployments options, like running their HA in a cloud environment, IT resources are challenged to find new approaches to increasing efficiency and productivity. In short, they need an HA solution that requires a minimum of configuration effort and ever-increasing automation.
Assure MIMIX, the leader in IBM i high availability and disaster recovery, has new several capabilities that can dramatically simplify configuration tasks, make monitoring easy and assist customers moving to the cloud. We can help these customers keep their mission-critical business applications running continuously, even while migrating to the cloud. During the webinar, we will show a demo of the highly automated Journal-Driven Configuration that takes the complexity out of setting up Assure MIMIX replication.
In addition to the demo, join us on this webinar to learn about:
- Initial synchronization improvements
- Streamlining migration to a new system with replication
- Support for IBM’s new Virtual Serial Number
Webinar - Achieving ce 2.0 network integrity - a solid foundation to enable t...Veryx Technologies
This Webinar reveals:
- A new approach to optimize CE 2.0 service design
- Cutting-edge test methodologies for service activation
- Proactive means to detect and diagnose service degradation
- How to increase availability using automated diagnostics and assisted troubleshooting
- Innovative test solutions designed to follow the evolution from CE 2.0 to the ‘Third Network’ paradigm.
Webinar Slides: How Bluefin Payment Systems Ensures 24/7/365 Operation and Ap...Continuent
Learn how Bluefin Payment Systems provides 24/7/365 operation and application availability for their payment gateway and decryption-as-a-service, essential to Point-Of-Sale solutions in retail, mobile, call centers and kiosks. We’ll discuss why Bluefin uses Tungsten Clustering, and how Bluefin runs two co-located data centers with multi-master replication between each cluster in each data center, with full fail-over within the cluster and between clusters, handling 350 million records each month.
AGENDA
- Subscription experience of four (4) select SaaS customers
- How we provide revenue protection
- How we enable global revenue growth
- Customer case: How Bluefin provides 24x7x365 operation and application availability for their payment gateway and decryption-as-a-service
VoLTE Service Monitoring - VoLTE Voice CallJose Gonzalez
There is currently no accepted standard for the measurement or monitoring of VoLTE Services, even though we believe that this is vital to assure the quality and reliability of such services - and to establish a framework for reliable comparison across implementations.
To this end Ascom has defined a formal definition and implementation strategy to help the Operations team solve a range of challenges, including issues related to EPC, IMS and the Application Server. We will describe this solution in a number of short articles.
This article describes the architecture of our solution and the VoLTE Voice Call test case.
Today, most mobile connectivity issues are quickly attributed to “bad Wi-Fi”. Very often it may not be a wireless or RF related issue at all. With Aruba Clarity, IT organisations now have visibility into non-RF metrics not only giving them end-to-end visibility into a wireless user experience, but also the ability to foresee connectivity issues before users are even impacted. Check out the webinar recording where this presentation was used. https://attendee.gotowebinar.com/register/224478872155652612
Register for the upcoming webinars: https://community.arubanetworks.com/t5/Training-Certification-Career/EMEA-Airheads-Webinars-Jul-Dec-2017/td-p/271908
As your service footprint grows, adding traffic control capabilities beyond stock solutions like kube-proxy becomes critical. Envoy provides fine grained routing control, load shedding, and metrics that help you scale your environment smoothly. We'll walk through several traffic control strategies using Envoy.
Similar to Empex - Monitoring Elixir Applications (20)
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
5. “Evidence for the long-term operational stability of the system had also not
been collected in any systematic way. For the Ericsson AXD301 the only
information on the long-term stability of the system came from a power-point
presentation showing some figures claiming that a major customer had run an
11 node system with a 99.9999999% reliability, though how these figure had
been obtained was not documented.”4
11. 2 Region AWS Availability
99.95% = .9995
Errors = .0005
.0005 * .0005 = 0.00000025
1 - 0.00000025 = 0.99999975
~99.99998% (More than 6 nines almost 7!)
~ 6s downtime in a year7,8, but…)
12. Failover between Regions isn’t free
DNS TTL + (Interval * Threshold)6
60s + (5 * 10s) = 110s
Now we’re in the minutes range
Let’s say 1 region fails 3 times in a year10
That would be 5.5 minutes
13. Uh oh, 5.26 minutes is 5 9s
and we have more problems...
14. Some ISPs don’t honor DNS TTLs11
(so some long tail will hit the region with the outage)
15. I haven’t even touched the issue of
data replication across geographically
distant data centers (regions)
Mention B/R and consumer teamLead on monitoringMention the initial focus on availability
Mention the shift then to how to monitor with examples from BR
The first metric to discussWhat is it?Is it a useful metric?
Deep dive on availability but this same process should be applied to all metrics you rely on
Conventional definition
Erlang’s nine ninesWhere does that figure come fromWhat does it meanI looked for the source
This quote comes from the 2003 thesis that I believe introduced the nine ninesHaving seen the source how reputable is this claim?
The name cloud invokes this nice simple visualization, in reality tc/ip in a virtualized environment is very complexVirtualized environments add a lot more complexityTalk about our AWS setup
Multiple availability zones
Load balancers
When you enable an Availability Zone for your load balancer, Elastic Load Balancing creates a load balancer node in the Availability Zone. If you register instances in an Availability Zone but do not enable the Availability Zone, these registered instances do not receive traffic. Note that your load balancer is most effective if you ensure that each enabled Availability Zone has at least one registered instance.
Say over 3 ninesThis is a contrived worse case. Financially incentivised to not go lower. Of course it could drop lower and it could be much higher“Region Unavailable” and “Region Unavailability” mean that more than one Availability Zone in which you are running an instance, within the same Region, is “Unavailable” to you.Our elixir apps therefore are limited to 99.95% availabilityCan we do better
We’re in a single region, but let’s explore going multi-regionWhat kind of complexity does this introduce
Route 53Failover Routing Policy (active-passive failover)There are othersEach AWS region is a completely independent stack of services, totally isolated from other regions. https://aws.amazon.com/blogs/developer/working-with-different-aws-regions/
This assumes regions are independent which is what AWS statesThis downtime is the downtime where both systems are down simultaneously.
Walk though this slideWhat does the interval mean
Ttl is how long for the caching name servers to cache from the authoritative name serverWhat does the threshold meanWhy not threshold 1
Quick slide
Depending on persistence layer (kafka, cassandra) vs mysql, postgres
Possibe mitigation vectors for thisBurstable instances, container orchestration, serverless
What else
Possibe mitigation vectors for thisBurstable instances, container orchestration, serverless
Quick slide
From Google, how reliable is1 out of 100 or 1 out of 1000 cellular network fail
Quick slide
Google introduces Aggregate Availability in their SRE book
Contrast with traditional uptime version
Can you accurately measure time in a cloud system? VMs preempted for tens of milliseconds
What we use at BR is a modified version
Let’s revisit the architecture of 1 of B/R’s many elixir services
We collect metrics from the LB nodes and send them to Datadog via cloudwatchWE CHEAT
USAGE OF CDN AND THIS IS ORIGIN
We didn’t spend 2 years switching to get worse!Quick slide
Ruby Apps were under less active development and being sunsetted in favor Elixir apps
Ruby apps aren’t changed except for exceptional casesThis is a comparison of microservicesThe elixir apps see deploys daily or even multiple times a day
Version Control - SoftwareVersion Control - InfrastructureRecord releases with data time and releaser
Planning for rollbackWe don’t deploy after 3pm*
What is happening in my system?
ELKLogzplug_logger_jsonEcto_logger_json
Bug me on the docs
Is my system working correctly?
Is my system working correctly?
When something goes wrong? How do I know?
The danger of false positives. Start conservative
Alerting and avoiding mid night wake up calls
Opsgenie
1 week shifts
Different severities / different response timesAnyone in the company can submit an issue not just automationWe have a slack channel for these alertsAn alert is then split out into a dedicated channel with stakeholdersGoal is to triage and try to solve yourself but have permission to drag someone not on call in if necessary.Management also gets paged.Phone calls if not answered with 2 minutes
When something goes wrong? How do I know?
The danger of false positives. Start conservative
Needed for capacity planning and figuring out user behaviors
Needed for capacity planning and figuring out user behaviors
There have been many studies on the importance of latency with regards to customer behavior. 100ms difference can have a measurable impact on revenue or satisfactionPercentiles not averages!!!!
Clocks - how accurate can you actually measure latency?
We tend to shove business metrics into a seperate analytics bucket? But what is analytics? Is that not business logic monitoring?How many user’s are using this piece of software, etc
How many user’s are using this piece of software, etc
CPU, open file descriptors, memory, disk space, network
CPU, open file descriptors, memory, disk space, network
These can creep up over time
Check timeRun Queue
Run Queue
100% CPU - after a few minutes, run queue would increase to 20ishInfinite loop by missed pattern match case
Run Queue
You need to deep dive like we did with availability for all the metrics you monitor