The document discusses several challenges for large-scale systems operating in cloud environments, including failure, inconsistency, continuous deployment, and installation errors. It describes tactics used by companies like Google and Netflix to address issues like fault tolerance, eventual consistency, rolling upgrades of loosely coupled services, and error diagnosis across distributed systems. The document also outlines research at NICTA to build process models for installation, analyze configuration errors, and develop new tactics for managing changes in complex cloud applications.
VMworld 2013: VMware NSX: A Customer’s Perspective VMworld
VMworld 2013
Taruna Gandhi, VMware
Jason Puig, Symantec
Richard Sillito, WestJet
Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare
Learn IP Address Management Best Practices including: challenges, IPv6: what you need to know today, IPv6 benefits, recommendations, and what to look for when evaluating an IPAM Solution.
VMworld 2013: VMware NSX: A Customer’s Perspective VMworld
VMworld 2013
Taruna Gandhi, VMware
Jason Puig, Symantec
Richard Sillito, WestJet
Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare
Learn IP Address Management Best Practices including: challenges, IPv6: what you need to know today, IPv6 benefits, recommendations, and what to look for when evaluating an IPAM Solution.
SolarWinds Product Management Technical Drilldown on Deep Packet Inspection a...SolarWinds
In this webinar, Group Product Manager Rob Hock and Network Management Head Geek Leon Adato will show how SolarWinds deep packet inspection and analysis and the new Quality of Experience dashboard within Network Performance Monitor (NPM) version 11 can help you solve both common and complex application and network performance issues.
Your Applications Are Distributed, How About Your Network Analysis Solution?Savvius, Inc
Watch the full OnDemand Webcast: http://bit.ly/wcappdistributed
Corporate computing has crossed the threshold to highly distributed, and it doesn’t matter if your corporation has 50 or 50,000 employees. The reasons are many, including cloud usage, SAAS subscriptions, or just highly distributed enterprise networks and users. Have your network analysis capabilities evolved with your corporate computing infrastructure?
Traditional network analysis and troubleshooting solutions were centralized, and back when all corporate computing assets were located in and routed through a central data center this approach worked well. But with more and more applications migrating from your data center to remote locations (cloud or otherwise), your ability to monitor, analyze and troubleshoot mission critical application performance is compromised by a centralized network analysis approach. It’s imperative that you also distribute your network analysis capabilities. More than that, you need to evolve from a "point and shoot" network troubleshooting approach to a distributed, 24x7 network monitoring approach. Distributing your network analysis capabilities and monitoring 24x7 allows you to respond to network issues before they become full blown problems and to analyze and troubleshoot issues proactively vs. reactively. Join us to examine the technologies and approaches to best deal with today’s highly distributed network applications.
In this webcast, we will cover:
Current trends in enterprise application distribution
Key performance indicators when monitoring overall application performance
Network analysis architectures for today’s distributed application environment
Benefits of employing distributed, 24x7 solutions for application and network performance monitoring
What you will learn:
How to effectively use network analysis solutions for application performance monitoring
How to deploy network analysis solutions to ensure adequate coverage of your network
Why "point and shoot" network troubleshooting approaches are no longer enough
Government and Education: IT Tools to Support Your Hybrid WorkforceSolarWinds
During this interactive webinar, our presenter discussed how to leverage IT tools to improve operations management and support for on-site and remote workers.
Attendees learned about:
• Improving network monitoring with ipMonitor® and configuration management with Kiwi CatTools®
• Centralizing and simplifying log message management across network devices and servers
• Monitoring logs with Kiwi Syslog® Server
• Utilizing IT service management tools like SolarWinds Web Help Desk® or SolarWinds Service Desk to improve resolution rates and provide self-service
• Leveraging Dameware® tools to support users and manage systems remotely
This presentation discusses the problems faced with managing a branch office infrastructure. It looks at current technologies for resolving these issues and gives a quick introduction of what to expect in the near future with Windows 7 and Windows Server 2008 R2.
NER & Emerson Infrastructure Optimization Capabilties StoryboardGreg Stover
NER Data Corporation is the Premier National Value Added Master Distributuor of Emerson, Liebert and Avocent Technology Solutions.
No other Distributor knows Emerson Cooling, Power, Monitoring and DCIM Solutions better, has better programs, has better pricing and/or has better Technical Consultiing capability then NER.
NER will show you how to optimize you existing Emerson, Liebert and Avocent investments while ensuring you are properly positioned for the future!
The Economics of Scale: Promises and Perils of Going DistributedTyler Treat
What does it take to scale a system? We'll learn how going distributed can pay dividends in areas like availability and fault tolerance by examining a real-world case study. However, we will also look at the inherent pitfalls. When it comes to distributed systems, for every promise there is a peril.
Government and Education Webinar: Recovering IP Addresses on Your NetworkSolarWinds
In this webinar, we discussed how to recover IP addresses on your network, whether abandoned, static, or reserved.
During this interactive webinar, attendees learned about:
Gain a more complete view of your network and find abandoned IP addresses
Obtain accurate, current information about the IP addresses in use on your network
Easily change the address status from “Used” to “Available”
Receive alerts when DHCP address pools exceed utilization thresholds
>>All-In-One Solution Situational Awareness, Fault Tolerance,
Centralized Monitoring and Cyber Security Readiness Solution.
>>Real-time situational awareness
>>Powerful enterprise network management and centralized monitoring
>>Robust cyber security readiness
>>Fault tolerance and increased ROI
>>NeuralStar serves as the military-grade network monitoring components at the heart of some of the world’s most sophisticated and secure networks.
SolarWinds Product Management Technical Drilldown on Deep Packet Inspection a...SolarWinds
In this webinar, Group Product Manager Rob Hock and Network Management Head Geek Leon Adato will show how SolarWinds deep packet inspection and analysis and the new Quality of Experience dashboard within Network Performance Monitor (NPM) version 11 can help you solve both common and complex application and network performance issues.
Your Applications Are Distributed, How About Your Network Analysis Solution?Savvius, Inc
Watch the full OnDemand Webcast: http://bit.ly/wcappdistributed
Corporate computing has crossed the threshold to highly distributed, and it doesn’t matter if your corporation has 50 or 50,000 employees. The reasons are many, including cloud usage, SAAS subscriptions, or just highly distributed enterprise networks and users. Have your network analysis capabilities evolved with your corporate computing infrastructure?
Traditional network analysis and troubleshooting solutions were centralized, and back when all corporate computing assets were located in and routed through a central data center this approach worked well. But with more and more applications migrating from your data center to remote locations (cloud or otherwise), your ability to monitor, analyze and troubleshoot mission critical application performance is compromised by a centralized network analysis approach. It’s imperative that you also distribute your network analysis capabilities. More than that, you need to evolve from a "point and shoot" network troubleshooting approach to a distributed, 24x7 network monitoring approach. Distributing your network analysis capabilities and monitoring 24x7 allows you to respond to network issues before they become full blown problems and to analyze and troubleshoot issues proactively vs. reactively. Join us to examine the technologies and approaches to best deal with today’s highly distributed network applications.
In this webcast, we will cover:
Current trends in enterprise application distribution
Key performance indicators when monitoring overall application performance
Network analysis architectures for today’s distributed application environment
Benefits of employing distributed, 24x7 solutions for application and network performance monitoring
What you will learn:
How to effectively use network analysis solutions for application performance monitoring
How to deploy network analysis solutions to ensure adequate coverage of your network
Why "point and shoot" network troubleshooting approaches are no longer enough
Government and Education: IT Tools to Support Your Hybrid WorkforceSolarWinds
During this interactive webinar, our presenter discussed how to leverage IT tools to improve operations management and support for on-site and remote workers.
Attendees learned about:
• Improving network monitoring with ipMonitor® and configuration management with Kiwi CatTools®
• Centralizing and simplifying log message management across network devices and servers
• Monitoring logs with Kiwi Syslog® Server
• Utilizing IT service management tools like SolarWinds Web Help Desk® or SolarWinds Service Desk to improve resolution rates and provide self-service
• Leveraging Dameware® tools to support users and manage systems remotely
This presentation discusses the problems faced with managing a branch office infrastructure. It looks at current technologies for resolving these issues and gives a quick introduction of what to expect in the near future with Windows 7 and Windows Server 2008 R2.
NER & Emerson Infrastructure Optimization Capabilties StoryboardGreg Stover
NER Data Corporation is the Premier National Value Added Master Distributuor of Emerson, Liebert and Avocent Technology Solutions.
No other Distributor knows Emerson Cooling, Power, Monitoring and DCIM Solutions better, has better programs, has better pricing and/or has better Technical Consultiing capability then NER.
NER will show you how to optimize you existing Emerson, Liebert and Avocent investments while ensuring you are properly positioned for the future!
The Economics of Scale: Promises and Perils of Going DistributedTyler Treat
What does it take to scale a system? We'll learn how going distributed can pay dividends in areas like availability and fault tolerance by examining a real-world case study. However, we will also look at the inherent pitfalls. When it comes to distributed systems, for every promise there is a peril.
Government and Education Webinar: Recovering IP Addresses on Your NetworkSolarWinds
In this webinar, we discussed how to recover IP addresses on your network, whether abandoned, static, or reserved.
During this interactive webinar, attendees learned about:
Gain a more complete view of your network and find abandoned IP addresses
Obtain accurate, current information about the IP addresses in use on your network
Easily change the address status from “Used” to “Available”
Receive alerts when DHCP address pools exceed utilization thresholds
>>All-In-One Solution Situational Awareness, Fault Tolerance,
Centralized Monitoring and Cyber Security Readiness Solution.
>>Real-time situational awareness
>>Powerful enterprise network management and centralized monitoring
>>Robust cyber security readiness
>>Fault tolerance and increased ROI
>>NeuralStar serves as the military-grade network monitoring components at the heart of some of the world’s most sophisticated and secure networks.
A Software Architect's View On Diagrammingmeghantaylor
Diagramming is an important tool to have in one’s repertoire but how can one go about learning to do it effectively? This presentation will shed some light on some use cases plus share some research.
Learn about different types of software diagrams, the different diagramming tools available, and Visio tips & tricks to make your diagrams pretty.
Architectural Patterns and Software Architectures: Client-Server, Multi-Tier,...Svetlin Nakov
Few days ago I gave a talk about software architectures. My goal was to explain as easy as possible the main ideas behind the most popular software architectures like the client-server model, the 3-tier and multi-tier layered models, the idea behind SOA architecture and cloud computing, and few widely used architectural patterns like MVC (Model-View-Controller), MVP (Model-View-Presenter), PAC (Presentation Abstraction Control), MVVM (Model-View-ViewModel). In my talk I explain that MVC, MVP and MVVM are not necessary bound to any particular architectural model like client-server, 3-tier of SOA. MVC, MVP and MVVM are architectural principles applicable when we need to separate the presentation (UI), the data model and the presentation logic.
Additionally I made an overview of the popular architectural principals IoC (Inversion of Control) and DI (Dependency Injection) and give examples how to build your own Inversion of Control (IoC) container.
A presentation on layered software architecture that goes through logical layering and physical layering, the difference between those two and a practical example.
Microservices are small services with independent lifecycles that work together. There is an underlying tension in that definition – how independent can you be when you have to be part of a whole? I’ve spent much of the last couple of years trying to understand how to find the right balance, and in this talk/tutorial I’ll be presenting the core seven principles that I think represent what makes microservices tick.
After a brief introduction of what microservices are and why they are important, we’ll spend the bulk of the time looking at the principles themselves, wherever possible covering real-world examples and technology:
- Modelled around business domain – using techniques from domain-driven design to find service boundaries leads to better team alignment and more stable service boundaries, avoiding expensive cross-service changes.
- Culture of automation – all organisations that use microservices at scale have strong cultures of automation. We’ll look at some of their stories and think about which sort of automation is key.
- Hide implementation details – how do you hide the detail inside each service to avoid coupling, and ensure each service retains its autonomous nature?
- Decentralize all the things! – we have to push power down as far as we can, and this goes for both the system and organisational architecture. We’ll look at everything from autonomous self-contained teams and internal open source, to using choreographed systems to handle long-lived business transactions.
- Deploy independently – this is all about being able to deploy safely. So we’ll cover everything from deployment models to consumer-driven contracts and the importance of separating deployment from release.
- Isolate failure – just making a system distributed doesn’t make it more stable than a monolithic application. So what do you need to look for?
- Highly observable – we need to understand the health of a single service, but also the whole ecosystem. How?
In terms of learning outcomes, beginners will get a sense of what microservices are and what makes them different, whereas more experienced practitioners will get insight and practical advice into how to implement them.
Presentazione dello speech tenuto da Carmine Spagnuolo (Postdoctoral Research Fellow - Università degli Studi di Salerno/ ACT OR) dal titolo "Technology insights: Decision Science Platform", durante il Decision Science Forum 2019, il più importante evento italiano sulla Scienza delle Decisioni.
Visualizing Your Network Health - Driving Visibility in Increasingly Complex...DellNMS
Dell Performance Monitoring Network Management solutions can provide your IT department with the affordable, in-depth visibility and actionable monitoring needed to manage network infrastructure complexity.
Join our webcast to learn how:
• Dynamic discovery of equipment provides the ability to map current location, configuration and interdependencies.
• Real-time visibility across network infrastructures can help ensure availability and performance.
• Actionable information about network health, faults, bandwidth hogs and performance issues reduces the mean-time-to-resolution.
• Proactive analysis can pinpoint the root cause of intermittent, hard to find problems.
Visualizing and optimizing your network is easier than you think
Cloud-Native Patterns and the Benefits of MySQL as a Platform Managed ServiceVMware Tanzu
You can’t have cloud-native applications without a modern approach to databases and backing services. Data professionals are looking for ways to transform how databases are provisioned and managed.
In this webinar, we’ll cover practical strategies you can employ to deliver improved business agility at the data layer. We’ll discuss the impact that microservices are having in the enterprise, and what this means for MySQL and other popular databases. Join us and learn the answers to these common questions:
● How can you meet the operational challenge of scaling the number of MySQL database instances and managing the fleet?
● Adding to this scale challenge, how can your MySQL instances maintain availability in a world where the underlying IT infrastructure is ephemeral?
● How can you secure data in motion?
● How can you enable self-service while maintaining control and governance?
We’ll cover these topics and share how enterprises like yours are delivering greater outcomes with our Pivotal Platform managed MySQL.
Now you can scale without fear of failure.
Presenters:
Judy Wang, Product Management
Jagdish Mirani, Product Marketing
Visualizing Your Network Health - Know your NetworkDellNMS
An old adage states that you cannot manage what you don’t know. Do you know what devices are on your network, where they are located, how they are configured, what they are connected to, and how they are affected by changes and failures?
Today’s network infrastructure is becoming more and more complex, while demands on the Network Administrator to ensure network availability and performance are higher than ever. Business critical systems depend upon you managing your entire network infrastructure and delivering high-quality service 24/7, 365 days a year. So how do you keep the pace?
Learn how real-time visibility into your entire network infrastructure provides the power to manage your assets with greater control.
Software Defined Networking: Network VirtualizationNetCraftsmen
SDN has the potential to revolutionize the way networks are designed, sold, and operated. This presentation describes SDN, discusses what it can do, and presents use cases. It also talks about the current and potential impact of SDN on the networking industry.
Kent Melville and Annie Wise from Inductive Automation, and water/wastewater controls professionals Henry Palechek and Jason Hamlin, cover 10 steps for building a sustainable SCADA system that survives and even thrives using only your operational expenditure budget.
You'll learn about:
• What type of hardware and operating systems to use
• Utilizing smart devices and MQTT
• The advantages of server-centric architecture and web-based deployment
• Rapid development with templates and UDTs
• Powerful alarming and reporting tools
• And more
Kent Melville and Annie Wise from Inductive Automation, and water/wastewater controls professionals Henry Palechek and Jason Hamlin, cover 10 steps for building a sustainable SCADA system that survives and even thrives using only your operational expenditure budget.
You'll learn about:
• What type of hardware and operating systems to use
• Utilizing smart devices and MQTT
• The advantages of server-centric architecture and web-based deployment
• Rapid development with templates and UDTs
• Powerful alarming and reporting tools
• And more
Cloud computing is a type of Internet-based computing that provides shared computer processing resources and data to computers and other devices on demand. It is a model for enabling ubiquitous, on-demand access to a shared pool of configurable computing resources (e.g., computer networks, servers, storage, applications and services),
An introduction to the technology underlying blockchains. If you want to hear a voice over, you can find it at http://presentationtube.com/watch?v=tCZDIEezrpbhttp://presentationtube.com/watch?v=tCZDIEezrpb
A collection of exercises to build a simple deployment pipeline. This comes from the course I have taught in DevOps and is targeted at instructors or individuals who want to learn the basics of a pipeline.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
Architectural Tactics for Large Scale Systems
1. NICTA Copyright 2012 From imagination to impact
Architecture Tactics for
Large-Scale Systems
to Manage Changes
Len Bass
2. NICTA Copyright 2012 From imagination to impact
2
About NICTA
National ICT Australia
• Federal and state funded research
company established in 2002
• Largest ICT research resource in
Australia
• National impact is an important
success metric
• ~700 staff/students working in 5 labs
across major capital cities
• 7 university partners
• Providing R&D services, knowledge
transfer to Australian (and global) ICT
industry
NICTA technology is
in over 1 billion mobile
phones
3. NICTA Copyright 2012 From imagination to impact
WICSA 2014 is in Sydney!!
Working IEEE/IFIP Conference on Software
Architecture (WICSA) is the pre-eminent
software architecture conference
April 7-11, 2014
4. NICTA Copyright 2012 From imagination to impact
Traditional View of Large Scale Systems
4
Application
Cloud
Environment
Traditionally, the software engineering community
has viewed systems as being developed for users
and existing in an environment. The motivating
questions have been: With this world view: how can
development costs be reduced and run time quality
improved?
End users
Developers
5. NICTA Copyright 2012 From imagination to impact
A Broader View
5
Application
Cloud
Environment
Applications are not only affected by the behavior of the
end users but also by actions of operators who control
the environment for a consumer’s application.
Consumer
Operator
End users
Developers
6. NICTA Copyright 2012 From imagination to impact
My Message: Applications must
respond to change caused by the
environment and the operators as well
as new processes used during
development.
Application
Cloud
Environment
Consumer
Operator
End users
Developers
.
7. NICTA Copyright 2012 From imagination to impact
Applications must be aware of
7
• Failure and its causes
• Consistency issues
• Continuous deployment practices
• Multiple simultaneous versions active
• The remainder of this talk will discuss why
applications should have this kind of awareness
and what tactics are used to address the
problems.
8. NICTA Copyright 2012 From imagination to impact
Failure and its causes
8
A year in the life of a Google data center (from Jeff Dean)
• ~0.5 overheating (power down most machines in <5 mins, ~1-2 days
to recover)
• ~1 PDU failure (~500-1000 machines suddenly disappear, ~6 hours
to come back)
• ~20 rack failures (40-80 machines instantly disappear, 1-6 hours to
get back)
• ~5 racks go wonky (40-80 machines see 50% packetloss)
• ~12 router reloads (takes out DNS and external vips for a couple
minutes)
• ~3 router failures (have to immediately pull traffic for an hour)
• ~dozens of minor 30-second blips for dns
• ~1000 individual machine failures
• ~thousands of hard drive failures
• slow disks, bad memory, misconfigured machines, flaky
machines, etc.
9. NICTA Copyright 2012 From imagination to impact
Consequence for cloud consumers
9
• Failure is pervasive.
• Cloud as a whole is reliable (99.5% availability)
but any particular physical component is not.
• This means applications must be aware of the
possibility of virtual machine failure.
• Applications must be constructed to be fault
tolerant.
10. NICTA Copyright 2012 From imagination to impact
Detection of fault
10
• Two techniques
– Heartbeat – component sends periodic messaging
indicating that it is alive
– Timeout – client of component sets a deadline after
which
• Component will be assumed to have failed.
• Messages will be assumed to have gotten lost
• Netflix (US video streaming service) advocates fast
failure.
– Clients set short timeout.
– Results in better response time if component failed
– May result in “false positive” whereby component is
assumed to have failed but, in reality, is still alive.
– If client retries request, it may be executed twice.
11. NICTA Copyright 2012 From imagination to impact
Recovery from fault
11
• Redundancy of computation and data
– Redundancy of data will be discussed in next section on
consistency
– Redundancy of computation is typically achieved by making
services stateless.
• Can send failed messages to new instance. Need to be
concerned about second execution if first message was, in
fact, acted on
• Can instantiate new copy of service if failure is caused by
overloading.
• Alternative means for accomplishing service
– Some services can be accomplished in using different
mechanisms. Consider one mechanism as a fallback to a
primary.
– Degraded service might be possible.
12. NICTA Copyright 2012 From imagination to impact
Undo
• After performing an operation in AWS, may want to go
back to original state – i.e. Undo the operation
• Not always that straight-forward:
– Attaching volume is no problem while the instance is
running, detaching might be problematic
– Creating / changing auto-scaling rules has effect on
number of running instances
• Cannot terminate additional instances, as the rule would
create new ones!
– Deleted / terminated / released resources are gone!
12
13. NICTA Copyright 2012 From imagination to impact
Undo using transaction approach
13
+ commit
+ pseudo-delete
begin-
transaction
rollback
do
do
do
Administrator
14. NICTA Copyright 2012 From imagination to impact
Approach
14
begin-
transaction
rollback
do
do
do
Sense cloud
resources states
Sense cloud
resources states
Administrator
Undo System
15. NICTA Copyright 2012 From imagination to impact
Approach
15
begin-
transaction
rollback
do
do
do
Sense cloud
resources states
Sense cloud
resources states
Administrator
Undo System
Goal
state
Goal
state
Initial
state
Initial
state
16. NICTA Copyright 2012 From imagination to impact
begin-
transaction
rollback
do
do
do
Sense cloud
resources states
Sense cloud
resources states
PlanGenerate codeExecute
Administrator
Undo System
Goal
state
Goal
state
Initial
state
Initial
state
Set of
actions
Set of
actions
Approach
16
17. NICTA Copyright 2012 From imagination to impact
Report fault
17
• Through logs.
– Correlating logs can be difficult
– Tracking logs to root causes can be very difficult.
• Through reporting to parent service.
– It, in turn, may have alternative means of achieving its
goals, including undo.
18. NICTA Copyright 2012 From imagination to impact
Consistency issues
18
• Data is frequently replicated.
– NoSQL data bases all replicate data
• Replication takes time.
– Means that inconsistent versions of data may exist
• One (or more) that has been updated
• One (or more) that has not yet received the updates.
– Leads to phenomenon known as “eventual
consistency”
– May take ½ second to become consistent.
19. NICTA Copyright 2012 From imagination to impact
19
Characterising Eventual Consistency in
Amazon SimpleDB
• The probability to read updated data in SimpleDB in US West
– An application reads data X (ms) after it has written data
• SimpleDB has two
read operations
– Eventual Consistent
Read
– Consistent Read
• This pattern is
consistent
regardless of the
time of day
Eventual ConsistentConsistent Read
20. NICTA Copyright 2012 From imagination to impact
Other types of inconsistency
• Configuration parameters
– All instances should have same settings in terms of
security, locality, etc.
• Synchronization locks. Locks shared across distributed
instances may not be in a consistent state.
• One mechanism is to have consistency manager.
– Complicated since centralized consistency manager
may fail and distributed consistency managers must
be coordinated.
– Zookeeper is an open source tool that manages
consistency for distributed applications at a small cost
in latency.
20
21. NICTA Copyright 2012 From imagination to impact
Continuous deployment practices
• Many organizations have developers deploy
after changes tested
– Google
– Amazon
– Linkedin
– Netflix
• Leads to following types of problems
– Multiple simultaneous versions active
– Errors occurring during installation
21
22. NICTA Copyright 2012 From imagination to impact
Various Upgrade Strategies
• How many at once?
– One at a time (rolling upgrade)
– Groups at a time (staged upgrade, e.g. canaries)
– All at once (big flip)
• What happens to old versions?
– Replaced en masse
– Maintained for some period for compatibility purposes
22
23. NICTA Copyright 2012 From imagination to impact
Services Can be Bundled in Two Fashions
• Tightly Coupled
– Google
– Facebook
• Loosely Coupled
– Amazon
– Linkedin
23
24. NICTA Copyright 2012 From imagination to impact
Tightly Coupled Services
• Deployment unit is tier
• A tier bundles multiple services into one virtual
machine
• Tier 1
• Tier 2
24
25. NICTA Copyright 2012 From imagination to impact
Loosely Coupled Services
• Deep service
dependency
hierarchy – may be
70 deep
• Upgrading one
service in this
hierarchy
• Need to consider
both service and its
clients
• Each service is a
Virtual Machine
25
Figure from Netflix Tech Blog
26. NICTA Copyright 2012 From imagination to impact
Comparing Two Options
• Both options provide for horizontal scaling based
on load
• Both options provide for failure recovery
– Tightly coupled option will replace tier
– Loosely coupled option will replace service
– Failure recovery assumes stateless Virtual Machines
• Differ
– How updates and canaries are managed (I will
discuss in a moment)
– How unwanted dependencies are avoided
• Tightly coupled option depends on developer discipline
• Loosely coupled option avoids unwanted dependencies
through information hiding.
26
27. NICTA Copyright 2012 From imagination to impact
Common upgrade strategy
• Require all versions to be backward compatible
with previous versions
• Require changes associated with new version to
be software switchable.
• Clients of a service must be version aware in
order to know whether to utilize new
functionality.
• Once all instances have been upgraded to new
versions, send signal to turn on changes both in
the new version and their clients.
• When using canaries only turn on changes for a
subset of services and their clients. 27
28. NICTA Copyright 2012 From imagination to impact
Current state of major internet provider
• Each service has an owner
• Every service instance is instrumented
• When a canary is deployed, service owner
examines monitoring data (next slide) and uses
judgment to decide when to move to production.
• Canary testing is currently based on
functionality. No stress testing of canaries.
28
29. NICTA Copyright 2012 From imagination to impact
Netflix Monitoring Sequence
29
• Client outbound (start/end)
• Network (start/end)
• Service network (inbound start/end)
• Service processing (start/end)
• Service outbound (start/end)
• Network (start/end)
• Client inbound (start/end)
30. NICTA Copyright 2012 From imagination to impact
General picture for version aware loosely coupled
services
Client
Top Level
load
balancer
Second
level load
balancer
Server for
Version A
Server for
Version A
Server for
Version B
Second
level load
balancer
Server for
Version A
Server for
Version B
30
Client
• Version aware
• Must know about new versions
In order to take advantage of
new functionality
• May be implicitly version aware
based on, e.g. cluster
• Version unaware clients will only use
old functionality and these can be
served by any server since services
are backward compatible.
In addition:
• Load variation may
trigger elasticity rules.
• Deciding whether to
load new version or old
version raises other
issues.
31. NICTA Copyright 2012 From imagination to impact
Canary Issues
• Canaries are a form of live testing. Put a new
version into limited production to test its
correctness.
• Issues
– How long are new versions tested to determine
correctness?
• Period based – for some period of time
• Load based – under some utilization assumptions
• Result based – until some criteria is met
– How are clients of new version chosen and how is
this choice enforced?
– How are the canaries deployed?
31
32. NICTA Copyright 2012 From imagination to impact
Use of canaries with tightly coupled services
• Version awareness does not need to extend to
load balancers
– Services and clients are bound into VM
– Services and clients that are used to test new version
are in single VM and have no need for version aware
load balancers.
32
33. NICTA Copyright 2012 From imagination to impact
More Detail on Upgrade Process
• Canaries are deployed and allowed to run for a
period without turning on new features.
• This is to test backward compatibility.
• Once canaries pass this test, then the new
features are turned on.
33
34. NICTA Copyright 2012 From imagination to impact
Installation Motivating Scenario
• You change the operating environment for an
application
– Configuration change
– Version change
– Hardware change
• Result is degraded performance
• When the software stack is deep with portions
from different suppliers, the result is frequently:
34
35. NICTA Copyright 2012 From imagination to impact
Why is Installation Error Prone?
• Installation is complicated.
– Installation guides for SAS 9.3 Intelligence, IBM i, Oracle 11g for
Linux are ~250 pages each
– Apache description of addresses and ports (one out of 16
descriptions) has following elements:
• Choosing and specifying ports for the server to listen to
• IPv4 and IPv6
• Protocols
• Virtual Hosts
– The number of configuration options that must be set can be
large
• Hadoop has 206 options
• HBase has 64
– Many dependencies are not visible until execution
35
36. NICTA Copyright 2012 From imagination to impact
Installation Processes
• Processes may be
– Undocumented
– Out of date
– Insufficiently detailed
• Our goal is to build process model including
error recovery mechanisms
36
37. NICTA Copyright 2012 From imagination to impact
Our Activities
37
• Create up to date process models for installation
processes. Information sources are
– Process discovery from logs
– Process formalization from existing written
descriptions.
• Process descriptions can be used to
– Make trade offs
– Make recommendations in real time to operations
staff
– Recommend setting checkpoints for potential later
undo, before a risky part of a process is entered
– Assist in the detection of errors
38. NICTA Copyright 2012 From imagination to impact
Hard Problems
38
• Creating accurate process models
– Exception handling mechanisms are not well
documented
– Noisy logs
– Our approach
• Top down modeling using process modeling formalism
• Bottom up process mining from error logs
• Diagnosing errors
39. NICTA Copyright 2012 From imagination to impact
Why is Error Diagnosis Hard?
In a distributed computing
environment, when an error
occurs during operations, it is
difficult and time consuming to
diagnosis it.
Diagnosis involves correlating
messages from
• different distributed servers
• different portions of the
software stack
and determining the root
cause of the error.
The root cause, in turn, may
be within a portion of the stack
that is different from where the
error is observed.
40. NICTA Copyright 2012 From imagination to impact
Test Bed
40
Our current test bed is the Hbase stack
41. NICTA Copyright 2012 From imagination to impact
Currently Performing Analysis of
Configuration Errors
41
• Cross stack errors may take hours to diagnose
– Log files are inconsistent
– Error message may not give context necessary to
determine root cause.
42. NICTA Copyright 2012 From imagination to impact
Summary
42
• The modern cloud environment and modern
development practices have introduced new
problems or made more important old problems.
• Tactics exist to deal with some of these
problems.
• Developing tactics for other problems is a matter
of research.
43. NICTA Copyright 2012 From imagination to impact
NICTA Team
• Anna Liu
• Alan Fekete
• Min Fu
• Jim Zhanwen Li
• Qinghua Lu
• Sherif Sakr
• Hiroshi Wada
• Ingo Weber
• Xiwei Xu
• Liming Zhu
43