This document summarizes research analyzing the behavior of DIRAC, a distributed system for particle physics experiments, using model checking with process algebra. Key points:
1) DIRAC's shared-memory paradigm and many components running in parallel make it difficult to reason about and trace unexpected behavior.
2) The study abstracts DIRAC's implementation in Python into a model using the process algebra mCRL2, representing agents and storage as processes.
3) The model is verified through model checking, automatically probing properties for safety, progress, and deadlocks and providing counter-examples for violations.
Here is the presentation I did at Trifork GeekNights, January 12+13.
The talk is about my journey into Erlang land, how we need to start thinking about threads and processes in a new way, and some aspects of the work I have done towards implementing Erlang on the JVM platform.
Follow-up discussion either at my blog: http://javalimit.com/ or at http://groups.google.com/group/erjang
Sand Piles and Software - Madison Ruby ConferenceZach Dennis
This is a slightly varied version my previous Sand Piles and Software talk for the Madison Ruby Conference. Instead of including slides on the values, it incorporates a second part which is dedicated to decision making and some concrete areas where we can learn to help improve how we make decisions with code.
Extent 2013 Obninsk Trading Systems: Testing at the Confluence of FT & NFTextentconf Tsoy
Exactpro Systems - "Trading Systems: Testing at the Confluence of FT & NFT"
Alexey Zverev, Managing Director, Exactpro Systems
Alyona Bulda, QA Project Lead, Exactpro Systems
Ivan Bobrov, HFT Analyst, Exactpro Systems
Software functional testing can unveil a wide range of potential malfunctions in applications. However, there is a significant fraction of errors that will be hardly detected through a traditional testing process. Problems such as memory corruptions, memory leaks, performance bottlenecks, low-level system call failures and I/O errors might not surface any symptoms in a tester’s machine while causing disasters in production. On the other hand, many handy tools have been emerging in all popular platforms allowing a tester or an analyst to monitor the behavior of an application with respect to these dark areas in order to identify potential fatal problems that would go unnoticed otherwise. Unfortunately, these tools are not yet in widespread use due to few reasons. First, the usage of tools requires a certain amount of expertise on system internals. Furthermore, these monitoring tools generate a vast amount of data even with elegant filtering and thereby demand a significant amount of time for an analysis even from experts. As the end result, using monitoring tools to improve software quality becomes a costly operation. Another facet of this problem is the lack of infrastructure to automate recurring analysis patterns.
This paper describes the current state of an ongoing research in developing a framework that automates a significant part of the process of monitoring various quality aspects of a software application with the utilization of tools and deriving conclusions based on results. According to our knowledge this is the first framework to do this. It formulates infrastructure for analysts to extract relevant data from monitoring tool logs, process those data, make inferences and present analysis results to a wide range of stakeholders in a project.
Here is the presentation I did at Trifork GeekNights, January 12+13.
The talk is about my journey into Erlang land, how we need to start thinking about threads and processes in a new way, and some aspects of the work I have done towards implementing Erlang on the JVM platform.
Follow-up discussion either at my blog: http://javalimit.com/ or at http://groups.google.com/group/erjang
Sand Piles and Software - Madison Ruby ConferenceZach Dennis
This is a slightly varied version my previous Sand Piles and Software talk for the Madison Ruby Conference. Instead of including slides on the values, it incorporates a second part which is dedicated to decision making and some concrete areas where we can learn to help improve how we make decisions with code.
Extent 2013 Obninsk Trading Systems: Testing at the Confluence of FT & NFTextentconf Tsoy
Exactpro Systems - "Trading Systems: Testing at the Confluence of FT & NFT"
Alexey Zverev, Managing Director, Exactpro Systems
Alyona Bulda, QA Project Lead, Exactpro Systems
Ivan Bobrov, HFT Analyst, Exactpro Systems
Software functional testing can unveil a wide range of potential malfunctions in applications. However, there is a significant fraction of errors that will be hardly detected through a traditional testing process. Problems such as memory corruptions, memory leaks, performance bottlenecks, low-level system call failures and I/O errors might not surface any symptoms in a tester’s machine while causing disasters in production. On the other hand, many handy tools have been emerging in all popular platforms allowing a tester or an analyst to monitor the behavior of an application with respect to these dark areas in order to identify potential fatal problems that would go unnoticed otherwise. Unfortunately, these tools are not yet in widespread use due to few reasons. First, the usage of tools requires a certain amount of expertise on system internals. Furthermore, these monitoring tools generate a vast amount of data even with elegant filtering and thereby demand a significant amount of time for an analysis even from experts. As the end result, using monitoring tools to improve software quality becomes a costly operation. Another facet of this problem is the lack of infrastructure to automate recurring analysis patterns.
This paper describes the current state of an ongoing research in developing a framework that automates a significant part of the process of monitoring various quality aspects of a software application with the utilization of tools and deriving conclusions based on results. According to our knowledge this is the first framework to do this. It formulates infrastructure for analysts to extract relevant data from monitoring tool logs, process those data, make inferences and present analysis results to a wide range of stakeholders in a project.
Validation of Spacecraft Behaviour Using a Collaborative ApproachDaniele Gianni
Presentation delivered at the 3rd IEEE Track on
Collaborative Modeling & Simulation - CoMetS'12.
Please see http://www.sel.uniroma2.it/comets12/ for further details.
Trading Systems: Testing at the Confluence of FT and NFTIosif Itkin
EXTENT Trading Technology Trends & Quality Assurance Conference in Obninsk, 2 March, 2013
Trading Systems:
Testing at the Confluence of FT & NFT
Alexey Zverev, Managing Director
Alyona Bulda, QA Project Lead
Ivan Bobrov, HFT Analyst
Preventing the Next Deployment Issue with Continuous Performance Testing and ...Correlsense
Traditionally, identifying and remedying performance problems resulting from application deployments has been a slow, reactive process. Tools exist which report on application changes and problems after they occur, but how do you prevent your next performance issues before they even begins?
Join Correlsense and dbMaestro for an online seminar outlining the crucial strategies for continuous performance testing and monitoring. We will discuss:
-Limitations of traditional strategies for application deployments
-Best practices for eliminating the risks of application changes
-Solutions for proactive application performance monitoring and database change management
Preventing the Next Deployment Issue with Continuous Performance Testing and ...
Poster chep2012 reduced_original1
1. Analysing DIRAC's Behavior using Model Checking
with Process Algebra
Daniela Remenska - Jeff Templon - Tim Willemse - Henri Bal - Kees Verstoep - Wan Fokkink
Philippe Charpentier - Ricardo Graciani - Elisa Lanciotti - Krzysztof Daniel Ciba - Stefan Roiser
Motivation From DIRAC to mCRL2 Verification
DIRAC background DIRAC (Python) ~150000 loc
▪
Properties (Satefy / Progress / Deadlock)
▪ production activities and user analysis for LHCb Model-checker automatically probes them.
Abstracting the implementation depends
▪ distributed services and light-weight agents on the focus of the analysis. ▪ Property violated: counter-example trace
is provided.
Check for race-conditions
"blackboard"
or Agents update the state of shared entities.
"shared-memory"
paradigm
Systems: Storage and Workload Mgmt
Entities: Jobs, Cache-Replicas, Tasks
Figure 1: DIRAC subsystems
▪ jobs often get into incorrect
(or inconsistent) states Figure 6: Violation of progress and safety requirements
▪ staging requests become stuck
▪ difficult to trace the root of such
unexpected behavior Figure 2: Job state machine
many scenarios and components
Agents and storage become processes.
▪ manual intervention necessary Control-flow is abstracted using mCRL2
non-deterministic choice and
if-then-else constructs.
There are formal or systematic States of entities are described using Figure 7: "Zombie" job starts running after being killed
approaches to tackle this! custom abstract data types.
Conclusions
State-space generation Distributed systems are difficult to
Why Formal Methods? reason about; many components,
all run in parallel.
Based on process algebra laws
no ambiguity
Formal methods are a more rigorous
addition to testing, as a way to
Model checking tools improve software quality.
full control over the execution of parallel
processes. This way one gains more insight
A sound model needs to be written
into the system behavior.
manually. This requires experience
and can be error-prone.
Automatically explore the entire
state-space and check if some Similar techniques can be re-applied
"interesting" properties hold. to similar systems, once the learning
curve has lapsed.
Stronger than testing
Some drawbacks... Future Work
Abstraction of the "real" behavior is needed. Automate (to some degree) the
This means one must build a sound model. translation from code to model.
Expertise in formal methods and the system Figure 3: State-space visualisation with LTSView
domain is necessary.
The state-space of the model can explode. Analysis & Issues
Problems can be discovered while building and debugging the model:
Language & Toolset
Actions: atomic building blocks
can carry data parameters
Processes: composed of actions,
using algebra operators
Figure 4a: XSim simulator trace of a job workflow Figure 4b: DIRAC logging info of a job workflow
Built-in data types
integers, booleans, lists, sets, bags
Abstract data types
Figure 5: State-transition visualisation with DiaGraphica