7. Dev ? Ops ?
… but they are only in approximate balance
IT would be easy if systems were deterministic ...IT would be easy if systems were deterministic ...
10. There is no such thing as an immutable systemThere is no such thing as an immutable system
NoNo intendedintended change is not the same as no change at allchange is not the same as no change at all
12. ““The history of attempts to prevent cracks from spreading or evadeThe history of attempts to prevent cracks from spreading or evade
their consequences is almost the history of engineering.”their consequences is almost the history of engineering.”
-- J.E. Gordon-- J.E. Gordon The New Science of Strong MaterialsThe New Science of Strong Materials
Failures are semanticFailures are semantic
Processes are dynamicProcesses are dynamic
13. Harp not on that keyboard!Harp not on that keyboard!
Orchestration?Orchestration?
Choreography?Choreography?
Control is ...Control is ...
This year is a bit special for me. It's the the 20th anniversary of CFEngine, and I've been putting together a story that I've been wanted to tell for many years now. It's the story of how, as a physicist, I came to rediscover my own subject in the guise of computers and information systems, and was able to use that knowledge of physics to build better systems that could scale and remain stable under pressure.
I've been putting down these ideas in a book, called “In Search of Certainty” which I'll show you at the end, and I hope this will help to dispel some of the misconceptions around how we understand systems and their configurations. As a side effect, it is also the story of CFEngine in many ways.
Obviously, I am no going to be able to tell you everything in the book. And I am not the only person who has had these insights. Today, it pleases me to see quite a few names in our circles today who believe the same principles of systems. But back in the 1990s when I started, it was considered to be heresy. All this to support the development of modern post-industrial information infrastructure.
So I'd like to tell you a little bit of that story today and what I think it means for the future of our industry.
To our shame, one of the rarely voiced complaints we could and should level at the IT service industry is that we don't really know how to make promises we can keep. In fact, we've not designed technology to `keep' anything; the main focus of almost everything we design lies in building, tweaking and then fire-fighting, working reactively, all within an increasingly fast-moving and disposable culture.
We should ask why this is so. Why is it so hard to keep promises?
One answer is that we don't practice a method and learn from mistakes
We don't know how to keep promises because we don't understand the relationship between what we want to happen and what actually happens.
We've been taught to accept a simple untruth about computers. It probably wasn't even spoken out loud, but we've come to believe it anyway--- and its this: computers are just machines, and machines just do what we tell them to. The most important thing we will every learn is why this simple idea is pure rubbish.
If are going to understand systems properly, then we could do worse than to look at physics. Physics stands for “Describing stuff that happens”. The methods have been applied to moving objects, force fields, network behaviour and any number of things. And physics has been one of the most successful applications of the scientific method. It has led to the most accurate predictions we have about the natural world.
What is its secret?
One thing is that is tries to get to the bottom of cause and effect. The other is that is is not afraid to model things approximately, under idealized conditions Then finally, it has an inbuilt concept of how to handle uncertainty about the environment at different SCALES.
Computer science, and the science of machines in general, on the other hand, is not very well developed. It tends to focus on only a part of the problem – about how to make the machine do what we want it to do. That is a human concern. It has no real model for uncertainty. It has software testing – a fairly primitive and incomplete theory of intended behaviour.
Why can't we keep promises? You could say it's sabotage! – by the environment.
Complex system dynamics – HIGH INFORMATION content
In technology we generally ignore the majority of influence from the environment. We might think about gravity when making a bridge, but we don't think much about chemical corrosion. We might think about the curvature of the Earth when designing satellite systems, but we don't think much about flying space debris.
We grow machines in captivity and release them into the wild. Then we are surprised when they exhibit behaviours that we didn't plan for. “Unintended behaviour”.
In software engineering, we believe we can handle this with testing. But this is almost useless. There is unit testing, which is verification of intended behaviour. How do you test for something when you don't know what to look for. Well you leave it running … in isolation? No, in an environment. Oh, but how shall we test all possible environments? We can't. Heaven forbid – we might actually need a theoretical model of those environments and their influence on a system.
So one answer is SYSTEM THINKING
People talk about system thinking a lot. Perhaps DEVOPS itself is a kind of system thihnking. But what does it mean? I believe it means having a unified understanding of two different aspects of technology, that I shall call: semantics and dynamics.
Dynamics describes the behaviours that exist. It is a parameterizable model for describing what happens – it's physics! There is a well established theory of separation of concerns in physics for DYNAMICS, not just semantics. This is about predictability.
Semantics describes how we interpret behaviours, and what they mean. This is a purely human invention. It can be ad hoc or model based.
The environment has no semantics – only dynamics.
Of course the laws of dynamics are not legal laws. No one will sue gravitation if a hammer fails to fall. But they will sue you for semantics.
GALILEO's EXPT
Galileo is famous for his experiment in which he climbed the tower of pizza..
dynamics is how influence is transmitted.
I would like to suggest that testing is an almost useless activity unless you have a quite detailed model of a system. It's like trying to pin the tail on a donkey, blindfolded because you don't know where the tail is supposed to go, or even what a donkey is if you have no expectations. Those expectations come from modelling. From theory.
First of all, we are not the only voice telling machines what to do.
But let's think about what this means for technology – for intended behaviour.
We intend: solve hunger and save people – transport logistics
Technology is not exactly like science – it's harder. For that reason, it tends to ignore science and focus on semantics. If something falls apart in use, we blame it on poor design, human error etc with perfect hindsight.
The semantics are an illusion – we invent them as part of our modelling. The universe doesn't really know that systems have a purpose – that is a fiction that we maintain. But we RELY on them. That means they can fail.
Success and failure are semantic qualities. Stability is a dynamical one.
What happens: we ship supplies through a hostile environment
Top down (idealized) / bottom up (reality)
So what is the problem with semantics? Why is it so hard to be certain about behaviour? SCALE comes into the issue.
The indeterminism of the environment is a giant search algorithm, much more efficient than any deterministic test harness. The thing that separates P from NP is non-deterministic influence. It tells you that the environment can generate exponential complexity in linear time...more than amatch for brute force determinism.
Death by a thousand ant stings.
The hard problems of the industrial revolution were about steam power industrialization: creating engines of great power to amplify human ability to dominate of the forces of nature. To conquer by force.
We can no longer afford to think in terms of brute force. Rather than conquering environments, we try to work with the environments – recognizing that we are tiny up against the forces at large.
What does this mean for IT? It means finding a way to go beyond the simplistic ideas about reasoning that we have today. It's not about how we do programming, in one language or another – but rather how we use programming in highly unstable ways....
This brings up another issue. LOGIC is vulnerable to complexity.
LOGIC, as we understand reasoning, is a model of artification simplicity that lives in a complex world. In order for logic to be reasonable (pun intended) it has to be based on stable assumptions and rules.
CS assumes the stability of these basic atoms. And that is why we make so many mistakes.
Why is this a problem? The answer is programmability (APIs)
You cannot argue with the human desire for freedom. Hence the explosion of tampering that we've unleashed today by providing APIs as a playground for everything. An explision of tampering.
In order to scale influence, we need to automate the dynamics. In order to achieve predictable semantics, we need to base them on stable behaviours.
In a system, something can be true and false at the same time.
We need to avoid logical inference based on unstable environments.
Now imagine this as a flood of web requets ...
Continuous delivery requires continuous maintenances “detailed balance” - not ad hoc, power assisted interventions
Let's compare what this means. Here is a parody of the way we handle unintended and unexpected change.
To automate it – intend it! Promise it. You should be able to promise it continuously too (on the same timescale as other change processes) – not just promise to jump in and fix it when it fails.
The drain is a perfect example of dynamics being designed to have specific semantics. There is no additional logic or reasoning – no choices, just unilateral action. The sole behaviour IS the purpose
This is how you embed certainty at a particular scale. It doesn't get better than this.
But we continue dreaming of the myth of deterministic control..
Recently the Continuous Delivery advocates have implied that we don't need to worry about anything, because we can make systems into closures. This is a semantic illusion, caused by drinking too much lambda calculus
Functions could be closures. Isolated atoms could be closures, but intricate operating systems connected to unpredictable reservoirs of complex influence cannot be meaningfully regarded as closures.
This is confusing intended change with actual change. Computers are rife with channels for bringing about unintended change,
Tamper-free does not mean immutable. Tamper free means that there is no intention to change. It does not mean that you don't have to maintain the drains.
The very ability to learn comes from mutability.
Every logical predicate in a program partitions the world into essentially separate realities, each with maximally uncertain incomplete information about the environment.
We have to recognize that limitation and use it sensibly. Like the storm drain. Different observers at different times will end up in different worlds, because the environment is sliced differently through tiny differences.
The tree branches in both time and space.
There are annoying side effects of this.
Eventual consistency is not a defect of noSQL, it is a hard reality about the physical world. Information travels at finite speed, and forms event horizons beyond which information is unobtainable. Those horizons shrink as computational speeds approach the speed of light.
Even if you can use partitions as approximate closures what does that win you? You trade leaky data for isolation and alternate realities.
Infrastructures – these realities are not really independent, because users can actually experience more than one of them.
What happens when you try to put semantics ahead of dynamics? You get FAILures and errors – promises are not kept.
Let me give you an example
Avalanche instabilities are the natural result of strong coupling and rigid systems.
In the first jet-airline Comet plane disaster of 1954, microscopic cracks precipitated an avalanche failure that was so fast nothing could have prevented it from happening. In physical terms we would say that the rate of reaction dominated any process capable of preventing it.
You might think that this is just another one of those annoying flight disaster analogies that we blame on human error. But I'd rather point out that, all we know about the deepest roots of physics tell us that this failure of design semantics (fitness for purpose) is directly analogous to the failure of any other information system. The physical world is an information system – just at a much larger scale. The bigger we make systems, the more similarities we are going to see between phenomena such as cracks.
BUT Cracks can occur in both semantics and dynamics.
What does this mean for infrastructures – and what behaviours does it imply? We can learn a lot from the lessons of material science, when force meets structure. There might not be reasoning in any sense, but there are both semanics and dynamics in materials.
We cannot be held hostage to the notion that reasoning is all computing is about.
What does this mean about our ability to control systems and provide a reliabe infrastructure that society can rely on?
We are obsessed with narrative orchestration the story narrative.
We have to stop confusing post hoc narrative with the imagination of control. The present is a branching, converging process of overlapping influences. Only hindsight causes us to imagine that there was a simple storyline to it.
How do we arrange cooperation?
Get people off keyboards: we can't type in the outcome
We need to find the right roles for humans, not wielding power tools, but architecting change and avoiding crises by modelling.
What we call orchestration today is a travesty. We need to go from gaming narratives to a proper distributed architectural cooperation score
IT has many technical books. I wrote my book from a cultural perspective
The great physicist Ludwig Boltzmann used the new statistics to analyze autonomous state machines that was kindled in the 1800s for dealing with the uncertainties of steam industrialization. In the sae era, Gorge Boole took a theory of logic, buulding on the work of Thomas Hobbes. and probability and applied it to larger questions of reasoning, that went back to the ideas of Thomas Hobbes.
In 1998, I argued that we dance for our computers. Today we are trying to make computers dance for us, with a strong arm, using crude remote controlled power suits. If we are ever going to master automation, we need to cut the strings that connect human need for narrative control to realtime reactive machinery, and think again about humans and technology should coexist.
The keyword is to embrace autonomy.