Managing a railway on historical data is like …

I was recently looking on the web for any news on the Galicia rail crash. I didn’t find anything current but came across this old item from The Guardian (London). It mentioned in passing that consortia tendering for a new high speed railway in Brazil were excluded if they had been involved in the operation of a high speed line that had had an accident in the previous five years.

Well, I don’t think that there is necessarily anything wrong with that in itself. But it is important to remember that a rail accident is not necessarily a Signal (sic). Rail accidents worldwide are often a manifestation of what W Edwards Deming called A stable system of trouble. In other words, a system that features only Noise but which cannot deliver the desired performance. An accident free record of five years is a fine thing but there is nothing about a stable system of trouble that says it can’t have long incident free periods.

In order to turn that incident free five years into evidence about future likely safety performance we also need hard evidence, statistical and qualitative, about the stability and predictability of the rail operator’s processes. Procurement managers are often much worse at looking for, and at, this sort of data. In highly sophisticated industries such as automotive it is routine to demand capability data and evidence of process surveillance from a potential supplier. Without that, past performance is of no value whatever in predicting future results.

Rearview

The cyclist on the railway crossing – a total failure of risk perception

This is a shocking video. It shows a cyclist wholly disregarding warnings and safety barriers at a railway crossing in the UK. She evaded death, and the possible derailment of the train, by the thinnest of margins imaginable.

In my mind this raises fundamental questions, not only about risk perception, but also about how we can expect individuals to behave in systems not of their own designing. Such systems, of course, include organisations.

I was always intrigued by John Adams’ anthropological taxonomy of attitudes to risk (taken from his 1995 book Risk).

AdamsTaxonomy1

Adams identifies four attitudes to risk found at large. Each is entirely self-consistent within its own terms. The egalitarian believes that human and natural systems inhabit a precarious equilibrium. Any departure from the sensitive balance will propel the system towards catastrophe. However, the individualist believes the converse, that systems are in general self-correcting. Any disturbance away from repose will be self-limiting and the system will adjust itself back to equilibrium. The hierarchist agrees with the individualist up to a point but only so long as any disturbance remains within scientifically drawn limits. Outside that lies catastrophe. The fatalist believes that outcomes are inherently uncontrollable and indifferent to individual ambition. Worrying about outcomes is not the right criterion for deciding behaviour.

Without an opportunity to interview the cyclist it is difficult to analyse what she was up to. Even then, I think that it would be difficult for her recollection to escape distortion by some post hoc and post-traumatic rationalisation. I think Adams provides some key insights but there is a whole ecology of thoughts that might be interacting here.

Was the cyclist a fatalist resigned to the belief that no matter how she behaved on the road injury, should it come, would be capricious and arbitrary? Time and chance happeneth to them all.

Was she an individualist confident that the crossing had been designed with her safety assured and that no mindfulness on her part was essential to its effectiveness? That would be consistent with Adams’ theory of risk homeostasis. Whenever a process is made safer on our behalf, we have a tendency to increase our own risk-taking so that the overall risk is the same as before. Adams cites the example of seatbelts in motor cars leading to more aggressive driving.

Did the cyclist perceive any risk at all? Wagenaar and Groeneweg (International Journal of Man-Machine Studies 1987 27 587) reviewed something like 100 shipping accidents and came to the conclusion that:

Accidents do not occur because people gamble and lose, they occur because people do not believe that the accident that is about to occur is at all possible.

Why did the cyclist not trust that the bells, flashing lights and barriers had been provided for her own safety by people who had thought about this a lot? The key word here is “trust” and I have blogged about that elsewhere. I feel that there is an emerging theme of trust in bureaucracy. Engineers are not used to mistrust, other than from accountants. I fear that we sometimes assume too easily that anti-establishment instincts are constrained by the instinct for self preservation.

However we analyse it, the cyclist suffered from a near fatal failure of imagination. Imagination is central to risk management, the richer the spectrum of futures anticipated, the more effectively risk management can be designed into a business system. To the extent that our imagination is limited, we are hostage to our agility in responding to signals in the data. That is what the cyclist discovered when she belatedly spotted the train.

Economist G L S Shackle made this point repeatedly, especially in his last book Imagination and the Nature of Choice (1979). Risk management is about getting better at imagining future scenarios but still being able to spot when an unanticipated scenario has emerged, and being excellent at responding efficiently and timeously. That is the big picture of risk identification and risk awareness.

That then leads to the question of how we manage the risks we can see. A fundamental question for any organisation is what sort of risk takers inhabit their ranks? Risk taking is integral to pursuing an enterprise. Each organisation has its own risk profile. It is critical that individual decision makers are aligned to that. Some will have an instinctive affinity for the corporate philosophy. Others can be aligned through regulation, training and leadership. Some others will not respond to guidance. It is the latter category who must only be placed in positions where the organisation knows that it can benefit from their personal risk appetite.

If you think this an isolated incident and that the cyclist doesn’t work for you, you can see more railway crossing incidents here.

The Monty Hall Problem redux

This old chestnut refuses to die and I see that it has turned up again on the BBC website. I have been intending for a while to blog about this so this has given me the excuse. I think that there has been a terrible history of misunderstanding this problem and I want to set down how the confusion comes about. People have mistaken a problem in psychology for a problem in probability.

Here is the classic statement of the problem that appeared in Parade magazine in 1990.

Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do you want to pick door No. 2?” Is it to your advantage to switch your choice?

The rational way of approaching this problem is through Bayes’ theorem. Bayes’ theorem tells us how to update our views as to the probability of events when we have some new information. In this problem I have never seen anyone start from a position other than that, before any doors are opened, no door is more probably hiding the car than the others. I think it is uncontroversial to say that for each door the probability of its hiding the car is 1/3.

Once the host opens door No. 3, we have some more information. We certainly know that the car is not behind door No. 3 but does the host tell us anything else? Bayes’ theorem tells us how to ask the right question. The theorem can be illustrated like this.
Bayes

The probability of observing the new data, if the theory is correct (the green box), is called the likelihood and plays a very important role in statistics.

Without giving the details of the mathematics, Bayes’ theorem leads us to analyse the problem in this way.

MH1

We can work this out arithmetically but, because all three doors were initially equally probable, the matter comes down to deciding which of the two likelihoods is greater.

MH2

So what are the respective probabilities of the host behaving in the way he did? Unfortunately, this is where we run into problems because the answer depends on the tactic that the host was adopting.

And we are not given that in the question.

Consider some of the following possible tactics the host may have adopted.

  1. Open an unopened door hiding a goat, if both unopened doors have goats, choose at random.
  2. If the contestant chooses door 1 (or 2, or 3), always open 3 (or 1, or 2) whether or not it contains a goat.
  3. Open either unopened door at random but only if contestant has chosen box with prize otherwise don’t open a box (the devious strategy, suggested to me by a former girlfriend as the obviously correct answer).
  4. Choose an unopened door at random. If it hides a goat open it. Otherwise do not open a door (not the same as tactic 1).
  5. Open either unopened door at random whether or not it contains a goat

There are many more. All these various tactics lead to different likelihoods.

Tactic Probability that the host revealed a goat at door 3: Rational choice
given that the car is at 1 given that the car is at 2
1

½

1

Switch
2

1

1

No difference
3

½

0

Don’t switch
4

½

½

No difference
5

½

½

No difference

So if we were given this situation in real life we would have to work out which tactic the host was adopting. The problem is presented as though it is a straightforward maths problem but it critically hinges on a problem in psychology. What can we infer from the host’s choice? What is he up to? I think that this leads to people’s discomfort and difficulty. I am aware that even people who start out assuming Tactic 1 struggle but I suspect that somewhere in the back of their minds they cannot rid themselves of the other possibilities. The seeds of doubt have been sown in the way the problem is set.

A participant in the game show would probably have to make a snap judgment about the meaning of the new data. This is the sort of thinking that Daniel Kahneman calls System 1 thinking. It is intuitive, heuristic and terribly bad at coping with novel situations. Fear of the devious strategy may well prevail.

A more ambitious contestant may try to embark on more reflective analytical System 2 thinking about the likely tactic. That would be quite an achievement under pressure. However, anyone with the inclination may have been able to prepare himself with some pre-show analysis. There may be a record of past shows from which the host’s common tactics can be inferred. The production company’s reputation in similar shows may be known. The host may be displaying signs of discomfort or emotional stress, the “tells” relied on by poker players.

There is a lot of data potentially out there. However, that only leads us to another level of statistical, and psychological, inference about the host’s strategy, an inference that itself relies on its own uncertain likelihoods and prior probabilities. And that then leads to the level of behaviour and cognitive psychology and the uncertainties in the fundamental science of human nature. It seems as though, as philosopher Richard Jeffrey put it, “It’s probabilities all the way down”.

Behind all this, it is always useful advice that, having once taken a decision, it should only be revised if there is some genuinely new data that was surprising given our initial thinking.

Economist G L S Shackle long ago lamented that:

… we habitually and, it seems, unthinkingly assume that the problem facing … a business man, is of the same kind as those set in examinations in mathematics, where the candidate unhesitatingly (and justly) takes it for granted that he has been given enough information to construe a satisfactory solution. Where, in real life, are we justified in assuming that we possess ‘enough’ information?

Walkie-Talkie “death ray” and risk identification

News media have been full of the tale of London’s Walkie-Talkie office block raising temperatures on the nearby highway to car melting levels.

The full story of how the architects and engineers created the problem has yet to be told. It is certainly the case that similar phenomena have been reported elsewhere. According to one news report, the Walkie-Talkie’s architect had worked on a Las Vegas hotel that caused similar problems back in September 2010.

More generally, an external hazard from a product’s optical properties is certainly something that has been noted in the past. It appears from this web page that domestic low-emissivity (low-E) glass was suspected of setting fire to adjacent buildings as long ago as 2007. I have not yet managed to find the Consumer Product Safety Commission report into low-E glass but I now know all about the hazards of snow globes.

The Walkie-Talkie phenomenon marks a signal failure in risk management and it will cost somebody to fix it. It is not yet clear whether this was a miscalculation of a known hazard or whether the hazard was simply neglected from the start.

Risk identification is the most fundamental part of risk management. If you have failed to identify a risk you are not in a position to control, mitigate or externalise it in advance. Risk identification is also the hardest part. In the case of the Walkie-Talkie, modern materials, construction methods and aesthetic tastes have conspired to create a phenomenon that was not, at least as an accidental feature, present in structures before this century. That means that risk identification is not a matter of running down a checklist of known hazards to see which apply. Novel and emergent risks are always the most difficult to identify, especially where they involve the impact of an artefact on its environment. This is a real, as Daniel Kahneman would put it, System 2 task. The standard checklist propels it back to the flawed System 1 level. As we know, even when we think we are applying a System 2 mindset, me may subconsciously be loafing in a subliminal System 1.

It is very difficult to spot when something has been missed out of a risk assessment, even in familiar scenarios. In a famous 1978 study by Fischhoff, Slovic and others, they showed to college students fault trees analysing potential causes of a car’s failure to start (this is 1978). Some of the fault trees had been “pruned”. One branch, representing say “battery charge”, had been removed. The subjects were very poor at spotting that a major, and well known, source of failure had been omitted from the analysis. Where failure modes are unfamiliar, it is even more difficult to identify the lacuna.

Even where failure modes are identified, if they are novel then they still present challenges in effective design and risk management. Henry Petroski, in Design Paradigms, his historical analysis of human error in structural engineering, shows how novel technologies present challenges for the development of new engineering methodologies. As he says:

There is no finite checklist of rules or questions that an engineer can apply and answer in order to declare that a design is perfect and absolutely safe, for such finality is incompatible with the whole process, practice and achievement of engineering. Not only must engineers preface any state-of-the-art analysis with what has variously been called engineering thinking and engineering judgment, they must always supplement the results of their analysis with thoughtful and considered interpretations of the results.

I think there are three principles that can help guard against an overly narrow vision. Firstly, involve as broad a selection of people as possible in hazard identification. Perhaps, diagonal slice the organisation. Do not put everybody in a room together where they can converge rapidly. This is probably a situation where some variant of the Delphi method can be justified.

Secondly, be aware that all assessments are provisional. Make design assumptions explicit. Collect data at every stage, especially on your assumptions. Compare the data with what you predicted would happen. Respond to any surprises by protecting the customer and investigating. Even if you’ve not yet melted a Jaguar, if the glass is looking a little more reflective than you thought it would be, take immediate action. Do not wait until you are in the Evening Standard. There is a reputation management side to this too.

Thirdly, as Petroski advocates, analysis of case studies and reflection on the lessons of history helps to develop broader horizons and develop a sense of humility. It seems nobody’s life is actually in danger from this “death ray” but the history of failures to identify risk leaves a more tangible record of mortality.

Trust in data – II

I just picked up on this, now not so recent, news item about the prosecution of Steven Eaton. Eaton was gaoled for falsifying data in clinical trials. His prosecution was pursuant to the Good Laboratory Practice Regulations 1999. The Regulations apply to chemical safety assessments and come to us, in the UK, from that supra-national body the OECD. Sadly I have managed to find few details other than the press reports. I have had a look at the website of the prosecuting Medicines and Healthcare Products Regulatory Agency but found nothing beyond the press release. I thought about a request under the Freedom of Information Act 2000 but wonder whether an exemption is being claimed pursuant to section 31.

It’s a shame because it would have been an opportunity to compare and contrast with another notable recent case of industrial data fabrication, that concerning BNFL and the Kansai Electric contract. Fortunately, in that case, the HSE made public a detailed report.

In the BNFL case, technicians had fabricated measurements of the diameters of fuel pellets in nuclear fuel rods, it appears principally out of boredom at doing the actual job. The customer spotted it, BNFL didn’t. The matter caused huge reputational damage to BNFL and resulted in the shipment of nuclear fuel rods, necessarily under armed escort, being turned around mid-ocean and returned to the supplier.

For me, the important lesson of the BNFL affair is that businesses must avoid a culture where employees decide what parts of the job are important and interesting to them, what is called intrinsic motivation. Intrinsic motivation is related to a sense of cognitive ease. That sense rests, as Daniel Kahneman has pointed out, on an ecology of unknown and unknowable beliefs and prejudices. No doubt the technicians had encountered nothing but boringly uniform products. They took that as a signal, and felt a sense of cognitive ease in doing so, to stop measuring and conceal the fact that they had stopped.

However, nobody in the supply chain is entitled to ignore the customer’s wishes. Businesses need to foster the extrinsic motivation of the voice of the customer. That is what defines a job well done. Sometimes it will be irksome and involve a lot of measuring pellets whose dimensions look just the same as the last batch. We simply have to get over it!

The customer wanted the data collected, not simply as a sterile exercise in box-ticking, but as a basis for diligent surveillance of the manufacturing process and as a critical component of managing the risks attendant in real world nuclear industry operations. The customer showed that a proper scrutiny of the data, exactly what they had thought that BNFL would perform as part of the contract, would have exposed its inauthenticity. BNFL were embarrassed, not only by their lack of management control of their own technicians, but by the exposure of their own incapacity to scrutinise data and act on its signal message. Even if all the pellets were of perfect dimension, the customer would be legitimately appalled that so little critical attention was being paid to keeping them so.

Data that is properly scrutinised, as part of a system of objective process management and with the correct statistical tools, will readily be exposed if it is fabricated. That is part of incentivising technicians to do the job diligently. Dishonesty must not be tolerated. However, it is essential that everybody in an organisation understands the voice of the customer and understands the particular way in which they themselves add value. A scheme of goal deployment weaves the threads of the voice of the customer together with those of individual process management tactics. That is what provides an individual’s insight into how their work adds value for the customer. That is what provides the “nudge” towards honesty.