UK railway suicides – 2017 update

The latest UK rail safety statistics were published on 23 November 2017, again absent much of the press fanfare we had seen in the past. Regular readers of this blog will know that I have followed the suicide data series, and the press response, closely in 2016, 20152014, 2013 and 2012. Again I have re-plotted the data myself on a Shewhart chart.

RailwaySuicides20171

Readers should note the following about the chart.

  • Many thanks to Tom Leveson Gower at the Office of Rail and Road who confirmed that the figures are for the year up to the end of March.
  • Some of the numbers for earlier years have been updated by the statistical authority.
  • I have recalculated natural process limits (NPLs) as there are still no more than 20 annual observations, and because the historical data has been updated. The NPLs have therefore changed but, this year, not by much.
  • Again, the pattern of signals, with respect to the NPLs, is similar to last year.

The current chart again shows two signals, an observation above the upper NPL in 2015 and a run of 8 below the centre line from 2002 to 2009. As I always remark, the Terry Weight rule says that a signal gives us license to interpret the ups and downs on the chart. So I shall have a go at doing that.

It will not escape anybody’s attention that this is now the second year in which there has been a fall in the number of fatalities.

I haven’t yet seen any real contemporaneous comment on the numbers from the press. This item appeared on the BBC, a weak performer in the field of data journalism but clearly with privileged access to the numbers, on 30 June 2017, confidently attributing the fall to past initiatives.

Sky News clearly also had advanced sight of the numbers and make the bold claim that:

… for every death, six more lives were saved through interventions.

That item goes on to highlight a campaign to encourage fellow train users to engage with anybody whose behaviour attracted attention.

But what conclusions can we really draw?

In 2015 I was coming to the conclusion that the data increasingly looked like a gradual upward trend. The 2016 data offered a challenge to that but my view was still that it was too soon to say that the trend had reversed. There was nothing in the data incompatible with a continuing trend. This year, 2017, has seen 2016’s fall repeated. A welcome development but does it really show conclusively that the upward trending pattern is broken? Regular readers of this blog will know that Langian statistics like “lowest for six years” carry no probative weight here.

Signal or noise?

Has there been a change to the underlying cause system that drives the suicide numbers? Last year, I fitted a trend line through the data and asked which narrative best fitted what I observed, a continuing increasing trend or a trend that had plateaued or even reversed. You can review my analysis from last year here.

Here is the data and fitted trend updated with this year’s numbers, along with NPLs around the fitted line, the same as I did last year.

RailwaySuicides20172

Let’s think a little deeper about how to analyse the data. The first step of any statistical investigation ought to be the cause and effect diagram.

SuicideCne

The difficulty with the suicide data is that there is very little reproducible and verifiable knowledge as to its causes. I have seen claims, of whose provenance I am uncertain, that railway suicide is virtually unknown in the USA. There is a lot of useful thinking from common human experience and from more general theories in psychology. But the uncertainty is great. It is not possible to come up with a definitive cause and effect diagram on which all will agree, other from the point of view of identifying candidate factors.

The earlier evidence of a trend, however, suggests that there might be some causes that are developing over time. It is not difficult to imagine that economic trends and the cumulative awareness of other fatalities might have an impact. We are talking about a number of things that might appear on the cause and effect diagram and some that do not, the “unknown unknowns”. When I identified “time” as a factor, I was taking sundry “lurking” factors and suspected causes from the cause and effect diagram that might have a secular impact. I aggregated them under the proxy factor “time” for want of a more exact analysis.

What I have tried to do is to split the data into two parts:

  • A trend (linear simply for the sake of exploratory data analysis (EDA); and
  • The residual variation about the trend.

The question I want to ask is whether the residual variation is stable, just plain noise, or whether there is a signal there that might give me a clue that a linear trend does not hold.

There is no signal in the detrended data, no signal that the trend has reversed. The tough truth of the data is that it supports either narrative.

  • The upward trend is continuing and is stable. There has been no reversal of trend yet.
  • The data is not stable. True there is evidence of an upward trend in the past but there is now evidence that deaths are decreasing.

Of course, there is no particular reason, absent the data, to believe in an increasing trend and the initiative to mitigate the situation might well be expected to result in an improvement.

Sometimes, with data, we have to be honest and say that we do not have the conclusive answer. That is the case here. All that can be done is to continue the existing initiatives and look to the future. Nobody ever likes that as a conclusion but it is no good pretending things are unambiguous when that is not the case.

Next steps

Previously I noted proposals to repeat a strategy from Japan of bathing railway platforms with blue light. In the UK, I understand that such lights were installed at Gatwick in summer 2014. In fact my wife and I were on the platform at Gatwick just this week and I had the opportunity to observe them. I also noted, on my way back from court the other day, blue strip lights along the platform edge at East Croydon. I think they are recently installed. However, I have not seen any data or heard of any analysis.

A huge amount of sincere endeavour has gone into this issue but further efforts have to be against the background that there is still no conclusive evidence of improvement.

Suggestions for alternative analyses are always welcomed here.

Advertisements

UK Election of June 2017 – Polling review

Pollin2017Overview

Here are all the published opinion polls for the June 2017 UK general election, plotted as a Shewhart chart.

The Conservative lead over Labour had been pretty constant at 16% from February 2017, after May’s Lancaster House speech. The initial Natural Process Limits (“NPLs”) on the chart extend back to that date. Then something odd happened in the polls around Easter. There were several polls above the upper NPL. That does not seem to fit with any surrounding event. Article 50 had been declared two weeks before and had had no real immediate impact.

I suspect that the “fugue state” around Easter was reflected in the respective parties’ private polling. It is possible that public reaction to the election announcement somehow locked in the phenomenon for a short while.

Things then seem to settle down to the 16% lead level again. However, the local election results at the bottom of the range of polls ought to have sounded some alarm bells. Local election results are not a reliable predictor of general elections but this data should not have felt very comforting.

Then the slide in lead begins. But when exactly? A lot of commentators have assumed that it was the badly received Conservative Party manifesto that started the decline. It is not possible to be definitive from the chart but it is certainly arguable that it was the leak of the Labour Party manifesto that started to shift voting intention.

Then the swing from Conservative to Labour continued unabated to polling day.

Polling performance

How did the individual pollsters fair? I have, somewhat arbitrarily, summarised all polls conducted in the 10 days before the election (29 May to 7 June). Here is the plot along with the actual popular poll result which gave a 2.5% margin of Conservative over Labour. That is the number that everybody was trying to predict.

PollsterPerformance

The red points are the surveys from the 5 days before the election (3 to 7 June). Visually, they seem to be no closer, in general, than the other points (6 to 10 days before). The vertical lines are just an aid for the eye in grouping the points. The absence of “closing in” is confirmed by looking at the mean squared error (MSE) (in %2) for the points over 10 days (31.1) and 5 days (34.8). There is no evidence of polls closing in on the final result. The overall Shewhart chart certainly doesn’t suggest that.

Taking the polls over the 10 day period, then, here is the performance of the pollsters in terms of MSE. Lower MSE is better.

Pollster MSE
Norstat 2.25
Survation 2.31
Kantar Public 6.25
Survey Monkey 8.25
YouGov 9.03
Opinium 16.50
Qriously 20.25
Ipsos MORI 20.50
Panelbase 30.25
ORB 42.25
ComRes 74.25
ICM 78.36
BMG 110.25

Norstat and Survation pollsters will have been enjoying bonuses on the morning after the election. There are a few other commendable performances.

YouGov model

I should also mention the YouGov model (the green line on the Shewhart chart) that has an MSE of 2.25. YouGov conduct web-based surveys against at huge data base or around 50,000 registered participants. They also collect, with permission, deep demographic data on those individuals concerning income, profession, education and other factors. There is enough published demographic data from the national census to judge whether that is a representative frame from which to sample.

YouGov did not poll and publish the raw, or even adjusted, voting intention. They used their poll to  construct a model, perhaps a logistic regression or an artificial neural network, they don’t say, to predict voting intention from demographic factors. They then input into that model, not their own demographic data but data from the national census. That then gave their published forecast. I have to say that this looks about the best possible method for eliminating sampling frame effects.

It remains to be seen how widely this approach is adopted next time.

Grenfell Tower – Elites on trial – Trust in bureaucracy revisited

Grenfell Tower fire (wider view).jpg

Grenfell Tower fire1

Nobody can react to the Grenfell Tower fire with anything other than horror, sadness, anger and resolve.

Much of that anger is, legitimately, directed at the elite professions who make the decisions on which individual safety turns. I am proud to have been a member of two elite professions during my lifetime: engineering and law. I wanted to say something about the nature of practice, of responsibility and of blame.

It is too early to be confident of causes, remedies or punishments. Those will have to await full investigation but professionals of all disciplines will need little encouragement to spend the coming weeks searching their own souls over their wider obligations to society. For, that is what membership of a profession entails.

The need for bureaucracy

“Bureaucracy” is a word most often used pejoratively, as a rebuke to a turgid rigidity that frustrates spontaneity, creativity, efficiency and expedition. It is that. But once society starts to enjoy systems of reasonable complexity, civil aviation, networked electricity supply, international transport of goods etc., much decision making is going to be reserved to a cadre of experts. Diane Vaughan’s analysis of the Space Shuttle Challenger disaster2 is a relevant and salutary account of engineering as a bureaucratic profession.

Of course, you can embed some of your bureaucracy in software but don’t expect that to improve spontaneity, creativity or flexibility. Efficiency and expedition, perhaps. Even putting a bunch of flowers on your dining table requires this.

I have often, on this blog, cited Robert Michels’ iron law of oligarchy. Michels contended that any team of bureaucrats soon realised the power they held in controlling the levers of policy. A willingness to pull those levers in the direction of their own self-interest, and a jealous protection of their professional status and expertise, soon followed. That sometimes put them at odds with the objectives they were supposed to be implementing on behalf of their principals. As one political scientist put it:3

Many governance dysfunctions arise because the agents have different agendas from the principals, and the problem of institutional design is related to incentivising the agents to do the principal’s bidding.

Max Weber had realised all this earlier in the nineteenth century. Weber was a child of that mother of all bureaucracies, the Prussian civil service. The self interest of managing elites bothered him and he sought to inculcate an ethic of responsibility whereby professionals thought hard about the wider consequences of their decisions.4 That is the ethic that modern professions seek to foster among their members. Engagement in a profession carries responsibilities.

Faith in regulation

So much marginally informed debate among journalists has been about “the building regulations”. I guess that they mean the Building Regulations 2010. You can examine the relevant parts here, for what they are worth. Scroll down to Part B. Of course the Regulations themselves are supported by the statutory guidance. Here that is. The guidance refers to a legion of British Standards. As I learned during my time in the railway industry, the scientific basis of the guidance is not always easy to trace. Expect to hear more of that as inquiries progress.

The fundamental truth of such regulations is that it is the building professionals themselves who write them. Who else? That does not mean that the professionals, even in conclave, are infallible. Nobel laureate psychologist Daniel Kahneman has written extensively about the bounded rationality that limits everybody’s individual, or group, vision beyond a limited range of experiences, values and prejudices. Experts are just as prone as anybody. You and me too.5 It is unlikely that software will do a better job. Expert systems will work, here I go again, in “an environment that is sufficiently regular to be predictable”. I heard Daniel Dennett speak in London recently. Software will provide us with tissues not colleagues.

All this feeds into the collateral phenomenon whereby businesses actively use their expert involvement in setting regulations as a strategy to capture market share, promote their own products and erect barriers against entry for would-be competitors. The extreme consequence here is regulator capture, where the regulator becomes so dependent upon the expertise of the regulated that she is glad to let them define the regulatory regime.

To some extent, the self validating nature of expertise is reinforced, at least in the UK, by the approach of the courts. In assessing the negligence of a professional an individual is judged against the standards of his profession. Only where no reasonable member of his profession would have acted as he did is he negligent.6, 7 But the courts have warned that, in some circumstances, they might call into question the standards of a whole profession if there were a failure of logic. That is something that the courts would never do lightly.8 It would be a spectacle indeed.

When it comes to professional responsibility, the courts refuse to be dazzled by statutory regulations or industry standards. Professionals are expected to exercise their judgment and not hide behind mere compliance.9 In 2003, giving judgment against a firm of architects for inadequate fire precautions in a food factory refurbishment, Judge Bowsher QC observed:10

I should add that I was not the slightest impressed by the submission that since the defendants had complied with their statutory requirements … they had fully performed their duties.

This is what a judge said in a different case concerning the safety of a flight of stairs.11

Looking at a photograph of the stairs, I myself would form the view that they are reasonably safe … But it is the fact that the stairs did not comply with the Building Regulations, or the relevant British Standard. That is evidence which we must certainly take into account. It represents the current professional opinion as to what is desirable in order that accidents should be avoided. But it is one thing to lay down regulations and standards, with that objective, and another to define what is reasonably safe in the circumstances of a particular case [emphasis added].

In any event, trying to manage a risk by statutory regulation is not so efficient a means as you might think. Regulations do not always ensure the best outcome for society.12

Trust in elites

That all leaves the elite professions with grave responsibilities. Let none of us deny that another salient feature of the professions is that they are businesses run to make a profit for the professionals. Members get the further reward of status in society. I know that we have all constructed narratives of our own expertise and that challenges, particularly from clients, are not always welcome. We think we know best and we don’t always want to waste the client’s time explaining what to us seems so obvious.

And when things go wrong, and they will, all that is thrown back at us. Quite justifiably. Trust in bureaucracy has been a recurrent theme on this blog. It is a complex matter. When it leads to herd immunity from disease it is good. When it leads to complicity in torture it is bad. The public trust we aspire to is not blind faith. It is collaboration. Blind faith leads to bad consequences, collaboration to an environment where professionals are able to explain and reassure. Reflecting on that, I think there are some things was all can do to improve that relationship of trust.

Listen The most useful person on a project is often the person who knows nothing about it. She can ask the dumb question. Physicists told Gulgielmo Marconi he would not be able to transmit a radio signal across the Atlantic. But he did, not because he knew something the physicists didn’t but because sometimes it takes an unashamed maverick to test an orthodoxy.13 There are sundry examples of rumours and folk tales that have sparked scientific curiosity and discovery. Sometimes data is the plural of anecdote. It’s not even all about testing scientific theories. People sometimes need confidence and reassurance in unfamiliar situations. They need to be told, in language they understand, what is happening and why you think this is a good idea. Their questions and reservations need to be taken seriously.

A signal is a signal One of the key skills for any professional is being able to distinguish signal from noise. Where there is a signal, a surprise, that suggests an established orthodoxy has stopped working then you must immediately take action to protect those at risk. The “regular environment” you relied on is blown. Don’t wait to see if it happens again. Don’t dismiss it as a “one off” or, heaven forbid, the most useless word in the English language, an “outlier”. It is the signals that contain all the information. Don’t relax when the signal isn’t repeated immediately. That is just regression to the mean. It’s what signals do. Something that you didn’t expect has happened. Pierce the veil of bounded rationality. Protect the client, investigate and look to update your practice.

Noise is noise The corollary to taking signals seriously is not mistaking noise for signal. When that happens we start looking for causes specific of an individual outcome when the true causes were generic to all outcomes. Professionals also need to know when they are embedded in a “stable system of trouble”. That brings its own challenges, not least of which is the cost and effort of perpetually protecting the client.

Humility Professionals don’t always get it right. There are individual errors. There are systemic failures of practice. If you have to start hiding behind the shield that you are beyond challenge and that dissenting views are outlawed then you are probably dismissing the best hope you have for avoiding problems.

Continual improvement We have to keep listening to counsels of despair from politicians about productivity. It is down to us. Continual improvement is not just for our individual domain expertise, it’s also about getting better at listening, distinguishing signal from noise and practising humility. It’s about getting better at improving too.

Professionals have bodies to kick and souls to damn

Over two hundred years ago, English judge Edward Thurlow famously observed that corporations have neither bodies to kick nor souls to damn. I am always baffled by calls, in cases like that of Grenfell Tower, for prosecutions for corporate manslaughter. The calls seem to reflect a mistaken sentiment that corporate manslaughter is some sort of aggravated form of manslaughter. This isn’t just manslaughter, it’s corporate manslaughter. But why would anybody want to relieve individuals of responsibility and impose it on a faceless abstraction?

Part of the deal when seeking certification as a professional is that you assume a responsibility to society. That’s where you get your status from. When you fail you will be held to account. There are always voices calling for an end to blame culture. But have no doubt, it is a professional’s duty to act within the standards she has adopted. If she falls below those standards then reparation is expected, to the extent that it remains possible. Anybody who causes death when they fall sufficiently far below standard can expect to be indicted for manslaughter and, on conviction, punished and shamed. There is a principle in criminal law called fair labelling. The name of a crime must reflect the offence. Manslaughter is a fair label in such cases.

There has been an increasing tendency in the UK for legislators to take power to order reparation away from the civil courts and to attempt to regulate with criminal sanctions. I am not persuaded that is always the right approach.

Trust in elites, bureaucrats, experts, call them what you will, is important. South African statesman Paul Kruger once remarked:

When I look at history I’m a pessimist. When I look at pre-history I’m an optimist.

If you live in the UK then, on any measure you can dream up, life is getting safer and better. That is the triumph of elite engineers, planners, security professionals, physicians … I could go on. If people at large lose faith in professionals then it will be to our common ruin. Only the professionals can work on building the trust we need. Politicians won’t do it.

What did you do today?

References

  1. Wikimedia Commons contributors, “File:Grenfell Tower fire (wider view).jpg,” Wikimedia Commons, the free media repository, https://commons.wikimedia.org/w/index.php title=File:Grenfell_Tower_fire_(wider_view).jpg&oldid=248417865 (accessed June 25, 2017)
  2. Vaughan, D (1996) The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA, University Of Chicago Press
  3. Fukuyama, F (2012) The Origins of Political Order: From Prehuman Times to the French Revolution, Profile Books, p207
  4. Kim, Sung Ho, “Max Weber”, The Stanford Encyclopedia of Philosophy (Fall 2012 Edition), Edward N. Zalta (ed.)
  5. Kahneman, D (2011) Thinking, Fast and Slow, Allen Lane, pp199-254
  6. Bolam v Friern Hospital Management Committee [1957] 1 WLR 582
  7. Pantelli Associates Ltd v Corporate City Developments Number Two Ltd [2010] EWHC 3189 (TCC)
  8. Bolitho v City and Hackney Health Authority [1996] 4 All ER 771
  9. Charlesworth & Percy on Negligence, 12th ed., 2010 and supplements, 7-46
  10. Sahib Foods Ltd & Ors v Paskin Kyriakides Sands (A Firm) [2003] EWHC 142 (TCC) at [43]
  11. Green v Building Scene Limited [1994] PIQR P259, CA at 269
  12. Coase, R H (1960) “The problem of social cost” Journal of Law and Economics 3, 1-44
  13. Raboy, M (2016) Marconi: The Man Who Networked the World, Oxford, p176

Building targets, constructing behaviour

Recently, the press reported that UK construction company Bovis Homes Group PLC have run into trouble for encouraging new homeowners to move into unfinished homes and have therefore faced a barrage of complaints about construction defects. It turns out that these practices were motivated by a desire to hit ambitious growth targets. Yet it has all had a substantial impact on trading position and mark downs for Bovis shares.1

I have blogged about targets before. It is worth repeating what I said there about the thoughts of John Pullinger, head of the UK Statistics Authority. He gave a trenchant warning about the “unsophisticated” use of targets. He cautioned:2

Anywhere we have had targets, there is a danger that they become an end in themselves and people lose sight of what they’re trying to achieve. We have numbers everywhere but haven’t been well enough schooled on how to use them and that’s where problems occur.

He went on.

The whole point of all these things is to change behaviour. The trick is to have a sophisticated understanding of what will happen when you put these things out.

That message was clearly one that Bovis didn’t get. They legitimately adopted an ambitious growth target but they forgot a couple of things. They forgot that targets, if not properly risk assessed, can create perverse incentives to distort the system. They forgot to think about how manager behaviour might be influenced. Leaders need to be able to harness insights from behavioural economics. Further, a mature system of goal deployment imposes a range of metrics across a business, each of which has to contribute to the global organisational plan. It is no use only measuring sales if measures of customer satisfaction and input measures about quality are neglected or even deliberately subverted. An organisation needs a rich dashboard and needs to know how to use it.

Critically, it is a matter of discipline. Employees must be left in no doubt that lack of care in maintaining the integrity of the organisational system and pursuing customer excellence will not be excused by mere adherence to a target, no matter how heroic. Bovis was clearly a culture where attention to customer requirements was not thought important by the staff. That is inevitably a failure of leadership.

Compare and contrast

Bovis are an interesting contrast with supermarket chain Sainsbury’s who featured in a law report in the same issue of The Times.3 Bovis and Sainsbury’s clearly have very different approaches as to how they communicate to their managers what is important.

Sainsbury’s operated a rigorous system of surveying staff engagement which aimed to embrace all employees. It was “deeply engrained in Sainsbury’s culture and was a critical part of Sainsbury’s strategy”. An HR manager sent an email to five store managers suggesting that the rigour could be relaxed. Not all employees needed to be engaged, he said, and participation could be restricted to the most enthusiastic. That would have been a clear distortion of the process.

Mr Colin Adesokan was a senior manager who subsequently learned of the email. He asked the HR manager to explain what he had meant but received no response and the email was recirculated. Adesokan did nothing. When his inaction came to the attention of the chief executive, Adesokan was dismissed summarily for gross misconduct.

He sued his employer and the matter ended up in the Court of Appeal, Adesokan arguing that such mere inaction over a colleague’s behaviour was incapable of constituting gross misconduct. The Court of Appeal did not agree. They found that, given the significance placed by Sainsbury’s on the engagement process, the trial judge had been entitled to find that Adesokan had been seriously in dereliction of his duty. That failing constituted gross misconduct because it had the effect of undermining the trust and confidence in the employment relationship. Adesokan seemed to have been indifferent to what, in Sainsbury’s eyes, was a very serious breach of an important procedure. Sainsbury’s had been entitled to dismiss him summarily for gross misconduct.

That is process discipline. That is how to manage it.

Display constancy of purpose in communicating what is important. Do not turn a blind eye to breaches. Do not tolerate those who would turn the blind eye. When you combine that with mature goal deployment and sophistication as to how to interpret variation in metrics then you are beginning to master, at least some parts of, how to run a business.

References

  1. “Share price plunges as Bovis tries to rebuild customers’ trust” (paywall), The Times (London), 20 February 2017
  2. “Targets could be skewing the truth, statistics chief warns” (paywall), The Times (London), 26 May 2014
  3. Adesokan v Sainsbury’s Supermarkets Ltd [2017] EWCA Civ 22, The Times, 21 February 2017 (paywall)

UK railway suicides – 2016 update

The latest UK rail safety statistics were published in September 2016, again absent much of the press press fanfare we had seen in the past. Apologies for the long delay but the day job has been busy. Regular readers of this blog will know that I have followed the suicide data series, and the press response, closely in 20152014, 2013 and 2012. Again, I “Cast a cold eye/ On life, on death.” Again I have re-plotted the data myself on a Shewhart chart.

railwaysuicides6

Readers should note the following about the chart.

  • Many thanks to Tom Leveson Gower at the Office of Rail and Road who confirmed that the figures are for the year up to the end of March.
  • Some of the numbers for earlier years have been updated by the statistical authority.
  • I have recalculated natural process limits (NPLs) as there are still no more than 20 annual observations, and because the historical data has been updated. The NPLs have therefore changed in that the 2014 total is no longer above the upper NPL.
  • The observation above the upper NPL in 2015 has not persisted. The latest total is within the NPLs. We have to think about how to interpret this.

The current chart shows two signals, an observation above the upper NPL in 2015 and a run of 8 below the centre line from 2002 to 2009. As I always remark, the Terry Weight rule says that a signal gives us license to interpret the ups and downs on the chart. So I shall have a go at doing that. Last year I was coming to the conclusion that the data increasingly looked like a gradual upward trend. Has the 2016 data changed that?

The Samaritans posted on their website, “Rail suicides fall by 12%,” and went on to say:

Suicide prevention measures put in place as part of the partnership between Samaritans, Network Rail and the wider rail industry are saving more lives on the railways.

In fairness, the Samaritans qualified their headline with the following footnote.

We must be mindful that suicide data is best understood by looking at trends over longer periods of time, and year-on-year fluctuations may not be indicative of longer term trends. It is however very encouraging to see such a decrease which we would hope to see continuing in future years.

The Huffington Post, no, not sure I really think of them as part of the MSM, were less cautious in banking the 12% by stating, “It is the first time the number has dropped in three years.” True, but #executivetimeseries!

Signal or noise?

What shall we make of the decrease, a decrease to  “back within” the NPLs? First, the mere fact that there are fewer suicides is good news. That is a “better” outcome. The question still remains as to whether we are making progress in reducing the frequency of suicides. Has there been a change to the underlying cause system that drives the suicide numbers? We might just be observing noise unrelated to an underlying signal or trend. Remember that extremely high measurements are usually followed by lower ones because of the principle of regression to the mean.1 Such a decrease is no evidence of an underlying improvement but merely a deceptive characteristic of common cause variation.

One thing that I can do is to try to fit a trend line through the data and to ask which narrative best fits what I observe, a continuing increasing trend or a trend that has plateaued or even reversed. As you know, I am very critical of the uncritical casting of regression lines on data plots. However, this time I have a definite purpose in mind. Here is the data with a fitted linear regression line.

railwaysuicides8a

What I wanted to do was to split the data into two parts:

  • A trend (linear simply for the sake of exploratory data analysis (EDA); and
  • The residual variation about the trend.

The question I want to ask is whether the residual variation is stable, just plain noise, or whether there is a signal there that might give me a clue that a linear trend does not hold. The way that I do that is to plot the residuals on a Shewhart chart.

railwaysuicides7

That shows a stable pattern of residuals. If I try to interpret the chart as a linear trend plus exchangeable noise then nothing in the data contradicts that. The original chart invites an interpretation, because of the signals. I adopt the interpretation of an increasing trend. Nothing in the data contradicts that. I can put the pictures together to show this model.

railwaysuicides8

My opinion is that, when I plot the data that way, I have a compelling picture of a growing trend about which there is some stable common cause variation. Had there been an observation below the lower NPL on the last chart then that could have been evidence that the trend was slowing or even reversing. But not here.

I note that there’s also a report here from Anna Taylor and her colleagues at the University of Bristol. They too find an increasing trend with no signal of amelioration. They have used a different approach from mine and the fact that we have both got to the same broad result should reinforce confidence in out common conclusion.

Measurement Systems Analysis

Of course, we should not draw any conclusions from the data without thinking about the measurement system. In this case there is a legal issue. It concerns the standard of proof that the law requires coroners to apply before finding suicide as the cause of death. Findings of fact in inquests in England and Wales are generally made if they satisfy the civil standard of proof, the balance of probabilities. However, a finding of suicide can only be returned if such a conclusion satisfies the higher standard of beyond reasonable doubt, the typical criminal standard.2 There have long been suggestions that that leads to under reporting of suicides.3 The Matthew Elvidge Trust is currently campaigning for the general civil standard of balance of probabilities to be adopted.4

Next steps

Previously I noted proposals to repeat a strategy from Japan of bathing railway platforms with blue light. In the UK, I understand that such lights were installed at Gatwick in summer 2014 but I have not seen any data or heard anything more about it.

A huge amount of sincere endeavour has gone into this issue but further efforts have to be against the background that there is an escalating and unexplained problem.

References

  1. Kahneman, D (2011) Thinking, Fast and Slow, Allen Lane, pp175-184
  2. Jervis on Coroners 13th ed. 13-70
  3. Chambers, D R (1989) “The coroner, the inquest and the verdict of suicide”, Medicine, Science and the Law 29, 181
  4. Trust responds to Coroner’s Consultation“, Mathew Elvidge Trust, retrieved 4/1/17

Plan B, gut feel and Shewhart charts

Elizabeth Holmes 2014 (cropped).jpgI honestly had the idea for this blog and started drafting it six months ago when I first saw this, now quite infamous, quote being shared around the internet.

The minute you have a back-up plan, you’ve admitted you’re not going to succeed.

Elizabeth Holmes

Good advice? I think not! Let’s review some science.

Confidence and trustworthiness

As far back as the 1970s, psychologists carried out a series of experiments on individual confidence.1 They took a sample of people and set each of them a series of general knowledge questions. The participants were to work independently of each other. The questions were things like What is the capital city of France? The respondents had, not only to do their best to answer the question, but also then to state the probability that they had answered correctly.

As a headline to their results the researchers found that, of all those answers in the aggregate about which people said they were 100% sure that they had answered correctly, more than 20% were answered incorrectly.

Now, we know that people who go around assigning 100% probabilities to things that happen only 80% of the time are setting themselves up for inevitable financial loss.2 Yet, this sort of over confidence in the quality and reliability of our individual, internal cognitive processes has been identified and repeated over multiple experiments and sundry real life situations.

There is even a theory that the only people whose probabilities are reliably calibrated against frequencies are those suffering from clinically diagnosed depression. The theory of depressive realism remains, however, controversial.

Psychologists like Daniel Kahneman have emphasised that human reasoning is limited by a bounded rationality. All our cognitive processes are built on individual experience, knowledge, cultural assumptions, habits for interpreting data (good, bad and indifferent) … everything. All those things are aggregated imperfectly, incompletely and partially. Nobody can can take the quality of their own judgments for granted.

Kahneman points out that, in particular, wherever individuals engage sophisticated techniques of analysis and rationalisation, and especially those tools that require long experience, education and training to acquire, there is over confidence in outcomes.3 Kahneman calls this the illusion of validity. The more thoroughly we construct an internally consistent narrative for ourselves, the more we are seduced by it. And it is instinctive for humans to seek such cogent models for experience and aspiration. Kahneman says:4

Confidence is a feeling, which reflects the coherence of the information and the cognitive ease of processing it. It is wise to take admissions of uncertainty seriously, but declarations of high confidence mainly tell you that an individual has constructed a coherent story in their mind, not necessarily that the story is true.

If illusion is the spectre of confidence then having a Plan B seems like a good idea. Of course, Holmes is correct that having a Plan B will tempt you to use it. When disappointments accumulate, in escalating costs, stagnating revenues or emerging political risks, it is very tempting to seek the repose of a lesser ambition or even a managed mitigation of residual losses.

But to proscribe a Plan B in order to motivate success is to display the risk appetite of a Kamikaze pilot. Sometimes reality tells you that your business plan is predicated on a false prospectus. Given the science of over confidence and the narrative of bounded rationality, we know that it will happen a lot of the time.

GenericPBCHolmes is also correct that disappointment is, in itself, no reason to change plan. What she neglects is that there is a phenomenon that does legitimately invite change: a surprise. It is a surprise that alerts us to an inconsistency between the real world and our design. A surprise ought to make us go back to our working business plan and examine the assumptions against the real world data. A switch to Plan B is not inevitable. There may be other means of mitigation: Act, Adapt or Abandon. The surprise could even be an opportunity to be grasped. The Plan B doesn’t have to be negative.

How then are we to tell a surprise from a disappointment? With a Shewhart chart of course. The chart has the benefits that:

  • Narrative building is shared not personal.
  • Narratives are challenged with data and context.
  • Surprise and disappointment are distinguished.
  • Predictive power is tested.

Analysis versus “gut feel”

I suppose that what lies behind Holmes’ quote is the theory that commitment and belief can, in themselves, overcome opposing forces, and that a commitment borne of emotion and instinctive confidence is all the more potent. Here is an old Linkedin post that caught my eye a while ago celebrating the virtues of “gut feel”.

The author believed that gut feel came from experience and individuals of long exposure to a complex world should be able to trump data with their intuition. Intuition forms part of what Kahneman called System 1 thinking which he contrasted with the System 2 thinking that we engage in when we perform careful and lengthy data analysis (we hope).5 System 1 thinking can be valuable. Philip Tetlock, a psychologist who researched the science of forecasting, noted this.6

Whether intuition generates delusion or insight depends on whether you work in a world full of valid cues you can unconsciously register for future use.

In fact, whether the world is full of the sorts of valid clues that support useful predictions is exactly the question that Shewhart charts are designed to answer. Whether we make decisions on data or on gut feel, either can mislead us with the illusion of validity.

Again, what the chart supports is the continual testing of the reliability and utility of intuitions. Gut feel is not forbidden but be sure that the successive predictions and revisions will be recorded and subjected to the scrutiny of the Shewhart chart. Impressive records of forecasting will form the armature of a continually developing shared narrative of organisational excellence. Unimpressive forecasters will have to yield ground.

References

  1. Lichtenstein, S et al. (1982) “Calibration of probabilities: The state of the art to 1980” in Kahneman, D et al. Judgment Under Uncertainty: Heuristics and Biases, Cambridge University Press
  2. De Finetti, B (1974) Theory of Probability: A Critical Introductory Treatment, Vol.1, trans. Machi, A & Smith, A; Wiley, p113
  3. Kahneman, D (2011) Thinking, Fast and Slow, Allen Lane, p217
  4. p212
  5. pp19-24
  6. Tetlock, P (2015) Superforecasting: The Art and Science of Prediction, Crown Publishing, Kindle loc 1031

Productivity and how to improve it: II – Profit = Customer value – Cost

I said I would be posting on this topic way back here. Perhaps that says something about my personal productivity but I have been being productive on other things. I have a day job.

I wanted to start off addressing customer value and waste. Here are a couple of revealing stories from the press.

Blue dollars and green dollars

File:Lemon.jpg

This story appeared on the BBC website about a pizza restaurant transferring the task of slicing lemons from the waiters to the kitchen staff. As you know I am rarely impressed by standards of data journalism at the state owned BBC. This item makes one of the gravest errors of attempted business improvement. It had been the practice that waiters, as their first job in the morning, would chop lemons for the day’s anticipated drinks orders. A pizza chef commented that chopping was one of the chefs’ trade skills. Lemon chopping should be transferred to the chefs. That would, purportedly, save the waiters from having to “take a break from their usual tasks, wash their hands, clear a space and then clean up after themselves.” The item goes on:

“Just by changing who chops the lemons, we were able to make a significant saving in hours which translates into a significant financial saving,” says Richard Hodgson, Pizza Express’ chief executive.

This looks, to the uncritical eye, like a saving. But it is a saving in what we call blue dollars (or pounds or euros). It appears in blue ink on an executive summary or monthly report. Did Pizza Express actually save any cash, as we call it green dollars (or …)? Did the initiative put a ding in the profit and loss account?

Perhaps it did but perhaps not. It is, actually, very easy to eliminate, or perhaps hide or redeploy, tasks or purchases and claim a saving in blue dollars. Demonstrating that this then mapped into a saving in green dollars requires committed analytics and the trenchant criticism of historical data. The blue dollars will turn into green dollars if Pizza Express can achieve a time saving that allows:

  • A reduction in payroll; or
  • Redeployment of time into an activity that creates greater value for the customer.

That is assuming that the initiative did result in a time saving. What it certainly lost was a team building opportunity between waiters and chefs and a signal for waiters to wash their hands.

The jury is out as to whether Pizza Express improved productivity. Translation of blue dollars into green dollars is not easy. It is certainly not automatic. Turning blue dollars into green dollars is the really tricky bit in improvement. The bit that requires all the skill and know-how. It turns on the Nolan and Provost question: How will you know when a change is an improvement? More work is needed here to persuade anybody of anything. More work is certainly needed by the BBC in improving their journalism.

Politicians don’t get it

I asked above if the freed time could be translated into an activity that creates greater value for the customer. The value of a thing is what somebody is willing to pay for it. When we say that an activity creates value we mean that it increases the price at which we can sell output. The importance of price is that it captures a revealed preference rather than just a casual attitude for which the subject will never have to give an account. Any activity that does not create value for the customer is waste. The Japanese word muda has become fashionable. It is at the core of achieving operational excellence that unrelenting, gradual and progressive elimination of waste is a daily activity for everybody in the organisation. Waste, everything that does not create value for the customer. Everything that does not make the customer willing to pay more. If the customer will not pay more there is no value for them.

John Redwood was a middle ranking official in John Major’s government of the 1990s though he had frustrated ambitions for higher office. He offered us his personal thoughts on productivity here. I think he illustrates how poorly politicians understand what productivity is. Redwood thinks that we are over simplifying things when we say that productivity is:

ProductiviityEq1

or, a better definition:

ProductiviityEq2

Redwood thinks that, in the service sector, “labour intensity is often seen as better service rather than as worse productivity”. It may be true but only in so far as the customer sees it as such and is willing to pay proportionately for the staffing. Where the customer will not pay then productivity is reduced and insisting that labour intensity is an inherent virtue is a delusion. I think this is the basis of what Redwood is trying to say about purchasing coffee from a store. The test is that the customer is willing to pay for the experience.

However imperfect the statistics, they do seek to capture what the customers have been willing to pay. The spend at the coffee stand should show up on the aggregated statistics for “customer value created” and so the retail coffee phenomenon will not manifest itself as a decrease in productivity. Redwood has completely misunderstood.

Of course there are measurement issues and they are serious ones. There is nothing though that suggests that the concept or its definition are at fault.

What is worrying is that Redwood’s background is in banking though I certainly know bankers who are less out of touch with the real world. Redwood needs to get that the fundamental theorem of business is that:

profit = price – cost

— and that price is set by the market. There are only two things to do to improve.

  • Develop products that enhance customer value.
  • Eliminate costs that do not contribute to customer value.

UK figures

I could not find a long-term productivity time series on the UK Office for National Statistics website (“ONS”). I think that is shameful. You know that I am always suspicious of politicians’ unwillingness to encourage sharing long term statistical series. I managed to find what I was looking for here at www.tradingeconomics.com. Click on the “MAX” tab on the chart.

That chart gave me a suspicion. The ONS website does have the data from 2008. There is a link to this data after Figure 3 of the ONS publication Labour Productivity: Oct to Dec 2015. However, all the charts in that publication are fairly hideous and lacking in graphical excellence. Here is the 2008 to 2015 data replotted.

Productivity1

I am satisfied that, following the steep drop in UK productivity coinciding with the world financial crisis of 2007/08, there has been a (fairly) steady rise in productivity to the region of pre-crash levels. Confirming that with a Shewhart chart is left as an exercise for the reader. Of course, there is common cause variation around the upward trend. And, I suspect, some special causes too. However, I think that inferences of gloom following the Quarter 4 2015 figures, the last observation plotted, are premature. A bad case of #executivetimeseries.

I think that makes me less gloomy about UK productivity than the press and politicians. I have a suspicion that growth since 2008 has been slower than historically but I do not want to take that too far here.

Coming next: Productivity and how to improve it III – Signal and noise