The latest UK rail safety statistics were published on 23 November 2017, again absent much of the press fanfare we had seen in the past. Regular readers of this blog will know that I have followed the suicide data series, and the press response, closely in 2016, 2015, 2014, 2013 and 2012. Again I have re-plotted the data myself on a Shewhart chart.
Readers should note the following about the chart.
- Many thanks to Tom Leveson Gower at the Office of Rail and Road who confirmed that the figures are for the year up to the end of March.
- Some of the numbers for earlier years have been updated by the statistical authority.
- I have recalculated natural process limits (NPLs) as there are still no more than 20 annual observations, and because the historical data has been updated. The NPLs have therefore changed but, this year, not by much.
- Again, the pattern of signals, with respect to the NPLs, is similar to last year.
The current chart again shows two signals, an observation above the upper NPL in 2015 and a run of 8 below the centre line from 2002 to 2009. As I always remark, the Terry Weight rule says that a signal gives us license to interpret the ups and downs on the chart. So I shall have a go at doing that.
It will not escape anybody’s attention that this is now the second year in which there has been a fall in the number of fatalities.
I haven’t yet seen any real contemporaneous comment on the numbers from the press. This item appeared on the BBC, a weak performer in the field of data journalism but clearly with privileged access to the numbers, on 30 June 2017, confidently attributing the fall to past initiatives.
Sky News clearly also had advanced sight of the numbers and make the bold claim that:
… for every death, six more lives were saved through interventions.
That item goes on to highlight a campaign to encourage fellow train users to engage with anybody whose behaviour attracted attention.
But what conclusions can we really draw?
In 2015 I was coming to the conclusion that the data increasingly looked like a gradual upward trend. The 2016 data offered a challenge to that but my view was still that it was too soon to say that the trend had reversed. There was nothing in the data incompatible with a continuing trend. This year, 2017, has seen 2016’s fall repeated. A welcome development but does it really show conclusively that the upward trending pattern is broken? Regular readers of this blog will know that Langian statistics like “lowest for six years” carry no probative weight here.
Signal or noise?
Has there been a change to the underlying cause system that drives the suicide numbers? Last year, I fitted a trend line through the data and asked which narrative best fitted what I observed, a continuing increasing trend or a trend that had plateaued or even reversed. You can review my analysis from last year here.
Here is the data and fitted trend updated with this year’s numbers, along with NPLs around the fitted line, the same as I did last year.
Let’s think a little deeper about how to analyse the data. The first step of any statistical investigation ought to be the cause and effect diagram.
The difficulty with the suicide data is that there is very little reproducible and verifiable knowledge as to its causes. I have seen claims, of whose provenance I am uncertain, that railway suicide is virtually unknown in the USA. There is a lot of useful thinking from common human experience and from more general theories in psychology. But the uncertainty is great. It is not possible to come up with a definitive cause and effect diagram on which all will agree, other from the point of view of identifying candidate factors.
The earlier evidence of a trend, however, suggests that there might be some causes that are developing over time. It is not difficult to imagine that economic trends and the cumulative awareness of other fatalities might have an impact. We are talking about a number of things that might appear on the cause and effect diagram and some that do not, the “unknown unknowns”. When I identified “time” as a factor, I was taking sundry “lurking” factors and suspected causes from the cause and effect diagram that might have a secular impact. I aggregated them under the proxy factor “time” for want of a more exact analysis.
What I have tried to do is to split the data into two parts:
- A trend (linear simply for the sake of exploratory data analysis (EDA); and
- The residual variation about the trend.
The question I want to ask is whether the residual variation is stable, just plain noise, or whether there is a signal there that might give me a clue that a linear trend does not hold.
There is no signal in the detrended data, no signal that the trend has reversed. The tough truth of the data is that it supports either narrative.
- The upward trend is continuing and is stable. There has been no reversal of trend yet.
- The data is not stable. True there is evidence of an upward trend in the past but there is now evidence that deaths are decreasing.
Of course, there is no particular reason, absent the data, to believe in an increasing trend and the initiative to mitigate the situation might well be expected to result in an improvement.
Sometimes, with data, we have to be honest and say that we do not have the conclusive answer. That is the case here. All that can be done is to continue the existing initiatives and look to the future. Nobody ever likes that as a conclusion but it is no good pretending things are unambiguous when that is not the case.
Next steps
Previously I noted proposals to repeat a strategy from Japan of bathing railway platforms with blue light. In the UK, I understand that such lights were installed at Gatwick in summer 2014. In fact my wife and I were on the platform at Gatwick just this week and I had the opportunity to observe them. I also noted, on my way back from court the other day, blue strip lights along the platform edge at East Croydon. I think they are recently installed. However, I have not seen any data or heard of any analysis.
A huge amount of sincere endeavour has gone into this issue but further efforts have to be against the background that there is still no conclusive evidence of improvement.
Suggestions for alternative analyses are always welcomed here.
I suggest that if the data had not been updated, there is a case for simply using the linear trend limits from 2015 or 2016. This would help the view that you can meaningfully do such analyses using small amounts of data. No doubt this would not have changed the conclusion.
I think the analysis approach you use is good and I would now stick with the current charts for 2018 although I am still tempted to use the earlier chart. Keeping updating the calculations might reduce the chance of detecting a change.
By the way, perhaps the conclusion could be simply; there had been evidence of an, as yet unexplained, increasing trend, for the moment, we cannot conclude that recent actions have been successful but there is some encouragement for that view.
Terry, many thanks for the comment. I think what you suggest is worthwhile. I am intrigued as to next year’s data. I see that the BBC had this data months ago so I might see what I can do next time to get an early sight.
A couple of thoughts if I may.
Firstly, if there is an upward trend, it may be correlated with an increase in population, so the absolute numbers are driven by an increase in potential suicides.
Looking at the trend line on your second graph, suggests an increase in the fitted line of roughly 25% since 2002. The ONS data indicate an increase in UK population from 59.4Mi in 2012 to 65.6 in 2016, roughly 10%, which may explain some of the apparent increase in absolute numbers.
My second line of thought was to look at the overall suicide rate, and here the ONS data are essentially flat from 2002 to 2015. However the trends for males and females differ and if data are available, it would be interesting to see if there is a gender factor in rail suicides.
Peter
Peter – thank you for the comment and I agree that plotting the data standardised on population would given an alternative view. It is a good point. My reasons for sticking with absolute numbers for the time being are the following.
1. Absolute deaths are what concerns the railway and the emergency services. They get no comfort from a reducing relative rate.
2. You are right, I think, that the best way of looking at risk would be to standardise on the ‘appropriate. demographic as to sex and ethnic origin etc. The risk will be a function of demographic factors. I have some guesses as to the most vulnerable demographics but they are no more than that.
3. I looked at rail suicides against suicides in general in my 2015 post and the pattern was different so I was not immediately drawn to a relative risk plot.
I think with next year’s data I may expand the analysis to try to look at these effects. As I always say “Alternative analyses are available”.