Trust in forecasting

File:City of London skyline at dusk.jpgStephen King (global economist at HSBC) made some profound comments about forecasting in The Times (London) (paywall) yesterday.

He points out that it is only a year since the International Monetary Fund (IMF) criticised UK economic strategy and forecast 0.7% GDP growth in 2013 and 1.5% in 2014. The latest estimate for 2013 is growth is 1.9%. The IMF now forecasts growth for 2014 at 2.4% and notes the strength of the UK economy. I should note that the UK Treasury’s forecasts were little different from the IMF’s.

Why, asks King, should we take any notice of the IMF forecast, or their opinions, now when they are so unapologetic about last year’s under estimate and their supporting comments?

The fact is that any forecast should come attached to an historic record of previous forecasts and actual outcomes, preferably on a deviation from aim chart. In fact, wherever somebody offers a forecast and there is no accompanying historic deviation from aim chart, I think it a reasonable inference that they have something to hide. The critical matter is that the chart must show a stable and predictable process of forecasting. If it does then we can start to make tentative efforts at estimating accuracy and precision. If not then there is simply no rational forecast. It would be generous to characterise such attempts at foresight as guesses.

Despite the experience base, forecasting is all about understanding fundamentals. King goes on to have doubts about the depth of the UK’s recovery and, in particular, concerns about productivity. The ONS data is here. He observes that businesses are choosing to expand by hiring cheap labour and suggests macroeconomic remedies to foster productivity growth such as encouraging small and medium sized enterprises, and enhancing educational effectiveness.

It comes back to a paradox that I have discussed before. There is a well signposted path to improved productivity that seems to remain The Road Not Taken. Everyone says they do it but it is clear from King’s observations on productivity that, in the UK at least, they do not. That would be consistent with the chronically poor service endemic in several industries. Productivity and quality go hand in hand.

I wonder if there is a preference in the UK for hiring state subsidised cheap labour over the rigorous and sustained thinking required to make real productivity improvements. I have speculated elsewhere that producers may feel themselves trading in a market for lemons. The macroeconomic causes of low productivity growth are difficult for non-economists such as myself to divine.

However, every individual company has the opportunity to take its own path and “Put its sticker on a lemon”. Governments may look to societal remedies but as an indefatigable female politician once trenchantly put it:

The individual is the true reality in life. A cosmos in himself, he does not exist for the State, nor for that abstraction called “society,” or the “nation,” which is only a collection of individuals. Man, the individual, has always been and, necessarily is the sole source and motive power of evolution and progress.

Emma Goldman
The Individual, Society and the State, 1940

Do I have to be a scientist to assess food safety?

I saw this BBC item on the web before Christmas: Why are we more scared of raw egg than reheated rice? Just after Christmas seemed like a good time to blog about food safety. Actually, the link I followed asked Are some foods more dangerous that others? A question that has a really easy answer.

However, understanding the characteristic risks of various foods and how most safely to prepare them is less simple. Risk theorist John Adams draws a distinction between readily identified inherent and obvious risks, and risks that can only be perceived with the help of science. Food risks fall into the latter category. As far as I can see, “folk wisdom” is no reliable guide here, even partially. The BBC article refers to risks from rice, pasta and salad vegetables which are not obvious. At the same time, in the UK at least, the risk from raw eggs is very small.

Ironically, raw eggs are one food that springs readily to British people’s minds when food risk is raised, largely due to the folk memory of a high profile but ill thought out declaration by a government minister in the 1980s. This is an example of what Amos Tversky and Daniel Kahneman called an availability heuristic: If you can think of it, it must be important.

Food safety is an environment where an individual is best advised to follow the advice of scientists. We commonly receive this filtered, even if only for accessibility, through government agencies. That takes us back to the issue of trust in bureaucracy on which I have blogged before.

I wonder whether governments are in the best position to provide such advice. It is food suppliers who suffer from the public’s misallocated fears. The egg fiasco of the 1980s had a catastrophic effect on UK egg sales. All food suppliers have an interest in a market characterised by a perception that the products are safe. The food industry is also likely to be in the best position to know what is best practice, to improve such practice, to know how to communicate it to their customers, to tailor it to their products and to provide the effective behavioural “nudges” that promote safe handling. Consumers are likely to be cynical about governments, “one size fits all” advice and cycles of academic meta-analysis.

I think there are also lessons here for organisations. Some risks are assessed on the basis of scientific analysis. It is important that the prestige of that origin is communicated to all staff who will be involved in working with risk. The danger for any organisation is that an individual employee might make a reassessment based on local data and their own self-serving emotional response. As I have blogged before, some individuals have particular difficulty in aligning themselves with the wider organisation.

Of course, individuals must also be equipped with the means of detecting when the assumptions behind the science have been violated and initiating an agile escalation so that employee, customer and organisation can be protected while a reassessment is conducted. Social media provide new ways of sharing experience. I note from the BBC article that, in the UK at least, there is no real data on the origins of food poisoning outbreaks.

So the short answer to the question at the head of this blog still turns out to be “yes”. There are some things where we simply have to rely on science if we want to look after ourselves, our families and our employees.

But even scientists are limited by their own bounded rationality. Science is a work in progress. Using that science itself as a background against which to look for novel phenomena and neglected residual effects leverages that original risk analysis into a key tool in managing, improving and growing a business.

It was 20 years ago today …

W._Edwards_Deming[1]Today, 20 December 2013, marks the twentieth anniversary of the death of W Edwards Deming. Deming was a hugely influential figure in management science, in Japan during the 1950s, 1960s and 1970s, then internationally from the early 1980s until his death. His memory persists in a continuing debate about his thinking among a small and aging sector of the operational excellence community, and in a broader reputation as a “management guru”, one of the writers who from the 1980s onwards championed and popularised the causes of employee engagement and business growth through customer satisfaction.

Deming’s training had been in mathematics and physics but in his professional life he first developed into a statistician, largely because of the influence of Walter Shewhart, an early mentor. It was fundamental to Deming’s beliefs that an organisation could only be managed effectively with widespread business measurement and trenchant statistical criticism of data. In that way he anticipated writers of a later generation such as Nate Silver and Nassim Taleb.

Since Deming’s death the operational excellence landscape has become more densely populated. In particular, lean operations and Six Sigma have variously been seen as competitors for Deming’s approach, as successors, usurpers, as complementary, as development, or as tools or tool sets to be deployed within Deming’s business strategy. In many ways, the pragmatic development of lean and Six Sigma have exposed the discursive, anecdotal and sometimes gnomic way Deming liked to communicate. In his book Out of the Crisis: Quality, Productivity and Competitive Position (1982) minor points are expanded over whole chapters while major ideas are finessed in a few words. Having elevated the importance of measurement and a proper system for responding to data he goes on to observe that the most important numbers are unknown and unknowable. I fear that this has often been an obstacle to managers finding the hard science in Deming.

For me, the core of Deming’s thinking remains this. There is only one game in town, the continual improvement of the alignment between the voice of the process and the voice of the customer. That improvement is achieved by the diligent use of process behaviour charts. Pursuit of that aim will collaterally reduce organisational costs.

Deming pursued the idea further. He asked what kind of organisation could most effectively exploit process behaviour charts. He sought philosophical justifications for successful heuristics. It is here that his writing became more difficult to accept for many people. In his last book, The New Economics for Industry, Government, Education, he trespassed on broader issues usually reserved to politics and social science, areas in which he was poorly qualified to contribute. The problem with Deming’s later work is that where it is new, it is not economics, and where it is economics, it is not new. It is this part of his writing that has tended to attract a few persistent followers. What is sad about Deming’s continued following is the lack of challenge. Every seminal thinker’s works are subject to repeated criticism, re-evaluation and development. Not simply development by accumulation but development by revision, deletion and synthesis. It is here that Deming’s memory is badly served. At the top of the page is a link to Deming’s Wikipedia entry. It is disturbing that everything is stated as though a settled and triumphant truth, a treatment that contrasts with the fact that his work is now largely ignored in mainstream management. Managers have found in lean and Six Sigma systems they could implement, even if only partially. In Deming they have not.

What Deming deserves, now that a generation, a global telecommunications system and a world wide web separate us from him, is a robust criticism and challenge of his work. The statistical thinking at the heart is profound. For me, the question of what sort of organisation is best placed to exploit that thinking remains open. Now is the time for the re-evaluation because I believe that out of it we can join in reaching new levels of operational excellence.

Trouble at the EU

I enjoy Metro the UK national free morning newspaper. It has a very straightforward non-partisan style. This morning there was an article dealing with the European Union’s (EU’s) accounting difficulties. There were a couple of very telling admissions from an EU bureaucrat. We lawyers love an admission.

Aidas Palubinskas, from the European Court of Auditors, … described the error rate as ‘relatively stable from year to year’.

He admits that the EU’s accounting is a stable system of trouble. That is a system where there is only common cause variation, variation common to the whole of the output, but where the system is still incapable of reliably delivering what the customer wants. Recognising that one is embedded in such a problem is the first step towards operational improvement. W Edwards Deming addressed the implications of the stable system and the strategy for its improvement at length in his seminal book Out of the Crisis (1982). The problems are not intractable but the solution demands leadership and adoption of the correct improvement approach.

Unfortunately, the second half of the quote is less encouraging.

He said the errors highlighted in its report were ‘examples of inefficiency, but not necessarily of waste’.

This makes me fear that the correct approach is far off for the EU. Everything that is not efficient, timely and effective delivery of what the customer wants is waste, as Toyota call it muda. Waste represents the scope of opportunity for improvement, for improving service and simultaneously reducing its cost. The first step in improvement is taken by accepting that waste is not inevitable and that it can be incrementally eliminated through use of appropriate tools under competent leadership.

The next step to improvement is to commit to the discipline of eliminating waste progressively. That requires leadership. That sort of leadership is often found in successful organisations. The EU, however, faces particular difficulties as an international bureaucracy with a multi-partisan political master and a democratically disengaged public. It is not easy to see where leadership will come from. This is a common problem of state bureaucracies.

Palubinskas is right to seek to analyse the problems as a stable system of trouble. However, beyond that, the path to radical improvement lies in rejecting the casual acceptance of waste and in committing to continual improvement of every process for delivery of service.

Suicide statistics for British railways

I chose a prosaic title because it’s not a subject about which levity is appropriate. I remain haunted by this cyclist on the level crossing. As a result I thought I would delve a little into railway accident statistics. The data is here. Unfortunately, the data only goes back to 2001/2002. This is a common feature of government data. There is no long term continuity in measurement to allow proper understanding of variation, trends and changes. All this encourages the “executive time series” that are familiar in press releases. I think that I shall call this political amnesia. When I have more time I shall look for a longer time series. The relevant department is usually helpful if contacted directly.

However, while I was searching I found this recent report on Railway Suicides in the UK: risk factors and prevention strategies. The report is by Kamaldeep Bhui and Jason Chalangary of the Wolfson Institute of Preventive Medicine, and Edgar Jones of the Institute of Psychiatry, King’s College, London. Originally, I didn’t intend to narrow my investigation to suicides but there were some things in the paper that bothered me and I felt were worth blogging about.

Obviously this is really important work. No civilised society is indifferent to tragedies such as suicide whose consequences are absorbed deeply into the community. The report analyses a wide base of theories and interventions concerning railway suicide risk. There is a lot of information and the authors have done an important job in bringing together and seeking conclusions. However, I was bothered by this passage (at p5).

The Rail Safety and Standards Board (RSSB) reported a progressive rise in suicides and suspected suicides from 192 in 2001-02 to a peak 233 in 2009-10, the total falling to 208 in 2010-11.

Oh dear! An “executive time series”. Let’s look at the data on a process behaviour chart.

RailwaySuicides1

There is no signal, even ignoring the last observation in 2011/2012 which the authors had not had to hand. There has been no increasing propensity for suicide since 2001. The writers have been, as Nassim Taleb would put it, “fooled by randomness”. In the words of Nate Silver, they have confused signal and noise. The common cause variation in the data has been over interpreted by zealous and well meaning policy makers as an upward trend. However, all diligent risk managers know that interpretation of a chart is forbidden if there is no signal. Over interpretation will lead to (well meaning) over adjustment and admixture of even more variation into a stable system of trouble.

Looking at the development of the data over time I can understand that there will have been a temptation to perform a regression analysis and calculate a p-value for the perceived slope. This is an approach to avoid in general. It is beset with the dangers of testing effects suggested by the data and the general criticisms of p-values made by McCloskey and Ziliak. It is not a method that will be a reliable guide to future action. For what it’s worth I got a p-value of 0.015 for the slope but I am not impressed. I looked to see if I could find a pattern in the data then tested for the pattern my mind had created. It is unsurprising that it was “significant”.

The authors of the report go on to interpret the two figures for 2009/2010 (233 suicides) and 2010/2011 (208 suicides) as a “fall in suicides”. It is clear from the process behaviour chart that this is not a signal of a fall in suicides. It is simply noise, common cause variation from year to year.

Having misidentified this as a signal they go on to seek a cause. Of course they “find” a potential cause. A partnership between Network Rail and the Samaritans, Men on the Ropes, had started in January 2010. The programme’s aim was to reduce suicides by 20% over five years. I genuinely hope that the programme shows success. However, the programme will not be assisted by thinking that it has yet shown signs of improvement.

With the current mean annual total at 211, a 20% reduction entails a new mean of 169 annual suicides.That is an ambitious target I think, and I want to emphasise that the programme is entirely laudable and plausible. However, whether it succeeds is to be judged by the figures on the process behaviour chart, not by any post hoc rationalisation. This is the tough discipline of the charts. It is no longer possible to claim an improvement where that is not supported by the data.

I will come back to this data next year and look to see if there are any signs of encouragement.