UK Election of June 2017 – Polling review

Pollin2017Overview

Here are all the published opinion polls for the June 2017 UK general election, plotted as a Shewhart chart.

The Conservative lead over Labour had been pretty constant at 16% from February 2017, after May’s Lancaster House speech. The initial Natural Process Limits (“NPLs”) on the chart extend back to that date. Then something odd happened in the polls around Easter. There were several polls above the upper NPL. That does not seem to fit with any surrounding event. Article 50 had been declared two weeks before and had had no real immediate impact.

I suspect that the “fugue state” around Easter was reflected in the respective parties’ private polling. It is possible that public reaction to the election announcement somehow locked in the phenomenon for a short while.

Things then seem to settle down to the 16% lead level again. However, the local election results at the bottom of the range of polls ought to have sounded some alarm bells. Local election results are not a reliable predictor of general elections but this data should not have felt very comforting.

Then the slide in lead begins. But when exactly? A lot of commentators have assumed that it was the badly received Conservative Party manifesto that started the decline. It is not possible to be definitive from the chart but it is certainly arguable that it was the leak of the Labour Party manifesto that started to shift voting intention.

Then the swing from Conservative to Labour continued unabated to polling day.

Polling performance

How did the individual pollsters fair? I have, somewhat arbitrarily, summarised all polls conducted in the 10 days before the election (29 May to 7 June). Here is the plot along with the actual popular poll result which gave a 2.5% margin of Conservative over Labour. That is the number that everybody was trying to predict.

PollsterPerformance

The red points are the surveys from the 5 days before the election (3 to 7 June). Visually, they seem to be no closer, in general, than the other points (6 to 10 days before). The vertical lines are just an aid for the eye in grouping the points. The absence of “closing in” is confirmed by looking at the mean squared error (MSE) (in %2) for the points over 10 days (31.1) and 5 days (34.8). There is no evidence of polls closing in on the final result. The overall Shewhart chart certainly doesn’t suggest that.

Taking the polls over the 10 day period, then, here is the performance of the pollsters in terms of MSE. Lower MSE is better.

Pollster MSE
Norstat 2.25
Survation 2.31
Kantar Public 6.25
Survey Monkey 8.25
YouGov 9.03
Opinium 16.50
Qriously 20.25
Ipsos MORI 20.50
Panelbase 30.25
ORB 42.25
ComRes 74.25
ICM 78.36
BMG 110.25

Norstat and Survation pollsters will have been enjoying bonuses on the morning after the election. There are a few other commendable performances.

YouGov model

I should also mention the YouGov model (the green line on the Shewhart chart) that has an MSE of 2.25. YouGov conduct web-based surveys against at huge data base or around 50,000 registered participants. They also collect, with permission, deep demographic data on those individuals concerning income, profession, education and other factors. There is enough published demographic data from the national census to judge whether that is a representative frame from which to sample.

YouGov did not poll and publish the raw, or even adjusted, voting intention. They used their poll to  construct a model, perhaps a logistic regression or an artificial neural network, they don’t say, to predict voting intention from demographic factors. They then input into that model, not their own demographic data but data from the national census. That then gave their published forecast. I have to say that this looks about the best possible method for eliminating sampling frame effects.

It remains to be seen how widely this approach is adopted next time.

Advertisements

On leadership and the Chinese contract

Hanyu trad simp.svgBetween 1958 and 1960, 67 of the 120 inhabitants of the Chinese village of Xiaogang starved to death. But Mao Zedong’s cruel and incompetent collectivist policies continued to be imposed into the 1970s. In December 1978, 18 of Xiaogang’s leading villagers met secretly and illegally to find a way out of borderline starvation and grinding poverty. The first person to speak up at the meeting was Yan Jingchang. He suggested that the village’s principal families clandestinely divide the collective farm’s land among themselves. Then each family should own what it grew. Jingchang drew up an agreement on a piece of paper for the others to endorse. Then he hid it in a bamboo tube in the rafters of his house. Had it been discovered Jingchang and the village would have suffered brutal punishment and reprisal as “counter-revolutionaries”.

The village prospered under Jingchang’s structure. During 1979 the village produced more than it had in the previous five years. That attracted the attention of the local Communist Party chief who summoned Jingchang for interrogation. Jingchang must have given a good account of what had been happening. The regional party chief became intrigued at what was going on and prepared a report on how the system could be extended across the whole region.

Mao had died in 1976 and, amid the emerging competitors for power, it was still uncertain as to how China would develop economically and politically. By 1979, Deng Xiaoping was working his way towards the effective leadership of China. The report into the region’s proposals for agricultural reform fell on his desk. His contribution to the reforms was that he did nothing to stop them.

I have often found the idea of leadership a rather dubious one and wondered whether it actually described anything. It was, I think, Goethe who remarked that “When an idea is wanting, a word can always be found to take its place.” I have always been tempted to suspect that that was the case with “leadership”. However, the Jingchang story did make me think.1 If there is such a thing as leadership then this story exemplifies it and it is worth looking at what was involved.

Personal risk

This leader took personal risks. Perhaps to do otherwise is to be a mere manager. A leader has, to use the graphic modern idiom, “skin in the game”. The risk could be financial or reputational, or to liberty and life.

Luck

Luck is the converse of risk. Real risks carry the danger of failure and the consequences thereof. Jingchang must have been aware of that. Napoleon is said to have complained, “I have plenty of clever generals but just give me a lucky one.2 Had things turned out differently with the development of Chinese history, the personalities of the party officials or Deng’s reaction, we would probably never have heard of Jingchang. I suspect though that the history of China since the 1970s would not have been very different.

The more I practice, the luckier I get.

Gary Player
South African golfer

Catalysing alignment

It was Jingchang who drew up the contract, who crystallised the various ideas, doubts, ambitions and thoughts into a written agreement. In law we say that a valid contract requires a consensus ad idem, a meeting of minds. Jingchang listened to the emerging appetite of the the other villagers and captured it in a form in which all could invest. I think that is a critical part of leadership. A leader catalyses alignment and models constancy of purpose.

However, this sort of leadership may not be essential in every system. Management scientists are enduringly fascinated by The Morning Star Company, a California tomato grower that functions without any conventional management. The particular needs and capabilities of the individuals interact to create an emergent order that evolves and responds to external drivers. Austrian economist Friedrich Hayek coined the term catallaxy for a self-organising system of voluntary co-operation and explained how such a thing could arise and sustain and what its benefits to society.3

But sometimes the system needs the spark of a leader like Jingchang who puts himself at risk and creates a vivid vision of the future state against which followers can align.

Deng kept out of the way. Jingchang put himself on the line. The most important characteristic of leadership is the sagacity to know when the system can manage itself and when to intervene.

References

  1. I have this story from Matt Ridley (2015) The Evolution of Everything, Fourth Estate
  2. Apocryphal I think.
  3. Hayek, F A (1982) Law, Legislation, and Liberty, vol.2, Routledge, pp108–9

Why did the polls get it wrong?

This week has seen much soul-searching by the UK polling industry over their performance leading up to the 2015 UK general election on 7 May. The polls had seemed to predict that Conservative and Labour Parties were neck and neck on the popular vote. In the actual election, the Conservatives polled 37.8% to Labour’s 31.2% leading to a working majority in the House of Commons, once the votes were divided among the seats contested. I can assure my readers that it was a shock result. Over breakfast on 7 May I told my wife that the probability of a Conservative majority in the House was nil. I hold my hands up.

An enquiry was set up by the industry led by the National Centre for Research Methods (NCRM). They presented their preliminary findings on 19 January 2016. The principal conclusion was that the failure to predict the voting share was because of biases in the way that the data were sampled and inadequate methods for correcting for those biases. I’m not so sure.

Population -> Frame -> Sample

The first thing students learn when studying statistics is the critical importance, and practical means, of specifying a sampling frame. If the sampling frame is not representative of the population of concern then simply collecting more and more data will not yield a prediction of greater accuracy. The errors associated with the specification of the frame are inherent to the sampling method. Creating a representative frame is very hard in opinion polling because of the difficulty in contacting particular individuals efficiently. It turns out that Conservative voters are harder than Labour voters to get hold of, so that they can be questioned. The NCRM study concluded that, within the commercial constraints of an opinion poll, there was a lower probability that a Conservative voter would be contacted. They therefore tended to be under-represented in the data causing a substantial bias towards Labour.

This is a well known problem in polling practice and there are demographic factors that can be used to make a statistical adjustment. Samples can be stratified. NCRM concluded that, in the run up to the 2015 election, there were important biases tending to under state the Conservative vote and the existing correction factors were inadequate. Fresh sampling strategies were needed to eradicate the bias and improve prediction. There are understandable fears that this will make polling more costly. More calls will be needed to catch Conservatives at home.

Of course, that all sounds an eminently believable narrative. These sorts of sampling frame biases are familiar but enormously troublesome for pollsters. However, I wanted to look at the data myself.

Plot data in time order

That is the starting point of all statistical analysis. Polls continued after the election, though with lesser frequency. I wanted to look at that data after the election in addition to the pre-election data. Here is a plot of poll results against time for Conservative and Labour. I have used data from 25 January to the end of 2015.1, 2 I have not managed to jitter the points so there is some overprinting of Conservative by Labour pre-election.

Polling201501

Now that is an arresting plot. Yet again plotting against time elucidates the cause system. Something happened on the date of the election. Before the election the polls had the two parties neck and neck. The instant (sic) the election was done there was clear red/ blue water between the parties. Applying my (very moderate) level of domain knowledge to the data before, the poll results look stable and predictable. There is a shift after the election to a new datum that remains stable and predictable. The respective arithmetic means are given below.

Party Mean Poll Before Election Mean Poll After
Conservative 33.3% 37.8% 38.8%
Labour 33.5% 31.2% 30.9%

The mean of the post-election polls is doing fairly well but is markedly different from the pre-election results. Now, it is trite statistics that the variation we observe on a chart is the aggregate of variation from two sources.

  • Variation from the thing of interest; and
  • Variation from the measurement process.

As far as I can gather, the sampling methods used by the polling companies have not so far been modified. They were awaiting the NCRM report. They certainly weren’t modified in the few days following the election. The abrupt change on 7 May cannot be because of corrected sampling methods. The misleading pre-election data and the “impressive” post-election polls were derived from common sampling practices. It seems to me difficult to reconcile NCRM’s narrative to the historical data. The shift in the data certainly needs explanation within that account.

What did change on the election date was that a distant intention turned into the recall of a past action. What everyone wants to know in advance is the result of the election. Unsurprisingly, and as we generally find, it is not possible to sample the future. Pollsters, and their clients, have to be content with individuals’ perceptions of how they will vote. The vast majority of people pay very little attention to politics at all and the general level of interest outside election time is de minimis. Standing in a polling booth with a ballot paper is a very different matter from being asked about intentions some days, weeks or months hence. Most people take voting very seriously. It is not obvious that the same diligence is directed towards answering pollster’s questions.

Perhaps the problems aren’t statistical at all and are more concerned with what psychologists call affective forecasting, predicting how we will feel and behave under future circumstances. Individuals are notoriously susceptible to all sorts of biases and inconsistencies in such forecasts. It must at least be a plausible source of error that intentions are only imperfectly formed in advance and mapping into votes is not straightforward. Is it possible that after the election respondents, once again disengaged from politics, simply recalled how they had voted in May? That would explain the good alignment with actual election results.

Imperfect foresight of voting intention before the election and 20/25 hindsight after is, I think, a narrative that sits well with the data. There is no reason whatever why internal reflections in the Cartesian theatre of future voting should be an unbiased predictor of actual votes. In fact, I think it would be a surprise, and one demanding explanation, if they were so.

The NCRM report does make some limited reference to post-election re-interviews of contacts. However, this is presented in the context of a possible “late swing” rather than affective forecasting. There are no conclusions I can use.

Meta-analysis

The UK polls took a horrible beating when they signally failed to predict the result of the 1992 election in under-estimating the Conservative lead by around 8%.3 Things then felt better. The 1997 election was happier, where Labour led by 13% at the election with final polls in the range of 10 to 18%.4 In 2001 each poll managed to get the Conservative vote within 3% but all over-estimated the Labour vote, some pollsters by as much as 5%.5 In 2005, the final poll had Labour on 38% and Conservative,  33%. The popular vote was Labour 36.2% and Conservative 33.2%.6 In 2010 the final poll had Labour on 29% and Conservative, 36%, with a popular vote of 29.7%/36.9%.7 The debacle of 1992 was all but forgotten when 2015 returned to pundits’ dismay.

Given the history and given the inherent difficulties of sampling and affective forecasting, I’m not sure why we are so surprised when the polls get it wrong. Unfortunately for the election strategist they are all we have. That is a common theme with real world data. Because of its imperfections it has to be interpreted within the context of other sources of evidence rather than followed slavishly. The objective is not to be driven by data but to be led by the insights it yields.

References

  1. Opinion polling for the 2015 United Kingdom general election. (2016, January 19). In Wikipedia, The Free Encyclopedia. Retrieved 22:57, January 20, 2016, from https://en.wikipedia.org/w/index.php?title=Opinion_polling_for_the_2015_United_Kingdom_general_election&oldid=700601063
  2. Opinion polling for the next United Kingdom general election. (2016, January 18). In Wikipedia, The Free Encyclopedia. Retrieved 22:55, January 20, 2016, from https://en.wikipedia.org/w/index.php?title=Opinion_polling_for_the_next_United_Kingdom_general_election&oldid=700453899
  3. Butler, D & Kavanagh, D (1992) The British General Election of 1992, Macmillan, Chapter 7
  4. — (1997) The British General Election of 1997, Macmillan, Chapter 7
  5. — (2002) The British General Election of 2001, Palgrave-Macmillan, Chapter 7
  6. Kavanagh, D & Butler, D (2005) The British General Election of 2005, Palgrave-Macmillan, Chapter 7
  7. Cowley, P & Kavanagh, D (2010) The British General Election of 2010, Palgrave-Macmillan, Chapter 7

Royal babies and the wisdom of crowds

Prince George of Cambridge with wombat plush toy (crop).jpgIn 2004 James Surowiecki published a book with the unequivocal title The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations. It was intended as a gloss on Charles Mackay’s 1841 book Extraordinary Popular Delusions and the Madness of Crowds. Both books are essential reading for any risk professional.

I am something of a believer in the wisdom of crowds. The other week I was fretting about the possible relegation of English Premier League soccer club West Bromwich Albion. It’s an emotional and atavistic tie for me. I always feel there is merit, as part of my overall assessment of risk, in checking online bookmakers’ odds. They surely represent the aggregated risk assessment of gamblers if nobody else. I was relieved that bookmakers were offering typically 100/1 against West Brom being relegated. My own assessment of risk is, of course, contaminated with personal anxiety so I was pleased that the crowd was more phlegmatic.

However, while I was on the online bookmaker’s website, I couldn’t help but notice that they were also accepting bets on the imminent birth of the royal baby, the next child of the Duke and Duchess of Cambridge. It struck me as weird that anyone would bet on the sex of the royal baby. Surely this was a mere coin toss, though I know that people will bet on that. Being hopelessly inquisitive I had a look. I was somewhat astonished to find these odds being offered (this was 22 April 2015, ten days before the royal birth).

odds implied probability
Girl 1/2 0.67
Boy 6/4 0.40
 Total 1.07

Here I have used the usual formula for converting between odds and implied probabilities: odds of m / n against an event imply a probability of n / (m + n) of the event occurring. Of course, the principle of finite additivity requires that probabilities add up to one. Here they don’t and there is an overround of 7%. Like the rest of us, bookmakers have to make a living and I was unsurprised to find a Dutch book.

The odds certainly suggested that the crowd thought a girl manifestly more probable than a boy. Bookmakers shorten the odds on the outcome that is attracting the money to avoid a heavy payout on an event that the crowd seems to know something about.

Historical data on sex ratio

I started, at this stage, to doubt my assumption that boy/ girl represented no more than a coin toss, 50:50, an evens bet. As with most things, sex ratio turns out to be an interesting subject. I found this interesting research paper which showed that sex ratio was definitely dependent on factors such as the age and ethnicity of the mother. The narrative of this chart was very interesting.

Sex ratio

However, the paper confirmed that the sex of a baby is independent of previous births, conditioned on the factors identified, and that the ratio of girls to boys is nowhere and no time greater than 1,100 to 1000, about 52% girls.

So why the odds?

Bookmakers lengthen the odds on the outcome attracting the smaller value of bets in order to encourage stakes on the less fancied outcomes, on which there is presumably less risk of having to pay out. At odds of 6/4, a punter betting £10 on a boy would receive his stake back plus £15 ( = 6 × £10 / 4 ). If we assume an equal chance of boy or girl then that is an expected return of £12.50 ( = 0.5 × £25 ) for a £10.00 stake. I’m not sure I’d seen such a good value wager since we all used to bet against Tim Henman winning Wimbledon.

Ex ante there are two superficially suggestive explanations as to the asymmetry in the odds. At least this is all my bounded rationality could imagine.

  • A lot of people (mistakenly) thought that the run of five male royal births (Princes Andrew, Edward, William, Harry and George) escalated the probability of a girl being next. “It was overdue.”
  • A lot of people believed that somebody “knew something” and that they knew what it was.

In his book about cognitive biases in decision making (Thinking, Fast and Slow, Allen Lane, 2011) Nobel laureate economist Daniel Kahneman describes widespread misconceptions concerning randomness of boy/ girl birth outcomes (at p115). People tend to see regularity in sequences of data as evidence of non-randomness, even where patterns are typical of, and unsurprising in, random events.

I had thought that there could not be sufficient gamblers who would be fooled by the baseless belief that a long run of boys made the next birth more likely to be a girl. But then Danny Finkelstein reminded me (The (London) Times, Saturday 25 April 2015) of a survey of UK politicians that revealed their limited ability to deal with chance and probabilities. Are politicians more or less competent with probabilities than online gamblers? That is a question for another day. I could add that the survey compared politicians of various parties but we have an on-going election campaign in the UK at the moment so I would, in the interest of balance, invite my voting-age UK readers not to draw any inferences therefrom.

The alternative is the possibility that somebody thought that somebody knew something. The parents avowed that they didn’t know. Medical staff may or may not have. The sort of people who work in VIP medicine in the UK are not the sort of people who divulge information. But one can imagine that a random shift in sentiment, perhaps because of the misconception that a girl was “overdue”, and a consequent drift in the odds, could lead others to infer that there was insight out there. It is not completely impossible. How many other situations in life and business does that model?

It’s a girl!

The wisdom of crowds or pure luck? We shall never know. I think it was Thomas Mann who observed that the best proof of the genuineness of a prophesy was that it turned out to be false. Had the royal baby been a boy we could have been sure that the crowd was mad.

To be complete, Bayes’ theorem tells us that the outcome should enhance our degree of belief in the crowd’s wisdom. But it is a modest increase (Bayes’ factor of 2, 3 deciban after Alan Turing’s suggestion) and as we were most sceptical before we remain unpersuaded.

In his book, Surowiecki identified five factors that can impair crowd intelligence. One of these is homogeneity. Insufficient diversity frustrates the inherent virtue on which the principle is founded. I wonder how much variety there is among online punters? Similarly, where judgments are made sequentially there is a danger of influence. That was surely a factor at work here. There must also have been an element of emotion, the factor that led to all those unrealistically short odds on Henman at Wimbledon on which the wise dined so well.

But I’m trusting that none of that applies to the West Brom odds.

Trust in forecasting

File:City of London skyline at dusk.jpgStephen King (global economist at HSBC) made some profound comments about forecasting in The Times (London) (paywall) yesterday.

He points out that it is only a year since the International Monetary Fund (IMF) criticised UK economic strategy and forecast 0.7% GDP growth in 2013 and 1.5% in 2014. The latest estimate for 2013 is growth is 1.9%. The IMF now forecasts growth for 2014 at 2.4% and notes the strength of the UK economy. I should note that the UK Treasury’s forecasts were little different from the IMF’s.

Why, asks King, should we take any notice of the IMF forecast, or their opinions, now when they are so unapologetic about last year’s under estimate and their supporting comments?

The fact is that any forecast should come attached to an historic record of previous forecasts and actual outcomes, preferably on a deviation from aim chart. In fact, wherever somebody offers a forecast and there is no accompanying historic deviation from aim chart, I think it a reasonable inference that they have something to hide. The critical matter is that the chart must show a stable and predictable process of forecasting. If it does then we can start to make tentative efforts at estimating accuracy and precision. If not then there is simply no rational forecast. It would be generous to characterise such attempts at foresight as guesses.

Despite the experience base, forecasting is all about understanding fundamentals. King goes on to have doubts about the depth of the UK’s recovery and, in particular, concerns about productivity. The ONS data is here. He observes that businesses are choosing to expand by hiring cheap labour and suggests macroeconomic remedies to foster productivity growth such as encouraging small and medium sized enterprises, and enhancing educational effectiveness.

It comes back to a paradox that I have discussed before. There is a well signposted path to improved productivity that seems to remain The Road Not Taken. Everyone says they do it but it is clear from King’s observations on productivity that, in the UK at least, they do not. That would be consistent with the chronically poor service endemic in several industries. Productivity and quality go hand in hand.

I wonder if there is a preference in the UK for hiring state subsidised cheap labour over the rigorous and sustained thinking required to make real productivity improvements. I have speculated elsewhere that producers may feel themselves trading in a market for lemons. The macroeconomic causes of low productivity growth are difficult for non-economists such as myself to divine.

However, every individual company has the opportunity to take its own path and “Put its sticker on a lemon”. Governments may look to societal remedies but as an indefatigable female politician once trenchantly put it:

The individual is the true reality in life. A cosmos in himself, he does not exist for the State, nor for that abstraction called “society,” or the “nation,” which is only a collection of individuals. Man, the individual, has always been and, necessarily is the sole source and motive power of evolution and progress.

Emma Goldman
The Individual, Society and the State, 1940

Do I have to be a scientist to assess food safety?

I saw this BBC item on the web before Christmas: Why are we more scared of raw egg than reheated rice? Just after Christmas seemed like a good time to blog about food safety. Actually, the link I followed asked Are some foods more dangerous that others? A question that has a really easy answer.

However, understanding the characteristic risks of various foods and how most safely to prepare them is less simple. Risk theorist John Adams draws a distinction between readily identified inherent and obvious risks, and risks that can only be perceived with the help of science. Food risks fall into the latter category. As far as I can see, “folk wisdom” is no reliable guide here, even partially. The BBC article refers to risks from rice, pasta and salad vegetables which are not obvious. At the same time, in the UK at least, the risk from raw eggs is very small.

Ironically, raw eggs are one food that springs readily to British people’s minds when food risk is raised, largely due to the folk memory of a high profile but ill thought out declaration by a government minister in the 1980s. This is an example of what Amos Tversky and Daniel Kahneman called an availability heuristic: If you can think of it, it must be important.

Food safety is an environment where an individual is best advised to follow the advice of scientists. We commonly receive this filtered, even if only for accessibility, through government agencies. That takes us back to the issue of trust in bureaucracy on which I have blogged before.

I wonder whether governments are in the best position to provide such advice. It is food suppliers who suffer from the public’s misallocated fears. The egg fiasco of the 1980s had a catastrophic effect on UK egg sales. All food suppliers have an interest in a market characterised by a perception that the products are safe. The food industry is also likely to be in the best position to know what is best practice, to improve such practice, to know how to communicate it to their customers, to tailor it to their products and to provide the effective behavioural “nudges” that promote safe handling. Consumers are likely to be cynical about governments, “one size fits all” advice and cycles of academic meta-analysis.

I think there are also lessons here for organisations. Some risks are assessed on the basis of scientific analysis. It is important that the prestige of that origin is communicated to all staff who will be involved in working with risk. The danger for any organisation is that an individual employee might make a reassessment based on local data and their own self-serving emotional response. As I have blogged before, some individuals have particular difficulty in aligning themselves with the wider organisation.

Of course, individuals must also be equipped with the means of detecting when the assumptions behind the science have been violated and initiating an agile escalation so that employee, customer and organisation can be protected while a reassessment is conducted. Social media provide new ways of sharing experience. I note from the BBC article that, in the UK at least, there is no real data on the origins of food poisoning outbreaks.

So the short answer to the question at the head of this blog still turns out to be “yes”. There are some things where we simply have to rely on science if we want to look after ourselves, our families and our employees.

But even scientists are limited by their own bounded rationality. Science is a work in progress. Using that science itself as a background against which to look for novel phenomena and neglected residual effects leverages that original risk analysis into a key tool in managing, improving and growing a business.

It was 20 years ago today …

File:W. Edwards Deming.gifToday, 20 December 2013, marks the twentieth anniversary of the death of W Edwards Deming. Deming was a hugely influential figure in management science, in Japan during the 1950s, 1960s and 1970s, then internationally from the early 1980s until his death. His memory persists in a continuing debate about his thinking among a small and aging sector of the operational excellence community, and in a broader reputation as a “management guru”, one of the writers who from the 1980s onwards championed and popularised the causes of employee engagement and business growth through customer satisfaction.

Deming’s training had been in mathematics and physics but in his professional life he first developed into a statistician, largely because of the influence of Walter Shewhart, an early mentor. It was fundamental to Deming’s beliefs that an organisation could only be managed effectively with widespread business measurement and trenchant statistical criticism of data. In that way he anticipated writers of a later generation such as Nate Silver and Nassim Taleb.

Since Deming’s death the operational excellence landscape has become more densely populated. In particular, lean operations and Six Sigma have variously been seen as competitors for Deming’s approach, as successors, usurpers, as complementary, as development, or as tools or tool sets to be deployed within Deming’s business strategy. In many ways, the pragmatic development of lean and Six Sigma have exposed the discursive, anecdotal and sometimes gnomic way Deming liked to communicate. In his book Out of the Crisis: Quality, Productivity and Competitive Position (1982) minor points are expanded over whole chapters while major ideas are finessed in a few words. Having elevated the importance of measurement and a proper system for responding to data he goes on to observe that the most important numbers are unknown and unknowable. I fear that this has often been an obstacle to managers finding the hard science in Deming.

For me, the core of Deming’s thinking remains this. There is only one game in town, the continual improvement of the alignment between the voice of the process and the voice of the customer. That improvement is achieved by the diligent use of process behaviour charts. Pursuit of that aim will collaterally reduce organisational costs.

Deming pursued the idea further. He asked what kind of organisation could most effectively exploit process behaviour charts. He sought philosophical justifications for successful heuristics. It is here that his writing became more difficult to accept for many people. In his last book, The New Economics for Industry, Government, Education, he trespassed on broader issues usually reserved to politics and social science, areas in which he was poorly qualified to contribute. The problem with Deming’s later work is that where it is new, it is not economics, and where it is economics, it is not new. It is this part of his writing that has tended to attract a few persistent followers. What is sad about Deming’s continued following is the lack of challenge. Every seminal thinker’s works are subject to repeated criticism, re-evaluation and development. Not simply development by accumulation but development by revision, deletion and synthesis. It is here that Deming’s memory is badly served. At the top of the page is a link to Deming’s Wikipedia entry. It is disturbing that everything is stated as though a settled and triumphant truth, a treatment that contrasts with the fact that his work is now largely ignored in mainstream management. Managers have found in lean and Six Sigma systems they could implement, even if only partially. In Deming they have not.

What Deming deserves, now that a generation, a global telecommunications system and a world wide web separate us from him, is a robust criticism and challenge of his work. The statistical thinking at the heart is profound. For me, the question of what sort of organisation is best placed to exploit that thinking remains open. Now is the time for the re-evaluation because I believe that out of it we can join in reaching new levels of operational excellence.