Here are all the published opinion polls for the June 2017 UK general election, plotted as a Shewhart chart.
The Conservative lead over Labour had been pretty constant at 16% from February 2017, after May’s Lancaster House speech. The initial Natural Process Limits (“NPLs”) on the chart extend back to that date. Then something odd happened in the polls around Easter. There were several polls above the upper NPL. That does not seem to fit with any surrounding event. Article 50 had been declared two weeks before and had had no real immediate impact.
I suspect that the “fugue state” around Easter was reflected in the respective parties’ private polling. It is possible that public reaction to the election announcement somehow locked in the phenomenon for a short while.
Things then seem to settle down to the 16% lead level again. However, the local election results at the bottom of the range of polls ought to have sounded some alarm bells. Local election results are not a reliable predictor of general elections but this data should not have felt very comforting.
Then the slide in lead begins. But when exactly? A lot of commentators have assumed that it was the badly received Conservative Party manifesto that started the decline. It is not possible to be definitive from the chart but it is certainly arguable that it was the leak of the Labour Party manifesto that started to shift voting intention.
Then the swing from Conservative to Labour continued unabated to polling day.
Polling performance
How did the individual pollsters fair? I have, somewhat arbitrarily, summarised all polls conducted in the 10 days before the election (29 May to 7 June). Here is the plot along with the actual popular poll result which gave a 2.5% margin of Conservative over Labour. That is the number that everybody was trying to predict.
The red points are the surveys from the 5 days before the election (3 to 7 June). Visually, they seem to be no closer, in general, than the other points (6 to 10 days before). The vertical lines are just an aid for the eye in grouping the points. The absence of “closing in” is confirmed by looking at the mean squared error (MSE) (in %2) for the points over 10 days (31.1) and 5 days (34.8). There is no evidence of polls closing in on the final result. The overall Shewhart chart certainly doesn’t suggest that.
Taking the polls over the 10 day period, then, here is the performance of the pollsters in terms of MSE. Lower MSE is better.
Pollster | MSE |
Norstat | 2.25 |
Survation | 2.31 |
Kantar Public | 6.25 |
Survey Monkey | 8.25 |
YouGov | 9.03 |
Opinium | 16.50 |
Qriously | 20.25 |
Ipsos MORI | 20.50 |
Panelbase | 30.25 |
ORB | 42.25 |
ComRes | 74.25 |
ICM | 78.36 |
BMG | 110.25 |
Norstat and Survation pollsters will have been enjoying bonuses on the morning after the election. There are a few other commendable performances.
YouGov model
I should also mention the YouGov model (the green line on the Shewhart chart) that has an MSE of 2.25. YouGov conduct web-based surveys against at huge data base or around 50,000 registered participants. They also collect, with permission, deep demographic data on those individuals concerning income, profession, education and other factors. There is enough published demographic data from the national census to judge whether that is a representative frame from which to sample.
YouGov did not poll and publish the raw, or even adjusted, voting intention. They used their poll to construct a model, perhaps a logistic regression or an artificial neural network, they don’t say, to predict voting intention from demographic factors. They then input into that model, not their own demographic data but data from the national census. That then gave their published forecast. I have to say that this looks about the best possible method for eliminating sampling frame effects.
It remains to be seen how widely this approach is adopted next time.
Very interesting. I like the analysis and comment which would not have been easy. Are any pollsters reading this blog and willing to comment??
Good work, that’s the insightful way to view opinion poll results!
Is it just me, of is there a pretty consistent downward trend in the data, starting right from “GE called”? If you did one more split at about the end of April, would we end up with 3 legs of the graphs with quite different limits?
It is true that the downward trend seems to have accelerated after Easter. The hypothesis I’m just about to form is that the trend was already there. The fact that it picked up (at the time you flagged up in your graph) could be explained by a common social dynamic, i.e. when an opinion, fashion etc. reaches a critical mass in a group or community, the trend amplifies (like a snowball effect). A sociologist would have somehting to say about that.
Luca, Many thanks. I agree that the data will bear the interpretation of a trend starting from “GE called”. Interesting stuff.
The chart is going to be updated and tweeted (almost) daily from now on so please do share it with anyone who might be interested. Anthony
HI Anthony
When you say that “the chart is going to be updated”, do you mean you will be adding new points to it as results of further opinion polls become available? That would make a very interesting and useful project. Applying process behaviour charts to political or economic dynamics that are high on the public agenda is a brilliant idea, as it could make the charts better known to the general public. The practice of interpreting data out of context (and therefore interpreting it wrongly) will not be an easy one too root out, but we must do what we can. I know a community who will definitely be interested in a project like this and willing to support it. Do you know much about the Deming Alliance? If we write a little piece about your project and publish it on our website with a link to your blog, that would be a good way to inform our members and enlist their support. If you like the idea, please send me an email and I’ll explain what I have in mind.
Luca, Every time there is a new poll I will add it to the chart and tweet it. Should be at least a couple of times a week at the current rate. I have this set up to “scrape” the data off wikipedia so it is almost automatic. I have been running this privately since before the 2015 election but have never really tried to publicise it before. At least part of the motivation is so get the charts in the minds of the public.
I also thought about occasionally adding a survey “Will we see the effect of next week’s budget in the polling data?”
I am very happy that you share this with the Alliance. Let me know what you need from me. Anthony