The risks of lead in the environment – social choice and individual values

 

Almost one in five deaths in the US can be linked to lead pollution, with even low levels of exposure potentially fatal, researchers have said.

That, in any event, was the headline in the Times (London) (£paywall) last week.

Gas pump lead warning

Historical environmental lead

The item turned out to be based on academic research by Professor Bruce Lanphear of Simon Fraser University, and others. You can find their published paper here in The Lancet: Public Health.1 It is publicly available at no charge, a practice very much to be encouraged. You know that I bristle at publicly funded research not being made available to the public.

As it was, no specific thing in either news report or the academic research struck me as wholly wrong. However, it made me wonder about the implied message of the news item and broader issues about communicating risk. I have some criticisms of the academic work, or at least how it is presented, but I will come to those below. I don’t have major doubts about the conclusions.

The pot odds of a jaywalker

Lanphear’s  principal result concerned hazard rates so it is worth talking a little about what they are. Suppose I stand still in the middle of the carriageway at Hyde Park Corner (London) or Time Square (New York) or … . Suppose the pedestrian lights are showing “Don’t walk”. The probability that I get hit by a motor car is fairly high. A good 70 to 80% in my judgment, if I stand there long enough.

Now, suppose I sprint across under the same conditions. My chances of emerging unscathed still aren’t great but I think they are better. A big difference is what engineers call the Time at Risk (TAR). In general, the longer I expose myself to a hazardous situation, the greater the probability that I encounter my nemesis.

Now, there might be other differences between the risks in the two situations. A moving target might be harder to hit or less easy to avoid. However, it feels difficult to make a fair comparison of the risk because of the different TARs. Hazard rates provide a common basis for comparing what actuaries call the force of mortality without the confounding effect of exposure time. Hazard rates, effectively, offer a probability per unit time. They are measured in units like “percent per hour”. The math is actually quite complicated but hazard rates translate into probabilities when you multiply them by TAR. Roughly.

I was recently reading of the British Army’s mission to Helmand Province in Afghanistan.2 In Operation Oqab Tsuka, military planners had to analyse the ground transport of a turbine to an hydroelectric plant. Terrain made the transport painfully slow along a route beset with insurgents and hostile militias. The highway had been seeded with IEDs (“Improvised Explosive Devices”) which slowed progress still more. The analysis predicted in the region of 50 British service deaths to get the turbine to its destination. The extended time to traverse the route escalated the TAR and hence the hazard, literally the force of mortality. That analysis led to a different transport route being explored and adopted.

So hazard rates provide a baseline of risk disregarding exposure time.

Lanphear’s results

Lanphear was working with a well established sampling frame of 18,825 adults in the USA whose lead levels had been measured some time in 1988 to 1994 when they were recruited to the panel. The cohort had been followed up in a longitudinal study so that data was to hand as to their subsequent morbidity and mortality.

What Lanphear actually looked at was a ratio of hazard rates. For the avoidance of doubt, the hazard that he was looking at was death from heart disease. There was already evidence of a link with lead exposure. He looked at, among other things, how much the hazard rate changed between the cohort members with the lowest measured blood-lead levels and with the highest. That is, as measured back in the period 1988 to 1994. He found, this is his headline result, that an increase in historical blood-lead from 1.0 μg/dL (microgram per decilitre) to 6.7 μg/dL was associated with an estimated 37% increase in hazard rate for heart disease.

Moreover, 1.0 and 6.7 μg/dL represented the lower and upper limits of the middle 80% of the sample. These were not wildly atypical levels. So in going from the blood-lead level that marks the 10% least exposed to the level of the 10% most exposed we get a 37% increase in instantaneous risk from heart disease.

Now there are a few things to note. Firstly, it is fairly obvious that historical lead in blood would be associated with other things that influence the onset of heart disease, location in an industrial zone, income, exercise regime etc. Lanphear took those into account, as far as is possible, in his statistical modelling. These are the known unknowns. It is also obvious that some things have an impact on heart disease that we don’t know about yet or which are simply too difficult, or too costly or too unethical, to measure. These are the unknown unknowns. Variation in these factors causes variation in morbidity and mortality. But we can’t assign the variation to an individual cause. Further, that variation causes uncertainty in all the estimates. It’s not exactly 37%. However, bearing all that in mind rather tentatively, this is all we have got.

Despite those other sources of variation, I happen to know my personal baseline risk of suffering cardiovascular disease. As I explored here, it is 5% over 10 years. Well, that was 4 years ago so its 3% over the next 6. Now, I was brought up in the industrial West Midlands of the UK, Rowley Regis to be exact, in the 1960s. Our nineteenth-century-built house had water supplied through lead pipes and there was no diligent running-off of drinking water before use. Who knew? Our house was beside a busy highway.3 I would guess that, on any determination of historical exposure to environmental lead, I would rate in the top 10%.

That gives me a personal probability over the next 6 years of 1.37 × 3% = 4%. Or so. Am I bothered?

Well, no. Neither should you be.

But …

Social Choice and Individual Values

That was the title of a seminal 1951 book by Nobel laureate economist Kenneth Arrow.4 Arrow applied his mind to the question of how society as a whole should respond when individuals in the society had differing views as to the right and the good, or even the true and the just.

The distinction between individual choice and social policy lies, I think, at the heart of the confusion of tone of the Times piece. The marginal risk to an individual, myself in particular, from historical lead is de minimis. I have taken a liberty in multiplying my hazard rate for morbidity by a hazard ratio for mortality but I think you get my point. There is no reason at all why I, or you, should be bothered in the slightest as to our personal health. Even with an egregious historical exposure. However, those minimal effects, aggregated across a national scale, add up to a real impact on the economy. Loss of productive hours, resources diverted to healthcare, developing professional expertise terminated early by disease. All these things have an impact on national wealth. A little elementary statistics, and a few not unreasonable assumptions, allows an estimate of the excess number of deaths that would not have occurred “but for” the environmental lead exposure. That number turns out to be 441,000 US deaths each year with an estimated annual impact on the economy of over $100 billion. If you are skeptical, perhaps it is one tenth of that.

Now, nobody is suggesting that environmental lead has precipitated some crisis in public health that ought to make us fear for our lives. That is where the Times article was badly framed. Lanphear and his colleagues are at pains to point out just how deaths from heart disease have declined over the past 50 years, how much healthier and long-lived we now are.

The analysis kicks in when policy makers come to consider choices between various taxation schemes, trade deals, international political actions, or infrastructure investment strategies. There, the impact of policy choices on environmental lead can be mapped directly into economic consequences. Here the figures matter a great deal. But to me? Not so much.

What is to be done?

How do we manage economy level policy when an individual might not perceive much of a stake? Arrow found that neither the ballot box nor markets offered a tremendously helpful solution. That leaves us with dependence on the bureaucratic professions, or the liberal elite as we are told we have to call them in these politically correct times. That in turn leads us back to Robert Michels’ Iron Law of Oligarchy. Historically, those elites have proved resistant to popular sentiments and democratic control. The modern solution is democratic governance. However, that is exactly what Michels viewed as doomed to fail. The account of the British Army in Afghanistan that I referred to above is a further anecdote of failure.5

But I am going to remain an optimist that bureaucrats can be controlled. Much of the difficulty arises from governance functions’ statistical naivety and lack of data smarts. Politicians aren’t usually the most data critical people around. The Times piece does not help. One of the things everyone can do is to be clearer that there are individual impacts and economy-wide impacts, and that they are different things. Just because you can discount a personal hazard does not mean there is not something that governments should be working to improve.

It’s not all about me.

Some remarks on the academic work

As I keep on saying, the most (sic) important part of any, at least conventional, regression modelling is residuals analysis and regression diagnostics.6 However, Lanphear and his colleagues were doing something a lot more complicated than the simple linear case. The were using proportional hazards modelling. Now, I know that there are really serious difficulties in residuals analysis for such models and in giving a neat summary figure of how much of the variation in the data is “explained” by the factors being investigated. However, there are diagnostic tools for proportional hazards and I would like to have seen something reported. Perhaps the analysis was done but my trenchant view is that it is vital that it is shared. For all the difficulties in this, progress will only be made by domain experts trying to develop practice collaboratively.

My mind is always haunted by the question Was the regression worth it? And please remember that p-values in no way answer that question.

References and notes

  1. Lanphear, BP (2018) Low-level lead exposure and mortality in US adults: a population-based cohort study, The Lancet: Public Health. Published online.
  2. Farrell, T (2017) Unwinnable: Britain’s War in Afghanistan 2001-2014, London: The Bodley Head, pp239-244
  3. During the industrial revolution, this had been the important Oldbury to Halesowen turnpike-road. Even in the 1960s it carried a lot of traffic. My Black Country grandfather always referred to it as the ‘oss road. a road so significant that one might find horses on it. Keep out o’ the ‘oss road, m’ mon. He knew about risk.
  4. Arrow, KJ [1951] (2012) Social Choice and Individual Values, Martino Fine Books
  5. Farrell Op. cit.
  6. Draper, NR & Smith, H (1998) Applied Regression Analysis, 3rd ed., New York:  Wiley, Chapters 2 and 8
Advertisements

Shewhart chart basics 1 – The environment sufficiently stable to be predictable

Everybody wants to be able to predict the future. Here is the forecaster’s catechism.

  • We can do no more that attach a probability to future events.
  • Where we have data from an environment that is sufficiently stable to be predictable we can project historical patterns into the future.
  • Otherwise, prediction is largely subjective;
  • … but there are tactics that can help.
  • The Shewhart chart is the tool that helps us know whether we are working with an environment that is sufficiently stable to be predictable.

Now let’s get to work.

What does a stable/ predictable environment look like?

Every trial lawyer knows the importance of constructing a narrative out of evidence, an internally consistent and compelling arrangement of the facts that asserts itself above competing explanations. Time is central to how a narrative evolves. It is time that suggests causes and effects, motivations, barriers and enablers, states of knowledge, external influences, sensitisers and cofactors. That’s why exploration of data always starts with plotting it in time order. Always.

Let’s start off by looking at something we know to be predictable. Imagine a bucket of thousands of spherical beads. Of the beads, 80% are white and 20%, red. You are given a paddle that will hold 50 beads. Use the paddle to stir the beads then draw out 50 with the paddle. Count the red beads. Now you may, at this stage, object. Surely, this is just random and inherently unpredictable. But I want to persuade you that this is the most predictable data you have ever seen. Let’s look at some data from 20 sequential draws. In time order, of course, in Fig. 1.

Shew Chrt 1

Just to look at the data from another angle, always a good idea, I have added up how many times a particular value, 9, 10, 11, … , turns up and tallied them on the right hand side. For example, here is the tally for 12 beads in Fig. 2.

Shew Chrt 2

We get this in Fig. 3.

Shew Chrt 3

Here are the important features of the data.

  • We can’t predict what the exact value will be on any particular draw.
  • The numbers vary irregularly from draw to draw, as far as we can see.
  • We can say that draws will vary somewhere between 2 (say) and 19 (say).
  • Most of the draws are fairly near 10.
  • Draws near 2 and 19 are much rarer.

I would be happy to predict that the 21st draw will be between 2 and 19, probably not too far from 10. I have tried to capture that in Fig. 4. There are limits to variation suggested by the experience base. As predictions go, let me promise you, that is as good as it gets.

Even statistical theory would point to an outcome not so very different from that. That theoretical support adds to my confidence.

Shew Chrt 4

But there’s something else. Something profound.

A philosopher, an engineer and a statistician walk into a bar …

… and agree.

I got my last three bullet points above from just looking at the tally on the right hand side. What about the time order I was so insistent on preserving? As Daniel Kahneman put it “A random event does not … lend itself to explanation, but collections of random events do behave in a highly regular fashion.” What is this “regularity” when we can see how irregularly the draws vary? This is where time and narrative make their appearance.

If we take the draw data above, the exact same data, and “shuffle” it into a fresh order, we get this, Fig. 5.

Shew Chrt 5

Now the bullet points still apply to the new arrangement. The story, the narrative, has not changed. We still see the “irregular” variation. That is its “regularity”, that is tells the same story when we shuffle it. The picture and its inferences are the same. We cannot predict an exact value on any future draw yet it is all but sure to be between 2 and 19 and probably quite close to 10.

In 1924, British philosopher W E Johnson and US engineer Walter Shewhart, independently, realised that this was the key to describing a predicable process. It shows the same “regular irregularity”, or shall we say stable irregularity, when you shuffle it. Italian statistician Bruno de Finetti went on to derive the rigorous mathematics a few years later with his famous representation theorem. The most important theorem in the whole of statistics.

This is the exact characterisation of noise. If you shuffle it, it makes no difference to what you see or the conclusions you draw. It makes no difference to the narrative you construct (sic). Paradoxically, it is noise that is predictable.

To understand this, let’s look at some data that isn’t just noise.

Events, dear boy, events.

That was the alleged response of British Prime Minister Harold Macmillan when asked what had been the most difficult aspect of governing Britain.

Suppose our data looks like this in Fig. 6.

Shew Chrt 6

Let’s make it more interesting. Suppose we are looking at the net approval rating of a politician (Fig. 7).

Shew Chrt 7

What this looks like is noise plus a material step change between the 10th and 11th observation. Now, this is a surprise. The regularity, and the predictability, is broken. In fact, my first reaction is to ask What happened? I research political events and find at that same time there was an announcement of universal tax cuts (Fig. 8). This is just fiction of course. That then correlates with the shift in the data I observe. The shift is a signal, a flag from the data telling me that something happened, that the stable irregularity has become an unstable irregularity. I use the time context to identify possible explanations. I come up with the tentative idea about tax cuts as an explanation of the sudden increase in popularity.

The bullet points above no longer apply. The most important feature of the data now is the shift, I say, caused by the Prime Minister’s intervention.

Shew Chrt 8

What happens when I shuffle the data into a random order though (Fig. 9)?

Shew Chrt 9

Now, the signal is distorted, hard to see and impossible to localise in time. I cannot tie it to a context. The message in the data is entirely different. The information in the chart is not preserved. The shuffled data does not bear the same narrative as the time ordered data. It does not tell the same story. It does not look the same. That is how I know there is a signal. The data changes its story when shuffled. The time order is crucial.

Of course, if I repeated the tally exercise that I did on Fig. 4, the tally would look the same, just as it did in the noise case in Fig. 5.

Is data with signals predictable?

The Prime Minister will say that they predicted that their tax cuts would be popular and they probably did so. My response to that would be to ask how big an improvement they predicted. While a response in the polls may have been foreseeable, specifying its magnitude is much more difficult and unlikely to be exact.

We might say that the approval data following the announcement has returned to stability. Can we not now predict the future polls? Perhaps tentatively in the short term but we know that “events” will continue to happen. Not all these will be planned by the government. Some government initiatives, triumphs and embarrassments will not register with the public. The public has other things to be interested in. Here is some UK data.

poll20180302

You can follow regular updates here if you are interested.

Shewhart’s ingenious chart

While Johnson and de Finetti were content with theory, Shewhart, working in the manufacture of telegraphy equipment, wanted a practical tool for his colleagues that would help them answer the question of predictability. A tool that would help users decide whether they were working with an environment sufficiently stable to be predictable. Moreover, he wanted a tool that would be easy to use by people who were short of time time for analysing data and had minds occupied by the usual distractions of the work place. He didn’t want people to have to run off to a statistician whenever they were perplexed by events.

In Part 2 I shall start to discuss how to construct Shewhart’s chart. In subsequent parts, I shall show you how to use it.

Get rich predicting the next recession – just watch the fertility statistics

… we are told. Or perhaps not. This was the research reported last week, with varying degrees of credulity, by the BBC here and The (London) Times here (£paywall). This turned out to be a press release about some academic research by Kasey Buckles of Notre Dame University and others. You have to pay USD 5 to get the academic paper. I shall come back to that.

The paper’s abstract claims as follows.

Many papers show that aggregate fertility is pro-cyclical over the business cycle. In this paper we do something else: using data on more than 100 million births and focusing on within-year changes in fertility, we show that for recent recessions in the United States, the growth rate for conceptions begins to fall several quarters prior to economic decline. Our findings suggest that fertility behavior is more forward-looking and sensitive to changes in short-run expectations about the economy than previously thought.

Now, here is a chart shared by the BBC.

Pregnancy and recession

The first thing to notice here is that we have exactly three observations. Three recession events with which to learn about any relationship between human sexual activity and macroeconomics. If you are the sort of person obsessed with “sample size”, and I know some of you are, ignore the misleading “100 million births” hold-out. Focus on the fact that n=3.

We are looking for a leading indicator, something capable of predicting a future event or outcome that we are bothered about. We need it to go up/ down before the up/ down event that we anticipate/ fear. Further it needs consistently to go up/ down in the right direction, by the right amount and in sufficient time for us to take action to correct, mitigate or exploit.

There is a similarity here to the hard and sustained thinking we have to do when we are looking for a causal relationship, though there is no claim to cause and effect here (c.f. the Bradford Hill guidelines). One of the most important factors in both is temporality. A leading indicator really needs to lead, and to lead in a regular way. Making predictions like, “There will be a recession some time in the next five years,” would be a shameless attempt to re-imagine the unsurprising as a signal novelty.

Having recognised the paucity of the data and the subtlety of identifying a usefully predictive effect, we move on to the chart. The chart above is pretty useless for the job at hand. Run charts with multiple variables are very weak tools for assessing association between factors, except in the most unambiguous cases. The chart broadly suggests some “association” between fertility and economic growth. It is possible to identify “big falls” both in fertility and growth and to persuade ourselves that the collapses in pregnancy statistics prefigure financial contraction. But the chart is not compelling evidence that one variable tracks the other reliably, even with a time lag. There looks like no evident global relationship between the variation in the two factors. There are big swings in each to which no corresponding event stands out in the other variable.

We have to go back and learn the elementary but universal lessons of simple linear regression. Remember that I told you that simple linear regression is the prototype of all successful statistical modelling and prediction work. We have to know whether we have a system that is sufficiently stable to be predictable. We have to know whether it is worth the effort. We have to understand the uncertainties in any prediction we make.

We do not have to go far to realise that the chart above cannot give a cogent answer to any of those. The exercise would, in any event, be a challenge with three observations. I am slightly resistant to spending GBP 3.63 to see the authors’ analysis. So I will reserve my judgment as to what the authors have actually done. I will stick to commenting on data journalism standards. However, I sense that the authors don’t claim to be able to predict economic growth simpliciter, just some discrete events. Certainly looking at the chart, it is not clear which of the many falls in fertility foreshadow financial and political crisis. With the myriad of factors available to define an “event”, it should not be too difficult, retrospectively, to define some fertility “signal” in the near term of the bull market and fit it astutely to the three data points.

As The Times, but not the BBC, reported:

However … the correlation between conception and recession is far from perfect. The study identified several periods when conceptions fell but the economy did not.

“It might be difficult in practice to determine whether a one-quarter drop in conceptions is really signalling a future downturn. However, this is also an issue with many commonly used economic indicators,” Professor Buckles told the Financial Times.

Think of it this way. There are, at most, three independent data points on your scatter plot. Really. And even then the “correlation … is far from perfect”.

And you have had the opportunity to optimise the time lag to maximise the “correlation”.

This is all probably what we suspected. What we really want is to see the authors put their money where their mouth is on this by wagering on the next recession, a point well made by Nassim Taleb’s new book Skin in the Game. What distinguishes a useful prediction is that the holder can use it to get the better of the crowd. And thinks the risks worth it.

As for the criticisms of economic forecasting generally, we get it. I would have thought though that the objective was to improve forecasting, not to satirise it.

UK railway suicides – 2017 update

The latest UK rail safety statistics were published on 23 November 2017, again absent much of the press fanfare we had seen in the past. Regular readers of this blog will know that I have followed the suicide data series, and the press response, closely in 2016, 20152014, 2013 and 2012. Again I have re-plotted the data myself on a Shewhart chart.

RailwaySuicides20171

Readers should note the following about the chart.

  • Many thanks to Tom Leveson Gower at the Office of Rail and Road who confirmed that the figures are for the year up to the end of March.
  • Some of the numbers for earlier years have been updated by the statistical authority.
  • I have recalculated natural process limits (NPLs) as there are still no more than 20 annual observations, and because the historical data has been updated. The NPLs have therefore changed but, this year, not by much.
  • Again, the pattern of signals, with respect to the NPLs, is similar to last year.

The current chart again shows two signals, an observation above the upper NPL in 2015 and a run of 8 below the centre line from 2002 to 2009. As I always remark, the Terry Weight rule says that a signal gives us license to interpret the ups and downs on the chart. So I shall have a go at doing that.

It will not escape anybody’s attention that this is now the second year in which there has been a fall in the number of fatalities.

I haven’t yet seen any real contemporaneous comment on the numbers from the press. This item appeared on the BBC, a weak performer in the field of data journalism but clearly with privileged access to the numbers, on 30 June 2017, confidently attributing the fall to past initiatives.

Sky News clearly also had advanced sight of the numbers and make the bold claim that:

… for every death, six more lives were saved through interventions.

That item goes on to highlight a campaign to encourage fellow train users to engage with anybody whose behaviour attracted attention.

But what conclusions can we really draw?

In 2015 I was coming to the conclusion that the data increasingly looked like a gradual upward trend. The 2016 data offered a challenge to that but my view was still that it was too soon to say that the trend had reversed. There was nothing in the data incompatible with a continuing trend. This year, 2017, has seen 2016’s fall repeated. A welcome development but does it really show conclusively that the upward trending pattern is broken? Regular readers of this blog will know that Langian statistics like “lowest for six years” carry no probative weight here.

Signal or noise?

Has there been a change to the underlying cause system that drives the suicide numbers? Last year, I fitted a trend line through the data and asked which narrative best fitted what I observed, a continuing increasing trend or a trend that had plateaued or even reversed. You can review my analysis from last year here.

Here is the data and fitted trend updated with this year’s numbers, along with NPLs around the fitted line, the same as I did last year.

RailwaySuicides20172

Let’s think a little deeper about how to analyse the data. The first step of any statistical investigation ought to be the cause and effect diagram.

SuicideCne

The difficulty with the suicide data is that there is very little reproducible and verifiable knowledge as to its causes. I have seen claims, of whose provenance I am uncertain, that railway suicide is virtually unknown in the USA. There is a lot of useful thinking from common human experience and from more general theories in psychology. But the uncertainty is great. It is not possible to come up with a definitive cause and effect diagram on which all will agree, other from the point of view of identifying candidate factors.

The earlier evidence of a trend, however, suggests that there might be some causes that are developing over time. It is not difficult to imagine that economic trends and the cumulative awareness of other fatalities might have an impact. We are talking about a number of things that might appear on the cause and effect diagram and some that do not, the “unknown unknowns”. When I identified “time” as a factor, I was taking sundry “lurking” factors and suspected causes from the cause and effect diagram that might have a secular impact. I aggregated them under the proxy factor “time” for want of a more exact analysis.

What I have tried to do is to split the data into two parts:

  • A trend (linear simply for the sake of exploratory data analysis (EDA); and
  • The residual variation about the trend.

The question I want to ask is whether the residual variation is stable, just plain noise, or whether there is a signal there that might give me a clue that a linear trend does not hold.

There is no signal in the detrended data, no signal that the trend has reversed. The tough truth of the data is that it supports either narrative.

  • The upward trend is continuing and is stable. There has been no reversal of trend yet.
  • The data is not stable. True there is evidence of an upward trend in the past but there is now evidence that deaths are decreasing.

Of course, there is no particular reason, absent the data, to believe in an increasing trend and the initiative to mitigate the situation might well be expected to result in an improvement.

Sometimes, with data, we have to be honest and say that we do not have the conclusive answer. That is the case here. All that can be done is to continue the existing initiatives and look to the future. Nobody ever likes that as a conclusion but it is no good pretending things are unambiguous when that is not the case.

Next steps

Previously I noted proposals to repeat a strategy from Japan of bathing railway platforms with blue light. In the UK, I understand that such lights were installed at Gatwick in summer 2014. In fact my wife and I were on the platform at Gatwick just this week and I had the opportunity to observe them. I also noted, on my way back from court the other day, blue strip lights along the platform edge at East Croydon. I think they are recently installed. However, I have not seen any data or heard of any analysis.

A huge amount of sincere endeavour has gone into this issue but further efforts have to be against the background that there is still no conclusive evidence of improvement.

Suggestions for alternative analyses are always welcomed here.

UK Election of June 2017 – Polling review

Pollin2017Overview

Here are all the published opinion polls for the June 2017 UK general election, plotted as a Shewhart chart.

The Conservative lead over Labour had been pretty constant at 16% from February 2017, after May’s Lancaster House speech. The initial Natural Process Limits (“NPLs”) on the chart extend back to that date. Then something odd happened in the polls around Easter. There were several polls above the upper NPL. That does not seem to fit with any surrounding event. Article 50 had been declared two weeks before and had had no real immediate impact.

I suspect that the “fugue state” around Easter was reflected in the respective parties’ private polling. It is possible that public reaction to the election announcement somehow locked in the phenomenon for a short while.

Things then seem to settle down to the 16% lead level again. However, the local election results at the bottom of the range of polls ought to have sounded some alarm bells. Local election results are not a reliable predictor of general elections but this data should not have felt very comforting.

Then the slide in lead begins. But when exactly? A lot of commentators have assumed that it was the badly received Conservative Party manifesto that started the decline. It is not possible to be definitive from the chart but it is certainly arguable that it was the leak of the Labour Party manifesto that started to shift voting intention.

Then the swing from Conservative to Labour continued unabated to polling day.

Polling performance

How did the individual pollsters fair? I have, somewhat arbitrarily, summarised all polls conducted in the 10 days before the election (29 May to 7 June). Here is the plot along with the actual popular poll result which gave a 2.5% margin of Conservative over Labour. That is the number that everybody was trying to predict.

PollsterPerformance

The red points are the surveys from the 5 days before the election (3 to 7 June). Visually, they seem to be no closer, in general, than the other points (6 to 10 days before). The vertical lines are just an aid for the eye in grouping the points. The absence of “closing in” is confirmed by looking at the mean squared error (MSE) (in %2) for the points over 10 days (31.1) and 5 days (34.8). There is no evidence of polls closing in on the final result. The overall Shewhart chart certainly doesn’t suggest that.

Taking the polls over the 10 day period, then, here is the performance of the pollsters in terms of MSE. Lower MSE is better.

Pollster MSE
Norstat 2.25
Survation 2.31
Kantar Public 6.25
Survey Monkey 8.25
YouGov 9.03
Opinium 16.50
Qriously 20.25
Ipsos MORI 20.50
Panelbase 30.25
ORB 42.25
ComRes 74.25
ICM 78.36
BMG 110.25

Norstat and Survation pollsters will have been enjoying bonuses on the morning after the election. There are a few other commendable performances.

YouGov model

I should also mention the YouGov model (the green line on the Shewhart chart) that has an MSE of 2.25. YouGov conduct web-based surveys against at huge data base or around 50,000 registered participants. They also collect, with permission, deep demographic data on those individuals concerning income, profession, education and other factors. There is enough published demographic data from the national census to judge whether that is a representative frame from which to sample.

YouGov did not poll and publish the raw, or even adjusted, voting intention. They used their poll to  construct a model, perhaps a logistic regression or an artificial neural network, they don’t say, to predict voting intention from demographic factors. They then input into that model, not their own demographic data but data from the national census. That then gave their published forecast. I have to say that this looks about the best possible method for eliminating sampling frame effects.

It remains to be seen how widely this approach is adopted next time.

Grenfell Tower – Elites on trial – Trust in bureaucracy revisited

Grenfell Tower fire (wider view).jpg

Grenfell Tower fire1

Nobody can react to the Grenfell Tower fire with anything other than horror, sadness, anger and resolve.

Much of that anger is, legitimately, directed at the elite professions who make the decisions on which individual safety turns. I am proud to have been a member of two elite professions during my lifetime: engineering and law. I wanted to say something about the nature of practice, of responsibility and of blame.

It is too early to be confident of causes, remedies or punishments. Those will have to await full investigation but professionals of all disciplines will need little encouragement to spend the coming weeks searching their own souls over their wider obligations to society. For, that is what membership of a profession entails.

The need for bureaucracy

“Bureaucracy” is a word most often used pejoratively, as a rebuke to a turgid rigidity that frustrates spontaneity, creativity, efficiency and expedition. It is that. But once society starts to enjoy systems of reasonable complexity, civil aviation, networked electricity supply, international transport of goods etc., much decision making is going to be reserved to a cadre of experts. Diane Vaughan’s analysis of the Space Shuttle Challenger disaster2 is a relevant and salutary account of engineering as a bureaucratic profession.

Of course, you can embed some of your bureaucracy in software but don’t expect that to improve spontaneity, creativity or flexibility. Efficiency and expedition, perhaps. Even putting a bunch of flowers on your dining table requires this.

I have often, on this blog, cited Robert Michels’ iron law of oligarchy. Michels contended that any team of bureaucrats soon realised the power they held in controlling the levers of policy. A willingness to pull those levers in the direction of their own self-interest, and a jealous protection of their professional status and expertise, soon followed. That sometimes put them at odds with the objectives they were supposed to be implementing on behalf of their principals. As one political scientist put it:3

Many governance dysfunctions arise because the agents have different agendas from the principals, and the problem of institutional design is related to incentivising the agents to do the principal’s bidding.

Max Weber had realised all this earlier in the nineteenth century. Weber was a child of that mother of all bureaucracies, the Prussian civil service. The self interest of managing elites bothered him and he sought to inculcate an ethic of responsibility whereby professionals thought hard about the wider consequences of their decisions.4 That is the ethic that modern professions seek to foster among their members. Engagement in a profession carries responsibilities.

Faith in regulation

So much marginally informed debate among journalists has been about “the building regulations”. I guess that they mean the Building Regulations 2010. You can examine the relevant parts here, for what they are worth. Scroll down to Part B. Of course the Regulations themselves are supported by the statutory guidance. Here that is. The guidance refers to a legion of British Standards. As I learned during my time in the railway industry, the scientific basis of the guidance is not always easy to trace. Expect to hear more of that as inquiries progress.

The fundamental truth of such regulations is that it is the building professionals themselves who write them. Who else? That does not mean that the professionals, even in conclave, are infallible. Nobel laureate psychologist Daniel Kahneman has written extensively about the bounded rationality that limits everybody’s individual, or group, vision beyond a limited range of experiences, values and prejudices. Experts are just as prone as anybody. You and me too.5 It is unlikely that software will do a better job. Expert systems will work, here I go again, in “an environment that is sufficiently regular to be predictable”. I heard Daniel Dennett speak in London recently. Software will provide us with tissues not colleagues.

All this feeds into the collateral phenomenon whereby businesses actively use their expert involvement in setting regulations as a strategy to capture market share, promote their own products and erect barriers against entry for would-be competitors. The extreme consequence here is regulator capture, where the regulator becomes so dependent upon the expertise of the regulated that she is glad to let them define the regulatory regime.

To some extent, the self validating nature of expertise is reinforced, at least in the UK, by the approach of the courts. In assessing the negligence of a professional an individual is judged against the standards of his profession. Only where no reasonable member of his profession would have acted as he did is he negligent.6, 7 But the courts have warned that, in some circumstances, they might call into question the standards of a whole profession if there were a failure of logic. That is something that the courts would never do lightly.8 It would be a spectacle indeed.

When it comes to professional responsibility, the courts refuse to be dazzled by statutory regulations or industry standards. Professionals are expected to exercise their judgment and not hide behind mere compliance.9 In 2003, giving judgment against a firm of architects for inadequate fire precautions in a food factory refurbishment, Judge Bowsher QC observed:10

I should add that I was not the slightest impressed by the submission that since the defendants had complied with their statutory requirements … they had fully performed their duties.

This is what a judge said in a different case concerning the safety of a flight of stairs.11

Looking at a photograph of the stairs, I myself would form the view that they are reasonably safe … But it is the fact that the stairs did not comply with the Building Regulations, or the relevant British Standard. That is evidence which we must certainly take into account. It represents the current professional opinion as to what is desirable in order that accidents should be avoided. But it is one thing to lay down regulations and standards, with that objective, and another to define what is reasonably safe in the circumstances of a particular case [emphasis added].

In any event, trying to manage a risk by statutory regulation is not so efficient a means as you might think. Regulations do not always ensure the best outcome for society.12

Trust in elites

That all leaves the elite professions with grave responsibilities. Let none of us deny that another salient feature of the professions is that they are businesses run to make a profit for the professionals. Members get the further reward of status in society. I know that we have all constructed narratives of our own expertise and that challenges, particularly from clients, are not always welcome. We think we know best and we don’t always want to waste the client’s time explaining what to us seems so obvious.

And when things go wrong, and they will, all that is thrown back at us. Quite justifiably. Trust in bureaucracy has been a recurrent theme on this blog. It is a complex matter. When it leads to herd immunity from disease it is good. When it leads to complicity in torture it is bad. The public trust we aspire to is not blind faith. It is collaboration. Blind faith leads to bad consequences, collaboration to an environment where professionals are able to explain and reassure. Reflecting on that, I think there are some things was all can do to improve that relationship of trust.

Listen The most useful person on a project is often the person who knows nothing about it. She can ask the dumb question. Physicists told Gulgielmo Marconi he would not be able to transmit a radio signal across the Atlantic. But he did, not because he knew something the physicists didn’t but because sometimes it takes an unashamed maverick to test an orthodoxy.13 There are sundry examples of rumours and folk tales that have sparked scientific curiosity and discovery. Sometimes data is the plural of anecdote. It’s not even all about testing scientific theories. People sometimes need confidence and reassurance in unfamiliar situations. They need to be told, in language they understand, what is happening and why you think this is a good idea. Their questions and reservations need to be taken seriously.

A signal is a signal One of the key skills for any professional is being able to distinguish signal from noise. Where there is a signal, a surprise, that suggests an established orthodoxy has stopped working then you must immediately take action to protect those at risk. The “regular environment” you relied on is blown. Don’t wait to see if it happens again. Don’t dismiss it as a “one off” or, heaven forbid, the most useless word in the English language, an “outlier”. It is the signals that contain all the information. Don’t relax when the signal isn’t repeated immediately. That is just regression to the mean. It’s what signals do. Something that you didn’t expect has happened. Pierce the veil of bounded rationality. Protect the client, investigate and look to update your practice.

Noise is noise The corollary to taking signals seriously is not mistaking noise for signal. When that happens we start looking for causes specific of an individual outcome when the true causes were generic to all outcomes. Professionals also need to know when they are embedded in a “stable system of trouble”. That brings its own challenges, not least of which is the cost and effort of perpetually protecting the client.

Humility Professionals don’t always get it right. There are individual errors. There are systemic failures of practice. If you have to start hiding behind the shield that you are beyond challenge and that dissenting views are outlawed then you are probably dismissing the best hope you have for avoiding problems.

Continual improvement We have to keep listening to counsels of despair from politicians about productivity. It is down to us. Continual improvement is not just for our individual domain expertise, it’s also about getting better at listening, distinguishing signal from noise and practising humility. It’s about getting better at improving too.

Professionals have bodies to kick and souls to damn

Over two hundred years ago, English judge Edward Thurlow famously observed that corporations have neither bodies to kick nor souls to damn. I am always baffled by calls, in cases like that of Grenfell Tower, for prosecutions for corporate manslaughter. The calls seem to reflect a mistaken sentiment that corporate manslaughter is some sort of aggravated form of manslaughter. This isn’t just manslaughter, it’s corporate manslaughter. But why would anybody want to relieve individuals of responsibility and impose it on a faceless abstraction?

Part of the deal when seeking certification as a professional is that you assume a responsibility to society. That’s where you get your status from. When you fail you will be held to account. There are always voices calling for an end to blame culture. But have no doubt, it is a professional’s duty to act within the standards she has adopted. If she falls below those standards then reparation is expected, to the extent that it remains possible. Anybody who causes death when they fall sufficiently far below standard can expect to be indicted for manslaughter and, on conviction, punished and shamed. There is a principle in criminal law called fair labelling. The name of a crime must reflect the offence. Manslaughter is a fair label in such cases.

There has been an increasing tendency in the UK for legislators to take power to order reparation away from the civil courts and to attempt to regulate with criminal sanctions. I am not persuaded that is always the right approach.

Trust in elites, bureaucrats, experts, call them what you will, is important. South African statesman Paul Kruger once remarked:

When I look at history I’m a pessimist. When I look at pre-history I’m an optimist.

If you live in the UK then, on any measure you can dream up, life is getting safer and better. That is the triumph of elite engineers, planners, security professionals, physicians … I could go on. If people at large lose faith in professionals then it will be to our common ruin. Only the professionals can work on building the trust we need. Politicians won’t do it.

What did you do today?

References

  1. Wikimedia Commons contributors, “File:Grenfell Tower fire (wider view).jpg,” Wikimedia Commons, the free media repository, https://commons.wikimedia.org/w/index.php title=File:Grenfell_Tower_fire_(wider_view).jpg&oldid=248417865 (accessed June 25, 2017)
  2. Vaughan, D (1996) The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA, University Of Chicago Press
  3. Fukuyama, F (2012) The Origins of Political Order: From Prehuman Times to the French Revolution, Profile Books, p207
  4. Kim, Sung Ho, “Max Weber”, The Stanford Encyclopedia of Philosophy (Fall 2012 Edition), Edward N. Zalta (ed.)
  5. Kahneman, D (2011) Thinking, Fast and Slow, Allen Lane, pp199-254
  6. Bolam v Friern Hospital Management Committee [1957] 1 WLR 582
  7. Pantelli Associates Ltd v Corporate City Developments Number Two Ltd [2010] EWHC 3189 (TCC)
  8. Bolitho v City and Hackney Health Authority [1996] 4 All ER 771
  9. Charlesworth & Percy on Negligence, 12th ed., 2010 and supplements, 7-46
  10. Sahib Foods Ltd & Ors v Paskin Kyriakides Sands (A Firm) [2003] EWHC 142 (TCC) at [43]
  11. Green v Building Scene Limited [1994] PIQR P259, CA at 269
  12. Coase, R H (1960) “The problem of social cost” Journal of Law and Economics 3, 1-44
  13. Raboy, M (2016) Marconi: The Man Who Networked the World, Oxford, p176

Building targets, constructing behaviour

Recently, the press reported that UK construction company Bovis Homes Group PLC have run into trouble for encouraging new homeowners to move into unfinished homes and have therefore faced a barrage of complaints about construction defects. It turns out that these practices were motivated by a desire to hit ambitious growth targets. Yet it has all had a substantial impact on trading position and mark downs for Bovis shares.1

I have blogged about targets before. It is worth repeating what I said there about the thoughts of John Pullinger, head of the UK Statistics Authority. He gave a trenchant warning about the “unsophisticated” use of targets. He cautioned:2

Anywhere we have had targets, there is a danger that they become an end in themselves and people lose sight of what they’re trying to achieve. We have numbers everywhere but haven’t been well enough schooled on how to use them and that’s where problems occur.

He went on.

The whole point of all these things is to change behaviour. The trick is to have a sophisticated understanding of what will happen when you put these things out.

That message was clearly one that Bovis didn’t get. They legitimately adopted an ambitious growth target but they forgot a couple of things. They forgot that targets, if not properly risk assessed, can create perverse incentives to distort the system. They forgot to think about how manager behaviour might be influenced. Leaders need to be able to harness insights from behavioural economics. Further, a mature system of goal deployment imposes a range of metrics across a business, each of which has to contribute to the global organisational plan. It is no use only measuring sales if measures of customer satisfaction and input measures about quality are neglected or even deliberately subverted. An organisation needs a rich dashboard and needs to know how to use it.

Critically, it is a matter of discipline. Employees must be left in no doubt that lack of care in maintaining the integrity of the organisational system and pursuing customer excellence will not be excused by mere adherence to a target, no matter how heroic. Bovis was clearly a culture where attention to customer requirements was not thought important by the staff. That is inevitably a failure of leadership.

Compare and contrast

Bovis are an interesting contrast with supermarket chain Sainsbury’s who featured in a law report in the same issue of The Times.3 Bovis and Sainsbury’s clearly have very different approaches as to how they communicate to their managers what is important.

Sainsbury’s operated a rigorous system of surveying staff engagement which aimed to embrace all employees. It was “deeply engrained in Sainsbury’s culture and was a critical part of Sainsbury’s strategy”. An HR manager sent an email to five store managers suggesting that the rigour could be relaxed. Not all employees needed to be engaged, he said, and participation could be restricted to the most enthusiastic. That would have been a clear distortion of the process.

Mr Colin Adesokan was a senior manager who subsequently learned of the email. He asked the HR manager to explain what he had meant but received no response and the email was recirculated. Adesokan did nothing. When his inaction came to the attention of the chief executive, Adesokan was dismissed summarily for gross misconduct.

He sued his employer and the matter ended up in the Court of Appeal, Adesokan arguing that such mere inaction over a colleague’s behaviour was incapable of constituting gross misconduct. The Court of Appeal did not agree. They found that, given the significance placed by Sainsbury’s on the engagement process, the trial judge had been entitled to find that Adesokan had been seriously in dereliction of his duty. That failing constituted gross misconduct because it had the effect of undermining the trust and confidence in the employment relationship. Adesokan seemed to have been indifferent to what, in Sainsbury’s eyes, was a very serious breach of an important procedure. Sainsbury’s had been entitled to dismiss him summarily for gross misconduct.

That is process discipline. That is how to manage it.

Display constancy of purpose in communicating what is important. Do not turn a blind eye to breaches. Do not tolerate those who would turn the blind eye. When you combine that with mature goal deployment and sophistication as to how to interpret variation in metrics then you are beginning to master, at least some parts of, how to run a business.

References

  1. “Share price plunges as Bovis tries to rebuild customers’ trust” (paywall), The Times (London), 20 February 2017
  2. “Targets could be skewing the truth, statistics chief warns” (paywall), The Times (London), 26 May 2014
  3. Adesokan v Sainsbury’s Supermarkets Ltd [2017] EWCA Civ 22, The Times, 21 February 2017 (paywall)