Data versus modelling

Life can only be understood backwards; but it must be lived forwards.

Søren Kierkegaard

Journalist James Forsyth was brave enough to write the following in The Spectator, 4 July 2020 in the context of reform of the UK civil service.

The new emphasis on data must properly distinguish between data and modelling. Data has real analytical value – it enables robust discussion of what has worked and what has not. Modelling is a far less exact science. In this [Covid-19] crisis, as in the 2008 financial crisis, models have been more of a hinderance than a help.

Now, this glosses a number of issues that I have gone on about a lot on this blog. It’s a good opportunity for me to challenge again what I think I have learned from a career in data, modelling and evidence.

Data basics

Pick up your undergraduate statistics text. Turn to page one. You will find this diagram.


The population, and be assured I honestly hate that term but I am stuck with it, is the collection of all things or events, individuals, that I passionately want to know about. All that I am willing to pay money to find out about. Many practical facets of life prevent me from measuring every single individual. Sometimes it’s worth the effort and that’s called a census. Then I know everything, subject to the performance of my measurement process. And if you haven’t characterised that beforehand you will be in trouble. #MSA

In many practical situations, we take a sample. Even then, not every single individual in the population will be available for sampling within my budget. Suppose I want to market soccer merchandise to all the people who support West Bromwich Albion. I have no means to identify who all those people are. I might start with season ticket holders, or individuals who have bought a ticket on line from the club in the past year, or paid for multiple West Brom games on subscription TV. I will not even have access to all those. Some may have opted to protect their details from marketing activities under GDPRUK. What is left, no matter how I chose to define it, is called the sampling frame. That is the collection of individuals that I have access to and can interrogate, in principle.  The sampling frame is all those items I can put on a list from one to whatever. I can interrogate any of them. I will probably, just because of cost, take a subset of the frame as my sample. As a matter of pure statistical theory, I can analyse and quantify the uncertainty in my conclusions that arises from the limited extent of my sampling within the frame, at least if I have adopted one of the canonical statistical sampling plans.

However, statistical theory tells me nothing about the uncertainty that arises in extrapolating (yes it is!) from frame to population. Many supporters will not show up in my frame, those who follow from the sports bar for example. Some in the frame may not even be supporters but parents who buy tickets for offspring who have rebelled against family tradition. In this illustration, I have a suspicion that the differences between frame and population are not so great. Nearly all the people in my frame will be supporters and neglecting those outside it may not be so great a matter. The overlap between frame and population is large, even though it may not be perfect. However, in general, extrapolation from frame to population is a matter for my subjective subject matter insight, market and product knowledge. Statistical theory is the trivial bit. Using domain knowledge to go from frame to population is the hard work. Not only is it hard work, it bears the greater part of the risk.

Enumerative and analytic statistics

W Edwards Deming was certainly the most famous statistician of the twentieth century. So long ago now. He made a famous distinction between two types of statistical study.

Enumerative study: A statistical study in which action will be taken on the material in the frame being studied.

Analytic study: A statistical study in which action will be taken on the process or cause-system that produced the frame being studied. The aim being to improve practice in the future.

Suppose that a company manufactures 1000 overcoats for sale on-line. An inspector checks each overcoat of the 1000 to make sure it has all three buttons. All is well. The 1000 overcoats are released for sale. No way to run a business, I know, but an example of an enumerative study. The 1000 overcoats are the frame. The inspector has sampled 100% of them. Action has been taken on the 1000 overcoats, the 1000 overcoats that were, themselves, the sampling frame. Sadly, this is what so many people think statistics is all about. There is no ambiguity here in extrapolating from frame to population as the frame is the population.

Deming’s definition of an analytic study is a bit more obscure with its reference to cause systems. But let’s take a case that is, at once, extreme and routine.

When we are governing or running a commercial enterprise or a charity, we are in the business of predicting the future. The past has happened and we are stuck with it. This is what our world looks like.


The frame available for sampling is the historical past. The data that you have is a sample from that past frame. The population you want to know about is the future. There is no area of overlap between past and future, between frame and population. All that stuff in statistics books about enumerative studies, that is most of the contents, will not help you. Issues of extrapolating from frame to sample, the tame statistical matters in the text books, are dwarfed by the audacity of projecting the frame onto an ineffable future.

And, as an aside, just think about what that means when we are drawing conclusions about future human health from past experiments on mice.

What Deming pointed towards, with his definition of analytic study, is that, in many cases, we have enough faith to believe that both the past and future are determined by a common system of factors, drivers, mechanisms, phenomena and causes, physiochemical and economic, likely interacting in a complicated but regular way. This is what Deming meant by the cause system.

Managing and governing are both about pulling levers to effect change. Dwelling on the past will only yield beneficial future change if exploited, mercilessly, to understand the cause system. To characterise what are the levers that will deliver future beneficial outcomes. That was Deming’s big challenge.

The inexact science of modelling

And to predict, we need a model of the cause system. This is unavoidable. Sometimes we are able to use the simplest model of all. That the stream of data we are bothered about is exchangeable, or if you prefer stable and predicable. As I have stressed so many times before on this blog, to do that we need:

  • Trenchant criticism of the experience base that shows an historical record of exchangeability; and
  • Enough subject matter insight into the cause system to believe that such exchangeability will be maintained, at least into an immediate future where foresight would be valuable.

Here, there is no need quantitatively to map out the cause system in detail. We are simply relying on its presumed persistence into the future. It’s still a model. Of course, the price of extrapolation is eternal vigilance. Philip Tetlock drew similar conclusions in Superforecasting.

But often we know that critical influences on the past are pray to change and variation. Climates, councils, governments … populations, tastes, technologies, creeds and resources never stand still. As visible change takes place we need to be able to map its influence onto those outcomes that bother us. We need to be able to do that in advance. Predicting sales of diesel motor vehicles based on historical data will have little prospect of success unless we know that they are being regulated out of existence, in the UK at least. And we have to account for that effect. Quantitatively. This requires more sophisticated modelling. But it remains essential to any form of prediction.

I looked at some of the critical ideas in modelling here, here and here.

Data v models

The purpose of models is not to fit the data but to sharpen the questions.

Samuel Karlin

Nothing is more useless than the endless collection of data without a will to action. Action takes place in the present with the intention of changing the future. To use historical data to inform our actions we need models. Forsyth wants to know what has worked in the past and what has not. That was then, this is now. And it is not even now we are bothered about but the future. Uncritical extrapolation is not robust analysis. We need models.

If we don’t understand these fundamental issues then models will seem more a hinderance than a help.

But … eternal vigilance.

Social distancing and the Theory of Constraints


An organised queue or line1

I was listening to the BBC News the other evening. There was discussion of return to work in the construction industry. A site foreman was interviewed and he was clear in his view that work could be resumed, social distancing observed, safety protected and valuable work done.

Workplace considerations are quite different from those in my recent post in which I was speculating how an “invisible hand” might co-ordinate independently acting and relatively isolated agents who were aspiring to socially isolate. The foreman in the interview had the power to define and enforce a business process, repeatable, measurable, improvable and adaptable.

Of course, the restrictions imposed by Covid-19 will be a nuisance. But how much? To understand the real impact they may have on efficiency requires a deeper analysis of the business process. I’m sure that the foreman and his colleagues had done it.

There won’t be anyone reading this blog who hasn’t read Eliyahu Goldratt’s book, The Goal.2 The big “takeaway” of Goldratt’s book is that some of the most critical outcomes of a business process are fundamentally limited by, perhaps, a single constraint in the value chain. The constraint imposes a ceiling on sales, throughput, cash flow and profit. It has secondary effects on quality, price, fixed costs and delivery. In many manufacturing processes it will be easy to identify the constraint. It will be the machine with the big pile of work-in-progress in front of it. In more service-oriented industries, finding the constraint may require some more subtle investigation. The rate at which the constraint works determines how fast material moves through the process towards the customer.

The simple fact is that much management energy expended in “improving efficiency” has nil (positive) effect on effectiveness, efficiency or flexibility (the “3Fs”). Working furiously will not, of itself, promote the throughput of the constraint. Measures such as Overall Equipment Effectiveness (OEE) are useless and costly if applied to business steps that, themselves, are limited by the performance of a constraint that lies elsewhere.

That is the point about the construction industry, and much else. The proximity of the manual workers is not necessarily the constraint. That must be the case in many other businesses and processes.

I did a quick internet search on the Theory of Constraints and the current Covid-19 pandemic. I found only this, rather general, post by Domenico Lepore. There really wasn’t anything else on the internet that I could find. Lepore is the author of, probably, the most systematic and practical explanation of how to implement Goldratt’s theories.3 Once the constraint is identified:

  • Prioritise the constraint. Make sure it is never short of staff, inputs or consumables. Eliminate unplanned downtime by rationalising maintenance. Plan maintenance when it will cause least disruption but work on the maintenance process too. Measure OEE if you like. On the constraint.
  • Make the constraint’s performance “sufficiently regular to be predictable”.4 You can now forecast and plan. At last.
  • Improve the throughput of the constraint until it is no longer the constraint. Now there is a new constraint to attack.
  • Don’t forget to keep up the good work on the old constraint.

This is, I think, a useful approach to some Covid-19 problems. Where is the constraint? Is it physical proximity? If so, work to manage it. Is it something else? Then you are already stuck with the throughput of the constraint. Serve it in a socially-distanced way.

The court system of England and Wales

Here is a potential example that I was thinking about. Throughput in the court system of England and Wales has, since the onset of Covid-19, collapsed. Certainly in the civil courts, personal injury cases, debt recovery, commercial cases, property disputes, professional negligence claims. There has been more action in criminal and family courts, as far as I can see. Some hearings have taken place by telephone or by video but throughput has been miserable. Most civil courts remain closed other than for matters that need the urgent attention of a judge.

And that is the point of it. The judge, judicial time, is the constraint in the court system. Judgment, or at least the prospect thereof, is the principal way the courts add value. Much of civil procedure is aimed at getting the issues in a proper state for the judge to assess them efficiently and justly. The byproduct of that is that, once the parties have each clarified the issues in dispute, there may then be a window for settlement.

What has horrified the court service is the prospect of the sort of scrum of lawyers and litigants that is common in the inadequate waiting and conference facilities of most courts. That scrum is seen as important. It gives trial counsel an opportunity to review the evidence with their witnesses. It provides an opportunity for negotiation and settlement. Trial counsel will be there face to face with their clients. Offer and counter offer can pass quickly and intuitively between seasoned professionals. Into the mix are added the ushers and clerks who manage the parties securely into the court room. It is a concentrated mass of individuals, beset with frequently inadequate washing facilities.

Court rooms themselves present little problem. Most civil courts in England and Wales are embarrassingly expansive for the few people that generally attend hearings. Very commonly just the judge and two advocates. I cannot think of that many occasions when there will have been any real difficulty in keeping two metres apart.

With the judge as the constraint and the court room not, what remains is the issue of getting people into court. Why is that mass of people routinely in the waiting room? Well, to some extent it serves, in the language of Lean Production, as a “supermarket”,5 a reservoir of inputs that guarantees the judicial constraint does not run dry of work.6 Effective but not necessarily efficient. This  is needed because hearing lengths are difficult to predict. Moreover, some matters settle at court, as set out above. Some the afternoon before. For some matters, nobody turns up. The parties have moved on and not felt it important to inform the court.

As to providing the opportunity for taking instructions and negotiation that is, surely, a matter that the parties can be compelled to address, by telephone or video, on the previous afternoon. The courts here can borrow ideas from Single-Minute Exchange of Dies. This, in any event, seems a good idea. The parties would then be attending court ready to go. The waiting facilities would not be needed for their benefit. The court door settlements would have been dealt with.

The only people who need waiting accommodation are the participants in the next hearing. In most cases they can be accommodated, distanced and will have sufficient, even if sparse, washing facilities. These ideas are not foreign to the court system. It has been many years since a litigant or lawyer could just turn up at the court counter without first telephoning for an appointment, even on an urgent matter.

That probably involves some less ambitious listing of hearings. It may well have moved the constraint away from the judge to the queuing of parties into court. However, once the system is established, and recognised as the constraint, it is there to be improved. Worked on constantly. Thought about in the bath. Worried at on a daily basis.

Generate data. Analyse it. Act on it. Work. Use an improvement process. DMAIC is great but other improvement processes are available.

I’m sure all this thinking is going on. I can say no more.


  1. Image courtesy of Wikipedia and subject to Creative Commons license – for details see here
  2. Goldratt, E M & Cox, J (1984) The Goal: A Process of Ongoing Improvement, Gower
  3. Lepore, D & Cohen, O (1999) Deming and Goldratt: The Theory of Constraints and the System of Profound Knowledge, North River Press
  4. Kahneman, D (2011) Thinking, Fast and Slow, Allen Lane, p240
  5. What a good supermarket looks like“, Planet Lean, 4 April 2019, retrieved 24/5/20
  6. Rother, M & Shook, J (2003) Learning to See: Value-stream Mapping to Create Value and Eliminate Muda, Lean Enterprise Institute, p46


Social distancing and the El Farol Bar problem

Oh, that place. It’s so crowded nobody goes there anymore.

Yogi Berra

If 2020 has given the world a phrase then that phrase is social distancing. However, it put me in mind of a classic analysis in economics/ complexity theory, the El Farol Bar problem.

I have long enjoyed running in Hyde Park. With social distancing I am aware that I need to time and route my runs to avoid crowds. The park is, legitimately, popular and a lot of people live within reasonable walking distance. Private gardens are at a premium in this part of West London. The pleasing thing is that people in general seem to have spread out their visits and the park tends not to get too busy, weather depending. It is almost as though the populace had some means of co-ordinating their visits. That said, I can assure you that I don’t phone up the several hundred thousand people who must live in the park’s catchment area.

The same applies to supermarket visits. Things seem to have stabilised. This put me in mind of W B Arthur’s 1994 speculative analysis of attendances at his local El Farol bar.1 The bar was popular but generally seemed to be attended by a comfortable number of people, neither unpleasantly over crowded nor un-atmospherically quiet. This seems odd. Individual attendees had no obvious way of communicating or coordinating. If people, in general, believed that it would be over crowded then, pace Yogi Berra, nobody would go, thwarting their own expectations. But if there was a general belief it would be empty then everybody would go, again guaranteeing that their own individual forecasts were refuted.

Arthur asked himself how, given this analysis, people seemed to be so good at turning up in the right numbers. Individuals must have some way of predicting the attendance even though that barely seemed possible with the great number of independently acting people.

The model that Artur came up with was to endow every individual with an ecology of prediction formulas or rules, each taking the experience base and following a simple rule, using it to make a prediction of attendance the following week. Some of Arthur’s examples were, along with some others:

  • Predict the same as last week’s attendance.
  • Predict the average of the last 4 weeks’ attendances.
  • Predict the same as the attendance 2 weeks ago.
  • Add 5 to last week’s attendance.

Now, every time an individual gets another week’s data he assesses the accuracy of the respective rules. He then adopts the currently most accurate rule to predict next week’s attendance.

Arthur ran a computer simulation. He set the optimal attendance at El Farol as 60. An individual predicting over 60 attendees would stay away. An individual predicting fewer would attend. He found that the time sequence of weekly attendances soon stabilised around 60.

Fig 1

There are a few points to pull out of that about human learning in general. What Arthur showed is that individuals, and communities thereof, have the ability to learn in an ill-defined environment in an unstructured way. Arthur was not suggesting that individuals co-ordinate by self-consciously articulating their theories and systematically updating on new data. He was suggesting the sort of unconscious and implicit decision mechanism that may inhabit the windmills of our respective minds. Mathematician and philosopher Alfred North Whitehead believed that much of society’s “knowledge” was tied up in such culturally embedded and unarticulated algorithms.2

It is a profoundly erroneous truism, repeated by all copy-books and by eminent people when they are making speeches, that we should cultivate the habit of thinking of what we are doing. The precise opposite is the case. Civilization advances by extending the number of important operations which we can perform without thinking about them. Operations of thought are like cavalry charges in a battle — they are strictly limited in number, they require fresh horses, and must only be made at decisive moments.

The regularity trap

Psychologists Gary Klein and Daniel Kahneman investigated how firefighters were able to perform so successfully in assessing a fire scene and making rapid, safety critical decisions. Lives of the public and of other firefighters were at stake. Together, Klein and Kahneman set out to describe how the brain could build up reliable memories that would be activated in the future, even in the agony of the moment. They came to the conclusion that there are two fundamental conditions for a human to acquire a predictive skill.3

  • An environment that is sufficiently regular to be predictable.
  • An opportunity to learn these regularities through prolonged practice

Arthur’s Fig.1, after the initial transient, looks impressively regular, stable and predictable. Some “invisible hand” has moved over the potential attendees and coordinated their actions. So it seems.

Though there is some fluctuation it is of a regular sort, what statisticians call exchangeable variation.

The power of a regular and predictable process is that it does enable us to keep Whitehead’s cavalry in reserve for what Kahneman called System 2 thinking, the reflective analytical dissection of a problem. It is the regularity that allows System 1 thinking where we can rely on heuristics, habits and inherited prejudices, the experience base.

The fascinating thing about the El Farol problem is that the regularity arises, not from anything consistent, but from data-adaptive selection from the ecology of rules. It is not obvious in advance that such can give rise to any, even apparent, stability. But there is a stability, and an individual can rely upon it to some extent. Certainly as far as a decision to spend a sociable evening is concerned. However, therein lies its trap.

Tastes in venue, rival attractions, new illnesses flooding the human race (pace Gottfried Leibniz), economic crises, … . Sundry matters can upset the regular and predictable system of attendance. And they will not be signalled in advance in the experience base.

Predicting on the basis of a robustly measured, regular and stable experience base will always be a hostage to emerging events. Agility in the face of emerging data-signals is essential. But understanding the vulnerabilities of current data patterns is important too. In risk analysis, understanding which historically stable processes are sensitive to foreseeable crises is essential.

Folk sociology

Folk physics is the name given to the patterns of behaviour that we all exhibit that enable us to catch projectiles, score “double tops” on the dart board, and which enabled Michel Platini to defy the wall with his free kicks. It is not the academic physics of Sir Isaac Newton which we learn in courses on theoretical mechanics and which enables the engineering of our most ambitious monumental structures. However, it works for us in everyday life, lifting boxes and pushing buggies.4

Apes, despite their apparently impressive ability to use tools, it turns out, have no internal dynamic models or physical theories at all. They are unable to predict in novel situations. They have no System 2 thinking. They rely on simple granular rules and heuristics, learned by observation and embedded by successful repetition. It seems more than likely that, in many circumstances, as advanced by Whitehead, that is true of humans too.5 Much of our superficially sophisticated behaviour is more habit than calculation, though habit in which is embedded genuine knowledge about our environment and successful strategies of value creation.6 Kahneman’s System 1 thinking.

The lesson of that is to respect what works. But where the experience base looks like the result of an pragmatic adjustment to external circumstances, indulge in trenchant criticism of historical data. And remain agile.

Next time I go out for a run, I’m going to check the weather.


  1. Arthur, W B (1994) “Inductive reasoning and bounded rationality, The American Economic Review, 84 (2), Papers and Proceedings of the Hundred and Sixth Annual Meeting of the American Economic Association, 406-411
  2. Whitehead, A N (1911) An Introduction to Mathematics, Ch.5
  3. Kahneman, D (2011) Thinking, Fast and Slow, Allen Lane, p240
  4. McCloskey (1983) “Intuitive physics”, Scientific American 248(4), 122-30
  5. Povinelli, D J (2000) Folk Physics for Apes: The Chimpanzee’s Theory of How the World Works, Oxford
  6. Hayek, F A (1945) “The use of knowledge in society”, The American Economic Review, 35(4), 519-530

The audit of pestilence – How will we know how many Covid-19 killed?

In the words of the late, great Kenny Rogers, “There’ll be time enough for countin’/ When the dealin’s done”, but sometime, at the end of the Covid-19 crisis, somebody will ask How many died? and, more speculatively, how many deaths were avoidable.

There always seems an odd and uncomfortable premise at the base of that question, that somehow there is a natural or neutral, unmarked, control, null, default or proper, legitimate number who “should have” died, absent the virus. Yet that idea is challenged by our certain collective knowledge that, in living memory, there has been a persistent rise in life expectancy and longevity. Longevity has not stood still.

And I want to focus on that as it relates to a problem that has been bothering me for a while. It was brought into focus a few weeks ago by a headline in the Daily Mail, the UK’s house journal for health scares and faux consumer outrage.1

Life expectancy in England has ground to a halt for the first time in a century, according to a landmark report.

For context, I should say that this appeared 8 days after the UK government’s first Covid-19 press conference. Obviously, somebody had an idea about how much life expectancy should be increasing. There was some felt entitlement to an historic pattern of improvement that had been, they said, interrupted. It seems that the newspaper headline was based on a report by Sir Michael Marmot, Professor of Epidemiology and Public Health at University College London.2 This was Marmot’s headline chart.

Marmot headline

Well, not quite the Daily Mail‘s “halt” but I think that there is no arguing with the chart. Despite there obviously having been some reprographic problem that has resulted in everything coming out in various shades of green and some gratuitous straight lines, it is clear that there was a break point around 2011. Following that, life expectancy has grown at a slower rate than before.

The chart did make me wonder though. The straight lines are almost too good to be true, like something from a freshman statistics service course. What happened in 2011? And what happened before 1980? Further, how is life expectancy for a baby born in 2018 being measured? I decided to go to the Office of National Statistics (UK) (“the ONS”) website and managed to find data back to 1841.


I have added some context but other narratives are available. Here is a different one.3


As Philip Tetlock4 and Daniel Kahneman5 have both pointed out, it is easy to find a narrative that fits both the data and our sympathies, and to use it as a basis for drawing conclusions about cause and effect. On the other hand, building a narrative is one of the most important components in understanding data. The way that data evolves over time and its linkage into an ecology of surrounding events is the very thing that drives our understanding of the cause system. Data has no meaning apart from its context. Knowledge of the cause system is critical to forecasting. But please use with care and continual trenchant criticism.

The first thing to notice from the chart is that there has been a relentless improvement in life expectancy over almost two centuries. However, it has not been uniform. There have been periods of relatively slow and relatively rapid growth. I could break the rate of improvement down into chunks as follows.

Narrative From To Annual
increase in life
expectancy (yr)
error (yr)
1841 to opening of London Sewer 1841 1865 -0.016 0.034
London Sewer to Salvarsan 1866 1907 0.192 0.013
Salvarsan to penicillin 1908 1928 0.458 0.085
Penicillin to creation of NHS 1929 1948 0.380 0.047
NHS to Thatcher election 1949 1979 0.132 0.007
Thatcher to financial crisis 1980 2008 0.251 0.004
Financial crisis to 2018 2009 2018 0.122 0.022

Here I have rather crudely fitted a straight line to the period measurement (I am going to come back to what this means) for men over the various epochs to a get a feel for the pace of growth. It is a very rough and ready approach. However, it does reveal that the real periods of improvement of life expectancy were from 1908 to 1948, notoriously the period of two World Wards and an unmitigated worldwide depression.

Other narratives are available.

It does certainly look as though improvement has slowed since the financial crisis of 2008. However, it has only gone back to the typical rate between 1948 and 1979, a golden age for some people I think, and nowhere near the triumphal march of the first half of the twentieth century. We may as well ask why the years 1980 to 2008 failed to match the heroic era.

There are some real difficulties in trying to come to any conclusions about cause and effect from this data.

Understanding the ONS numbers

In statistics, life expectancy is fully characterised by the survivor function. Once we know that, we can calculate everything we need, in particular life expectancy (mean life). Any decent textbook on survival analysis tells you how to do this.6 The survivor function tells us the probability that an individual will survive beyond time t and die at some later unspecified date. Survivor functions look like this, in general.

Survivor curve

It goes from t=0 until the chances of survival have vanished, steadily decreasing with moral certainty. In fact, you can extract life expectancy (mean life) from this by measuring the area under the curve, perhaps with a planimeter.

However, what we are talking about is a survivor function that changes over time. Not in the sense that a survivor function falls away as an individual ages. A man born in 1841 will have a particular survivor function. A man born in 2018 will have a better one. We have a sequence of generally improving survivor functions over the years.

Now you will see the difficulty in estimating the survivor function for a man born in 1980. Most of them are still alive and we have nil data on that cohort’s specific fatalities after 40 years of age. Now, there are statistical techniques for handling that but none that I am going to urge upon you. Techniques not without their important limitations but useful in the right context. The difficulty in establishing a survivor function for a newborn in 2020 is all the more problematic. We can look at the age of everyone who dies in a particular year, but that sample will be a mixture of men born in each and every year over the preceding century or so. The individual years’ survivor functions will be “smeared” out by the instantaneous age distribution of the UK, what in mathematical terms is called convolution. That helps us understand why the trends seem, in general, well behaved. What we are looking at is the aggregate of effects over the preceding decades. There will, importantly in the current context, be some “instantaneous” effects from epidemics and wars but those are isolated events within the general smooth trend of improvement.

There is no perfect solution to these problems. The ONS takes two approaches both of which it publishes.7 The first is just to regard the current distribution of ages at death as though it represented the survivor function for a person born this year. This, of course, is a pessimistic outlook for a newborn’s prospects. This year’s data is a mixture of the survivor functions for births over the last century or so, along with instantaneous effects. For much of those earlier decades, life expectancy was signally worse than it is now. However, the figure does give a conservative view and it does enable a year-on-year comparison of how we are doing. It captures instantaneous effects well. The ONS actually take the recorded deaths over the last three consecutive years. This is what they refer to as the period measurement and it is what I have used in this post.

The other method is slightly more speculative in that it attempts to reconstruct a “true” survivor function but is forced into making that through assuming an overall secular improvement in longevity. This is called the cohort measurement. The ONS use historical life data then assume that the annual rate of increase in life expectancy will be 1.2% from 2043 onwards. Rates between 2018 and 2043 are interpolated. The cohort measurement yields a substantially higher life expectancy than the period measurement, 87.8 years as against 79.5 years for 2018 male births.

Endogenous and exogenous improvement

Well, I really hesitated before I used those two economists’ terms but they are probably the most scholarly. I shall try to put it more cogently.

There is improvement contrived by endeavour. We identify some desired problem, conceive a plausible solution, implement, then measure the results against the pre-solution experience base. There are many established processes for this, DMAIC is a good one, but there is no reason to be dogmatic as to approach.

However, some improvement occurs because there is an environment of appropriate market conditions and financial incentives. It is the environment that is important in turning random, and possibly unmotivated, good ideas into improvement. As German sociologist Max Weber famously observed, “Ideas occur to us when they please, not when it pleases us.”8

For example, in 1858, engineer Joseph Bazalgette proposed an enclosed, underground sewer system for much of London. A causative association between fecal-contaminated water and cholera had been current since the work of John Snow in 1854. That’s another story. Bazalgette’s engineering was instrumental in relieving the city from cholera. That is an improvement procured by endeavour, endogenous if you like.

In 1928, Sir Alexander Fleming noticed how mould, accidentally contaminating his biological samples, seemed to inhibit bacterial growth. Fleming pursued this random observation and ended up isolating penicillin. However, it took a broader environment of value, demand and capital to launch penicillin as a pharmaceutical product, some time in the 1940s. There were critical stages of clinical trials and industrial engineering demanding significant capital investment and constancy of purpose. Howard Florey, Baron Florey, was instrumental and, in many ways, his contribution is greater than Fleming’s. However, penicillin would not have reached the public had the market conditions not been there. The aggregate of incremental improvements arising from accidents of discovery, nurtured by favourable external  economic and political forces, are the exogenous improvements. All the partisans will claim success for their party.

Of course, it is, to some extent, a fuzzy characterisation. Penicillin required Florey’s (endogenous) endeavour. All endeavour takes place within some broader (exogenous) culture of improvement. Paul Ehrlich hypothesised that screening an array of compounds could identify drugs with anti-bacterial properties. Salvarsan’s effectiveness against syphilis was discovered as part of such a programme and then developed and marketed as a product by Hoechst in 1910. An interaction of endogenous and exogenous forces.

It is, for business people who know their analytics, relatively straightforward to identify improvements from endogenous endeavour. But where they dynamics are exogenous, economists can debate and politicians celebrate or dispute. Improvements can variously be claimed on behalf of labour law, state aid, nationalisation, privatisation or market deregulation. Then, is the whole question of cause and effect slightly less obvious than we think? Moderns carol the innovation of penicillin. We shudder noting that, in 1924, a US President’s son died simply because of an infection originating in an ill-fitting tennis shoe.9 However, looking at the charts of life expectancy, there is no signal from the introduction of penicillin, I think. What caused that improvement in the first half of the twentieth century?

Cause and effect

It was philosopher-scientist-lawyer Francis Bacon who famously observed:

It were infinite for the law to judge the causes of causes and the impression one on another.

We lawyers are constantly involved in disputes over cause and effect. We start off by accepting that nearly everything that happens is as a result of many causes. Everyday causation is inevitably a multifactorial matter. That is why the cause and effect diagram is essential to any analysis, in law, commerce or engineering. However, lawyers are usually concerned with proving that a particular factor caused an outcome. Other factors there may be and that may well be a matter of contribution from other parties but liability turns on establishing that a particular action was part of the causative nexus.

The common law has some rather blunt ways of dealing with the matter. Pre-eminent is the “but for” test. We say that A caused B if B would not have happened but for A. There may well have been other causes of B, even ones that were more important, but it is A that is under examination. That though leaves us with, at least, a couple of problems. Lord Hoffman pointed out the first problem in South Australia Asset Management Corporation Respondents v York Montague Ltd.10

A mountaineer about to take a difficult climb is concerned about the fitness of his knee. He goes to the doctor who makes a superficial examination and pronounces the knee fit. The climber goes on the expedition, which he would not have undertaken if the doctor told him the true state of his knee. He suffers an injury which is an entirely foreseeable consequence of mountaineering but has nothing to do with his knee.

The law deals with this by various devices: operative cause, remoteness and foreseeability,  reasonable foreseeability, reasonable contemplation of the parties, breaks in the chain of causation, boundaries on the duty of care … . The law has to draw a line and avoid “opening the floodgates” of liability.11, 12 How the line can be drawn objectively in social science is a different matter.

The second issue was illustrated in a fine analysis by Prof. David Spiegelhalter as to headlines of 40,000 annual UK deaths because of air pollution.13 Daily Mail again! That number had been based on historical longitudinal observational studies showing greater force of mortality among those with greater exposure to particular pollutants. I presume, though Spiegelhalter does not go into this in terms, that there is some plausible physio-chemical cause system that can describe the mechanism of action of inhaled chemicals on metabolism and the risk of early death.

Thus we can expose a population to a risk with a moral certainty that more will die than absent the risk. That does not, of itself, enable us to attribute any particular death to the exposure. There may, in any event, be substantial uncertainty about the exact level of risk.

The law is emphatic. Mere exposure to a risk is insufficient to establish causation and liability.14 There are a few exceptions. I will not go into them here. The law is willing to find causation in situations that fall short of but for where it finds that the was a material contribution to a loss.15 However, a claimant must, in general, show a physical route to the individual loss or injury.16

A question of attribution

Thus, even for those with Covid-19 on their death certificate, the cause will typically be multi-factorial. Some would have died in the instant year in any event. And some others will die because medical resources have been diverted from the quotidian treatment of the systemic perils of life. The local disruption, isolation, avoidance and confinement may well turn out to result in further deaths. Domestic violence is a salient repercussion of this pandemic.

But there is something beyond that. One of Marmot’s principle conclusions was that the recent pause in improvement of life expectancy was the result of poverty. In general, the richer a community becomes, the longer it lives. Poverty is a real factor in early death. On 14 April 2020, the UK Office of Budget Responsibility opined that the UK economy could shrink by 35% by June.17 There was likely to be a long lasting impact on public finances. Such a contraction would dwarf even the financial crisis of 2008. If the 2008 crisis diminished longevity, what will a Covid-19 depression do? How will deaths then be attributed to the virus?

The audit of Covid-19 deaths is destined to be controversial, ideological, partisan and likely bitter. The data, once that has been argued over, will bear many narratives. There is no “right” answer to this. An honest analysis will embrace multiple accounts and diverse perspectives. We live in hope.

I think it was Jack Welch who said that anybody could manage the short term and anybody could manage the long term. What he needed were people who could manage the short term and the long term at the same time.


  1. “Life expectancy grinds to a halt in England for the first time in 100 YEARS”, Daily Mail , 25/2/20, retrieved 13/4/20
  2. Marmot, M et al. (2020) Health Equity in England: The Marmot Review 10 Years On, Institute of Health Equity
  3. NHS expenditure data from, “How funding for the NHS in the UK has changed over a rolling ten year period”, The Health Foundation, 31/10/15, retrieved 14/4/20
  4. Tetlock, P & Gardner, D (2015) Superforecasting: The Art and Science of Prediction, Random House
  5. Kahneman, D (2011) Thinking, fast and slow, Allen Lane
  6. Mann, N R et al. (1974) Methods for Statistical Analysis of Reliability and Life Data, Wiley
  7. “Period and cohort life expectancy explained”, ONS, December 2019, retrieved 13/4/20
  8. Weber, M (1922) “Science as a vocation”, in Gessamelte Aufsätze zur Wissenschaftslehre, Tubingen, JCB Mohr 1922, 524-555
  9. The medical context of Calvin Jr’s untimely death“, Coolidge Foundation, accessed 13/4/20
  10. [1997] AC 191 at 213
  11. Charlesworth & Percy on Negligence, 14th ed., 2-03
  12. Lamb v Camden LBC [1981] QB 625, per Lord Denning at 636
  13. “Does air pollution kill 40,000 people each year in the UK?”, D Siepgelhalter, Medium, 20/2/17, retrieved 13/4/20
  14. Wilsher v Essex Area Health Authority [1988] AC 1075, HL
  15. Bailey v Ministry of Defence [2008] EWCA Civ 883, [2009] 1 WLR 1052
  16. Pickford v ICI [1998] 1 WLR 1189, HL
  17. “Coronavirus: UK economy ‘could shrink by record 35%’ by June”, BBC News 14/4/20, retrieved 14/4/20

Just says, “in mice”; just says, “in boys”

If anybody doubts that twitter has a valuable role in the world they should turn their attention to the twitter sensation that is @justsaysinmice.

The twitter feed exposes bad science journalism where extravagant claims are advanced with a penumbra of implication that something relevant to human life or happiness has been certified by peer reviewed science. It often turns out that, when the original research in interrogated, and in fairness at the very bottom of the journalistic puff piece, it just says, “in mice”. Cauliflower, cabbage, broccoli harbour prostate cancer inhibiting compound, was a recent subeditor’s attention grabbing headline. But the body of the article just says, “in mice”. Most days the author finds at least one item to tweet.

Population – Frame – Sample

The big point here is one of the really big points in understanding statistics.
We start generating data and doing statistics because there is something out there we are interested in. Some things or events. We call the things and events we are bothered about the population. The problem is that, in the real world, it is often difficult to get hold of all those things or events. In an opinion poll, we don’t know who will vote at the next election, or even who will still be alive. We don’t know all the people who follow a particular sports club. We can’t find everyone who’s ever tasted Marmite and expressed an opinion. Sometimes the events or things we are interested in don’t even exist yet and lie wholly in the future. That’s called prediction and forecasting.

In order to do the sort of statistical sampling that text books tell us about, we need to identify some relevant material that is available to us to measure or interrogate. For the opinion poll it would be everyone on the electoral register, perhaps. Or everyone who can be reached by dialing random numbers in the region of interest. Or everyone who signs up to an online database (seriously). Those won’t be the exact people who will be doing the voting at the next election. Some of them likely will be. But we have to make a judgment that they are, somehow, representative.

Similarly, if we want to survey sports club supporters we could use the club’s supporter database. Or the people who but tickets online. Or who tweet. Not perfect but, hey! And, perhaps, in some way representative.

The collection of things we are going to do the sampling on is called the sampling frame. We don’t need to look at the whole of the frame. We can sample. And statistical theory assures us about how much the sample can tell us about the frame, usually quite a lot if done properly. But as to the differences between population and frame, that is another question.

Enumerative and analytic statistics

These real world situations lie in contrast to the sort of simplified situations found in statistics text books. A inspector randomly samples 5 widgets from a batch of 100 and decides whether to accept or reject the batch (though why anyone would do this still defies rational explanation). Here the frame and population are identical. No need to worry.

W Edwards Deming was a statistician who, among his other achievements, developed the sampling techniques used in the 1940 US census. Deming thought deeply about sampling and continually emphasised the distinction between the sort of problems where population and frame were identical, what he called enumerative statistics, and the sundry real world situations where they were not, analytic statistics.1

The key to Deming’s thinking is that, where we are doing analytic statistics, we are not trying to learn about the frame, that is not what interests us, we are trying to learn something useful about the population of concern. That means that we have to use the frame data to learn about the cause system that is common to frame and population. By cause system, Deming meant the aggregate of competing, interacting and evolving factors, inherent and environmental, that influence the outcomes both in frame and population. As Donald Rumsfeld put it, the known knowns, the known unknowns and the unknown unknowns.

The task of understanding how any particular frame and population depend on a common cause-system requires deep subject matter knowledge. As does knowing the scope for reading across conclusions.

Just says, “in mice”

Experimenting on people is not straightforward. That’s why we do experiments on mice.

But here the frame and population are wildly disjoint.
Mice frameSo why? Well apparently, their genetic, biological and behavior characteristics closely resemble those of humans, and many symptoms of human conditions can be replicated in mice.2 That is, their cause systems have something in common. Not everything but things useful to researchers and subject matter experts.

Mice cause

Now, that means that experimental results in mice can’t just be read across as though we had done the experiment on humans. But they help subject matter experts learn more about those parts of the cause-system that are common. That might then lead to tentative theories about human welfare that can then be tested in the inevitably more ethically stringent regime of human trials.

So, not only is bad, often sensationalist, data journalism exposed, but we learn a little more about how science is done.

Just says, “in boys”

If the importance of this point needed emphasising then Caroline Criado Perez makes the case compellingly in her recent book Invisible Women.3

It turns out, that much medical research, much development of treatments and even assessment of motor vehicle safety have historically been performed on frames dominated by men, but with results then read across as though representative of men and women. Perez goes on to show how this has made women’s lives less safe and less healthy than they need have been.

It seems that it is not only journalists who are addicted to bad science.

Anyone doing statistics needs aggressively to scrutinise their sampling frame and how it matches the population of interest. Contrasts in respective cause systems need to be interrogated and distinguished with domain knowledge, background information and contextual data. Involvement in statistics carries responsibilities.


  1. Deming, W E (1975) “On probability as a basis for action”, American Statistician29 146
  2. Melina, R (2010) “Why Do Medical Researchers Use Mice?“, Live Science, retrieved 18:32 UCT 2/6/19
  3. Perez, C C (2019) Invisible Women: Exposing Data Bias in a World Designed for Men, Chatto & Windus