Soccer management – signal, noise and contract negotiation

Some poor data journalism here from the BBC on 28 May 2015, concerning turnover in professional soccer managers in England. “Managerial sackings reach highest level for 13 years” says the headline. A classic executive time series. What is the significance of the 13 years? Other than it being the last year with more sackings than the present.

The data was purportedly from the League Managers’ Association (LMA) and their Richard Bevan thought the matter “very concerning”. The BBC provided a chart (fair use claimed).

MgrSackingsto201503

Now, I had a couple of thoughts as soon as I saw this. Firstly, why chart only back to 2005/6? More importantly, this looked to me like a stable system of trouble (for football managers) with the possible exception of this (2014/15) season’s Championship coach turnover. Personally, I detest multiple time series on a common chart unless there is a good reason for doing so. I do not think it the best way of showing variation and/ or association.

Signal and noise

The first task of any analyst looking at data is to seek to separate signal from noise. Nate Silver made this point powerfully in his book The Signal and the Noise: The Art and Science of Prediction. As Don Wheeler put it: all data has noise; some data has signal.

Noise is typically the irregular aggregate of many causes. It is predictable in the same way as a roulette wheel. A signal is a sign of some underlying factor that has had so large an effect that it stands out from the noise. Signals can herald a fundamental unpredictability of future behaviour.

If we find a signal we look for a special cause. If we start assigning special causes to observations that are simply noise then, at best, we spend money and effort to no effect and, at worst, we aggravate the situation.

The Championship data

In any event, I wanted to look at the data for myself. I was most interested in the Championship data as that was where the BBC and LMA had been quick to find a signal. I looked on the LMA’s website and this is the latest data I found. The data only records dismissals up to 31 March of the 2014/15 season. There were 16. The data in the report gives the total number of dismissals for each preceding season back to 2005/6. The report separates out “dismissals” from “resignations” but does not say exactly how the classification was made. It can be ambiguous. A manager may well resign because he feels his club have themselves repudiated his contract, a situation known in England as constructive dismissal.

The BBC’s analysis included dismissals right up to the end of each season including 2014/15. Reading from the chart they had 20. The BBC have added some data for 2014/15 that isn’t in the LMA report and not given the source. I regard that as poor data journalism.

I found one source of further data at website The Sack Race. That told me that since the end of March there had been four terminations.

Manager Club Termination Date
Malky Mackay Wigan Athletic Sacked 6 April
Lee Clark Blackpool Resigned 9 May
Neil Redfearn Leeds United Contract expired 20 May
Steve McClaren Derby County Sacked 25 May

As far as I can tell, “dismissals” include contract non-renewals and terminations by mutual consent. There are then a further three dismissals, not four. However, Clark left Blackpool amid some corporate chaos. That is certainly a termination that is classifiable either way. In any event, I have taken the BBC figure at face value though I am alerted as to some possible data quality issues here.

Signal and noise

Looking at the Championship data, this was the process behaviour chart, plotted as an individuals chart.

MgrSackingsto201503

There is a clear signal for the 2014/15 season with an observation, 20 dismissals,, above the upper natural process limit of 19.18 dismissals. Where there is a signal we should seek a special cause. There is no guarantee that we will find a special cause. Data limitations and bounded rationality are always constraints. In fact, there is no guarantee that there was a special cause. The signal could be a false positive. Such effects cannot be eliminated. However, signals efficiently direct our limited energy for, what Daniel Kahneman calls, System 2 thinking towards the most promising enquiries.

Analysis

The BBC reports one narrative woven round the data.

Bevan said the current tenure of those employed in the second tier was about eight months. And the demand to reach the top flight, where a new record £5.14bn TV deal is set to begin in 2016, had led to clubs hitting the “panic button” too quickly.

It is certainly a plausible view. I compiled a list of the dismissals and non-renewals, not the resignations, with data from Wikipedia and The Sack Race. I only identified 17 which again suggests some data quality issue around classification. I have then charted a scatter plot of date of dismissal against the club’s then league position.

MgrSackings201415

It certainly looks as though risk of relegation is the major driver for dismissal. Aside from that, Watford dismissed Billy McKinlay after only two games when they were third in the league, equal on points with the top two. McKinlay had been an emergency appointment after Oscar Garcia had been compelled to resign through ill health. Watford thought they had quickly found a better manager in Slavisa Jokanovic. Watford ended the season in second place and were promoted to the Premiership.

There were two dismissals after the final game on 2 May by disappointed mid-table teams. Beyond that, the only evidence for impulsive managerial changes in pursuit of promotion is the three mid-season, mid-table dismissals.

Club league position
Manager Club On dismissal At end of season
Nigel Adkins Reading 16 19
Bob Peeters Charlton Athletic 14 12
Stuart Pearce Nottingham Forrest 12 14

A table that speaks for itself. I am not impressed by the argument that there has been the sort of increase in panic sackings that Bevan fears. Both Blackpool and Leeds experienced chaotic executive management which will have resulted in an enhanced force of mortality on their respective coaches. That along with the data quality issues and the technical matter I have described below lead me to feel that there was no great enhanced threat to the typical Championship manager in 2014/15.

Next season I would expect some regression to the mean with a lower number of dismissals. Not much of a prediction really but that’s what the data tells me. If Bevan tries to attribute that to the LMA’s activism them I fear that he will be indulging in Langian statistical analysis. Will he be able to resist?

Techie bit

I have a preference for individuals charts but I did also try plotting the data on an np-chart where I found no signal. It is trite service-course statistics that a Poisson distribution with mean λ has standard deviation √λ so an upper 3-sigma limit for a (homogeneous) Poisson process with mean 11.1 dismissals would be 21.1 dismissals. Kahneman has cogently highlighted how people tend to see patterns in data as signals even where they are typical of mere noise. In this case I am aware that the data is not atypical of a Poisson process so I am unsurprised that I failed to identify a special cause.

A Poisson process with mean 11.1 dismissals is a pretty good model going forwards and that is the basis I would press on any managers in contract negotiations.

Of course, the clubs should remember that when they look for a replacement manager they will then take a random sample from the pool of job seekers. Really!

Advertisement

Bad Statistics I – the phantom line

I came across this chart on the web recently.

BadScatter01

This really is one of my pet hates: a perfectly informative scatter chart with a meaningless straight line drawn on it.

The scatter chart is interesting. Each individual blot represents a nation state. Its vertical position represents national average life expectancy. I take that to be mean life expectancy at birth, though it is not explained in terms. The horizontal axis represents annual per capita health spending, though there is no indication as to whether that is adjusted for purchasing power. The whole thing is a snapshot from 2011. The message I take from the chart is that Hungary and Mexico, and I think two smaller blots, represent special causes, they are outside the experience base represented by the balance of the nations. As to the other nations the chart suggests that average life expectancy doesn’t depend very strongly on health spending.

Of course, there is much more to a thorough investigation of the impact of health spending on outcomes. The chart doesn’t reveal differential performance as to morbidity, or lost hours, or a host of important economic indicators. But it does put forward that one, slightly surprising, message that longevity is not enhanced by health spending. Or at least it wasn’t in 2011 and there is no explanation as to why that year was isolated.

The question is then as to why the author decided to put the straight line through it. As the chart “helpfully” tells me it is a “Linear Trend line”. I guess (sic) that this is a linear regression through the blots, possibly with some weighting as to national population. I originally thought that the size of the blot was related to population but there doesn’t seem to be enough variation in the blot sizes. It looks like there are only two sizes of blot and the USA (population 318.5 million) is the same size as Norway (5.1 million).

The difficulty here is that I can see that the two special cause nations, Hungary and Mexico, have very high leverage. That means that they have a large impact on where the straight lines goes, because they are so unusual as observations. The impact of those two atypical countries drags the straight line down to the left and exaggerates the impact that spending appears to have on longevity. It really is an unhelpful straight line.

These lines seem to appear a lot. I think that is because of the ease with which they can be generated in Excel. They are an example of what statistician Edward Tufte called chartjunk. They simply clutter the message of the data.

Of course, the chart here is a snapshot, not a video. If you do want to know how to use scatter charts to explain life expectancy then you need to learn here from the master, Hans Rosling.

There are no lines in nature, only areas of colour, one against another.

Edouard Manet

Deconstructing Deming III – Cease reliance on inspection

3. Cease dependence on inspection to achieve quality. Eliminate the need for massive inspection by building quality into the product in the first place.

W Edwards Deming Point 3 of Deming’s 14 Points. This at least cannot be controversial. For me it goes to the heart of Deming’s thinking.

The point is that every defective item produced (or defective service delivered) has taken cash from the pockets of customers or shareholders. They should be more angry. One day they will be. Inputs have been purchased with their cash, their resources have been deployed to transform the inputs and they will get nothing back in return. They will even face the costs of disposing of the scrap, especially if it is environmentally noxious.

That you have an efficient system for segregating non-conforming from conforming is unimpressive. That you spend even more of other people’s money reworking the product ought to be a matter of shame. Lean Six Sigma practitioners often talk of the hidden factory where the rework takes place. A factory hidden out of embarrassment. The costs remain whether you recognise them or not. Segregation is still more problematic in service industries.

The insight is not unique to Deming. This is a common theme in Lean, Six Sigma, Theory of Constraints and other approaches to operational excellence. However, Deming elucidated the profound statistical truths that belie the superficial effectiveness of inspection.

Inspection is inefficient

When I used to work in the railway industry I was once asked to look at what percentage of signalling scheme designs needed to be rechecked to defend against the danger of a logical error creeping through. The problem requires a simple application of Bayes’ theorem. I was rather taken aback at the result. There were only two strategies that made sense: recheck everything or recheck nothing. I didn’t at that point realise that this is a standard statistical result in inspection theory. For a wide class of real world situations, where the objective is to segregate non-conforming from conforming, the only sensible sampling schemes are 100% or 0%.

Where the inspection technique is destructive, such as a weld strength test, there really is only one option.

Inspection is ineffective

All inspection methods are imperfect. There will be false-positives and false-negatives. You will spend some money scrapping product you could have sold for cash. Some defective product will escape onto the market. Can you think of any examples in your own experience? Further, some of the conforming product will be only marginally conforming. It won’t delight the customer.

So build quality into the product

… and the process for producing the product (or delivering the service). Deming was a champion of the engineering philosophy of Genechi Taguchi who put forward a three-stage approach for achieving, what he called, off-line quality control.

  1. System design – in developing a product (or process) concept think about how variation in inputs and environment will affect performance. Choose concepts that are robust against sources of variation that are difficult or costly to control.
  2. Parameter design – choose product dimensions and process settings that minimise the sensitivity of performance to variation.
  3. Tolerance design – work out the residual sources of variation to which performance remains sensitive. Develop control plans for measuring, managing and continually reducing such variation.

Is there now no need to measure?

Conventional inspection aimed at approving or condemning a completed batch of output. The only thing of interest was the product and whether it conformed. Action would be taken on the batch. Deming called the application of statistics to such problems an enumerative study.

But the thing managers really need to know about is future outcomes and how they will be influenced by present decisions. There is no way of sampling the future. So sampling of the past has to go beyond mere characterisation and quantification of the outcomes. You are stuck with those and will have to take the consequences one way or another. Sampling (of the past) has to aim principally at understanding the causes of those historic outcomes. Only that enables managers to take a view on whether those causes will persist in the future, in what way they might change and how they might be adjusted. This is what Deming called an analytic study.

Essential to the ability to project data into the future is the recognition of common and special causes of variation. Only when managers are confident in thinking and speaking in those terms will their organisations have a sound basis for action. Then it becomes apparent that the results of inspection represent the occult interaction of inherent variation with threshold effects. Inspection obscures the distinction between common and special causes. It seduces the unwary into misguided action that exacerbates quality problems and reputational damage. It obscures the sad truth that, as Terry Weight put it, a disappointment is not necessarily a surprise.

The programme

  1. Drive out sensitivity to variation at the design stage.
  2. Routinely measure the inputs whose variation threatens product performance.
  3. Measure product performance too. Your bounded rationality may have led you to get (2) wrong.
  4. No need to measure every unit. We are trying to understand the cause system not segregate items.
  5. Plot data on a process behaviour chart.
  6. Stabilise the system.
  7. Establish capability.
  8. Keep on measuring to maintain stability and improve capability.

Some people think they have absorbed Deming’s thinking, mastered it even. Yet the test is the extent to which they are able to analyse problems in terms of common and special causes of variation. Is that the language that their organisation uses to communicate exceptions and business performance, and to share analytics, plans, successes and failures?

There has always been some distaste for Deming’s thinking among those who consider it cold, statistically driven and paralysed by data. But the data are only a means to getting beyond the emotional reaction to those two impostors: triumph and disaster. The language of common and special causes is a profound tool for building engagement, fostering communication and sharing understanding. Above that, it is the only sound approach to business measurement.