FIFA and the Iron Law of Oligarchy

Йозеф Блаттер.jpgIn 1911, Robert Michels embarked on one of the earliest investigations into organisational culture. Michels was a pioneering sociologist, a student of Max Weber. In his book Political Parties he aggregated evidence about a range of trade unions and political groups, in particular the German Social Democratic Party.

He concluded that, as organisations become larger and more complex, a bureaucracy inevitably forms to take, co-ordinate and optimise decisions. It is the most straightforward way of creating alignment in decision making and unified direction of purpose and policy. Decision taking power ends up in the hands of a few bureaucrats and they increasingly use such power to further their own interests, isolating themselves from the rest of the organisation to protect their privilege. Michels called this the Iron Law of Oligarchy.

These are very difficult matters to capture quantitavely and Michels’ limited evidential sampling frame has more of the feel of anecdote than data. “Iron Law” surely takes the matter too far. However, when we look at the allegations concerning misconduct within FIFA it is tempting to feel that Michels’ theory is validated, or at least has gathered another anecdote to take the evidence base closer to data.

But beyond that, what Michels surely identifies is a danger that a bureaucracy, a management cadre, can successfully isolate itself from superior and inferior strata in an organisation, limiting the mobility of business data and fostering their own ease. The legitimate objectives of the organisation suffer.

Michels failed to identify a realistic solution, being seduced by the easy, but misguided, certainties of fascism. However, I think that a rigorous approach to the use of data can guard against some abuses without compromising human rights.

Oligarchs love traffic lights

I remember hearing the story of a CEO newly installed in a mature organisation. His direct reports had instituted a “traffic light” system to report status to the weekly management meeting. A green light meant all was well. An amber light meant that some intervention was needed. A red light signalled that threats to the company’s goals had emerged. At his first meeting, the CEO found that nearly all “lights” were green, with a few amber. The new CEO perceived an opportunity to assert his authority and show his analytical skills. He insisted that could not be so. There must be more problems and he demanded that the next meeting be an opportunity for honesty and confronting reality.

At the next meeting there was a kaleidoscope of red, amber and green “lights”. Of course, it turned out that the managers had flagged as red the things that were either actually fine or could be remedied quickly. They could then report green at the following meeting. Real career limiting problems were hidden behind green lights. The direct reports certainly didn’t want those exposed.

Openness and accountability

I’ve quoted Nobel laureate economist Kenneth Arrow before.

… a manager is an information channel of decidedly limited capacity.

Essays in the Theory of Risk-Bearing

Perhaps the fundamental problem of organisational design is how to enable communication of information so that:

  • Individual managers are not overloaded.
  • Confidence in the reliable satisfaction of process and organisational goals is shared.
  • Systemic shortfalls in process capability are transparent to the managers responsible, and their managers.
  • Leading indicators yield early warnings of threats to the system.
  • Agile responses to market opportunities are catalysed.
  • Governance functions can exploit the borrowing strength of diverse data sources to identify misreporting and misconduct.

All that requires using analytics to distinguish between signal and noise. Traffic lights offer a lousy system of intra-organisational analytics. Traffic light systems leave it up to the individual manager to decide what is “signal” and what “noise”. Nobel laureate psychologist Daniel Kahneman has studied how easily managers are confused and misled in subjective attempts to separate signal and noise. It is dangerous to think that What you see is all there is. Traffic lights offer a motley cloak to an oligarch wishing to shield his sphere of responsibility from scrutiny.

The answer is trenchant and candid criticism of historical data. That’s the only data you have. A rigorous system of goal deployment and mature use of process behaviour charts delivers a potent stimulus to reluctant data sharers. Process behaviour charts capture the development of process performance over time, for better or for worse. They challenge the current reality of performance through the Voice of the Customer. They capture a shared heuristic for characterising variation as signal or noise.

Individual managers may well prefer to interpret the chart with various competing narratives. The message of the data, the Voice of the Process, will not always be unambiguous. But collaborative sharing of data compels an organisation to address its structural and people issues. Shared data generation and investigation encourage an organisation to find practical ways of fostering team work, enabling problem solving and motivating participation. It is the data that can support the organic emergence of a shared organisational narrative that adds further value to the data and how it is used and developed. None of these organisational and people matters have generalised solutions but a proper focus on data drives an organisation to find practical strategies that work within their own context. And to test the effectiveness of those strategies.

Every week the press discloses allegations of hidden or fabricated assets, repudiated valuations, fraud, misfeasance, regulators blindsided, creative reporting, anti-competitive behaviour, abused human rights and freedoms.

Where a proper system of intra-organisational analytics is absent, you constantly have to ask yourself whether you have another FIFA on your hands. The FIFA allegations may be true or false but that they can be made surely betrays an absence of effective governance.

#oligarchslovetrafficlights

Deconstructing Deming XI B – Eliminate numerical goals for management

11. Part B. Eliminate numerical goals for management.

W. Edwards Deming.jpgA supposed corollary to the elimination of numerical quotas for the workforce.

This topic seems to form a very large part of what passes for exploration and development of Deming’s ideas in the present day. It gets tied in to criticisms of remuneration practices and annual appraisal, and target-setting in general (management by objectives). It seems to me that interest flows principally from a community who have some passionately held emotional attitudes to these issues. Advocates are enthusiastic to advance the views of theorists like Alfie Kohn who deny, in terms, the effectiveness of traditional incentives. It is sad that those attitudes stifle analytical debate. I fear that the problem started with Deming himself.

Deming’s detailed arguments are set out in Out of the Crisis (at pp75-76). There are two principle reasoned objections.

  1. Managers will seek empty justification from the most convenient executive time series to hand.
  2. Surely, if we can improve now, we would have done so previously, so managers will fall back on (1).

The executive time series

I’ve used the time series below in some other blogs (here in 2013 and here in 2012). It represents the anual number of suicides on UK railways. This is just the data up to 2013.
RailwaySuicides2

The process behaviour chart shows a stable system of trouble. There is variation from year to year but no significant (sic) pattern. There is noise but no signal. There is an average of just over 200 fatalities, varying irregularly between around 175 and 250. Sadly, as I have discussed in earlier blogs, simply selecting a pair of observations enables a polemicist to advance any theory they choose.

In Railway Suicides in the UK: risk factors and prevention strategies, Kamaldeep Bhui and Jason Chalangary of the Wolfson Institute of Preventive Medicine, and Edgar Jones of the Institute of Psychiatry, King’s College, London quoted the Rail Safety and Standards Board (RSSB) in the following two assertions.

  • Suicides rose from 192 in 2001-02 to a peak 233 in 2009-10; and
  • The total fell from 233 to 208 in 2010-11 because of actions taken.

Each of these points is what Don Wheeler calls an executive time series. Selective attention, or inattention, on just two numbers from a sequence of irregular variation can be used to justify any theory. Deming feared such behaviour could be perverted to justify satisfaction of any goal. Of course, the process behaviour chart, nowhere more strongly advocated than by Deming himself in Out of the Crisis, is the robust defence against such deceptions. Diligent criticism of historical data by means of process behaviour charts is exactly what is needed to improve the business and exactly what guards against success-oriented interpretations.

Wishful thinking, and the more subtle cognitive biases studied by Daniel Kahneman and others, will always assist us in finding support for our position somewhere in the data. Process behaviour charts keep us objective.

If not now, when?

If I am not for myself, then who will be for me?
And when I am for myself, then what am “I”?
And if not now, when?

Hillel the Elder

Deming criticises managerial targets on the grounds that, were the means of achieving the target known, it would already have been achieved and, further, that without having the means efforts are futile at best. It’s important to remember that Deming is not here, I think, talking about efforts to stabilise a business process. Deming is talking about working to improve an already stable, but incapable, process.

There are trite reasons why a target might legitimately be mandated where it has not been historically realised. External market conditions change. A manager might unremarkably be instructed to “Make 20% more of product X and 40% less of product Y“. That plays in to the broader picture of targets’ role in co-ordinating the parts of a system, internal to the organisation of more widely. It may be a straightforward matter to change the output of a well-understood, stable system by an adjustment of the inputs.

Deming says:

If you have a stable system, then there is no use to specify a goal. You will get whatever the system will deliver.

But it is the manager’s job to work on a stable system to improve its capability (Out of the Crisis at pp321-322). That requires capital and a plan. It involves a target because the target captures the consensus of the whole system as to what is required, how much to spend, what the new system looks like to its customer. Simply settling for the existing process, being managed through systematic productivity to do its best, is exactly what Deming criticises at his Point 1 (Constancy of purpose for improvement).

Numerical goals are essential

… a manager is an information channel of decidedly limited capacity.

Kenneth Arrow
Essays in the Theory of Risk-Bearing

Deming’s followers have, to some extent, conceded those criticisms. They say that it is only arbitrary targets that are deprecated and not the legitimate Voice of the Customer/ Voice of the Business. But I think they make a distinction without a difference through the weasel words “arbitrary” and “legitimate”. Deming himself was content to allow managerial targets relating to two categories of existential risk.

However, those two examples are not of any qualitatively different type from the “Increase sales by 10%” that he condemns. Certainly back when Deming was writing Out of the Crisis most OELs were based on LD50 studies, a methodology that I am sure Deming would have been the first to criticise.

Properly defined targets are essential to business survival as they are one of the principal means by which the integrated function of the whole system is communicated. If my factory is producing more than I can sell, I will not work on increasing capacity until somebody promises me that there is a plan to improve sales. And I need to know the target of the sales plan to know where to aim with plant capacity. It is no good just to say “Make as much as you can. Sell as much as you can.” That is to guarantee discoordination and inefficiency. It is unsurprising that Deming’s thinking has found so little real world implementation when he seeks to deprive managers of one of the principle tools of managing.

Targets are dangerous

I have previously blogged about what is needed to implement effective targets. An ill judged target can induce perverse incentives. These can be catastrophic for an organisation, particularly one where the rigorous criticism of historical data is absent.

The art of managing footballers

Van Persie (15300483040) (crop).jpg… or is it a science? Robin van Persie’s penalty miss against West Bromwich Albion on 2 May 2015 was certainly welcome news to my ears. It eased the relegation pressures on West Brom and allowed us to advance to 40 points for the season. Relegation fears are only “mathematical” now. However, the miss also resulted in van Persie being relieved of penalty taking duties, by Manchester United manager Louis van Gaal, until further notice.

He is now at the end of the road. It is always [like that]. Wayne [Rooney] has missed also so when you miss you are at the bottom again.

The Daily Mail report linked above goes on to say that van Persie had converted his previous 6 penalties.

Van Gaal was, of course, referring to Rooney’s shot over the crossbar against West Ham in February 2013, when Rooney had himself invited then manager Sir Alex Ferguson to retire him as designated penalty taker. Rooney’s record had apparently been 9 misses from 27 penalties. I have all this from this Daily Telegraph report.

I wonder if statistics can offer any insight into soccer management?

The benchmark

It was very difficult to find, very quickly, any exhaustive statistics on penalty conversion rates on the web. However, I would like to start by establishing what constituted “good” performance for a penalty taker. As a starting point I have looked at Table 2 on this Premier League website. The data is from February 2014 and shows, at that date, data on the players with the best conversion rates in the League’s history. Players who took fewer than 10 penalties were excluded. It shows that of the ten top converting players, who must rank as the very good if not the ten best, in the aggregate they converted 155 of 166 penalties. That is a conversion rate of 93.4%. At first sight that suggests a useful baseline against which to assess any individual penalty taker.

Several questions come to mind. The aggregate statistics do not tell us how individual players have developed over time, whether improving or losing their nerve. That said, it is difficult to perform that sort of analysis on these comparatively low volumes of data when collected in this way. There is however data (Table 4) on the overall conversion rate in the Premier League since its inception.

Penalties

That looks to me like a fairly stable system. That would be expected as players come and go and this is the aggregate of many effects. Perhaps there is latterly reduced season-to-season variation, which would be odd, but I am not really interested in that and have not pursued it. I am aware that during this period there has been a rule change allowing goalkeepers to move before the kick his taken but I have just spent 30 minutes on the web and failed to establish the date when that happened. The total aggregate statistics up to 2014 are 1,438 penalties converted out of 1,888. That is a conversion rate of 76.2%.

I did wonder if there was any evidence that some of the top ten players were better than others or whether the data was consistent with a common elite conversion rate of 93.4%. In that case the table positions would reflect nothing more than sampling variation. Somewhat reluctantly I calculated the chi-squared statistic for the table of successes and failures (I know! But what else to do?). The statistic came out as 2.02 which, with 9 degrees of freedom, has a p-value (I know!) of 0.8%. That is very suggestive of a genuine ranking among the elite penalty takers.

It inevitably follows that the elite are doing better than the overall success rate of 76.2%. Considering all that together I am happy to proceed with 93.4% as the sort of benchmark for a penalty taker that a team like Manchester United would aspire to.

Van Persie

This website, dated 6 Sept 2012, told me that van Persie had converted 18 penalties with a 77% success rate. That does not quite fit either 18/23 or 18/24 but let us take it at face value. If that is accurate then that is, more or less, the data on which Ferguson gave van Persie the job in February 2013. It is a surprising appointment given the Premier League average of 76.2% and the elite benchmark but perhaps it was the best that could be mustered from the squad.

Rooney’s 9 misses out of 27 yields a success rate of 67%. Not so much lower than van Persie’s historical performance but, in all the circumstances, it was not good enough.

The dismissal

What is fascinating is that, no matter what van Persie’s historical record on which he was appointed penalty taker, before his 2 May miss he had scored 6 out of 6. The miss made it 6 out of 7, 85.7%. That was his recent record of performance, even if selected to some extent to show him in a good light.

Selection of that run is a danger. It is often “convenient” to select a subset of data that favours a cherished hypothesis. Though there might be that selectivity, where was the real signal that van Persie had deteriorated or that the club would perform better were he replaced?

The process

Of course, a manager has more information than the straightforward success/ fail ratio. A coach may have observed goalkeepers increasingly guessing a penalty taker’s shot direction. There may have been many near-saves, a hesitancy on the part of the player, trepidation in training. Those are all factors that a manager must take into account. That may lead to the rotation of even the most impressive performer. Perhaps.

But that is not the process that van Gaal advocates. Keep scoring until you miss then go to the bottom of the list. The bottom! Even scorers in the elite-10 miss sometimes. Is it rational to then replace them with an alternative that will most likely be more average (i.e. worse)? And then make them wait until everyone else has missed.

With an average success rate of 76.2% it is more likely than not that van Persie’s replacement will score their first penalty. Van Gaal will be vindicated. That is the phenomenon called regression to the mean. An extreme event (a miss) is most likely followed by something more average (a goal). Economist Daniel Kahneman explores this at length in his book Thinking, Fast and Slow.

It is an odd strategy to adopt. Keep the able until they fail. Then replace them with somebody less able. But different.

 

UK railway suicides – 2014 update

It’s taken me a while to sit down and blog about this news item from October 2014: Sharp Rise in Railway Suicides Say Network Rail . Regular readers of this blog will know that I have followed this data series closely in 2013 and 2012.

The headline was based on the latest UK government data. However, I baulk at the way these things are reported by the press. The news item states as follows.

The number of people who have committed suicide on Britain’s railways in the last year has almost reached 300, Network Rail and the Samaritans have warned. Official figures for 2013-14 show there have already been 279 suicides on the UK’s rail network – the highest number on record and up from 246 in the previous year.

I don’t think it’s helpful to characterise 279 deaths as “almost … 300”, where there is, in any event, no particular significance in the number 300. It arbitrarily conveys the impression that some pivotal threshold is threatened. Further, there is no especial significance in an increase from 246 to 279 deaths. Another executive time series. Every one of the 279 is a tragedy as is every one of the 246. The experience base has varied from year to year and there is no surprise that it has varied again. To assess the tone of the news report I have replotted the data myself.

RailwaySuicides3

Readers should note the following about the chart.

  • Some of the numbers for earlier years have been updated by the statistical authority.
  • I have recalculated natural process limits as there are still no more than 20 annual observations.
  • There is now a signal (in red) of an observation above the upper natural process limit.

The news report is justified, unlike the earlier ones. There is a signal in the chart and an objective basis for concluding that there is more than just a stable system of trouble. There is a signal and not just noise.

As my colleague Terry Weight always taught me, a signal gives us license to interpret the ups and downs on the chart. There are two possible narratives that immediately suggest themselves from the chart.

  • A sudden increase in deaths in 2013/14; or
  • A gradual increasing trend from around 200 in 2001/02.

The chart supports either story. To distinguish would require other sources of information, possibly historical data that can provide some borrowing strength, or a plan for future data collection. Once there is a signal, it makes sense to ask what was its cause. Building  a narrative around the data is a critical part of that enquiry. A manager needs to seek the cause of the signal so that he or she can take action to improve system outcomes. Reliably identifying a cause requires trenchant criticism of historical data.

My first thought here was to wonder whether the railway data simply reflected an increasing trend in suicide in general. Certainly a very quick look at the data here suggests that the broader trend of suicides has been downwards and certainly not increasing. It appears that there is some factor localised to railways at work.

I have seen proposals to repeat a strategy from Japan of bathing railway platforms with blue light. I have not scrutinised the Japanese data but the claims made in this paper and this are impressive in terms of purported incident reduction. If these modifications are implemented at British stations we can look at the chart to see whether there is a signal of fewer suicides. That is the only real evidence that counts.

Those who were advocating a narrative of increasing railway suicides in earlier years may feel vindicated. However, until this latest evidence there was no signal on the chart. There is always competition for resources and directing effort on a false assumptions leads to misallocation. Intervening in a stable system of trouble, a system featuring only noise, on the false belief that there is a signal will usually make the situation worse. Failing to listen to the voice of the process on the chart risks diverting vital resources and using them to make outcomes worse.

Of course, data in terms of time between incidents is much more powerful in spotting an early signal. I have not had the opportunity to look at such data but it would have provided more, better and earlier evidence.

Where there is a perception of a trend there will always be an instinctive temptation to fit a straight line through the data. I always ask myself why this should help in identifying the causes of the signal. In terms of analysis at this stage I cannot see how it would help. However, when we come to look for a signal of improvement in future years it may well be a helpful step.

Deconstructing Deming XI A – Eliminate numerical quotas for the workforce

11. Part A. Eliminate numerical quotas for the workforce.

W Edwards DemingI find this probably the most confused part of Deming’s thinking. Carefully reading Out of the Crisis (at pp70-75) Deming’s attack is not on standardised work, that is advocated as central to his message, but against specifications for the volume of work: calls answered per hour, finished parts per day.

Deming recognises management’s need to predict costs and revenues but condemns quotas as destructive of achieving productivity.

Deming also deprecates such quotas as corroding workplace pride. I shall return to that in Point 12.

Deming’s criticism of work quotas goes as follows.

  • Some individuals may achieve them easily and their productive capacity will then stand idle.
  • Some individuals may struggle and suffer poor moral.
  • Some individuals may compromise quality so as to make a quota or so as to make it sooner.
  • Achievement of quotas may be frustrated by faults in “the system” which are outside the individual worker’s control.

Deming gives the following example of how he would advise financial planning in a call centre of 500 people (at pp73-74).

  1. Set a preliminary budget.
  2. Make it clear to every one of the 500 that their aim is to give satisfaction to the customer, to take pride in their work.
  3. Everybody will keep a record of calls made.
  4. Customers with special problems will be referred to the supervisor.
  5. At the end of each week, sample 100 individuals’ record and summarise the data.
  6. Repeat steps 2 to 5 for several weeks.
  7. Analyse the data.
  8. Establish a continuing study following the above steps but on a reducing basis.
  9. Use the data to predict costs.

Now there is much merit in forecasting costs based on actual data. Further, improving performance based on the relentless criticism of historical data is essential. However, I think Deming’s prescription naïve and idealistic. The trick is to extract the ideals and industrialise them.

Planning

The simple matter is that any new enterprise has to be established on the basis of a robust business plan. There is competition for resources: people, capital, infrastructure … and everyone has to make their case. It is impossible to do that without judgment. No matter how much historical data or even qualitative experience is to hand we cannot simply project it into the future without establishing further conditions (RearView). It is unlikely this can ever be done exactly in a new establishment.

That competition for resources then prevents us from taking an overly conservative view of what can be achieved. Setting the bar too low for call centre operators starts off from an uncompetitive position. Further, the modest answering rate in the plan has to be resourced with infrastructure. Intentions to improve the answering rate post-launch are all very well but what will happen to the personnel and materiel that we bought in to accommodate the unambitious start-up?

Sometimes work needs to be set at a rate that is recognised by a team of co-workers and other parts of the organisation. Excess production is as contrary to the philosophy of lean operations as is shortage. The idea of takt time allows production lines to be balanced, receipts and deliveries co-ordinated, stock turns to be minimised and cash flows improved. In many situations that is sufficient to answer Deming’s fears about individuals distorting production to bank an accomplished target.

Stretch

What is now proved was once but imagined.

William Blake

Is it so wrong to set a target that nobody involved has seen achieved before? Deming would say that it was fine so long as there was a plan defining the means by which this could be achieved. There are many compelling stories from sports science telling how records have been broken by incremental improvement (e.g. Dave Brailsford and the GB cycling team).

But what about setting an ambitious stretch target without a plan for achieving it? That would be brave indeed. It would be based on no more than an exhortation to the call centre operators to work more furiously, more furiously than anyone had ever done before. I cannot say that would never work. In my athletics days I ran some of my best times when team mates were urging me on from the sidelines. However, as a business strategy it faces the social realities of employees’ collective ability to resist quietly that to which they do not assent. With a carefully recruited and motivated team it could work. It would certainly require a high degree of collective problem solving and improvement by the operators. But of all strategies for operational excellence it looks the most limited and the most risky. There is no obvious Plan B.

The Ringelmann effect

There is a tension between unrealistic stretch targets and a further problem that Deming ignores entirely, the Ringelmann effect. It may sadden the hearts of those who believe in the inherent fulfilling joy of work and best intentions of workers to do a good job but evidence is overwhelming that there are situations where individuals exert less effort in a group environment than they would if acting individually.

In 1913, Max Ringelmann conducted experiments that showed that individuals pulled less strenuously on a rope when pulling in a group than when pulling alone.

A realistically set and communicated takt time can assist in concentrating effort and communicating common work standards and the expectations of peers.

The poor supervisor

If Deming was so pessimistic as to believe that workers would sacrifice quality to hit targets then they would surely be more than happy to shunt enquiries off to their supervisor in order to post commendable performance. All that Deming’s proposal does is to divert the whole problem of difficult calls to the supervisor who, presumably, is either beset with his own performance problems or operates outside business measurement.