The flaw of averages


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

The flaw of averages

  1. 1. The flaw of AveragesDave Coulson, 2013
  2. 2. Henry Kissinger used to say (and I’m paraphrasing quite badly frommemory) that ‚we must guard ourselves from ever falling into the trap ofcollectivist thinking, lest we become the very people we seek to punish‛.He was referring to a natural urge in all of us to see members of a class ofpeople as being ‘all the same’ and judging them equally for the crimes of themost extreme. The example he offered to illustrate this was the anger felt byallied forces towards the Nazis at the end of World War Two, particularlywith regard to how they had massacred Jewish families in theirthousands, simply because they were who they were. It would have beeneasy, he suggested, for the prosecutors at Nuremburg to see the Nazis as allthe same, and to apportion their hatred towards them in equal measure, inwhich case they would have been using the same kind of narrow-mindednessas they sought to eradicate. They would have become Nazis by prosecutingNazis for being Nazis.
  3. 3. Another (mis)quote I have come across makes a similar point: ‚The truemeasure of man’s character is how he deals with those who lack it.‛ It iseasy to become tribal when we are offended. We are perhaps programmed byevolution to be so. In that case, justice is a battle between the human cortexand the mammalian brain more so than a battle between two ideologies.Anytime we resort to dividing the world into two parts, ‘us and them’, weare wrong.-as a bit of statistics shows:
  4. 4. When I was a teacher in Singapore, a long time ago, I used to take the feederservice bus from my home to the train station, a very short journey madeeconomical so that people would be encouraged to travel this way rather thanuse cars. The cost of one of these journeys (in those days) was 25 cents.People would enter through the front door and drop a rattle of coins into thebox and take their seats. The observant driver would listen for the sound ofthe coins, and if he felt inclined, might even look at the coins as they fell intothe box to count them, and occasionally he might have to decide whether itwas worth the effort to get upset if someone underpaid by five or ten centsonce in a while.
  5. 5. Like most Singaporeans, I would have been ashamed to pay anything lessthan the proper amount, and so I was inclined to overpay if I didn’t have theexact coinage. After all, who cares if an extra five cents goes into theSingapore economy once in a while? It’s an infinitesimal loss compared tothe rest of my spending through the day.
  6. 6. But here’s the thing: how much does the average Singaporean pay when(s)he gets on the bus to the train station? Obviously it’s supposed to be 25cents, but more likely it will be a bit more. So what does this say about theaverage Singaporean? That they’re overly generous? That they’re lazy aboutcounting their coins? Or that they’re being manipulated by an evilgovernment to pay more than they should? Probably, none of the above.
  7. 7. I used to do the math on this as part of an exercise in statistics with mystudents. I don’t recall offhand how much the average Singaporeanpassenger put into the tin, but one thing I can be sure of is that the averageSingaporean doesn’t exist.
  8. 8. The data collected in this exercise would have us believe that there is aperson out there in Singapore who, when asked to put 25 cents into acontainer, actually puts (say) 27 cents into the box. Furthermore, if we aresimplistic enough to think that we can judge an entire population by itsaverage, then Singapore is full of people who fish around in their pockets foran extra two cents to donate to the Singapore bus Service. That’s quite aremarkable achievement for a society that outmoded one-cent pieces evenbefore the data were collected.
  9. 9. Perhaps we need to use a different kind of average. These days it’sfashionable to use the median instead of the numerical average because (aswe were told at school) ‚the outcome is more robust against peculiarities inthe data‛. In that case we may find that the average Singaporean puts 30cents into the tin instead of 27. Now he is fishing around in his pocket for anextra five cent piece, a coin that actually exists instead of two that don’t. Arewe any the wiser? And is the conclusion any more accurate?
  10. 10. Perhaps instead of using the mean and the median, we should be using themodal value. You’ll remember this from high school too: the third way ofmeasuring central tendency is to pick out the most commonly occurringvalue because that of course is what the majority is doing.Before we even go to look for this value, we are philosophically and morallyon the hind foot, because now we are preparing to judge a society by what itsmajority are doing. Nazis bad, Singaporeans good. Why? Because of whatthe majority is doing. Apply this kind of thinking to ourselves in our ownhomeland cultures and we would be appalled at how unfairly we could bejudged by what our majority has done on our behalf.Hence, the ‘flaw’ of averages.
  11. 11. Using averages alone to judge a society, or to judge the success of a newpublic policy is wrong, even when the sample size is large and even whenthe spread of data is normal. The very fact that there is a spread of data isenough to ensure that we are being unfair to a wide swath of people aboveand below the average every time we draw a conclusion exclusively fromthat average.Consider the sort of thing you might hear or read about in the newsoccasionally: implementing Program A on a group of people improved theirperformance on Test B by (shall we say) ten percent. Is this good news?Perhaps it is, but good news for whom, and to what extent, and why?Let’s look at a couple of sketches:
  12. 12. In most investigations like this, the participants are scattered in a bell curvearound some midpoint. For the sake of discussion, I’ve chosen a datasetwhere the midpoint is at 50 (on some scale – think school exam if it helpsyou) and the data are scattered so that one percent of the population (each)lie at 0 and 100. This I think will illustrate my point nicely.
  13. 13. The investigation tells us only that the midpoint shifted to right by tenpercent, but doesn’t tell us whether the entire group moved to the right bythat amount or whether the group simply stretched out. Let’s suppose thegroup stretched out, as shown in my graph.Clearly not everyone benefited to the same extent. The people who benefitedthe most were the people who were already doing well. They shifted by asmuch as twenty percent. The people on the hard left barely moved at all.
  14. 14. Somewhere in this chart will be a magical line that separates winners fromlosers. If the line is over on the left, then only a small percentage ofparticipants are changing from losers to winners. Interestingly, if the magicalline is over on the far right, again only a small percentage of participants arechanging from losers to winners. Either way, the percentage has nothing todo with the stated ten percent average improvement.
  15. 15. The most striking benefit would come about if the magical line were exactlyin the middle, where most of the participants lie. Here, 18 percent of losersbecome winners.Should we celebrate? Maybe, if we’re testing a medical procedure that saveslives. But if we’re trialling a new teaching procedure on a bunch of kids in aclassroom, then we might be justified in saying that the principal outcome ofthe method was to make successful kids more successful.
  16. 16. While it’s true that 18 percent of losers became winners, all we were doingwas making their performance 8 percent better, so that if they were given(shall we say) a math problem to solve tomorrow, they would have a 50percent chance of getting it right instead of a 42 percent chance. Wow.
  17. 17. Let’s consider the case where the entire group shifted to the right. Noweverybody got ten points more than they would have without the newprocedure, whether they were originally on the right side of the curve or not.Now (in the best case) 23 percent have crossed sides from losers to winners.Should we celebrate? Again, it depends. 23 percent crossed the midline, 77percent did not.
  18. 18. The result would be better if 90 percent of the participants crossed themidline, which would mean that the population would have to have beenmore tightly packed together in the first place. Whether the data are closely-packed together or not is not revealed to us. And that’s my point: if you don’tknow how widely-spread the data is, you don’t know whether you’re lookingat a significant step forwards or not. Perhaps it’s all just good-looking fluff.
  19. 19. Some time ago I came across a report about a school that had embarked on aradical plan to improve the grades of its students, which til then had beennotoriously low. The procedure was hailed as a dramatic success; classaverages improved. Should we celebrate? It all depends on whether we seestudents as a homogenous material to be processed or as individuals whoarrive at school with a universe of different backgrounds, and respond to newprocedures – or don’t – as individually as they respond to breakfast.
  20. 20. The radical new procedure that was trialled on these kids was to keep themin school six days a week instead of five. Amazingly (they say), kids’ gradesimproved. Well, well. Was it the extra tuition that helped their grades, orsomething else? Maybe simply keeping them off the streets and away fromtrouble was what made the difference, in which case enforced ping pongwould have been as useful, and maybe more fun. But who knows what madethe difference, and so I’ll not say anything more about that.
  21. 21. What I will pursue, though, is the question of whether that shift in theaverage percentage was representative of all or most of the kids in theclass, and not just the kids who were going to pass the exams anyway.Furthermore, was the shift to the right enough to represent a defining shift intheir lifestyles? I look back on my school days and can’t remember most ofmy grades and doubt whether a 70-percent score on my maths paper ratherthan an 80-percent score actually decided who I would be later in life.Maybe it would have if I had earned a scholarship from it, or been allowed togo to university at all. But what percentage of the population actually fallinto this narrow range of people who cross the line from loser to winner? In asociety where scholarships go to the best N percent irrespective of what theyactually score in their school assessments, the exercise may actually havebeen pointless.
  22. 22. Ultimately, whether their grades improved, whether they got to university ornot, whether they ended up with good jobs with stable lifestyles, whether anyof these outcomes actually had anything to do with what they did on thoseclassroomed Saturdays, I have to wonder how many of those kids willactually be thankful for what was achieved.There are so many more questions that could be asked about a study like thisthan can be answered simply by looking at how the average pass rate shifts.And that, ultimately, is my point; averages are not enough.
  23. 23. Perhaps the most common mistakes we make, though, are the simpleprejudices we absorb whenever we hear that one part of our society is betterat something than its counterpart. Women are better at multitasking thanmen, we are told, and men are better at spatial reasoning. Though theevidence, I’m sure, is overwhelming, I wonder if it is actually helpful?
  24. 24. What happens to a woman who discovers that the man she married isactually better organised than she is and can do six things at once comparedto her two? Could it ever happen? If so, is she somehow letting her teamdown? And should a man be ashamed if his wife is better at fixing the lawn-mower than he is? If he is standing on the sidelines when his wife catchesthe ball on the rugby field, does he cheer her on or does he call out advice?He is, after all, supposed to be better at the game than she is.
  25. 25. We are reducing people to averages again, and an average is nothing short ofa stereotype. Yes, there are differences between the sexes, but so what? Dothey define who we are, as individuals? Maybe the differences in skill at thisor that are real differences, but so minor that they don’t really matter. On ascale of 1 to 10 where 10 is exceptional and 1 is exceptionallypathetic, maybe the difference between the scores for the two sexes is only0.5. I could care less. If you can’t see two distinct hilltops when one bell-curve is plotted over the top of another, the differences are not important.
  26. 26. [END]