Introduce self Lay out session: Work on pulling apart a visualisation in a blog post from the economist Dig down to the data underneath Build it up again, showing how different stories are achieved from the same raw data Before we begin, we are going to split into discussion groups Divide up into quadrants? Nominate someone to speak for the group
This is a visualisation taken from a blog article on the economist back in October 2017. I’ve picked it today, because it was also the subject of a data viz challenge on the “Storytelling with Data” blog. The baseline data was given out in the challenge and people were invited to submit alternate visualisations, and critique the original. We can’t go building our own visualisations today, so we’ll instead break down this one, and see what others built. But, imagine that you get an inquiry from a student about this visualisation. Today, let’s look at this with a critical eye. Actually, a quick aside… I’m a linguist by training who loves variation in language. Just out of interest, a quick show of hands, who say H-UH-ri-cAY-ne and who says H-UH-ri-can.
What is the take away “story” of this visualisation? Is it effective in expressing that story? What is the larger context in which such a story is being pitched? Take a few minutes to discuss this in your groups. Story?: That hurricanes are becoming less frequent over time, but that major hurricanes are becoming slightly more frequent Effective?: I would argue yes, that the story is clear from the visualisation Context?: The larger context is climate change. This is supposed to challenge our assumptions about climate change
Identify the components of the diagram (e.g., “an x-axis showing years in 10 year bands”) Take a few minutes this time. Components: A title, and further description relating to y-axis a key showing the colours coding categories 1-5 the y-axis: number of hurricanes making landfall, a linear trendline for major hurricanes and all hurricanes the definition of “major hurricanes” the source of the data NOAA (National Oceanic and Atmospheric Administration) the source of the visualisation So there’s definitely a lot of material packed into this visualisation.
What groupings of numbers are shown in the visualisation? (e.g., “each bar represents a 10 year aggregate”) Or “in what shorthand way are numbers summarised in the graph”? Are any units of measurement themselves a grouping of numbers? Take a few minutes again All hurricanes are grouped into bands of 10 years, stacked from most to least severe (effectively making two graphs--severe and all); also grouped by colour: major and minor hurricanes are shown with contrastive hues; Categories themselves are groupings: “Saffir-Simpson Hurricane Wind Scale” is based on a hurricane’s sustained wind speed, and this measure is supposed to be a proxy for damage and lives lost in the storm. So for instance, a category 1 windstorm has a sustained wind speed between 74 and 95 mph, category 3 windstorm is between 111-129mph and a category 5 is 157mph and above. I.e., bands of 1: 22, 2: 15, 3: 19, 4: 27. Part of the reason for the precise measures is to do with rounding when converting between mph, kts, and kph! A change in 2012 updated the values to make this easier, but kept those changes to values such that they wouldn’t alter historical counts. Identifying the groupings, summarisations or abstractions is important for understanding the total information content of the visualisation, and also for thinking about what kinds of source data might be used to recreate it. For instance, data on sustained wind speeds could be used.
This time from the audience: What data is recoverable in this graph? The source data! By citing the data source, a reader can go to NOAA and these statistics can be downloaded… but a direct data citation would be even better! The count of the number of hurricanes in a ten year period, broken down by category, are recoverable. That is because the information is expressed iconically. If the scale had been between 0 and 1000 then it would not have been recoverable, only estimable. What data is lost? By grouping them into ten year bands, we have no year-on-year picture of what we’re seeing. This is on top of the category grouping. All of the decisions that went into making the graph. Bonus question… what data is irrelevant? The distinction between major and minor storms is the primary story. Arguably the finer grained detail about categories could be dropped. The most obvious detail lost would be that there’s only been 3 category 5 hurricanes. (don’t mention if it doesn’t come up: the 2011-16 data is irrelevant)
We’ve really unpacked the graph now, so: What is misleading, or what should we be skeptical about? I guess I’ve asked a leading question, but I’ll try anyway. A quick show of hands: Is this visualisation convincing? Or is there something a little fishy here? Back in your groups, take a few minutes to discuss. The linear trend lines… always be wary of these! It might represent many things: e.g., a change in the quality of data capture over time. And be wary that others will be wary! The categories, as discussed before. The final bar to the right: it’s not a 10 year period.
How could this visualisation be better? From the audience? Improvements: a more sophisticated trendline ten year windows working backwards from 2016, skip the first 4 years. or better, no ten year groupings (tricky to weigh up) data from 2017, or just drop the most recent figures
As I mentioned at the start, there’s a collection of alternate graphs drawn from the same data. Before we cycle through a few graphs, let me straight up that there is no one perfect visualisation here. There is a lot of information being represented, and so some detail will be lost somewhere.
In this visualisation, major and minor hurricanes are grouped, and the two trends are clearer because of this. Note that, unsurprisingly the decrease in minor hurricanes is steeper than the overall downward trend shown in the previous visualisation. A lot of information is removed, making the point clearer in my opinion. But a hand annotation in grey indicates a possible upward trend.
Here there is no binning of years, and so each bar represents aggregate storms in a single year. Instead of a linear trendline, there is a ten year moving average.
Does anything leap out about this visualisation compared to the previous one? Note that there are gaps. The cause of these gaps is unclear, but given they occur recently and in the past, it is unlikely to be just that there are gaps in the record.
Similarly, this graph avoids storm categories too but shows a completely different quality, by measuring mean wind speed within a year. Now we lose detail on how many hurricanes there were in a year. A weakness in this approach appears to be that there are years with no hurricanes, but these years would still have measurable wind speed! So this might be more than the data can show.
Never the less, does anyone see any patterns in this graph?
Perhaps that the trendline shows a cyclical increase and decrease in the mean wind speed over time. Perhaps there’s a fall off since 1991. I’m no hurricane expert, but I gather this cyclical nature is actually a feature of weather systems.
Finally, this visualisation attempts to cram in a lot of information. Years are once again binned into 10 year groups along the x-axis, but this time the creator works backwards from 2016. Each dot represents a hurricane within a ten year band, grouped by category. The size of the dots corresponds to sustained wind speed of the hurricane. I gather the position within the box is random. What this graph picks out is that perhaps category 4 and 5 hurricanes are becoming less frequent, but that category 3 hurricanes are uniform!
A quick show of hands… who still believes the story of the original graph. Is there a clear trend in this data? Before I wrap up, are there any comments from the crowd about this process of pulling apart a graph?
Check out the BOM site for detail on cyclone trends. Overall it is a lot less hyperbolic, and packed with useful context… and the data!
Dr Tom Honeyman,
NSW Outreach Officer,