Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Transcript - Data Visualisation - Design and Principals

120 views

Published on

22 March 2018

Video & slides available from ANDS website: http://www.ands.org.au/news-and-events/presentations/2018

Published in: Education
  • Be the first to comment

  • Be the first to like this

Transcript - Data Visualisation - Design and Principals

  1. 1. [Unclear] words are denoted in brackets Webinar: Data Visualisation – Design and Principles 22 March 2018 Video & slides available from ANDS website START OF TRANSCRIPT Gerry Ryder: So good afternoon, everyone, and welcome to the webinar today. My name is Gerry Ryder and it's my pleasure today to host this webinar about data visualisation. It's my pleasure to introduce Martin Schweitzer. Martin's currently working with ANDS as a data technologist. He has a background in computer science and a particular interest in data visualisation, data science and user interface design. He has a very professional background which includes photography, working on large IT systems, lecturing, as well as running workshops and training courses. Martin is currently seconded to ANDS from the Bureau of Meteorology where he's largely responsible for the climate record of Australia. Today Martin is presenting for us the first in a series of two webinars focused on data visualisation. This first webinar will focus on visualisation design and principles while the second will focus on tools and techniques. So having covered off on those introductions it's my pleasure now to handover to Martin for our presentation today. Thank you, Martin. Martin Schweitzer: Thanks very much, Gerry, and hello, everybody. I'll just jump straight in. So when asked to present a series on visualisation the first question, I guess, that everybody will have asked is what is the visualisation. I wanted it to be slightly broader than just presenting graphic data so my definition of the
  2. 2. Page 2 of 19 visualisation is that it's a visual explanation. It's anything that helps us understand something by looking at it. A typical example is something that should be familiar to most people, a map of the underground. One of the things that makes this a good visualisation is it helps show the relationships between the different objects inside this and how people in this case understand how to get from one point to another. If you're trying to imagine looking at a text description of how to get, for example, from Edgware Road to Blackfriars, it would be particularly complex, particularly if, for example, somebody told you that Tottenham Court Road is closed. One of the things that make this visualisation famous was that the designer discovered that when you're underground it's really only just the relationships that mattered. The actual exact geographical location is a lot less interesting, and that can be seen in this visualisation. So we'll just have a look at this. This shows the actual place on the map and then it morphs to what it looks like on the underground map. It just cycles through what the locations really look like and the underground map. So once again a beautiful visualisation of how the underground map actually maps to the real locations in London. Yet another example, often for people who may be a bit 3D challenged would be familiar - many people would be familiar with this IKEA visualisation that shows us the correct way to construct a bookcase. Why are visualisations important? Why don't we just have text description? We have a lot of descriptive statistics. Well, one of my favourite examples and something that really made my hair stand on end the first time I saw it was this thing called Anscombe's quartet. Many people may be familiar with this. It's a famous example. What we have here are four data sets: one, two, three and four in Roman numerals. Each one is a series of X and Y values. Just looking at them it's very hard to read much into them, but we can look at their summary statistics and - sorry - for example, they all have the same average value for the X. They all have the same average value for the Y. The sample variance of both the X and Y is the same in all four of them. The correlation between the X and Y is almost
  3. 3. Page 3 of 19 identical in all four of them. The linear regression is exactly the same. So a statistician may be tempted to just say, well, these numbers are pretty much the same. However, as soon as we look at the visualisation - in other words see the values plotted - we see something quite different. So just one example of how seeing a visualisation is very different to looking at the raw data. Another example that I've taken is - we'll just go to some text and we will have a look at a file. This is the contents of a file. As you can see, it's probably not easy to interpret what's in the file. Most files when they're stored on a disc are just bunches of numbers. If I told you that these numbers represent RGB values arranged according to an XY grid, once again it may not be obvious what the numbers represent. However, if I do this and present them as an image - excuse me - suddenly we see, okay, we have an image. So as numbers the numbers meant - or as data the numbers meant absolutely nothing. However, as soon as we visualise it as an image it will make sense. So as Gerry mentioned, I've been interested in visualisation for a very long time. In fact, over 20 years. One of the first books I came across was by Edward Tufte and was one of the seminal works. At that time I think it wasn't really realised that it would become a seminal work. He wrote a book called The Visual Display of Quantitative Information. In it he says, excellence in statistical graphics consists of complex ideas communicated with clarity, precision and efficiency. For the rest of the presentation I'm going to try and expand some of these ideas. So he came up with a few principles. The first one is graphical displays should show the data. We'll go through these principles first and then we'll look at examples. It should induce the viewer to think about the substance rather than about methodology, avoid distorting what the data has to say, present many numbers in a small space, make large data sets coherent, and reveal the data at several levels of detail from a broad overview to the fine structure.
  4. 4. Page 4 of 19 In this book he's got many fine examples; however, I've tried to find more modern examples and I've taken some of the examples from the work that I do. So the first one, show the data. What we're looking at here is a rainfall map of Australia and the government instituted a plan where they said they would give farmers concessionary loans if they were in a region that had suffered a one in 10 year rainfall deficiency or one in 20 year rainfall deficiency. So the map we're seeing here is a map where users can typically zoom in and out, but what we've done is to show only those areas where - that are affected or covered by this concessionary loan. So I guess one of the things is we could have shown a typical rainfall map, but ideally make this simple as possible and show only the data, so the pink and red areas are the areas that had been affected by either a one in 10 or one in 20 year rainfall deficiency. Next, induce the viewer to think about the substance rather than about the methodology. So what we're looking at here is in Kyoto, Japan cherry blossoms are a big thing. In Kyoto they've been recording the peak of the cherry blossom season since the year 800. So they have over 1000 years of data. What somebody's done is to plot all this data. What we see is that for about a century they pretty much peak between 10 and 20 April. However, since the early twentieth century they start blossoming earlier and earlier and a lot of people would say, well, this is a signal of climate change. However, what we wanted to show about this graph is that the person has plotted the actual data points using a little image of cherry blossoms which is quite cute. But they also noted in an article they wrote about it that initially they had plotted it with a cherry blossom with six petals until somebody pointed out that cherry blossoms only have five petals. The point about that is if people are thinking about how many petals the cherry blossoms have rather than about what the graph is saying maybe they should have thought more about the substance than the methodology. But nonetheless, I think with any of these rules often it's a good thing to
  5. 5. Page 5 of 19 break a rule now and again because in this case, for example, I certainly remembered this graph long after I'd seen it because I remembered the issue with the cherry blossoms. The next one was avoid distorting the data and here we're going to do something exciting and that's do it live. So what I've done is we're now seeing what's known as Jupyter Notebook. I imagine a lot of people would be familiar with Jupyter Notebook. Jupyter Notebook allows us to run Python code and in the next webinar the whole webinar will be based around looking at our work in Jupyter Notebook; however, this is a small demo that I've got in this presentation. What we're looking at here is storage levels in the dams that are around Melbourne. So the first graph I'll pull up I'll just - so this is fantastic at work. What we see in this graph is it looks like the Thomson, Cardinia and Upper Yarra dams are really low and all the rest of them are almost full. So we may worry a bit about that. However, when we look at this graph we see that we started - the base of it was 60 per cent full. So Cardinia, for example, is - well, let's take Thomson. It's actually almost 65 per cent full so it's really not that bad. When we look at the graph plotted against - starting at zero we note as well it doesn't look that bad. We may also look at this and say, well, the other dams are all over 80 per cent so we've got nothing to worry about. However, not all these dams are the same size, so looking at only the percentage can be a bit misleading. So let's run this one. What we see here is that the amount of space in the Thomson Dam, there's probably not enough water in all of these smaller dams to even fill that gap that's in the Thomson Dam. So that's what we mean when we say avoid distorting the data. Try and make sure that we're telling a story with integrity. The next principle was to present many numbers in a small space. The map that we're looking at here is Australian rainfall deciles. So this is that - the areas that are in this bright red have received the least rainfall this December, they're in the lowest one per cent of December rainfalls.
  6. 6. Page 6 of 19 These tiny dark blue patches are in the highest one per cent of rainfall that - this record goes back to 1910 so they take every year from 1910. We say present many numbers in a small space. So what we're looking at here is a grid, and they're roughly 640 by 800 grid cells. So each one is calculated and for each one there's 117 years of data. So what we're looking at is almost 36 million data points; however, we've condensed those 36 million data points into one, well, simple map. So I think this is a fantastic example of presenting many numbers in a small space. Sometimes, as I said, we want to break the rules and get something where we break the rules. This was the recent tropical cyclone. We've got a visualisation that shows the current position of the cyclone. This arguably is just one data point; however, it's a really important data point, particularly if you're living in the north of Western Australia and you want to know how close the cyclone is or whether it's got a chance. Also we can - by clicking on that one point we see a far more detailed image which then takes us into seeing the data at different levels. The next one was around making large data sets coherent. This is something that at the bureau we're very interested in. How do you communicate things like probability? When people hear almost certainly do they think that an event is more probable or less probable than if they hear highly likely or if they hear very good chance? So what they've done here is taken all these terms and presented them using a technique known as KDE on one graph. So we can very easily compare that, for example, if somebody says, chances are slight, that people think that there's actually slightly more chance of an event happening than if we, for example, say, it's highly unlikely, or if we say, there's almost no chance. So that covers off on Tufte. The next few slides are some of my ideas and some of my experience in developing visualisations and somethings that I feel are important. One of the most important things in any visualisation is that you actually have something interesting to talk about the data. Whenever I see somebody saying, we've got this data, it looks pretty boring. Can we just create a
  7. 7. Page 7 of 19 visualisation, well, that's when the hairs on my neck prickle a bit. So this is a famous video. It started off as a TED Talk by the Swedish Hans Rosling. [Video playing] Martin Schweitzer: Okay. I think people get the idea. Now, one of the things that strikes me about that video is talking about inequality, et cetera, and gave this TED Talk. At a similar time, Thomas Piketty, who was famous for his book on capitalism, also gave a TED Talk. I watched both talks. Both were equally impressive. I thought Piketty's was the more impressive. However, Rosling's - the one you've just seen - got 10 times as many views roughly as Piketty's, and I think the real reason it got so many views was because it had such a story here. It had such remarkable visualisation and graphics. So it certainly says that it's important. Obviously Rosling is a very - or was a very impressive storyteller and was just a very impressive presenter and so did it really well. Of course, not all of us have his talents; however, we can all do good or great visualisations. So here's a simpler graphic and this one shows the trend in maximum temperatures from 1970 to 2016. So wherever the graph is red the average maximum temperature has been increasing and wherever the graph is blue the maximum temperature has been decreasing over the years. I think this one tells quite an alarming story. Here's another visualisation and this one I've got three slides which show a progression of how we're trying to convey something. So in the first slide the person has just taken the data and they've put it - this is rainfall data. They've started at 1900 and showed how much rainfall up to years 2010. Now, there are two large influences on rainfall. One is the ENSO which is - often we hear that in a La Niña system or an El Niño system. The other one is what is marked as IOD which is Indian Ocean Dipole. Once again, these can be either positive or negative. So we've got two, four, six, seven different colours in the graph showing that when this rainfall fell what kind of system we were in. However, this doesn't
  8. 8. Page 8 of 19 really tell a good story. if we look at it having been rearranged we see that the blue lines on the right when - all the years where we had a lot of rainfall all tended to be where we had a La Niña and a negative Indian Ocean Dipole. The red and brown on the left were during generally El Niño years. However, we can improve this as well because we've got seven different things. We have to keep looking at the colours, move forwards and backwards. So here's a graph where what we've done is we've plotted the IOD along the bottom going from negative to positive. We've plotted the ENSO along the left-hand side. So these numbers in the top right we can see had a strong ENSO signal, strong La Niña, and a positive IOD, while these numbers to the left had a - sorry. These are the La Niña and the negative IOD. We can see as it gets stronger how it affects the rainfall. Here's another graph which also tells quite an alarming story. This is the water supply in Cape Town and in 2013/14 we can see they typically get their rainfall in winter. So around about - from October onwards the dam levels start falling. Because for about the last five years they haven't been - there hasn't been good rain, they've continually been falling each year progressively. That's 2013, 2014, 2015, 2016, up till this year which is 2017/18. We see when I pulled up this graph it was between January and February and we were over there and they were projecting that around April/May/June Cape Town could run out of water. There were a few projections. One is if people use 600 megalitres a day of water, one with 500. One is if they were using 600 megalitres and they've started up desal plants so what would happen. All of them show pretty dire consequences. A visualisation like this really does tell a story. So the next principle is keep your graph as simple as possible. I've made a very quick 3D graph. I've just made a fictitious one, which is how many people attended at morning teas and maybe the person that attends the most morning teas at the end of the year gets a prize and the person who has attended the least gets a wooden spoon. So this was my first graph and I felt, well, this can always be improved. Whenever I see
  9. 9. Page 9 of 19 a 3D graph if it's not displaying 3D data I'm a little bit disturbed. So I modified it so we've now got a 2D graph. However, the numbers are in the [box]. We probably don't need those grids and as many of them. We certainly don't need our dotted and solid line grids, so I cleaned that up a bit. So there's a simpler graph. However, when looking at that graph - and often I see graphs like this - the first question I ask is, what do those colours mean? Why are there different colours? Well, in this case the colours mean absolutely nothing, so I've got rid of the colours. The next thing is getting back to this idea maybe of telling a story. What am I trying to say? Well, really what I'm trying to do is find out who attended the most and least morning teas. So maybe by improving the graph, well, I've now put the least - I've ordered them from least to most and now it's quite obvious who's attended the least and who's attended the most. So is there anything else we can do to make this presentation simpler or to remove any unnecessary data et cetera? This is a trick question, but of course there is. Well, in this particular case I think we can just remove the graph altogether. I don't think that that visualisation has given us any more information than simply looking at a table of numbers. The table remains ordered. I get exactly that same information. So it's probably important to ask that question occasionally. Do we really need a graph for this data, or do we really need a visualisation for this data? I think Antoine de Saint-Exupery said it best when he said, perfection is achieved not when there's nothing more to add but when there's nothing left to take away. However - this was a however - Einstein was apparently famous for saying, make it as simple as possible but no simpler. So here's another example of a visualisation. This is called a skew-T log-P graph, and this is used by meteorologists every single day. Temperature is on these diagonals. The pressure is going along this way. The reason it's called log-P is because at the bottom you see the gap between 100 - 900 and 1000 is much smaller than the gap between 200 and
  10. 10. Page 10 of 19 300. So even the scale appears to be changing. There are two different colour lines. Each of those lines has a meaning. The red line is what was recorded today and the blue yesterday. In case - well, I imagine most people aren't familiar with these graphs. So what this is actually plotting is at a lot of locations around the world they send up weather balloons or sondes. So this is plotting the temperature as the balloon is moving up through the atmosphere. So we can see that it's getting cooler, et cetera. The second line is the dew point. So we can see, for example, if the dew point crosses the temperature we're going to get precipitation or rainfall and so on. On the right-hand side we've got another particularly interesting thing being visualised here and these are called wind barbs. The direction of the wind barb shows the direction of the wind. So these ones pointing upwards show northerly wind. The number of feathers shows the speed of the wind. So the short ones are five knots. The long one is 10 knots. A long and a short is 50 knots and so on. I won't go too much into this. But the fact is that for a meteorologist, this is a really important graph. It's not as simple as a bar chart or a line graph, et cetera, but it's serving its purpose. That's the most important thing. A visualisation has to be fit for purpose. The next thing we'll look at is colour. I'm not going to go into colour in a lot of detail, the main reason being because you can spend hours talking about colour to really understand it thoroughly. I've got a few suggestions, but the most important one, I think, is for colour if it's important please try and find somebody who's an expert. There are lots of different factors to consider, things like colour blindness, common conventions, cultural differences and so on. This is just a very simple example. These are from images of blood travelling through an artery. This one - well, basically they showed these different images to a lot of doctors and asked which one they preferred. Most doctors came up with this A. However, when they asked people to diagnose the issues with these things they were - then I think the best one was F or G where they were able to identify the most issues or see the most problems
  11. 11. Page 11 of 19 with a patient. So even though they thought that this one was the easiest one to read - the colourful one - admittedly they were used to those colours, et cetera. It's not always the case. The reason I'm saying this is it really does say that colour can be a tricky issue and that really it does need some expertise and, in this case, it was actually through some research. In the slides there's a reference to this paper that talks about this. It's quite an interesting paper. Just on the topic of colour, here are some examples from the bureau once again. This one is showing rainfall. It's using a gradated scale, so darker means more rainfall. They've used the colour blue which makes sense because the more saturated blue tends to show areas that have more saturation in terms of rainfall. This map is not showing how much rainfall but it's showing how variable the rainfall is, in other words, how much it differs from year to year. So it wouldn't have made sense to use blue here because some areas can be very dry but at the same time have a lot of variability of - or have very little variability. Areas that may be very wet may have a very small variability because they're wet all year round, just as areas that are wet all year around have low variability. So this one is showing - that's chosen a different colour for this one. This one is showing how much rainfall in this case fell in the week of 23 January. This is using a scale that people who are looking at this type of map are familiar with. The white areas have had no rainfall or not been able to record it, and these dark colours are the areas of the highest rainfall. Once again we see that this scale is not linear, so there's a colour for between one and five millimetres. There's a different colour for between 300 and 400 millimetres. I think that's useful when looking at visualisations also to see examples of maybe things that we can try and avoid. This is always the part of this presentation that I feel uneasy about, but I think it's just worth having a look at an example so we'll have a quick look at this one. So what this is talking
  12. 12. Page 12 of 19 about is average household debt in America by this person who is a financial data journalist. It's how much debt you have. It's an infographic. So the first thing I looked at when I saw this is we've got some sort of thing that looks like a visualisation, and I tried to work out what it's telling us. I looked at it and I thought, well, why are some people green and some people - is it the green ones have less debt? No. All different sizes. They've - I realised that it probably doesn't mean anything. It's just decoration, so we can move on. So the next thing is the total owed by the average. We see credit cards are 16,000, mortgages are almost 10 times that amount, but the mortgages aren't actually 10 times as long in the specialisation. 28,000 is a lot longer than 16,000. So there's clearly no clear scale - well, I should just say there's no clear scale on this. Once again, we've got different colours but, once again, they seem just for decoration. The other thing is I couldn't understand why any type of debt is 134,000 while mortgages are 176,000. So it wasn't quite clear what any type of debt meant. Also credit cards and auto loans were lumped together with mortgages which are more of an asset and some people differentiate between things like mortgages, which they classify as good debt and things like auto loans which are classified as bad debt. The next one is how much does debt cost you. This probably one of the better ones, but there's no - given that she's used comparative scales in the previous ones I was surprised that there wasn't any comparative scale. I think one thing I did notice here was that this figure from memory didn't really add up. This was an interesting one, medical debt on the rise. There were a few issues with this but one of the things we notice is if that's 63 per cent then that one is about 37 per cent and yet that 37 per cent segment actually looks a bit bigger than the 42 per cent segment. Considering that halfway across would be 50 per cent I don't think that that 42 per cent is accurately reflected in the pie chart. I won't go into the colours that have been chosen or talk much more about pie charts. A lot of
  13. 13. Page 13 of 19 people have very strong opinions about how useful pie charts are. We now come to debt broken down by age. In this one it actually looks as though the colours may be meaningful because they're two red bars, two orange bars, and two green bars, but once again it just seems that the colours were arbitrarily chosen. That's all I'll say about that, but - except to say I do think - have a look at examples and always look critically. Look critically at your own work at things that can be improved. But also when looking at other things think about, okay, is this a good visualisation? Is it a bad one? When you see something that looks good what makes it look good? When you see something that looks okay maybe think, how could it be improved? What could this person have done to make the story clearer? So what are some techniques that you can use when doing a visualisation that will make it better for the people looking at it? One of the first ones I talk about is natural mappings. What we're looking at here is what's called a wind rose. What this is showing is wind in eight - not quadrants but eight sectors - and how windy it is. So this is Melbourne Airport that we're seeing here and we see that most of the winds at Melbourne Airport are northerly. These are the averages taken over a particular period. As we go out in this telescope it shows us stronger and stronger winds. So, for example, we hardly ever have, let's say, gale force winds in this south-westerly direction. There's very few easterlies at Melbourne Airport. But the natural mapping is if it's facing upwards then we can see straightaway it's a northerly wind. We've seen this graph before, but the important thing is to highlight relevant information. So if all five of these lines were the same colour it wouldn’t be quite clear what the story's telling us, but it - given that this one is highlighted and the others are muted we can see straightaway it - our focus shifts to this one. The next thing, make comparisons clear. So what this is comparing is arctic ice. This is going back to 1879 and it's comparing the - as we're progressing into the present. One of the things we see is it seems pretty clear that
  14. 14. Page 14 of 19 there's less and less arctic ice as we're coming into the present. By overlaying those plots one on top of the other it makes it a lot clearer. Going back to this graph we see once again by plotting all these different attributes on the same set of vertical axes it makes those comparisons much clearer. So, for example, when we're comparing highly likely to very good chance we can see quite clearly how they compare. The next thing is in this case it's probably exaggerated but make the scale clear. This is showing the stations in Australia that record - it's showing basically the largest difference between two days - so between the maximum temperature on day 1 and day 2. So at these stations there was a 25 degree or 27 degree difference. So one day the maximum temperature was 10 degrees and the next day 37 degrees, for example. As we went further north there's less difference between successive days in temperature in terms of their records. Yet another visualisation, this time of space, and we've got a very different scale here. It's probably hard to read on the slide, but that distance there is 100 million light years across. So a light year is pretty big. 100 million light years is 100 million times as big. Finally, colour should add meaning and not detract. We come back to this slide, which is how much Australia has - or the warming trend in Australia since 1970. Here clearly colour is enhancing the meaning of what we're trying to say here. Use conventions. If we look at this time series of temperature, at first look it may seem that temperature is actually declining. This is just a dummy slide I created for this presentation. What I've done here is these temperatures are actually - if we look carefully at these numbers we see the numbers are actually decreasing as we go from left to right. Normally when we read from left to right we expect time to increase - in other words, get either closer to the present or further into the future. By turning it around we've defied that convention and then obviously made this a whole lot harder to read. There's a lot of ways to display different dimensions, and I'll just - sorry. I'll just go there and I'll just skip this for the moment. We'll go back to it if
  15. 15. Page 15 of 19 we've got a bit of time. So here's another slide showing how we can plot dimensions very differently. In this graph or in this visualisation what we've done is this is temperature in Africa but across a range of latitudes going from 30 south to 30 north. So the Y axis is latitude. The X axis is the month of the year. The actual colours depict the rainfall during those months. So what we see here is in the southern latitudes we get rainfall around December/January/February. As we go north of 20 degrees north it's very dry and around about 10 degrees north they get mostly a winter rainfall. This way of plotting data is known as a Hovmöller plot. These are called Chernoff faces and what this does is allows us to plot multidimensional data by using faces. So Chernoff said people, their brains are hardwired to really recognise faces quickly. So what we can do is we've got about seven or eight different attributes we can change. We can change the smile on their mouth. We can change the length of their nose, the distance between their eyes, the amount by which eyebrows are raised and so on. So we've taken a dummy data set here comparing different universities, different people across the universities, and then we've said, okay, we'll use, for example, the eye colour to show how - where they are, [of data for sharing] and maybe the length of the nose to show awareness of data licensing, et cetera. So basically, it's a novel way of displaying data with a high number of dimensions. As I keep saying, it's always good to break the rules. Some people may be familiar with this image. It's called pale blue dot. If you're not familiar it's a visualisation - well, I guess any image can be. But what it's showing is over there there's a pale blue dot. This photograph was taken by Voyager 1 from out of space - well, from space. That pale blue dot there, almost single pixel down there, is Earth. So often we're told to make the data we're displaying significant and obvious. In this case the strength of this visualisation comes from how insignificant that tiny little dot on that photograph is, how insignificant this huge planet that we live on is. Da Vinci has said, simplicity is the ultimate sophistication.
  16. 16. Page 16 of 19 I've got a few things in my slides. I'll just go back to the slide that I was trying to find earlier. Which one did we - for some reason - what I'll do is I'll rewind that. So - okay. So what we're going to see is how Australia's temperatures changed for the 12 months ending December 1910. I'll just maximise this. This is an animation. The colour shows the year and we see as we're coming more and more to the present the colours spiralling outward, representing warming. So I guess what makes this visualisation effective is not only the animation but also the fact that we were able to draw a line which shows about 100 years of data which typically would have been a very long line but in this case by wrapping it around the inner circle we were able to show it all in one compact way. So finally, all visualisations are wrong. What do I mean? There's a famous quote from George Box, the statistician, that said, all models are wrong. He said, all models are wrong. The only question of interest is is the model illuminating and useful. I've changed that to, all visualisations are wrong. The question is is the visualisation illuminating, useful, and does it have integrity? Thank you. Gerry Ryder: Thank you so much, Martin, for that really valuable presentation that I'm sure has given us all a lot of ideas and some things to look forward to in the next webinar where we'll actually see some of the tools that you've used to create these examples. We do have time for questions if we have anyone in the audience that would like to ask Martin a question about anything he's presented on today. Please do put it into the question pod and I'd happily relay that and put Martin on the spot. So we've got a number of people thanking you, Martin, for a really interesting talk. We have got one question, Martin, from [Mark Mackay] who's asked if you could suggest any textbooks or papers that he could share with students. Martin Schweitzer: Yes, I do, quite a few. I've actually put them in the slides. So at the end of the slides there's some references. I believe the slides are going to be made available, Gerry.
  17. 17. Page 17 of 19 Gerry Ryder: Yes. That's correct. We'll have both the slides up as well as the recording up. So you can have a look at the slides separately to the recording. Gerry Ryder: Another question, Martin, can you provide the name of the visualisation with the faces. Somebody's obviously liked that one. Martin Schweitzer: Chernoff faces, C-H-E-R-N-O, either V or F-F. Gerry Ryder: So perhaps we might put them - Susannah, we might be able to pop that in the question box for people to see, C-H-E-R-N-O-V or F-F. Someone's - Richard's asked, Martin, you've used Jupyter Notebooks. He's pre-empting the next webinar. What sort of other technologies do you normally use to build visualisations? Another question related about open source software for visualisations. So I know we'll cover that in the next webinar, but perhaps a teaser today, Martin. Martin Schweitzer: So definitely Jupyter Notebooks and Python. So the next webinar will focus largely on Python. I also do a lot of work with web front-ends and JavaScript. So if somebody's working with JavaScript there's a huge array of visualisation tools but probably if one - if you don't mind a steep learning curve and want to be able to do absolutely everything, E3.js is the go to one and it's open source. Gerry Ryder: Thank you, Martin. Someone wants you to - [Jacinta] wants you to look in a crystal ball and asks, what do you see is the future direction of data visualisation? Martin Schweitzer: Wow. I think the - what's happening is we're getting to things with higher and higher resolution. We're going to more dimensions so we've got the three dimensional static flatwork. We move to two dimensional animation with the web. One of the things that's becoming popular is virtual reality, so people can put on some glasses and maybe see storms being - the data for the storm being visualised but in their own surroundings. So what does it feel if a rain - and that actually gets us on to the next one which is augmented reality.
  18. 18. Page 18 of 19 So I can look around at Monash University or let's say I could go down to St Kilda Beach and see what it's going to look like maybe in 100 years with the sea level rising two feet or 10 feet or something like that. So both exciting and scary. Gerry Ryder: As technology changes tend to be. We do have a couple more minutes if there is any other final questions for Martin. So Lisa is interested in the relationship between storytelling and data and the idea of integrity and worries about collecting data to suit a story and there being a lack of rigour and accountability. I guess that's a comment more than a question, but you might like to respond to that, Martin. Martin Schweitzer: I think it's a - integrity is always in the mind of the beholder, so that you can't - data cannot have integrity. The people using and presenting the data need to have integrity. They need to present the data with integrity. I would say any tool that can be used for good can also be used for evil. So, yes, people can create visualisations that try and push an agenda or push a point, et cetera. Hopefully by being more critical of visualisations we can actually see those ones where somebody is trying to push something which isn't true. That's why I also push for integrity in data that as soon as we show a visualisation that, let's say, only shows 30 years of data where maybe, let's say, temperatures have been decreasing immediately it puts a cloud over everything that person is saying because why have they picked that one 30 year period where the temperature was dropping? So I think in the long run it pays to be as honest as one can about data. Gerry Ryder: A final question today, thanks, Martin. Is there a common standard for colour coding for general use in data visualisation? Martin Schweitzer: A very simple and short answer, no, absolutely not. However, there is a website called ColorBrewer - actually, it's called ColorBrewer 2, so colour is spelt the American way, and brewer like somebody who brews. I would recommend anybody looking for a good set of colours to go there first.
  19. 19. Page 19 of 19 [There are tools] for visualisation. [We'll] actually use the - so it was written by a researcher called - her last name is Brewer and she's done a lot of research into colour and how to use it well. Gerry Ryder: Great. I'd like to thank now Martin for his presentation today and also acknowledge Susannah who's been quietly sitting in the background responding to your questions and making sure the webinar runs smoothly. So thank you all today and have a great afternoon. END OF TRANSCRIPT

×