You need to make sure you are clear about the difference between a basic bar chart and a histogram, as they do look similar. In the case of bar charts the horizontal scale represents a number of categories and the vertical scale the values they represent. The width of the bar is not important and often a gap is left between the bars. In the case of the histogram the horizontal axis represents a linked series of class intervals eg/ pebble size.
Give 4 features of a well presented and appropriately used bar graph.
Give an example of an occasion when you would use a histogram rather than a bar graph, and say how you would construct it.
State 2 advantages and 2 disadvantages of using bar graphs in a geographical investigation.
Suggest an alternative method of presenting the data shown above.
http://geographyfieldwork.com/DataPresentationPieCharts.htm The pie chart is useful to show the total data divided into proportions. It often has good visual impact but can it is difficult to read the data accurately, particularly if there are several categories. The segments should be drawn from the largest first and the smallest last unless there is an "others" category in which case that should be last regardless of its size. Segments should be shaded in different colours and a suitable key or labels added. The raw data and percentage figures can be added to the key if appropriate. Proportional Pies use the concepts of pie graphs and proportional symbols together. The diameter of each pie is proportional to the total. This method integrates data together and involves a spatial element when plotted on a suitable base map. With some thought "death by pie chart" can be avoided by using this more interesting alternative technique to present data. Notice the need for two keys explaining the size and division of the circles. Proportional Pie Graphs(located on a base map) Proportional Pie Chart Pie Chart
Line graph Line graphs show changes over time. All the points are joined up and the axes should normally begin at zero. Rates of change are shown well, although careful thought to the scale should be given. Unsuitable if there are only a few data points.
Scatter graph Scatter plots are used to show a relationship between two data sets. The dependent data should be placed on the horizontal (x) axis. The points should not be joined up but a line of best fit showing the general trend is useful where there is an obvious correlation. http://geographyfieldwork.com/DataPresentationScatterGraphs.htm
1. Measures of Central Tendancy When there is a lot of data it can be useful to find an average to summarise it, particularly when comparisons between data sets are desirable. (+) It is very quick to calculate. (+) It is not affected by extreme values. (-) It can only be identified if the individual values are known. (-) The result cannot be used for further mathematical processing The most frequently occurring number in a set of data values. Mode (+) It is not affected by extreme values. (-) It cannot be used for further mathematical processing. The median is best quoted with reference to the interquartile range. The central value in a series of ranked values. If there is an even number of values, the median is the mid-point between the two centrally placed values. Median (+) It takes into consideration all the data. (+) The result can be used for further mathematical processing. (-) It can be misleading if there are a small number of very high or very low values which may distort the mean. The mean is best quoted with reference to the standard deviation. All the data values are added together and then the total is divided by the number of values in the data set. Mean Evaluation Method Measure
2. The Spread of the Data The mean, median and mode give a useful summary value for a set of data but give no information about the spread of values around the "average" figure. As such, this summary value can be misleading and give an untrue picture of reality. The spread, or deviation, from a central value can be measured giving a fairer picture about the set of data.
. (+) The best way to measure the spread of data around the central value as it involves all the data. (+) Allows useful comparisons of the distribution of values in a data set to be made. (+) Gives results that can be used in further mathematical calculations for further analysis. (-) Reasonably complicated to calculate, although calculators and spreadsheets can help. The standard deviation indicates the degree of clustering of each data value about the mean. It is calculated by measuring the difference (deviation) of each value from the mean; these results are then squared and then added together. This total is divided by the number of values in the data set, and finally the square root is taken from this result. A low SD value indicates that the data is clustered around the mean, whereas a high value indicates that the data is widely spaced with some much higher and lower figures than the mean value. Standard Deviation (+) Although it is more complicated than the range, it is still quite simple to calculate. (+) The result represents the spread of the middle 50% of values and is therefore more representative of the entire data set. (+) Extreme values are not considered and so the result is unlikely to be skewed. (-) Not all the data is considered. The interquartile range is the difference between the 25th and 75th percentiles. The higher the interquartile range, the greater the spread of values around the median. Interquartile Range (+) Quick and easy to calculate. (-) A crude measure as it only considers extreme values and doesn't make any reference to any other values. The difference between the highest and lowest value. Regularly used when describing climate figures. Range Evaluation Method Measure