Module 1(Extracts from Chapter 1 of Statistics for Geographyand Environmental Science)DATA, STATISTICS ANDGEOGRAPHY
Module overviewTo convince you that studyingstatistics is a good idea!Our argument is that data collectionand analysis are central to thefunctioning of contemporary societyso knowledge of quantitativemethods is a necessary skill tocontribute to social and scientificdebate.
About statisticsStatistics are a reflective practice: away of approaching research thatrequires a clear and manageableresearch question to be formulated, ameans to answer that question,knowledge of the assumptions ofeach test used, an understanding ofthe consequences of violating thoseassumptions, and awareness of theresearcher‘s own prejudices whendoing the research.
Some reasons to study statisticsReasons for human geographers – Data collection and analysis are central to the functioning of society, to systems of governance and science. – Knowledge of statistics is an entry into debate, informed critique and the possibility of creating change.
Some reasons to study statisticsReasons for GI scientists – To address the uncertainties and ambiguities of using data analytical. – Because of the increased integration of mapping capabilities, data visualizations and (geo-) statistical analysis.
Some reasons to study statisticsReasons for all students – They provide a transferable skill set using in other areas of research, study and employment. – There is a recognised shortage of students with skills in quantitative methods, especially within the social sciences.
Types of statisticDescriptive– Used to provide a summary of a set of measurements, e.g. the average.Inferential– Use the data at hand to convey information about the population (‗the greater something‘) from which the data are drawn.Relational– Consider whether greater or lesser values in one set of data are related to greater or lesser values in another.
Geographical dataThese are records of what hashappened at some location on theEarth‘s surface and where.For many statistical tests the whereis largely ignored.However, it is central to geostatisticsand to spatial statistics (as theirnames suggest)
Some problems when analysing geographical dataStandard statistical tests assume thateach ‗bit‘ of data (each observation)has a value that is not influenced byany other.However, we may often expect thereto be geographical patterns in thedata.– Spatial autocorrelation: geographical patterns in the measurements
Some problems when analysing geographical dataDetermining what causes what in acomplex and dynamic natural orsocial system is extremely tricky.Two things may be associated (e.g.greater income inequality and morenon-recycled waste) without the onedirectly causing the other.
Some problems when analysing geographical dataData and structured forms of enquirycan only tell us so much and may notbe appropriate to some types ofresearch for which a morequalitative, participatory or lessrepresentational approach may bebetter.
Further readingChapter 1 of Statistics forGeography and EnvironmentalScience by Richard Harris and ClaireJarvis (Prentice Hall / Pearson, 2011)Includes a review of the followingkey concepts: types of statistics;why error is unavoidable;geographical data analysis; andspatial autocorrelation and the firstlaw of geography.
Module 2(Extracts from Chapter 2 of Statistics for Geographyand Environmental Science)DESCRIPTIVE STATISTICS
Module overviewThis module is about ―everyday statistics‖,the sort that summarise data and describethem in simple ways.They include the number of home runs thisseason, average male earnings, numbersunemployed, outside temperature, averagecost of a barrel of oil, regional variations incrime rates, pollution statistics, measuresof the economy and other ―facts andfigures‖These are the sorts of descriptiveinformation that come about by observingand measuring something, then bysummarising the data in clear andstraightforward ways.
Data and variablesData– A collection of observations: measurements made of something.A variable– Another name for a collection of data. Variable because it is unlikely that the data are all the same.Data types– These include discrete, continuous, and categorical data.
Simple ways of presenting dataDiscrete data Continuous dataFrequency table Summary tableBar chart (below) Histogram (below, with a rug plot)
Information to include in a summary tableMeasures of central tendency(―averages‖)– The mean and/or median • The ―centre‖ of the dataMeasures of spread and variation– The range (minimum to maximum)– The interquartile range (from ‗mid- spread‘ of the data)– The standard deviation,s
More about the standard deviation Essentially a measure of average variation around the mean. It is also the square root of the variance. The variance is the sum of squares divided by the degrees of freedom
BoxplotsAre useful forshowing themedian,interquartilerange and rangeof a set of data,for indentifyingoutliers and alsofor comparingvariables.
Other ways of classifying numeric data Nominal, ordinal, interval and ratio Counts and rates Proportions and percentages Parametric and non—parametric Arithmetic and geometric Primary and secondary
Further readingChapter 2 of Statistics for Geographyand Environmental Science by RichardHarris and Claire Jarvis (Prentice Hall /Pearson, 2011)Includes a review of the following keyconcepts: data and variables; discreteand continuous data; the range;histograms, rug plots, and stem andleaf plots; measures of centraltendency; why averages can bemisleading; quantiles; the sum ofsquares; degrees of freedom; thestandard deviation and the variance;box plots; and five and six numbersummaries