Measurement and Representation of
                                                  Hydrological Quantities
Leonardo da Vi...
Measurement and Representation of Hydrological Quantities




                                                  Objectives...
Measurement and Representation of Hydrological Quantities



                         Frickenhausen, on the River Meno
   ...
Measurement and Representation of Hydrological Quantities



                         Frickenhausen, on the River Meno
   ...
Measurement and Representation of Hydrological Quantities




             Hydrological Data have Complex Trends 1/2
     ...
Measurement and Representation of Hydrological Quantities




             Hydrological Data have Complex Trends 2/2

    ...
Measurement and Representation of Hydrological Quantities




            Some Typical Problems
            precipitation
...
Measurement and Representation of Hydrological Quantities




            Some Typical Problems
            incident solar...
Measurement and Representation of Hydrological Quantities




            Some Typical Problems
            Flow of the Ri...
Measurement and Representation of Hydrological Quantities




  Some Typical Problems
  Distribution of monthly river flow...
Measurement and Representation of Hydrological Quantities




  Some Typical Problems
  Annual water budget for the Lake o...
Measurement and Representation of Hydrological Quantities




       Some Typical Problems
       Water content of the soi...
Measurement and Representation of Hydrological Quantities




       Some Typical Problems
       Water content of the soi...
Measurement and Representation of Hydrological Quantities




       Some Typical Problems
       Spatial distribution of ...
Measurement and Representation of Hydrological Quantities




       Some Typical Problems
       Spatial pattern of the h...
Statistical Inference
                             and Descriptive Statistics


                             Lucio Fontana...
Measurement and Representation of Hydrological Quantities




                                                  Objectives...
Statistics




                               Population and Sample




          Statistical inference assumes that a dat...
Statistics



                                   Exploratory Data Analysis
                                     temporal r...
Statistics




                                          Sample Means

                      Given a sample, various stati...
Statistical Inference and Descriptive Statistics




                                                   Statistical Infere...
Statistical Inference and Descriptive Statistics




                                                   Statistical Infere...
Statistical Inference and Descriptive Statistics




                                                   Statistical Infere...
Statistical Inference and Descriptive Statistics




                                                   Statistical Infere...
Statistics




                               Exploratory Data Analysis
                             The mean is not the o...
Statistics




                                  Median and Mode


           The mode represents the most frequent value....
Statistics




                             Empirical Distribution Function


     Given the dataset


      hi = {h1 , · ...
Statistics




                                                                         ECDF
       The empirical cumulati...
Statistics




                                                                           ECDF
       The 0.5 quantile sep...
Statistics




                                                                           ECDF
       The 0.5 quantile sep...
Statistics




                                                                              ECDF

       And so the media...
Statistics




                                   Box and Whisker Diagrams

   The procedure can be generalised and repres...
Statistics




                                     Parameters and Statistics


                      A parameter is a des...
Statistics




                                      Parameters and Statistics


                      A statistic is a nu...
Statistics




                             Other Statistics: the Range




               Rx := max(x) − min(x)



      ...
Statistics




                             Other Statistics: Variance and
                                 Standard Devia...
Statistics




                             Other Statistics: Variance and
                                 Standard Devia...
Statistics




                                Coefficient of Variation

            • The coefficient of variation (CV) o...
Statistics




                             Other Statistics: Skewness and Kurtosis

                                     ...
Statistics




                             Estimation and Hypothesis Testing

                Usually, we are not interes...
Statistics




                             Estimation and Hypothesis Testing

                These two questions belong ...
Statistics




                                       Sample Variability
                A fundamental aspect of sample st...
Statistical Inference and Descriptive Statistics




                                                          Sample Vari...
Statistical Inference and Descriptive Statistics




                                                   Sample Variability...
Statistical Inference and Descriptive Statistics




                                                   Sample Variability...
Statistical Inference and Descriptive Statistics




                                                   Sample Variability...
Statistical Inference and Descriptive Statistics




                                                   Sample Variability...
Statistical Inference and Descriptive Statistics




                                                   Sample Variability...
Statistical Inference and Descriptive Statistics




                                                   Sample Variability...
Statistical Inference and Descriptive Statistics




                                                   Sample Variability...
Statistical Inference and Descriptive Statistics




                                                   Sample Variability...
Statistical Inference and Descriptive Statistics




                                                   Sample Variability...
Statistical Inference and Descriptive Statistics




                                                    Sample Variabilit...
Statistical Inference and Descriptive Statistics




                                                   Sample Variability...
Statistical Inference and Descriptive Statistics




                                                   Sample Variability...
Statistical Inference and Descriptive Statistics




                                                   Sample Variability...
Statistical Inference and Descriptive Statistics




                                                   Sample Variability...
Statistical Inference and Descriptive Statistics




                                                   Sample Variability...
Statistical Inference and Descriptive Statistics




                              Estimation and Hypothesis Testing

    ...
Statistical Inference and Descriptive Statistics




                              Estimation and Hypothesis Testing




 ...
Statistical Inference and Descriptive Statistics




                              Estimation and Hypothesis Testing
     ...
Statistical Inference and Descriptive Statistics




                                                   Null Hypothesis

 ...
Statistical Inference and Descriptive Statistics




                                        Other Statistics: Covariance
...
Statistical Inference and Descriptive Statistics




                                       Other Statistics: Correlation
...
Statistical Inference and Descriptive Statistics




                                       Other Statistics: Correlation
...
Statistical Inference and Descriptive Statistics




                                       Other Statistics: Correlation
...
Statistical Inference and Descriptive Statistics




                                 Other Statistics: Autocorrelation


...
Statistical Inference and Descriptive Statistics




                                                   Random Sampling


...
Statistical Inference and Descriptive Statistics




                                                   Sample Variability...
Statistical Inference and Descriptive Statistics




                                                          Sample Vari...
Statistical Inference and Descriptive Statistics




                                                          Sample Vari...
Statistical Inference and Descriptive Statistics




                                                          Sample Vari...
Statistical Inference and Descriptive Statistics




                                                          Sample Vari...
Statistical Inference and Descriptive Statistics




                                                       Sample Variabi...
Statistical Inference and Descriptive Statistics




                                                    Sample Variabilit...
Measurement and Representation of Hydrological Quantities




                                        Thank you for your a...
Upcoming SlideShare
Loading in...5
×

6 measurement&representation

1,131

Published on

How Hydrological measure appears and how to treat them (an introduction)

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,131
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
77
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

6 measurement&representation

  1. 1. Measurement and Representation of Hydrological Quantities Leonardo da Vinci - Vitruvian Man, ca 1487 photo by Luc Viatour, www.lucnix.be Riccardo Rigon Sunday, September 12, 2010
  2. 2. Measurement and Representation of Hydrological Quantities Objectives: •In these pages the spatio-temporal variability of measurements of hydrological quantities is discussed by means of examples. •One deduces that statistical instruments must be used to describe these quantities. 2 Riccardo Rigon Sunday, September 12, 2010
  3. 3. Measurement and Representation of Hydrological Quantities Frickenhausen, on the River Meno Hydrometric Height 3 Riccardo Rigon Sunday, September 12, 2010
  4. 4. Measurement and Representation of Hydrological Quantities Frickenhausen, on the River Meno Hydrometric Height 4 Riccardo Rigon Sunday, September 12, 2010
  5. 5. Measurement and Representation of Hydrological Quantities Hydrological Data have Complex Trends 1/2 The hydrological cycles is controlled by innumerable factors: hence it depends on innumerable degrees of freedom. Only a small portion of these factors can be taken into consideration, while the remaining part needs to be modelled as a boundary condition or as “background noise” (this noise is either modelled or eliminated with statistical instruments). The dynamics of the hydrological cycle are non-linear. Both the hydrodynamics and the thermodynamics of the processes, that involve numerous phase changes, are non-linear. Another non-linear characteristic is that many of these processes are activated in function of some regulating quantity surpassing a threshold value. For example, the condensation of water vapour into raindrops is triggered when air humidity exceeds saturation; landslides are triggered when the internal friction forces of the material are overcome by the thrust of water within the capillarities of the soil; the channels of a hydrographic network begin to form when running water reaches a certain value of force per unit area. 5 Riccardo Rigon Sunday, September 12, 2010
  6. 6. Measurement and Representation of Hydrological Quantities Hydrological Data have Complex Trends 2/2 The dynamics include processes which are linearly unstable: for example the baroclinic instability the drives meteorological processes at the middle latitudes. The dynamics of climate and hydrology are dissipative. That is to say they transfer and transform mechanical energy into thermal energy. The hydrodynamic process of turbulence transports energy from the larger spatial scales to the smaller ones, where the energy is dissipated through friction. Wave phenomena of various kind (e.g. gravity waves) transport the energy contained in water and in air. 6 Riccardo Rigon Sunday, September 12, 2010
  7. 7. Measurement and Representation of Hydrological Quantities Some Typical Problems precipitation 7 Riccardo Rigon Sunday, September 12, 2010
  8. 8. Measurement and Representation of Hydrological Quantities Some Typical Problems incident solar radiation 8 Riccardo Rigon Sunday, September 12, 2010
  9. 9. Measurement and Representation of Hydrological Quantities Some Typical Problems Flow of the River Adige at San Lorenzo Bridge 1400 1200 1000 Portate m^3/s 800 600 400 200 0 1990 1995 2000 2005 Anno 9 Riccardo Rigon Sunday, September 12, 2010
  10. 10. Measurement and Representation of Hydrological Quantities Some Typical Problems Distribution of monthly river flows in Trento 10 Riccardo Rigon Sunday, September 12, 2010
  11. 11. Measurement and Representation of Hydrological Quantities Some Typical Problems Annual water budget for the Lake of Serraia catchment Grafico bilancio annuo del bacino (2000) P - precipitazione ET - evapotraspirazione Inv - volume invasato (accumulo) R - rilascio 1 0,9 0,8675 0,797 0,8 0,7 0,6 0,5 Valore (mc/s) 0,4 0,343 0,3 0,2 0,1 0 -0,1 -0,2 -0,184 -0,3 gen-00 feb-00 mar-00 apr-00 mag-00 giu-00 lug-00 ago-00 set-00 ott-00 nov-00 dic-00 Tempo (mese- anno) 11 Riccardo Rigon Sunday, September 12, 2010
  12. 12. Measurement and Representation of Hydrological Quantities Some Typical Problems Water content of the soil in the Little Washita catchment (Oklahoma) 12 Riccardo Rigon Sunday, September 12, 2010
  13. 13. Measurement and Representation of Hydrological Quantities Some Typical Problems Water content of the soil in the Little Washita catchment (Oklahoma) 13 Riccardo Rigon Sunday, September 12, 2010
  14. 14. Measurement and Representation of Hydrological Quantities Some Typical Problems Spatial distribution of preceipitation 14 Riccardo Rigon Sunday, September 12, 2010
  15. 15. Measurement and Representation of Hydrological Quantities Some Typical Problems Spatial pattern of the hydrographic network 15 Riccardo Rigon Sunday, September 12, 2010
  16. 16. Statistical Inference and Descriptive Statistics Lucio Fontana - Expectations (MoMA), 1959 Riccardo Rigon Sunday, September 12, 2010
  17. 17. Measurement and Representation of Hydrological Quantities Objectives: •In these pages the fundamental elements of statistical analysis will be recalled. •Population, sample and various elementary statistics, such as mean, variance and covariance, will be defined. •The existence of statistics and their value will be argued. •The concept of random sampling will be introduced. 17 Riccardo Rigon Sunday, September 12, 2010
  18. 18. Statistics Population and Sample Statistical inference assumes that a dataset is representative of a subset of cases, among all the possible cases, called the sample. All the possible cases represent the population from which the dataset has been extracted. While the sample is know, generally the population is not. Hypotheses are implicitly made about the population. 18 Riccardo Rigon Sunday, September 12, 2010
  19. 19. Statistics Exploratory Data Analysis temporal representation - histogram A set of n data constitutes, therefore, a sample of data. a) Bergen:Sep temperature 15 14 Temperature (oC) 13 12 11 10 9 8 1860 1880 1900 1920 1940 1960 1980 2000 time b) Bergen:Sep temperature distribution (1861!1997) 30 25 20 Frequency 15 10 5 0 5 6 7 8 9 10 11 12 13 14 15 Temperature (oC) These data can be represented in various forms. Each representation form emphasises certain characteristics. 19 Riccardo Rigon Sunday, September 12, 2010
  20. 20. Statistics Sample Means Given a sample, various statistics can be calculated. For example: n 1 x := ¯ x,t Temporal Mean n t=1 n 1 x := xi Spatial Mean n i=1 The mean is an indicator of position 20 Riccardo Rigon Sunday, September 12, 2010
  21. 21. Statistical Inference and Descriptive Statistics Statistical Inference Corrado Caudek 21 Riccardo Rigon Sunday, September 12, 2010
  22. 22. Statistical Inference and Descriptive Statistics Statistical Inference •Statistical inference is the process which allows one to formulate conclusions with regards to a population on the basis of a sample of observations extracted casually from the population. Corrado Caudek 21 Riccardo Rigon Sunday, September 12, 2010
  23. 23. Statistical Inference and Descriptive Statistics Statistical Inference •Statistical inference is the process which allows one to formulate conclusions with regards to a population on the basis of a sample of observations extracted casually from the population. •Central to classic statistical inference is the notion of sample distribution, that is to say how the statistics of the samples vary if casual samples, of the same size n, are repeatedly extracted from the population. Corrado Caudek 21 Riccardo Rigon Sunday, September 12, 2010
  24. 24. Statistical Inference and Descriptive Statistics Statistical Inference •Statistical inference is the process which allows one to formulate conclusions with regards to a population on the basis of a sample of observations extracted casually from the population. •Central to classic statistical inference is the notion of sample distribution, that is to say how the statistics of the samples vary if casual samples, of the same size n, are repeatedly extracted from the population. •Even though, in each practical application of statistical inference, the researcher only has one n-sized casual sample, the possibility that the sampling can be repeated furnishes the conceptual foundation for deciding Corrado Caudek how informative the observed sample is of the population in its entirety. 21 Riccardo Rigon Sunday, September 12, 2010
  25. 25. Statistics Exploratory Data Analysis The mean is not the only indicator of position Mode 22 Riccardo Rigon Sunday, September 12, 2010
  26. 26. Statistics Median and Mode The mode represents the most frequent value. If the histogram distinctly presents various maximums, though the matter risks being controverial, the dataset is said to be multimodal. The median represents the value for which 50% of the data has an inferior value and (obviously!) the other 50% has a greater value. 23 Riccardo Rigon Sunday, September 12, 2010
  27. 27. Statistics Empirical Distribution Function Given the dataset hi = {h1 , · · ·, hn } and having derived from this the ordered set in ascending order ˆ ˆ ˆ ˆ ˆ ˆ hj = (h1 , · · ·, hn ) h1 ≤ h2 ≤ · ≤ hn the empirical cumulative distribution function is defined i ˆ 1 ECDFi (h) := j n j=1 24 Riccardo Rigon Sunday, September 12, 2010
  28. 28. Statistics ECDF The empirical cumulative distribution function can be represented as illustrated. The ordinate value identified by the curve is called the frequency of non- exceedance or quantile. Frequenza di non superamento 1.0 ● ● ● ● ● ● ● ● ● ● 0.8 ● ● ● ● ● ● ● ● ● 0.6 ● ● P[Hh] ● ● ● ● ● ● ● 0.4 ● ● ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ● ● ● 0.0 20 40 60 80 25 h[mm] Riccardo Rigon Sunday, September 12, 2010
  29. 29. Statistics ECDF The 0.5 quantile separates the data distribution in half in relation to the ordinate. Frequenza di non superamento 1.0 ● ● ● ● ● ● ● ● ● ● 0.8 ● ● ● ● ● ● ● ● ● 0.6 ● ● P[Hh] ● 0.5 quantile ● ● ● ● ● ● 0.4 ● ● ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ● ● ● 0.0 20 40 60 80 26 h[mm] Riccardo Rigon Sunday, September 12, 2010
  30. 30. Statistics ECDF The 0.5 quantile separates the data distribution in half in relation to the ordinate. Frequenza di non superamento 1.0 ● ● ● ● ● ● ● ● ● ● 0.8 ● ● ● ● ● ● ● ● ● 0.6 ● ● P[Hh] ● 0.5 quantile ● ● ● ● ● ● 0.4 ● ● ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ● ● ● 0.0 20 40 60 80 27 h[mm] Riccardo Rigon Sunday, September 12, 2010
  31. 31. Statistics ECDF And so the median is identified Frequenza di non superamento 1.0 ● ● ● ● ● ● ● ● ● ● 0.8 ● ● ● ● ● ● ● ● ● 0.6 ● ● P[Hh] ● 0.5 quantile ● ● ● ● ● ● 0.4 ● ● ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ● ● ● 0.0 median 20 40 60 80 28 h[mm] Riccardo Rigon Sunday, September 12, 2010
  32. 32. Statistics Box and Whisker Diagrams The procedure can be generalised and represented with a box and whisker diagram. Frequenza di non superamento 1.0 ● ● ● ● ● ● ● ● ● ● 0.8 ● 0.75 quantile ● ● ● ● ● ● ● ● 0.6 ● ● 0.5 quantile P[Hh] ● ● ● ● ● ● ● 0.4 ● ● 0.25 quantile ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ● ● ● 0.0 20 40 60 80 h[mm] “whisker” 29 The box and whisker diagram is another way of representing the data distribution. Riccardo Rigon Sunday, September 12, 2010
  33. 33. Statistics Parameters and Statistics A parameter is a describes a certain aspect of the population. • For example, the (real) mean annual precipitation at a weather station is a parameter. Let us suppose that this mean is µh = 980 mm • In any concrete situation the parameters are unknown Corrado Caudek 30 Riccardo Rigon Sunday, September 12, 2010
  34. 34. Statistics Parameters and Statistics A statistic is a number that can be calculated on the basis of data given by a sample, without any knowledge of the parameters of the population. • Let us suppose, for example, that the casual sample of precipitation data covers 30 years of measurement and that the mean annual precipitation, on the basis of the sample, is ¯ h = 1002 mm Corrado Caudek • This mean is a statistic. 31 Riccardo Rigon Sunday, September 12, 2010
  35. 35. Statistics Other Statistics: the Range Rx := max(x) − min(x) The range is the simplest indicator of data distribution. It is an indicator of the scale of the data. However, it only considers two data and does not consider the other n-2 data that make up the sample. 32 Riccardo Rigon Sunday, September 12, 2010
  36. 36. Statistics Other Statistics: Variance and Standard Deviation n 1 V ar(x) := (xi − x) ¯ n i=1 n 1 σx := (xi − x) ¯ n i=1 The variance is an indicator of “scale” that considers all the data of the sample 33 Riccardo Rigon Sunday, September 12, 2010
  37. 37. Statistics Other Statistics: Variance and Standard Deviation “corrected” version (unbiased) n 1 V ar(x) := (xi − x) ¯ n−1 i=2 n 1 σx := (xi − x) ¯ n−1 i=1 The unbiased version of the variance takes into account that only n-1 data are independent, their mean being fixed. 34 Riccardo Rigon Sunday, September 12, 2010
  38. 38. Statistics Coefficient of Variation • The coefficient of variation (CV) of a data sample is defined as the ratio of between the standard deviation and the mean: σx CVx := x¯ • The greater the coefficient of variation, the less informative and indicative the mean is in relation to the future trends of the population. 35 Riccardo Rigon Sunday, September 12, 2010
  39. 39. Statistics Other Statistics: Skewness and Kurtosis n 3 1 ¯ xi − x skx := i=1 n σx Skewness is a measure of the asymmetry of the data distribution n 4 1 ¯ xi − x kx := 3 + i=1 n σx Kurtosis is a measure of the “peakedness” of the data distribution 36 Riccardo Rigon Sunday, September 12, 2010
  40. 40. Statistics Estimation and Hypothesis Testing Usually, we are not interested in the statistics for themselves, but in what the statistics tell us about the population of interest. • We could, for example, use the annual mean precipitation, measured at all hydro-meteorological stations, to estimate the mean annual precipitation for the Italian Peninsula. • Or, we could use the mean of the sample to establish whether the mean annual precipitation has mutated during the duration of the sample. 37 Riccardo Rigon Sunday, September 12, 2010
  41. 41. Statistics Estimation and Hypothesis Testing These two questions belong to the two main schools of classical statistical inference • The estimation of parameters • Statistical hypothesis testing 38 Riccardo Rigon Sunday, September 12, 2010
  42. 42. Statistics Sample Variability A fundamental aspect of sample statistics is that they vary from one sample to the next. In the case of annual precipitation, it is very improbable that the mean of the sample, of 1002mm, will coincide with the mean of the population. • The variability of a sample statistic from sample to sample is called sample variability. – When sample variability is very high, the sample is misinformative in relation to the population parameter. – When the sample variability is small, the statistic is informative, even though it is practically impossible that the statistic of a sample be exactly the same as the population parameter. 39 Riccardo Rigon Sunday, September 12, 2010
  43. 43. Statistical Inference and Descriptive Statistics Sample Variability Simulation Sample variability will be illustrated as follows: 1. we will consider a discrete variable that can only assume a small number of possible values (N = 4); 2. a list will be furnished listing all possible samples of size n = 2; 3. the mean will be calculated for each possible sample of size n = 2; 4. the distribution of means of the samples of size n = 2 will be examined. The mean μ and the variance σ of the population will be calculated. It must be noted that μ and σ are parameters, while the mean xi and the variance s2i of each sample are statistics. Corrado Caudek Techniques in Psychological Research and Data Analysis 8 40 Riccardo Rigon Sunday, September 12, 2010
  44. 44. Statistical Inference and Descriptive Statistics Sample Variability •The experiment in this example consists of the n=2 extractions with return of a marble xi from an urn that contains N=4 marbles. •The marbles are numbered as follows: {2, 3, 5, 9} •Extraction with return of the marble corresponds to a population of infinite size (it is in fact always possible to extract a ball from the urn) Corrado Caudek 41 Riccardo Rigon Sunday, September 12, 2010
  45. 45. Statistical Inference and Descriptive Statistics Sample Variability •For each sample of size n=2 the mean of the value of the marbles extracted is calculated: 2 xi x= ¯ i=1 2 •For example, if the marbles extracted are x1=2 and x2=3, then: 2+3 5 x= ¯ = = 2.5 Corrado Caudek 2 2 42 Riccardo Rigon Sunday, September 12, 2010
  46. 46. Statistical Inference and Descriptive Statistics Sample Variability Three Distributions We must distinguish between three distributions: 1. the population distribution 2. the distribution of a sample 3. the sample distribution of the means of all possible samples Corrado Caudek 43 Riccardo Rigon Sunday, September 12, 2010
  47. 47. Statistical Inference and Descriptive Statistics Sample Variability ๏ 1. The Population Distribution The population distribution: the distribution of X (the value of the marble extracted) in the population. In this specific case the population is of infinite size and has the following probability distribution: xi pi 2 1/4 3 1/4 5 1/4 9 1/4 Corrado Caudek Total 1 44 Riccardo Rigon Sunday, September 12, 2010
  48. 48. Statistical Inference and Descriptive Statistics Sample Variability •The mean of the population is: µ= xi pi = 4.75 •The variance of the population is: σ =2 (xi − µ) pi = 7.1875 2 Corrado Caudek 45 Riccardo Rigon Sunday, September 12, 2010
  49. 49. Statistical Inference and Descriptive Statistics Sample Variability ๏ 2. The Distribution of a Sample The distribution of a sample: the distribution of X in a specific sample. • If, for example, the x1 = 2 and x2 = 3, then the mean of this sample is x = 2.5 and the variance is s2 = 0.5 ¯ Corrado Caudek 46 Riccardo Rigon Sunday, September 12, 2010
  50. 50. Statistical Inference and Descriptive Statistics Sample Variability ๏ 3. The Sample Distribution of a the Means The sample distribution of a the means: the distribution of the means of all the possible samples. • If the size of the samples is n=2, then there are 4X4=16 possible samples. We can therefore list their means. sample mean xi ¯ sample mean xi ¯ {3, 2} 2.5 {2, 3} 2.5 {5, 2} 3.5 {2, 5} 3.5 {9, 2} 5.5 {2, 9} 5.5 {5, 3} 4.0 {3, 5} 4.0 Corrado Caudek {9, 3} 6.0 {3, 9} 6.0 {9, 5} 7.0 {9, 5} 7.0 {2, 2} 2.0 {3, 3} 3.0 {5, 5} 5.0 {9, 9} 9.0 47 Riccardo Rigon Sunday, September 12, 2010
  51. 51. Statistical Inference and Descriptive Statistics Sample Variability •The sample distribution of the means has the following probability distribution: ¯ xi pi 2.0 1/16 2.5 2/16 3.0 1/16 3.5 2/16 4.0 2/16 5.0 1/16 5.5 2/16 Corrado Caudek 6.0 2/16 7.0 2/16 9.0 1/16 Total 1 48 Riccardo Rigon Sunday, September 12, 2010
  52. 52. Statistical Inference and Descriptive Statistics Sample Variability •The mean of the sample distribution of the means is: µx = ¯ xi pi = 4.75 ¯ •The variance of the population is: 2 σx ¯ = (¯i − µx ) pi = 3.59375 x ¯ 2 Corrado Caudek 49 Riccardo Rigon Sunday, September 12, 2010
  53. 53. Statistical Inference and Descriptive Statistics Sample Variability ! The example we have seen is very particular insomuch that the population is known. In practice the population distribution is never known. However, we can take note of two important properties of the sample distribution of the means: •The mean of the sample distribution of means µx is the same as the ¯ population mean µ 2 •The variance of the sample distribution of means ¯σx is the equal to 2 the ratio of the variance of the population σ to the numerosity n of Corrado Caudek the sample: σ2 7.1875 σx = 2 = = 3.59375 ¯ n 2 50 Riccardo Rigon Sunday, September 12, 2010
  54. 54. Statistical Inference and Descriptive Statistics Sample Variability The two things to note can be summarised as follows: •The mean and variance of the sample distribution of means are determined by the mean and variance of the population: σ2 µx = µ ¯ σx = 2 ¯ n •The variance of the sample distribution of the means is smaller than the variance of the population. Corrado Caudek 51 Riccardo Rigon Sunday, September 12, 2010
  55. 55. Statistical Inference and Descriptive Statistics Sample Variability To follow, we will use the properties of the sample distribution to make inferences about the parameters of the population even when the population distribution is not known. Corrado Caudek 52 Riccardo Rigon Sunday, September 12, 2010
  56. 56. Statistical Inference and Descriptive Statistics Sample Variability Three Distributions Therefore, we have distinguished between three distributions: 1. the population distribution Ω = {2, 3, 5, 9}, µ = 4.75, σ 2 = 7.1875 2. the distribution of a sample Ωi = {2, 3}, x = 2.5, s = 0.5 ¯ 2 3. the sample distribution of the means of all possible samples Corrado Caudek Ωx = {2.0, 2.5, 3.0, 3.5, 4.0, 5.0, 5.5, 6.0, 7.0, 9.0}, ¯ µx = ¯ 4.75, σx 2 ¯ = 3.59375 53 Riccardo Rigon Sunday, September 12, 2010
  57. 57. Statistical Inference and Descriptive Statistics Sample Variability The population distribution: this is the distribution that contains all possible observations. The mean and variance of this distribution are indicated with μ and σ2. 1. The distribution of a sample: this is the distribution of the values of the population that make up a particular casual sample of size n. The single values are indicated x1,.... xn, and the mean and variance are ¯ indicated x and s2. 2. The sample distribution of the means of the samples: this is the ¯ distribution of the xi for al the possible samples of size n that can be extracted from the population being considered. The mean and variance of the sample distribution of means are indicated by µx and σ 2 . Corrado Caudek ¯ x ¯ 54 Riccardo Rigon Sunday, September 12, 2010
  58. 58. Statistical Inference and Descriptive Statistics Sample Variability The distribution that is the basis of statistical inference is the sample distribution. Definition: the sample distribution of a statistic is the distribution of values that the specific statistic assumes for all samples of size n that can be extracted from the population. It must be noted that if the simulation considers less samples than all those theoretically possible than the resulting distribution will only be an approximation of the real sample distribution. Corrado Caudek 55 Riccardo Rigon Sunday, September 12, 2010
  59. 59. Statistical Inference and Descriptive Statistics Estimation and Hypothesis Testing Having created different statistics, we can now make some hypotheses. For example: • Do the samples all have the same mean and the same variance? • Does the mean depend on the numerosity of the sample? • Does the variance depend on the numerosity of the sample? 56 Riccardo Rigon Sunday, September 12, 2010
  60. 60. Statistical Inference and Descriptive Statistics Estimation and Hypothesis Testing If the samples do not have the same mean, a trend can present istself. 57 Riccardo Rigon Sunday, September 12, 2010
  61. 61. Statistical Inference and Descriptive Statistics Estimation and Hypothesis Testing The variance can vary with the numerosity of the sample ! If it does not stabilise as the data of the sample increases than the data are said to have “Infinite Variance Syndrome”. 58 Riccardo Rigon Sunday, September 12, 2010
  62. 62. Statistical Inference and Descriptive Statistics Null Hypothesis We will have a chance to look at hypothesis testing in detail in future lectures. However, it is well to remember the following: • Generally, it is not possible to definitively prove anything. One can only attempt to prove that a hypothesis is not true. • Let H0 be the (null) hypothesis to be tested. If H0 can not be rejected, then one an affirm that “it is true” with a certain degree of confidence. 59 Riccardo Rigon Sunday, September 12, 2010
  63. 63. Statistical Inference and Descriptive Statistics Other Statistics: Covariance Given two datasets, for example: hi = {h1 , · · ·, hn } and li = {l1 , · · ·, ln } La covariance between these two datasets is defined as: n 1 Cov(hi , li ) := (li − ¯i )(hi − hi ) l ¯ N −1 1 60 Riccardo Rigon Sunday, September 12, 2010
  64. 64. Statistical Inference and Descriptive Statistics Other Statistics: Correlation Given two datasets, for example: hi = {h1 , · · ·, hn } and li = {l1 , · · ·, ln } La correlation between these two datasets is defined as: Cov(l, h) ρlh := √ σh σl 61 Riccardo Rigon Sunday, September 12, 2010
  65. 65. Statistical Inference and Descriptive Statistics Other Statistics: Correlation Please observe that one can consider the correlation between two sample series of equal length: hi = {h1 , · · ·, hn−1 } and hi+1 = {h2 , · · ·, hn−1 } Resulting in: n−1 1 ¯ ¯ Cov(hi , hi+1 ) := (hi − hi )(hi+1 − hi+1 ) N −1 j=1 62 Riccardo Rigon Sunday, September 12, 2010
  66. 66. Statistical Inference and Descriptive Statistics Other Statistics: Correlation Repeating this operation for the series which are gradually reduced in length and separated by r instants, the resulting series are: r hi = {h1 , · · ·, hn−r } and hi+r = {hr , · · ·, hn } From where: n−r 1 ¯ r )(hi+r − hi+r ) ¯ Cov(hi , hi+r ) r := (hi r − hi N −1 j=1 Cov(hr , hi+r ) ρ(hi , hi+r ) := r i σi σi + r r 63 Riccardo Rigon Sunday, September 12, 2010
  67. 67. Statistical Inference and Descriptive Statistics Other Statistics: Autocorrelation 64 Riccardo Rigon Sunday, September 12, 2010
  68. 68. Statistical Inference and Descriptive Statistics Random Sampling Within the strategy of creating and analysing data samples, the selection ( or, sometimes, the generation) of random samples plays an important role. A random sample of n events, selected from a population, is such if the probability of that sample being selected is the same as any other sample of the same size. If the data are generated, then one is carrying out a random experiment. Some examples of this are: •tossing a coin; •counting the rainy days in a year; and •counting the days when the river flow at the Bridge of San Lorenzo, Trento, is greater than a predetermined value. Riccardo Rigon Sunday, September 12, 2010
  69. 69. Statistical Inference and Descriptive Statistics Sample Variability Simulation 2 Let us consider another example where sample variability is illustrated as follows: 1. the same population as in the previous example shall be used (N = 4); 2. by means of the computer programme R, 50,000 samples will be extracted, with replacement, from the population of size n = 2; 3. the mean will be calculated for each of these samples of size n = 2; 4. the mean and variance of the distribution of means of the 50,000 samples of size n = 2 will be calculated. Corrado Caudek 66 Riccardo Rigon Sunday, September 12, 2010
  70. 70. Statistical Inference and Descriptive Statistics Sample Variability 3 Simulazione 2 N - 4 n - 2 nSamples - 50000 X - c(2, 3, 5, 9) Mean - mean(X) Var - var(X)*(N-1)/N SampDistr - rep(0, nSamples) Corrado Caudek for (i in 1:nSamples){ samp - sample(X, n, replace=T) SampDistr[i] - mean(samp) } MeanSampDistr - mean(SampDistr) 67 VarSampDistr - var(SampDistr)*(nSamples-1)/nSamples Riccardo Rigon Tecniche di Ricerca Psicologica e di Analisi dei Dati 27 Sunday, September 12, 2010
  71. 71. Statistical Inference and Descriptive Statistics Sample Variability 3 Simulazione 2 N - 4 n - 2 nSamples - 50000 X - c(2, 3, 5, 9) Mean - mean(X) Var - var(X)*(N-1)/N SampDistr - rep(0, nSamples) Corrado Caudek for (i in 1:nSamples){ samp - sample(X, n, replace=T) SampDistr[i] - mean(samp) } MeanSampDistr - mean(SampDistr) 67 VarSampDistr - var(SampDistr)*(nSamples-1)/nSamples Riccardo Rigon Tecniche di Ricerca Psicologica e di Analisi dei Dati 27 Sunday, September 12, 2010
  72. 72. Statistical Inference and Descriptive Statistics Sample Variability 3 Simulazione 2 N - 4 n - 2 nSamples - 50000 X - c(2, 3, 5, 9) Mean - mean(X) Mean and Variance of the Sample Var - var(X)*(N-1)/N SampDistr - rep(0, nSamples) Corrado Caudek for (i in 1:nSamples){ samp - sample(X, n, replace=T) SampDistr[i] - mean(samp) } MeanSampDistr - mean(SampDistr) 67 VarSampDistr - var(SampDistr)*(nSamples-1)/nSamples Riccardo Rigon Tecniche di Ricerca Psicologica e di Analisi dei Dati 27 Sunday, September 12, 2010
  73. 73. Statistical Inference and Descriptive Statistics Sample Variability 3 Simulazione 2 N - 4 n - 2 nSamples - 50000 X - c(2, 3, 5, 9) Mean - mean(X) Mean and Variance of the Sample Var - var(X)*(N-1)/N SampDistr - rep(0, nSamples) Corrado Caudek for (i in 1:nSamples){ 50,000 samples are extracted samp - sample(X, n, replace=T) SampDistr[i] - mean(samp) } MeanSampDistr - mean(SampDistr) 67 VarSampDistr - var(SampDistr)*(nSamples-1)/nSamples Riccardo Rigon Tecniche di Ricerca Psicologica e di Analisi dei Dati 27 Sunday, September 12, 2010
  74. 74. Statistical Inference and Descriptive Statistics Sample Variability 3 Simulazione 2 ! Results of analysis with R: Risultati della simulazione Mean [1] 4.75 Var [1] 7.1875 MeanSampDistr [1] 4.73943 VarSampDistr [1] 3.578548 Var/n Corrado Caudek [1] 3.59375 68 Tecniche di Ricerca Psicologica e di Analisi dei Dati 28 Riccardo Rigon Sunday, September 12, 2010
  75. 75. Statistical Inference and Descriptive Statistics Sample Variability ! Population: µ = 4.75, σ = 7.1875 2 ๏Sample distribution of the means: µx = 4.75, σx = 3.59375 ¯ 2 ¯ ๏Results of the R simulation: µx = ˆ¯ 4.73943, σx ˆ¯ 2 = 3.578548 Corrado Caudek 69 Riccardo Rigon Sunday, September 12, 2010
  76. 76. Measurement and Representation of Hydrological Quantities Thank you for your attention! G.Ulrici - Uomo dope aver lavorato alle slides , 2000 ? 70 Riccardo Rigon Sunday, September 12, 2010
  1. ¿Le ha llamado la atención una diapositiva en particular?

    Recortar diapositivas es una manera útil de recopilar información importante para consultarla más tarde.

×