Your SlideShare is downloading. ×
0
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
6   measurement&representation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

6 measurement&representation

1,110

Published on

How Hydrological measure appears and how to treat them (an introduction)

How Hydrological measure appears and how to treat them (an introduction)

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,110
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
76
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Measurement and Representation of Hydrological Quantities Leonardo da Vinci - Vitruvian Man, ca 1487 photo by Luc Viatour, www.lucnix.be Riccardo Rigon Sunday, September 12, 2010
  • 2. Measurement and Representation of Hydrological Quantities Objectives: •In these pages the spatio-temporal variability of measurements of hydrological quantities is discussed by means of examples. •One deduces that statistical instruments must be used to describe these quantities. 2 Riccardo Rigon Sunday, September 12, 2010
  • 3. Measurement and Representation of Hydrological Quantities Frickenhausen, on the River Meno Hydrometric Height 3 Riccardo Rigon Sunday, September 12, 2010
  • 4. Measurement and Representation of Hydrological Quantities Frickenhausen, on the River Meno Hydrometric Height 4 Riccardo Rigon Sunday, September 12, 2010
  • 5. Measurement and Representation of Hydrological Quantities Hydrological Data have Complex Trends 1/2 The hydrological cycles is controlled by innumerable factors: hence it depends on innumerable degrees of freedom. Only a small portion of these factors can be taken into consideration, while the remaining part needs to be modelled as a boundary condition or as “background noise” (this noise is either modelled or eliminated with statistical instruments). The dynamics of the hydrological cycle are non-linear. Both the hydrodynamics and the thermodynamics of the processes, that involve numerous phase changes, are non-linear. Another non-linear characteristic is that many of these processes are activated in function of some regulating quantity surpassing a threshold value. For example, the condensation of water vapour into raindrops is triggered when air humidity exceeds saturation; landslides are triggered when the internal friction forces of the material are overcome by the thrust of water within the capillarities of the soil; the channels of a hydrographic network begin to form when running water reaches a certain value of force per unit area. 5 Riccardo Rigon Sunday, September 12, 2010
  • 6. Measurement and Representation of Hydrological Quantities Hydrological Data have Complex Trends 2/2 The dynamics include processes which are linearly unstable: for example the baroclinic instability the drives meteorological processes at the middle latitudes. The dynamics of climate and hydrology are dissipative. That is to say they transfer and transform mechanical energy into thermal energy. The hydrodynamic process of turbulence transports energy from the larger spatial scales to the smaller ones, where the energy is dissipated through friction. Wave phenomena of various kind (e.g. gravity waves) transport the energy contained in water and in air. 6 Riccardo Rigon Sunday, September 12, 2010
  • 7. Measurement and Representation of Hydrological Quantities Some Typical Problems precipitation 7 Riccardo Rigon Sunday, September 12, 2010
  • 8. Measurement and Representation of Hydrological Quantities Some Typical Problems incident solar radiation 8 Riccardo Rigon Sunday, September 12, 2010
  • 9. Measurement and Representation of Hydrological Quantities Some Typical Problems Flow of the River Adige at San Lorenzo Bridge 1400 1200 1000 Portate m^3/s 800 600 400 200 0 1990 1995 2000 2005 Anno 9 Riccardo Rigon Sunday, September 12, 2010
  • 10. Measurement and Representation of Hydrological Quantities Some Typical Problems Distribution of monthly river flows in Trento 10 Riccardo Rigon Sunday, September 12, 2010
  • 11. Measurement and Representation of Hydrological Quantities Some Typical Problems Annual water budget for the Lake of Serraia catchment Grafico bilancio annuo del bacino (2000) P - precipitazione ET - evapotraspirazione Inv - volume invasato (accumulo) R - rilascio 1 0,9 0,8675 0,797 0,8 0,7 0,6 0,5 Valore (mc/s) 0,4 0,343 0,3 0,2 0,1 0 -0,1 -0,2 -0,184 -0,3 gen-00 feb-00 mar-00 apr-00 mag-00 giu-00 lug-00 ago-00 set-00 ott-00 nov-00 dic-00 Tempo (mese- anno) 11 Riccardo Rigon Sunday, September 12, 2010
  • 12. Measurement and Representation of Hydrological Quantities Some Typical Problems Water content of the soil in the Little Washita catchment (Oklahoma) 12 Riccardo Rigon Sunday, September 12, 2010
  • 13. Measurement and Representation of Hydrological Quantities Some Typical Problems Water content of the soil in the Little Washita catchment (Oklahoma) 13 Riccardo Rigon Sunday, September 12, 2010
  • 14. Measurement and Representation of Hydrological Quantities Some Typical Problems Spatial distribution of preceipitation 14 Riccardo Rigon Sunday, September 12, 2010
  • 15. Measurement and Representation of Hydrological Quantities Some Typical Problems Spatial pattern of the hydrographic network 15 Riccardo Rigon Sunday, September 12, 2010
  • 16. Statistical Inference and Descriptive Statistics Lucio Fontana - Expectations (MoMA), 1959 Riccardo Rigon Sunday, September 12, 2010
  • 17. Measurement and Representation of Hydrological Quantities Objectives: •In these pages the fundamental elements of statistical analysis will be recalled. •Population, sample and various elementary statistics, such as mean, variance and covariance, will be defined. •The existence of statistics and their value will be argued. •The concept of random sampling will be introduced. 17 Riccardo Rigon Sunday, September 12, 2010
  • 18. Statistics Population and Sample Statistical inference assumes that a dataset is representative of a subset of cases, among all the possible cases, called the sample. All the possible cases represent the population from which the dataset has been extracted. While the sample is know, generally the population is not. Hypotheses are implicitly made about the population. 18 Riccardo Rigon Sunday, September 12, 2010
  • 19. Statistics Exploratory Data Analysis temporal representation - histogram A set of n data constitutes, therefore, a sample of data. a) Bergen:Sep temperature 15 14 Temperature (oC) 13 12 11 10 9 8 1860 1880 1900 1920 1940 1960 1980 2000 time b) Bergen:Sep temperature distribution (1861!1997) 30 25 20 Frequency 15 10 5 0 5 6 7 8 9 10 11 12 13 14 15 Temperature (oC) These data can be represented in various forms. Each representation form emphasises certain characteristics. 19 Riccardo Rigon Sunday, September 12, 2010
  • 20. Statistics Sample Means Given a sample, various statistics can be calculated. For example: n 1 x := ¯ x,t Temporal Mean n t=1 n 1 x := xi Spatial Mean n i=1 The mean is an indicator of position 20 Riccardo Rigon Sunday, September 12, 2010
  • 21. Statistical Inference and Descriptive Statistics Statistical Inference Corrado Caudek 21 Riccardo Rigon Sunday, September 12, 2010
  • 22. Statistical Inference and Descriptive Statistics Statistical Inference •Statistical inference is the process which allows one to formulate conclusions with regards to a population on the basis of a sample of observations extracted casually from the population. Corrado Caudek 21 Riccardo Rigon Sunday, September 12, 2010
  • 23. Statistical Inference and Descriptive Statistics Statistical Inference •Statistical inference is the process which allows one to formulate conclusions with regards to a population on the basis of a sample of observations extracted casually from the population. •Central to classic statistical inference is the notion of sample distribution, that is to say how the statistics of the samples vary if casual samples, of the same size n, are repeatedly extracted from the population. Corrado Caudek 21 Riccardo Rigon Sunday, September 12, 2010
  • 24. Statistical Inference and Descriptive Statistics Statistical Inference •Statistical inference is the process which allows one to formulate conclusions with regards to a population on the basis of a sample of observations extracted casually from the population. •Central to classic statistical inference is the notion of sample distribution, that is to say how the statistics of the samples vary if casual samples, of the same size n, are repeatedly extracted from the population. •Even though, in each practical application of statistical inference, the researcher only has one n-sized casual sample, the possibility that the sampling can be repeated furnishes the conceptual foundation for deciding Corrado Caudek how informative the observed sample is of the population in its entirety. 21 Riccardo Rigon Sunday, September 12, 2010
  • 25. Statistics Exploratory Data Analysis The mean is not the only indicator of position Mode 22 Riccardo Rigon Sunday, September 12, 2010
  • 26. Statistics Median and Mode The mode represents the most frequent value. If the histogram distinctly presents various maximums, though the matter risks being controverial, the dataset is said to be multimodal. The median represents the value for which 50% of the data has an inferior value and (obviously!) the other 50% has a greater value. 23 Riccardo Rigon Sunday, September 12, 2010
  • 27. Statistics Empirical Distribution Function Given the dataset hi = {h1 , · · ·, hn } and having derived from this the ordered set in ascending order ˆ ˆ ˆ ˆ ˆ ˆ hj = (h1 , · · ·, hn ) h1 ≤ h2 ≤ · ≤ hn the empirical cumulative distribution function is defined i ˆ 1 ECDFi (h) := j n j=1 24 Riccardo Rigon Sunday, September 12, 2010
  • 28. Statistics ECDF The empirical cumulative distribution function can be represented as illustrated. The ordinate value identified by the curve is called the frequency of non- exceedance or quantile. Frequenza di non superamento 1.0 ● ● ● ● ● ● ● ● ● ● 0.8 ● ● ● ● ● ● ● ● ● 0.6 ● ● P[Hh] ● ● ● ● ● ● ● 0.4 ● ● ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ● ● ● 0.0 20 40 60 80 25 h[mm] Riccardo Rigon Sunday, September 12, 2010
  • 29. Statistics ECDF The 0.5 quantile separates the data distribution in half in relation to the ordinate. Frequenza di non superamento 1.0 ● ● ● ● ● ● ● ● ● ● 0.8 ● ● ● ● ● ● ● ● ● 0.6 ● ● P[Hh] ● 0.5 quantile ● ● ● ● ● ● 0.4 ● ● ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ● ● ● 0.0 20 40 60 80 26 h[mm] Riccardo Rigon Sunday, September 12, 2010
  • 30. Statistics ECDF The 0.5 quantile separates the data distribution in half in relation to the ordinate. Frequenza di non superamento 1.0 ● ● ● ● ● ● ● ● ● ● 0.8 ● ● ● ● ● ● ● ● ● 0.6 ● ● P[Hh] ● 0.5 quantile ● ● ● ● ● ● 0.4 ● ● ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ● ● ● 0.0 20 40 60 80 27 h[mm] Riccardo Rigon Sunday, September 12, 2010
  • 31. Statistics ECDF And so the median is identified Frequenza di non superamento 1.0 ● ● ● ● ● ● ● ● ● ● 0.8 ● ● ● ● ● ● ● ● ● 0.6 ● ● P[Hh] ● 0.5 quantile ● ● ● ● ● ● 0.4 ● ● ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ● ● ● 0.0 median 20 40 60 80 28 h[mm] Riccardo Rigon Sunday, September 12, 2010
  • 32. Statistics Box and Whisker Diagrams The procedure can be generalised and represented with a box and whisker diagram. Frequenza di non superamento 1.0 ● ● ● ● ● ● ● ● ● ● 0.8 ● 0.75 quantile ● ● ● ● ● ● ● ● 0.6 ● ● 0.5 quantile P[Hh] ● ● ● ● ● ● ● 0.4 ● ● 0.25 quantile ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ● ● ● 0.0 20 40 60 80 h[mm] “whisker” 29 The box and whisker diagram is another way of representing the data distribution. Riccardo Rigon Sunday, September 12, 2010
  • 33. Statistics Parameters and Statistics A parameter is a describes a certain aspect of the population. • For example, the (real) mean annual precipitation at a weather station is a parameter. Let us suppose that this mean is µh = 980 mm • In any concrete situation the parameters are unknown Corrado Caudek 30 Riccardo Rigon Sunday, September 12, 2010
  • 34. Statistics Parameters and Statistics A statistic is a number that can be calculated on the basis of data given by a sample, without any knowledge of the parameters of the population. • Let us suppose, for example, that the casual sample of precipitation data covers 30 years of measurement and that the mean annual precipitation, on the basis of the sample, is ¯ h = 1002 mm Corrado Caudek • This mean is a statistic. 31 Riccardo Rigon Sunday, September 12, 2010
  • 35. Statistics Other Statistics: the Range Rx := max(x) − min(x) The range is the simplest indicator of data distribution. It is an indicator of the scale of the data. However, it only considers two data and does not consider the other n-2 data that make up the sample. 32 Riccardo Rigon Sunday, September 12, 2010
  • 36. Statistics Other Statistics: Variance and Standard Deviation n 1 V ar(x) := (xi − x) ¯ n i=1 n 1 σx := (xi − x) ¯ n i=1 The variance is an indicator of “scale” that considers all the data of the sample 33 Riccardo Rigon Sunday, September 12, 2010
  • 37. Statistics Other Statistics: Variance and Standard Deviation “corrected” version (unbiased) n 1 V ar(x) := (xi − x) ¯ n−1 i=2 n 1 σx := (xi − x) ¯ n−1 i=1 The unbiased version of the variance takes into account that only n-1 data are independent, their mean being fixed. 34 Riccardo Rigon Sunday, September 12, 2010
  • 38. Statistics Coefficient of Variation • The coefficient of variation (CV) of a data sample is defined as the ratio of between the standard deviation and the mean: σx CVx := x¯ • The greater the coefficient of variation, the less informative and indicative the mean is in relation to the future trends of the population. 35 Riccardo Rigon Sunday, September 12, 2010
  • 39. Statistics Other Statistics: Skewness and Kurtosis n 3 1 ¯ xi − x skx := i=1 n σx Skewness is a measure of the asymmetry of the data distribution n 4 1 ¯ xi − x kx := 3 + i=1 n σx Kurtosis is a measure of the “peakedness” of the data distribution 36 Riccardo Rigon Sunday, September 12, 2010
  • 40. Statistics Estimation and Hypothesis Testing Usually, we are not interested in the statistics for themselves, but in what the statistics tell us about the population of interest. • We could, for example, use the annual mean precipitation, measured at all hydro-meteorological stations, to estimate the mean annual precipitation for the Italian Peninsula. • Or, we could use the mean of the sample to establish whether the mean annual precipitation has mutated during the duration of the sample. 37 Riccardo Rigon Sunday, September 12, 2010
  • 41. Statistics Estimation and Hypothesis Testing These two questions belong to the two main schools of classical statistical inference • The estimation of parameters • Statistical hypothesis testing 38 Riccardo Rigon Sunday, September 12, 2010
  • 42. Statistics Sample Variability A fundamental aspect of sample statistics is that they vary from one sample to the next. In the case of annual precipitation, it is very improbable that the mean of the sample, of 1002mm, will coincide with the mean of the population. • The variability of a sample statistic from sample to sample is called sample variability. – When sample variability is very high, the sample is misinformative in relation to the population parameter. – When the sample variability is small, the statistic is informative, even though it is practically impossible that the statistic of a sample be exactly the same as the population parameter. 39 Riccardo Rigon Sunday, September 12, 2010
  • 43. Statistical Inference and Descriptive Statistics Sample Variability Simulation Sample variability will be illustrated as follows: 1. we will consider a discrete variable that can only assume a small number of possible values (N = 4); 2. a list will be furnished listing all possible samples of size n = 2; 3. the mean will be calculated for each possible sample of size n = 2; 4. the distribution of means of the samples of size n = 2 will be examined. The mean μ and the variance σ of the population will be calculated. It must be noted that μ and σ are parameters, while the mean xi and the variance s2i of each sample are statistics. Corrado Caudek Techniques in Psychological Research and Data Analysis 8 40 Riccardo Rigon Sunday, September 12, 2010
  • 44. Statistical Inference and Descriptive Statistics Sample Variability •The experiment in this example consists of the n=2 extractions with return of a marble xi from an urn that contains N=4 marbles. •The marbles are numbered as follows: {2, 3, 5, 9} •Extraction with return of the marble corresponds to a population of infinite size (it is in fact always possible to extract a ball from the urn) Corrado Caudek 41 Riccardo Rigon Sunday, September 12, 2010
  • 45. Statistical Inference and Descriptive Statistics Sample Variability •For each sample of size n=2 the mean of the value of the marbles extracted is calculated: 2 xi x= ¯ i=1 2 •For example, if the marbles extracted are x1=2 and x2=3, then: 2+3 5 x= ¯ = = 2.5 Corrado Caudek 2 2 42 Riccardo Rigon Sunday, September 12, 2010
  • 46. Statistical Inference and Descriptive Statistics Sample Variability Three Distributions We must distinguish between three distributions: 1. the population distribution 2. the distribution of a sample 3. the sample distribution of the means of all possible samples Corrado Caudek 43 Riccardo Rigon Sunday, September 12, 2010
  • 47. Statistical Inference and Descriptive Statistics Sample Variability ๏ 1. The Population Distribution The population distribution: the distribution of X (the value of the marble extracted) in the population. In this specific case the population is of infinite size and has the following probability distribution: xi pi 2 1/4 3 1/4 5 1/4 9 1/4 Corrado Caudek Total 1 44 Riccardo Rigon Sunday, September 12, 2010
  • 48. Statistical Inference and Descriptive Statistics Sample Variability •The mean of the population is: µ= xi pi = 4.75 •The variance of the population is: σ =2 (xi − µ) pi = 7.1875 2 Corrado Caudek 45 Riccardo Rigon Sunday, September 12, 2010
  • 49. Statistical Inference and Descriptive Statistics Sample Variability ๏ 2. The Distribution of a Sample The distribution of a sample: the distribution of X in a specific sample. • If, for example, the x1 = 2 and x2 = 3, then the mean of this sample is x = 2.5 and the variance is s2 = 0.5 ¯ Corrado Caudek 46 Riccardo Rigon Sunday, September 12, 2010
  • 50. Statistical Inference and Descriptive Statistics Sample Variability ๏ 3. The Sample Distribution of a the Means The sample distribution of a the means: the distribution of the means of all the possible samples. • If the size of the samples is n=2, then there are 4X4=16 possible samples. We can therefore list their means. sample mean xi ¯ sample mean xi ¯ {3, 2} 2.5 {2, 3} 2.5 {5, 2} 3.5 {2, 5} 3.5 {9, 2} 5.5 {2, 9} 5.5 {5, 3} 4.0 {3, 5} 4.0 Corrado Caudek {9, 3} 6.0 {3, 9} 6.0 {9, 5} 7.0 {9, 5} 7.0 {2, 2} 2.0 {3, 3} 3.0 {5, 5} 5.0 {9, 9} 9.0 47 Riccardo Rigon Sunday, September 12, 2010
  • 51. Statistical Inference and Descriptive Statistics Sample Variability •The sample distribution of the means has the following probability distribution: ¯ xi pi 2.0 1/16 2.5 2/16 3.0 1/16 3.5 2/16 4.0 2/16 5.0 1/16 5.5 2/16 Corrado Caudek 6.0 2/16 7.0 2/16 9.0 1/16 Total 1 48 Riccardo Rigon Sunday, September 12, 2010
  • 52. Statistical Inference and Descriptive Statistics Sample Variability •The mean of the sample distribution of the means is: µx = ¯ xi pi = 4.75 ¯ •The variance of the population is: 2 σx ¯ = (¯i − µx ) pi = 3.59375 x ¯ 2 Corrado Caudek 49 Riccardo Rigon Sunday, September 12, 2010
  • 53. Statistical Inference and Descriptive Statistics Sample Variability ! The example we have seen is very particular insomuch that the population is known. In practice the population distribution is never known. However, we can take note of two important properties of the sample distribution of the means: •The mean of the sample distribution of means µx is the same as the ¯ population mean µ 2 •The variance of the sample distribution of means ¯σx is the equal to 2 the ratio of the variance of the population σ to the numerosity n of Corrado Caudek the sample: σ2 7.1875 σx = 2 = = 3.59375 ¯ n 2 50 Riccardo Rigon Sunday, September 12, 2010
  • 54. Statistical Inference and Descriptive Statistics Sample Variability The two things to note can be summarised as follows: •The mean and variance of the sample distribution of means are determined by the mean and variance of the population: σ2 µx = µ ¯ σx = 2 ¯ n •The variance of the sample distribution of the means is smaller than the variance of the population. Corrado Caudek 51 Riccardo Rigon Sunday, September 12, 2010
  • 55. Statistical Inference and Descriptive Statistics Sample Variability To follow, we will use the properties of the sample distribution to make inferences about the parameters of the population even when the population distribution is not known. Corrado Caudek 52 Riccardo Rigon Sunday, September 12, 2010
  • 56. Statistical Inference and Descriptive Statistics Sample Variability Three Distributions Therefore, we have distinguished between three distributions: 1. the population distribution Ω = {2, 3, 5, 9}, µ = 4.75, σ 2 = 7.1875 2. the distribution of a sample Ωi = {2, 3}, x = 2.5, s = 0.5 ¯ 2 3. the sample distribution of the means of all possible samples Corrado Caudek Ωx = {2.0, 2.5, 3.0, 3.5, 4.0, 5.0, 5.5, 6.0, 7.0, 9.0}, ¯ µx = ¯ 4.75, σx 2 ¯ = 3.59375 53 Riccardo Rigon Sunday, September 12, 2010
  • 57. Statistical Inference and Descriptive Statistics Sample Variability The population distribution: this is the distribution that contains all possible observations. The mean and variance of this distribution are indicated with μ and σ2. 1. The distribution of a sample: this is the distribution of the values of the population that make up a particular casual sample of size n. The single values are indicated x1,.... xn, and the mean and variance are ¯ indicated x and s2. 2. The sample distribution of the means of the samples: this is the ¯ distribution of the xi for al the possible samples of size n that can be extracted from the population being considered. The mean and variance of the sample distribution of means are indicated by µx and σ 2 . Corrado Caudek ¯ x ¯ 54 Riccardo Rigon Sunday, September 12, 2010
  • 58. Statistical Inference and Descriptive Statistics Sample Variability The distribution that is the basis of statistical inference is the sample distribution. Definition: the sample distribution of a statistic is the distribution of values that the specific statistic assumes for all samples of size n that can be extracted from the population. It must be noted that if the simulation considers less samples than all those theoretically possible than the resulting distribution will only be an approximation of the real sample distribution. Corrado Caudek 55 Riccardo Rigon Sunday, September 12, 2010
  • 59. Statistical Inference and Descriptive Statistics Estimation and Hypothesis Testing Having created different statistics, we can now make some hypotheses. For example: • Do the samples all have the same mean and the same variance? • Does the mean depend on the numerosity of the sample? • Does the variance depend on the numerosity of the sample? 56 Riccardo Rigon Sunday, September 12, 2010
  • 60. Statistical Inference and Descriptive Statistics Estimation and Hypothesis Testing If the samples do not have the same mean, a trend can present istself. 57 Riccardo Rigon Sunday, September 12, 2010
  • 61. Statistical Inference and Descriptive Statistics Estimation and Hypothesis Testing The variance can vary with the numerosity of the sample ! If it does not stabilise as the data of the sample increases than the data are said to have “Infinite Variance Syndrome”. 58 Riccardo Rigon Sunday, September 12, 2010
  • 62. Statistical Inference and Descriptive Statistics Null Hypothesis We will have a chance to look at hypothesis testing in detail in future lectures. However, it is well to remember the following: • Generally, it is not possible to definitively prove anything. One can only attempt to prove that a hypothesis is not true. • Let H0 be the (null) hypothesis to be tested. If H0 can not be rejected, then one an affirm that “it is true” with a certain degree of confidence. 59 Riccardo Rigon Sunday, September 12, 2010
  • 63. Statistical Inference and Descriptive Statistics Other Statistics: Covariance Given two datasets, for example: hi = {h1 , · · ·, hn } and li = {l1 , · · ·, ln } La covariance between these two datasets is defined as: n 1 Cov(hi , li ) := (li − ¯i )(hi − hi ) l ¯ N −1 1 60 Riccardo Rigon Sunday, September 12, 2010
  • 64. Statistical Inference and Descriptive Statistics Other Statistics: Correlation Given two datasets, for example: hi = {h1 , · · ·, hn } and li = {l1 , · · ·, ln } La correlation between these two datasets is defined as: Cov(l, h) ρlh := √ σh σl 61 Riccardo Rigon Sunday, September 12, 2010
  • 65. Statistical Inference and Descriptive Statistics Other Statistics: Correlation Please observe that one can consider the correlation between two sample series of equal length: hi = {h1 , · · ·, hn−1 } and hi+1 = {h2 , · · ·, hn−1 } Resulting in: n−1 1 ¯ ¯ Cov(hi , hi+1 ) := (hi − hi )(hi+1 − hi+1 ) N −1 j=1 62 Riccardo Rigon Sunday, September 12, 2010
  • 66. Statistical Inference and Descriptive Statistics Other Statistics: Correlation Repeating this operation for the series which are gradually reduced in length and separated by r instants, the resulting series are: r hi = {h1 , · · ·, hn−r } and hi+r = {hr , · · ·, hn } From where: n−r 1 ¯ r )(hi+r − hi+r ) ¯ Cov(hi , hi+r ) r := (hi r − hi N −1 j=1 Cov(hr , hi+r ) ρ(hi , hi+r ) := r i σi σi + r r 63 Riccardo Rigon Sunday, September 12, 2010
  • 67. Statistical Inference and Descriptive Statistics Other Statistics: Autocorrelation 64 Riccardo Rigon Sunday, September 12, 2010
  • 68. Statistical Inference and Descriptive Statistics Random Sampling Within the strategy of creating and analysing data samples, the selection ( or, sometimes, the generation) of random samples plays an important role. A random sample of n events, selected from a population, is such if the probability of that sample being selected is the same as any other sample of the same size. If the data are generated, then one is carrying out a random experiment. Some examples of this are: •tossing a coin; •counting the rainy days in a year; and •counting the days when the river flow at the Bridge of San Lorenzo, Trento, is greater than a predetermined value. Riccardo Rigon Sunday, September 12, 2010
  • 69. Statistical Inference and Descriptive Statistics Sample Variability Simulation 2 Let us consider another example where sample variability is illustrated as follows: 1. the same population as in the previous example shall be used (N = 4); 2. by means of the computer programme R, 50,000 samples will be extracted, with replacement, from the population of size n = 2; 3. the mean will be calculated for each of these samples of size n = 2; 4. the mean and variance of the distribution of means of the 50,000 samples of size n = 2 will be calculated. Corrado Caudek 66 Riccardo Rigon Sunday, September 12, 2010
  • 70. Statistical Inference and Descriptive Statistics Sample Variability 3 Simulazione 2 N - 4 n - 2 nSamples - 50000 X - c(2, 3, 5, 9) Mean - mean(X) Var - var(X)*(N-1)/N SampDistr - rep(0, nSamples) Corrado Caudek for (i in 1:nSamples){ samp - sample(X, n, replace=T) SampDistr[i] - mean(samp) } MeanSampDistr - mean(SampDistr) 67 VarSampDistr - var(SampDistr)*(nSamples-1)/nSamples Riccardo Rigon Tecniche di Ricerca Psicologica e di Analisi dei Dati 27 Sunday, September 12, 2010
  • 71. Statistical Inference and Descriptive Statistics Sample Variability 3 Simulazione 2 N - 4 n - 2 nSamples - 50000 X - c(2, 3, 5, 9) Mean - mean(X) Var - var(X)*(N-1)/N SampDistr - rep(0, nSamples) Corrado Caudek for (i in 1:nSamples){ samp - sample(X, n, replace=T) SampDistr[i] - mean(samp) } MeanSampDistr - mean(SampDistr) 67 VarSampDistr - var(SampDistr)*(nSamples-1)/nSamples Riccardo Rigon Tecniche di Ricerca Psicologica e di Analisi dei Dati 27 Sunday, September 12, 2010
  • 72. Statistical Inference and Descriptive Statistics Sample Variability 3 Simulazione 2 N - 4 n - 2 nSamples - 50000 X - c(2, 3, 5, 9) Mean - mean(X) Mean and Variance of the Sample Var - var(X)*(N-1)/N SampDistr - rep(0, nSamples) Corrado Caudek for (i in 1:nSamples){ samp - sample(X, n, replace=T) SampDistr[i] - mean(samp) } MeanSampDistr - mean(SampDistr) 67 VarSampDistr - var(SampDistr)*(nSamples-1)/nSamples Riccardo Rigon Tecniche di Ricerca Psicologica e di Analisi dei Dati 27 Sunday, September 12, 2010
  • 73. Statistical Inference and Descriptive Statistics Sample Variability 3 Simulazione 2 N - 4 n - 2 nSamples - 50000 X - c(2, 3, 5, 9) Mean - mean(X) Mean and Variance of the Sample Var - var(X)*(N-1)/N SampDistr - rep(0, nSamples) Corrado Caudek for (i in 1:nSamples){ 50,000 samples are extracted samp - sample(X, n, replace=T) SampDistr[i] - mean(samp) } MeanSampDistr - mean(SampDistr) 67 VarSampDistr - var(SampDistr)*(nSamples-1)/nSamples Riccardo Rigon Tecniche di Ricerca Psicologica e di Analisi dei Dati 27 Sunday, September 12, 2010
  • 74. Statistical Inference and Descriptive Statistics Sample Variability 3 Simulazione 2 ! Results of analysis with R: Risultati della simulazione Mean [1] 4.75 Var [1] 7.1875 MeanSampDistr [1] 4.73943 VarSampDistr [1] 3.578548 Var/n Corrado Caudek [1] 3.59375 68 Tecniche di Ricerca Psicologica e di Analisi dei Dati 28 Riccardo Rigon Sunday, September 12, 2010
  • 75. Statistical Inference and Descriptive Statistics Sample Variability ! Population: µ = 4.75, σ = 7.1875 2 ๏Sample distribution of the means: µx = 4.75, σx = 3.59375 ¯ 2 ¯ ๏Results of the R simulation: µx = ˆ¯ 4.73943, σx ˆ¯ 2 = 3.578548 Corrado Caudek 69 Riccardo Rigon Sunday, September 12, 2010
  • 76. Measurement and Representation of Hydrological Quantities Thank you for your attention! G.Ulrici - Uomo dope aver lavorato alle slides , 2000 ? 70 Riccardo Rigon Sunday, September 12, 2010

×