Fernandos Statistics


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Fernandos Statistics

  1. 1. Descriptive statistics <ul><li>Definition: Descriptive statistics refers to statistical techniques used to summarise and describe a data set, and also to the statistics (measures) used in such summaries. Measures of central tendency, such as mean and median , and dispersion, such as range and standard deviation , are the main descriptive statistics. Displays of data , such as histograms and box-plots , are also considered techniques of descriptive statistics. </li></ul>
  2. 2. Inferential statistics <ul><li>Definition: Inferential statistics, or statistical induction, means the use of statistics to make inferences concerning some unknown aspect of a population from a sample of that population. A common method used in inferential statistics is estimation. In estimation, the sample is used to estimate a parameter , and a confidence interval about the estimate is constructed. Other examples of inferential statistics methods include hypothesis testing , linear regression , and principle components analysis . </li></ul>
  3. 3. The difference between them <ul><li>There are two fundamental purposes to analyzing data: the first is to describe a large number of </li></ul><ul><li>data points in a concise way by means of one or more summary statistics; the second is to draw </li></ul><ul><li>inferences about the characteristics of a population based on the characteristics of a sample. </li></ul><ul><li>Descriptive statistics characterize the distribution of a set of observations on a specific variable </li></ul><ul><li>or variables. By conveying the essential properties of the aggregation of many different </li></ul><ul><li>observations, these summary measures make it possible to understand the phenomenon under </li></ul><ul><li>study better and more quickly than would be possible by studying a multitude of unprocessed </li></ul><ul><li>individual values. Inferential statistics allow one to draw conclusions about the unknown </li></ul><ul><li>parameters of a population based on statistics which describe a sample from that population. </li></ul><ul><li>Very often, mere description of a set of observations in a sample is not the goal of research. The </li></ul><ul><li>data on hand are usually only a sample of the actual population of interest, possibly a minute </li></ul><ul><li>sample of the population. For example, most presidential election polls only sample about 1,000 </li></ul><ul><li>individuals, and yet the goal is to describe the expected voting behavior of 100 million or more </li></ul><ul><li>potential voters. </li></ul>
  4. 4. Regression <ul><li>Definition: The idea behind regression is that when there is significant linear correlation, you can use a line to estimate the value of the dependent variable for certain values of the independent variable. </li></ul><ul><li>The regression equation should only used: </li></ul><ul><ul><li>When there is significant linear correlation. That is, when you reject the null hypothesis that rho=0 in a correlation hypothesis test. </li></ul></ul><ul><ul><li>The value of the independent variable being used in the estimation is close to the original values. That is, you should not use a regression equation obtained using x's between 10 and 20 to estimate y when x is 200. </li></ul></ul><ul><ul><li>The regression equation should not be used with different populations. That is, if x is the height of a male, and y is the weight of a male, then you shouldn't use the regression equation to estimate the weight of a female. </li></ul></ul><ul><ul><li>The regression equation shouldn't be used to forecast values not from that time frame. If data is from the 1960's, it probably isn't valid in the 1990's. </li></ul></ul>
  5. 5. Regression formula <ul><li>a is the slope of the regression line: </li></ul><ul><li>b is the y-intercept of the regression line: </li></ul><ul><li>The regression line is sometimes called ‘’the line of best fit’’ or ‘’the best fit line’’ </li></ul><ul><li>Since it &quot;best fits&quot; the data, it makes sense that the line passes through the means. </li></ul><ul><li>The regression equation is the line with slope a passing through the point </li></ul><ul><li>Another way to write the equation would be: </li></ul>