Introduction to statistics


Published on

Published in: Technology, Education

Introduction to statistics

  1. 1. Introduction to statistics By Dr. Amira Talic
  2. 2. What is “Statistics”? • • • • • • •Statistics is the science of data that involves: •Collecting •Classifying •Summarizing •Organizing and •Interpretation • • • • • • • Of numerical information. •Examples: •Cricket batting averages •Stock price •Climatology data such as rainfall amounts, average temperatures •Marketing information •Gambling?
  3. 3. Key Terms • What is Data? facts or information that is relevant or appropriate to a decision maker • Population? •the totality of objects under consideration • Sample? •a portion of the population that is selected for analysis
  4. 4. Key Terms • Parameter? a summary measure (e.g., mean) that is computed to describe a characteristic of the population • Statistic? a summary measure (e.g., mean) that is computed to describe a characteristic of the sample
  5. 5. Variables • Traits or characteristics that can change values from case to case. • A variable is what is measured or manipulated in an experiment •Examples: •Age •Gender •Income •Social class
  6. 6. Types Of Variables • In causal relationships: • CAUSE =>EFFECT independent variable & dependent variable •Independent variable: is a variable that can be controlled or manipulated. An independent variable is the variable you have control over (dose of drug) •Dependent variable: is a variable that cannot be controlled or manipulated. Its values are predicted from the independent variable ( effect on the condition)
  7. 7. Types Of Variables •Discrete variables are measured in units that cannot be subdivided. Example: Number of children •Continuous variables are measured in a unit that can be subdivided infinitely. Example: Height
  8. 8. Statistical analysis • Descriptive Statistics • Inferential statistics • Predictive modeling
  9. 9. Descriptive Statistics •Gives us the overall picture about data •Presents data in the form of tables, charts and graphs •Includes summary data •Avoids inferences Examples: •Measures of central location Mean, median, mode and midrange •Measures of Variation •Variance, Standard Deviation, z-scores
  10. 10. Inferential Statistics •Take decision on overall population using a sample • “Sampled” data are incomplete but can still be representative of the population •Permits the making of generalizations (inferences) about the data • Probability theory is a major tool used to analyze sampled data
  11. 11. Predictive Modeling • The science of predicting future outcomes based on historical events. • Model Building: “Developing set of equations or mathematical formulation to forecast future behaviors based on current or historical data.” • Regression, logistic Regression, time series analysis etc.,
  12. 12. Calculation of the probability • Based on the characteristics of the population for the observed parameter • (e.g. . Duration of the pregnancy, duration of the first labor stage, height, et cetera) • To describe the population, “distribution” will be used
  13. 13. Distribution • A statistical distribution describes the numbers of times each possible outcome occurs in a sample • Distributions for continuous variables are called continuous distributions ( e.g. height) • They also carry the fancier name probability density
  14. 14. Distribution • Some probability densities have particular importance in statistics. A very important one is shaped like a bell, and called the normal ( Gaussian) distribution. • Many naturally-occurring phenomena can be approximated surprisingly well by this distribution. It will serve to illustrate some features of all continuous distributions.
  15. 15. Gaussian distribution
  16. 16. What are the Components of A Distribution? • Measures of central tendency • Suppose we have a sample with 4 observations: 4, 1, 4, 3 • Mean = the sum of a set of numbers divided by the number of observations (4+1+4+3=12:4=3) Median - the middle point of a set of numbers(3.5)
  17. 17. Components of distribution • Mode - the most frequently occurring number. Mode=4 • Median - the middle point of a set of numbers(3.5)
  18. 18. Components of distribution Measures of variation Range - the maximum value minus the minimum value in a set of numbers. Range = 4-1 = 3 Standard Deviation - the average distance a data point is away from the mean. [ (4- 3)+( 1 -3)+ (4- 3)+ (3- 3)]: 4=1 standard deviation= 1
  19. 19. Standard deviation
  20. 20. Why to know about it ? • Mean, Median, Mode, Range, and Standard Deviations are measurements in a sample (statistics) and can also be used to make inferences on a population.
  21. 21. What do we expect from the statistical analysis? • To find out whether there is a statistically significant difference between our sample (e.g. pregnancy loss in Al Ain Hospital Patient) and general population
  22. 22. How to perform the statistical analysis? • Statistics can take us to a beautiful journey of understanding ,but
  23. 23. Festina lente! make haste slowly