Descriptive Statistics
Presented by
Hiba Armouche
Outline
•
•
•
•

Statistics versus Parameters
Types of Numerical Data.
Types of Scores
Techniques for Summarizing
Quantitative Data
Statistics Versus Parameters
• A parameter is a characteristic of a
population. It is a numerical or graphic
way to summarize data obtained from
the population
• A statistic is a characteristic of a
sample. It is a numerical or graphic way
to summarize data obtained from a
sample
Types of Numerical Data
• There are two types of data :
1- Quantitative data are obtained by determining
placement on a scale that indicates amount or degree
Ex:The temperatures recorded each day during the
months of September through December in Lebanon in
a given year (the variable is temperature )
2- Categorical data are obtained by determining the
frequency of occurrences in each of several categories
Ex: The number of male and female students in a
chemistry class (the variable is gender )
Types of Scores
Raw Score is the initial score obtained
Ex: The number of items an individual gets correct
on a test.
Derived Score is obtained by taking the raw score and
converting it into a more useful score
Types of Scores
Types of Scores
Age and Grade level Equivalent

tell us of what age or grade an
individual score is typical.
Types of Scores
A percentile rank refers to the
percentage of individuals scoring at or
below a given raw score.
PR =

Number of Students
Below Score

+

All Students
All Scores

Total Number in the Group

X 100
Types of Scores
Standard scores
indicate how far a given raw score
is from a reference point.
The z scores and the t scores
Techniques for Summarizing
Quantitative Data
• A frequency distribution is two-column
listing, from high to low, of all the scores
along with their frequencies
Techniques for Summarizing
Quantitative Data
• A frequency polygon is a graphic display of
frequency distribution. It is a graphic way to
summarize quantitative data for one variable
– A graphic distribution of scores in which only a few
individuals receive high scores is called a
positively skewed polygon
– One in which only a few individuals receive low
scores is called a negatively skewed polygon
Techniques for Summarizing
Quantitative Data
• A histogram is a bar graph used to
display quantitative data at the interval
or ratio level of measurement
Techniques for Summarizing
Quantitative Data
• The stem-leaf plot is a display that organizes a set of
data to show both its shape and distribution. Each
data value is split into a stem and a leaf.
The leaf is the last digit of a number.
The other digits to the left of the leaf form the stem

Example
Stem

15 9
Leaf
Techniques for Summarizing
Quantitative Data
• The normal distribution is a theoretical
distribution that is symmetrical and in which a
large proportion is concentrated in the middle
• The distribution curve of a normal distribution
is called a normal curve. It is a bell-shaped,
and its mean, mode, and median are identical
How do you analyze the data?
Conduct descriptive analysis
Descriptive Statistics

Central
Tendency

Mean
Median
Mode

Variability

Relative Standing

Variance
Standard Deviation
Range

Z-Score
Percentile Ranks
Averages/Measures of
central tendency
• Mode:
– The most frequently occurring score
– Appropriate for nominal data
Averages/Measures of
central tendency
• Median
– The score above and below which
50% of all scores lie (i.e., the midpoint)
– Characteristics
• Appropriate for ordinal scales
• Doesn’t take into account the value
of each and every score in the data
Averages/Measures of
central tendency
• Mean
– The arithmetic average of all scores
– Characteristics
• Advantageous statistical properties
• Affected by outlying scores
• Most frequently used measure of
central tendency
– Formula
Skewed Distributions
• Positive – many low scores and few high scores
• Negative – few low scores and many high scores
• Relationships between the mean, median, and mode
– Positively skewed – mode is lowest, median is in
the middle, and mean is highest
– Negatively skewed – mean is lowest, median is in
the middle, and mode is highest
Variability or Spreads
• Purpose – to measure the extent to
which scores are spread apart
Distribution A: 19, 20, 25, 32, 39
Distribution B: 2, 3, 25, 30, 75
Variability or Spreads
– Range
– Quartile deviation
– Boxplots
– Variance & Standard deviation
Variability or Spreads
• Range
– The difference between the highest and
lowest score in a data set
– Characteristics
• Unstable measure of variability
• Rough, quick estimate
Variability or Spreads
Quartiles and the Five-Number Summary
A percentile in a set of numbers is a value below
which a certain percentage of numbers fall and above
which the rest of the numbers fall.
Example: You received in SAT score
“Raw score 630, percentile 84”
This means that your score is 630 and 84% of those
who took the exam scored lower than you.
Variability or Spreads
Quartiles and the Five-Number Summary
NB:
• The median is the 50th percentile
• The first quartile is the 25th percentile Q1
• The third quartile is the 75th percentile Q3.
Variability or Spreads
Five-Number Summary
•
•
•
•
•

The lowest score
Q1
The highest score
The median
Q3
Interquartile range
IQR = Q3 - Q1
Variability or Spreads
• Boxplots
Variability or Spreads
• Standard Deviation SD
It is a single number that represents
the spread of a distribution. Every score
in the distribution is used to calculate it.
Variability or Spreads
• How to calculate the Standard Deviation
1- Calculate the mean
2- Subtract the mean from each score
3-Square each of these scores
4- Add all the squares of these scores
5- Divide the total by the total numbers of scores
The result is called Variance.
6- Take the square root of the variance.
This is the standard deviation
Variability or Spreads

SD =
Variability or Spreads
NB:
The more spread out scores are the
greater the deviation scores will be and
hence the larger the standard deviation
Relative Standing
• Types
– Percentile ranks – the percentage of
scores that fall at or above a given score
– Standard scores – a derived score based
on how far a raw score is from a reference
point in terms of standard deviation units
• z score
• T score
Thank You
hiba.armouche@yahoo.com
www.facebook.com/TrainerHibaArmouche

Descriptive statistics

  • 1.
  • 2.
    Outline • • • • Statistics versus Parameters Typesof Numerical Data. Types of Scores Techniques for Summarizing Quantitative Data
  • 3.
    Statistics Versus Parameters •A parameter is a characteristic of a population. It is a numerical or graphic way to summarize data obtained from the population • A statistic is a characteristic of a sample. It is a numerical or graphic way to summarize data obtained from a sample
  • 4.
    Types of NumericalData • There are two types of data : 1- Quantitative data are obtained by determining placement on a scale that indicates amount or degree Ex:The temperatures recorded each day during the months of September through December in Lebanon in a given year (the variable is temperature ) 2- Categorical data are obtained by determining the frequency of occurrences in each of several categories Ex: The number of male and female students in a chemistry class (the variable is gender )
  • 5.
    Types of Scores RawScore is the initial score obtained Ex: The number of items an individual gets correct on a test. Derived Score is obtained by taking the raw score and converting it into a more useful score
  • 6.
  • 7.
    Types of Scores Ageand Grade level Equivalent tell us of what age or grade an individual score is typical.
  • 8.
    Types of Scores Apercentile rank refers to the percentage of individuals scoring at or below a given raw score. PR = Number of Students Below Score + All Students All Scores Total Number in the Group X 100
  • 9.
    Types of Scores Standardscores indicate how far a given raw score is from a reference point. The z scores and the t scores
  • 10.
    Techniques for Summarizing QuantitativeData • A frequency distribution is two-column listing, from high to low, of all the scores along with their frequencies
  • 11.
    Techniques for Summarizing QuantitativeData • A frequency polygon is a graphic display of frequency distribution. It is a graphic way to summarize quantitative data for one variable – A graphic distribution of scores in which only a few individuals receive high scores is called a positively skewed polygon – One in which only a few individuals receive low scores is called a negatively skewed polygon
  • 12.
    Techniques for Summarizing QuantitativeData • A histogram is a bar graph used to display quantitative data at the interval or ratio level of measurement
  • 13.
    Techniques for Summarizing QuantitativeData • The stem-leaf plot is a display that organizes a set of data to show both its shape and distribution. Each data value is split into a stem and a leaf. The leaf is the last digit of a number. The other digits to the left of the leaf form the stem Example Stem 15 9 Leaf
  • 14.
    Techniques for Summarizing QuantitativeData • The normal distribution is a theoretical distribution that is symmetrical and in which a large proportion is concentrated in the middle • The distribution curve of a normal distribution is called a normal curve. It is a bell-shaped, and its mean, mode, and median are identical
  • 15.
    How do youanalyze the data? Conduct descriptive analysis Descriptive Statistics Central Tendency Mean Median Mode Variability Relative Standing Variance Standard Deviation Range Z-Score Percentile Ranks
  • 16.
    Averages/Measures of central tendency •Mode: – The most frequently occurring score – Appropriate for nominal data
  • 17.
    Averages/Measures of central tendency •Median – The score above and below which 50% of all scores lie (i.e., the midpoint) – Characteristics • Appropriate for ordinal scales • Doesn’t take into account the value of each and every score in the data
  • 18.
    Averages/Measures of central tendency •Mean – The arithmetic average of all scores – Characteristics • Advantageous statistical properties • Affected by outlying scores • Most frequently used measure of central tendency – Formula
  • 19.
    Skewed Distributions • Positive– many low scores and few high scores • Negative – few low scores and many high scores • Relationships between the mean, median, and mode – Positively skewed – mode is lowest, median is in the middle, and mean is highest – Negatively skewed – mean is lowest, median is in the middle, and mode is highest
  • 20.
    Variability or Spreads •Purpose – to measure the extent to which scores are spread apart Distribution A: 19, 20, 25, 32, 39 Distribution B: 2, 3, 25, 30, 75
  • 21.
    Variability or Spreads –Range – Quartile deviation – Boxplots – Variance & Standard deviation
  • 22.
    Variability or Spreads •Range – The difference between the highest and lowest score in a data set – Characteristics • Unstable measure of variability • Rough, quick estimate
  • 23.
    Variability or Spreads Quartilesand the Five-Number Summary A percentile in a set of numbers is a value below which a certain percentage of numbers fall and above which the rest of the numbers fall. Example: You received in SAT score “Raw score 630, percentile 84” This means that your score is 630 and 84% of those who took the exam scored lower than you.
  • 24.
    Variability or Spreads Quartilesand the Five-Number Summary NB: • The median is the 50th percentile • The first quartile is the 25th percentile Q1 • The third quartile is the 75th percentile Q3.
  • 25.
    Variability or Spreads Five-NumberSummary • • • • • The lowest score Q1 The highest score The median Q3 Interquartile range IQR = Q3 - Q1
  • 26.
  • 27.
    Variability or Spreads •Standard Deviation SD It is a single number that represents the spread of a distribution. Every score in the distribution is used to calculate it.
  • 28.
    Variability or Spreads •How to calculate the Standard Deviation 1- Calculate the mean 2- Subtract the mean from each score 3-Square each of these scores 4- Add all the squares of these scores 5- Divide the total by the total numbers of scores The result is called Variance. 6- Take the square root of the variance. This is the standard deviation
  • 29.
  • 30.
    Variability or Spreads NB: Themore spread out scores are the greater the deviation scores will be and hence the larger the standard deviation
  • 31.
    Relative Standing • Types –Percentile ranks – the percentage of scores that fall at or above a given score – Standard scores – a derived score based on how far a raw score is from a reference point in terms of standard deviation units • z score • T score
  • 32.