This document discusses descriptive statistics and summarizing distributions. It covers measures of central tendency including the mean, median, and mode. It also discusses measures of dispersion such as variance and standard deviation. These measures are used to describe the characteristics of frequency distributions and determine where the center is located and how spread out the data is. The choice between measures depends on whether the distribution is normal or skewed.

Measure of Central Tendency (Mean, Median, Mode and Quantiles)

A measure of central tendency is a summary statistic that represents the center point or typical value of a dataset. These measures indicate where most values in a distribution fall and are also referred to as the central location of a distribution. You can think of it as the tendency of data to cluster around a middle value. In statistics, the three most common measures of central tendency are the mean, median, and mode. Each of these measures calculates the location of the central point using a different method.

Measures of dispersion

The document discusses various measures used to describe the dispersion or variability in a data set. It defines dispersion as the extent to which values in a distribution differ from the average. Several measures of dispersion are described, including range, interquartile range, mean deviation, and standard deviation. The document also discusses measures of relative standing like percentiles and quartiles, and how they can locate the position of observations within a data set. The learning objectives are to understand how to describe variability, compare distributions, describe relative standing, and understand the shape of distributions using these measures.

Basic Descriptive statistics

Descriptive statistics is used to describe and summarize key characteristics of a data set. Commonly used measures include central tendency, such as the mean, median, and mode, and measures of dispersion like range, interquartile range, standard deviation, and variance. The mean is the average value calculated by summing all values and dividing by the number of values. The median is the middle value when data is arranged in order. The mode is the most frequently occurring value. Measures of dispersion describe how spread out the data is, such as the difference between highest and lowest values (range) or how close values are to the average (standard deviation).

Measure of Dispersion in statistics

Measure of dispersion has two types Absolute measure and Graphical measure. There are other different types in there.
In this slide the discussed points are:
1. Dispersion & it's types
2. Definition
3. Use
4. Merits
5. Demerits
6. Formula & math
7. Graph and pictures
8. Real life application.

Standard deviation

The document discusses the conceptual definition of standard deviation. Standard deviation represents the root average of the squared deviations of scores from the mean. It explains that to calculate standard deviation, each score's deviation from the mean is squared, those squared deviations are averaged, and then the square root of the average is taken to determine the standard deviation in the original units of measurement.

Math 102- Statistics

The document provides an introduction to statistics and statistical inference. It discusses key definitions such as variables, parameters, populations, samples, and descriptive and inferential statistics. It also covers common measures of central tendency (mean, median, mode), measures of variability, and levels of measurement (nominal, ordinal, interval, ratio). Examples of descriptive and inferential statistics are given.

Descriptive statistics

What are descriptive Statistics, Types of statistics and its implications.
graphs, tables, variation, mean mode median, central tendency,

Lec. biostatistics introduction

The document discusses key concepts in public health methodologies and biostatistics. It defines data as facts that can be processed by computers. Statistics is described as the study of collecting, summarizing, analyzing and interpreting data. Biostatistics applies statistical techniques to health-related fields like medicine. Descriptive statistics refers to methods used to describe data, while inferential statistics are used to draw conclusions from numeric data. Variables, grouped vs. ungrouped data, and types of variables are also outlined.

Standard deviation

Standard deviation measures how dispersed data values are from the average. It is the most reliable measure of dispersion and shows the average distance of each data point from the mean. While it is more difficult to calculate than other measures, standard deviation provides important information about how concentrated or spread out the data is. The presentation defines standard deviation, lists its merits and demerits, and shows how to calculate it for both populations and samples.

Statistics Class 10 CBSE

Powerpoint presentation on the chapter- Statistics from class 10. Includes examples and formulas directly from the textbook

Variability

This document discusses measures of variability used to describe how spread out data values are from the mean or average. It defines and provides formulas for calculating range, variance, standard deviation, sample variance, sample standard deviation, population variance, population standard deviation, estimated population variance, and estimated population standard deviation. These measures are important in statistical analysis to understand the distribution of data values.

Graphical presentation of data

Variables describe attributes that can vary between entities. They can be qualitative (categorical) or quantitative (numeric). Common types of variables include continuous, discrete, ordinal, and nominal variables. Data can be presented graphically through bar charts, pie charts, histograms, box plots, and scatter plots to better understand patterns and trends. Key measures used to summarize data include measures of central tendency (mean, median, mode) and measures of variability (range, standard deviation, interquartile range).

Standard deviation

The document provides objectives and instructions for calculating standard deviation, variance, and student's t-test. It defines standard deviation as the positive square root of the arithmetic mean of the squared deviations from the mean. Standard deviation is considered the most reliable measure of variability. Variance is defined as the square of the standard deviation. Student's t-test is used to compare means of two samples and determine if they are statistically different. The document provides examples of calculating standard deviation, variance, and performing matched pairs and independent samples t-tests on sets of data.

Measures of dispersions

1. The document discusses various measures of dispersion used to quantify how spread out or variable a data set is. It describes measures such as range, mean deviation, variance, and standard deviation.
2. It also discusses relative measures of dispersion like the coefficient of variation, which allows comparison of variability between data sets with different units or averages. The coefficient of variation expresses variability as a percentage of the mean.
3. Additional concepts covered include skewness, which refers to the asymmetry of a distribution, and kurtosis, which measures the peakedness of a distribution compared to a normal distribution. Positive or negative skewness and leptokurtic, mesokurtic, or platykurtic k

1.2 types of data

The document discusses different types of data that can be collected in statistics including categorical vs. quantitative data, discrete vs. continuous data, and different levels of measurement for data including nominal, ordinal, interval, and ratio scales. It also discusses key concepts such as parameters, statistics, populations, and samples. Potential pitfalls in statistical analysis are outlined such as misleading conclusions, nonresponse bias, and issues with survey question wording and order.

Type of data

This document introduces the concept of data classification and levels of measurement in statistics. It explains that data can be either qualitative or quantitative. Qualitative data consists of attributes and labels while quantitative data involves numerical measurements. The document also outlines the four levels of measurement - nominal, ordinal, interval, and ratio - from lowest to highest. Each level allows for different types of statistical calculations, with the ratio level permitting the most complex calculations like ratios of two values.

Standard Deviation.ppt

The document discusses variance and standard deviation. It defines variance as the average squared deviation from the mean of a data set. It provides the step-by-step process to calculate variance which includes finding the mean, deviations from the mean, squaring the deviations, summing the squares, and dividing by the number of data points. Standard deviation is defined as the square root of the variance and measures how spread out numbers are from the mean. Examples are provided to demonstrate calculating variance and standard deviation.

Statistical Methods

1. Statistics is used to analyze data beyond what can be seen in maps and diagrams by using mathematical manipulation, which can reveal patterns that may otherwise go unnoticed.
2. It is important to justify any statistical techniques used and to only use techniques that are appropriate for the type of data.
3. Common methods for summarizing large data sets include calculating the mean, mode, and median. The mean is the average, the mode is the most frequent value, and the median is the middle value when the data is arranged from lowest to highest.

Measures of dispersion

This document defines and compares several common measures of statistical dispersion, or variation within a data set from the average value. It explains that measures of dispersion include range, interquartile range, variance, standard deviation, and others. Each measure is defined and their advantages and disadvantages for describing how spread out numbers are in a data set are discussed. For example, the range is simple to calculate but influenced by outliers, while the standard deviation takes all values into account but can also be impacted by outliers.

Coefficient of variation

The document defines and provides examples for calculating the coefficient of variation, which is a measure used to compare the dispersion of data sets. It gives the formula for coefficient of variation as the standard deviation divided by the mean, expressed as a percentage. Two examples are shown comparing the stability of prices between two cities and production between two manufacturing plants, with the data set having the lower coefficient of variation considered more consistent or stable.

Measure of Central Tendency

The document discusses different measures of central tendency including the mean, median, and mode. The mean is the average value calculated by adding all values and dividing by the total number of values. The median is the middle value when values are arranged from lowest to highest. The mode is the most frequently occurring value in the data set. The document provides examples of calculating each measure and discusses their advantages and disadvantages.

best for normal distribution.ppt

- The document discusses key concepts in descriptive statistics including types of distributions, measures of central tendency, and measures of dispersion.
- It covers normal, skewed, and other types of distributions. Measures of central tendency discussed are mean, median, and mode. Measures of dispersion covered are variance and standard deviation.
- The document uses examples and explanations to illustrate how to calculate and interpret these important statistical measures.

statical-data-1 to know how to measure.ppt

- The document discusses key concepts in descriptive statistics including types of distributions, measures of central tendency, and measures of dispersion.
- It covers normal, skewed, and other types of distributions. Measures of central tendency discussed are mean, median, and mode. Measures of dispersion covered are variance and standard deviation.
- The document uses examples and explanations to illustrate how to calculate and interpret these important statistical measures.

Sriram seminar on introduction to statistics

The document provides an introduction to statistics concepts including central tendency, dispersion, probability, and random variables. It discusses different measures of central tendency like mean, median and mode. It also covers dispersion concepts like variance and standard deviation. The document introduces key probability concepts such as experiments, sample spaces, events, and conditional probability. It defines random variables and discusses discrete and continuous random variables.

Describing quantitative data with numbers

1. Quantitative data can be summarized using measures of center (mean, median), spread (range, IQR, standard deviation), and position (quartiles, percentiles, z-scores).
2. The mean is more affected by outliers than the median. The median is more resistant to outliers and a better measure of center for skewed data.
3. Additional summaries like the five-number summary and boxplots provide a graphical view of the distribution and identify potential outliers.

Measures of Dispersion.pptx

The document discusses measures of dispersion, which describe how varied or spread out a data set is around the average value. It defines several measures of dispersion, including range, interquartile range, mean deviation, and standard deviation. The standard deviation is described as the most important measure, as it takes into account all values in the data set and is not overly influenced by outliers. The document provides a detailed example of calculating the standard deviation, which involves finding the differences from the mean, squaring those values, summing them, and taking the square root.

21.StatsLecture.07.ppt

Descriptive statistics are used to organize, simplify and describe data distributions. They involve determining the shape, central tendency (e.g. mean, median, mode), and variability or spread of data. Common measures of central tendency indicate the center of the distribution, while measures of variability like standard deviation quantify how far values are from the mean. Descriptive statistics provide essential information about data and are the first step in statistical analysis before making inferences about populations.

statistics

This document provides an introduction to inferential statistics and statistical significance. It discusses key concepts like standard error of the mean, confidence intervals, and comparing means from two samples using a t-test. The document explains how inferential statistics allow researchers to make inferences about populations based on samples and determine if observed differences are likely due to chance or a real effect.

CABT Math 8 measures of central tendency and dispersion

This document provides an introduction to statistics. It discusses what statistics is, the two main branches of statistics (descriptive and inferential), and the different types of data. It then describes several key measures used in statistics, including measures of central tendency (mean, median, mode) and measures of dispersion (range, mean deviation, standard deviation). The mean is the average value, the median is the middle value, and the mode is the most frequent value. The range is the difference between highest and lowest values, the mean deviation is the average distance from the mean, and the standard deviation measures how spread out values are from the mean. Examples are provided to demonstrate how to calculate each measure.

QT1 - 03 - Measures of Central Tendency

This document discusses measures of central tendency and dispersion used to analyze and summarize data. It defines key terms like mean, median, mode, range, variance, and standard deviation. It explains how to calculate these measures both mathematically and using grouped or sample data, and the importance of understanding the central tendency and dispersion of data distributions.

QT1 - 03 - Measures of Central Tendency

This document discusses measures of central tendency and dispersion used to analyze and summarize data. It defines key terms like mean, median, mode, range, variance, and standard deviation. It explains how to calculate these measures both mathematically and using grouped or sample data, and the importance of understanding the distribution, central tendency and dispersion of data.

Bio statistics

1. The document discusses key concepts in biostatistics including measures of central tendency, dispersion, correlation, regression, and sampling.
2. Measures of central tendency described are the mean, median, and mode. Measures of dispersion include range, standard deviation, and quartile deviation.
3. The importance of statistical analysis for living organisms in areas like medicine, biology and public health is highlighted. Examples are provided to demonstrate calculation of statistical measures.

Answer the questions in one paragraph 4-5 sentences.

Answer the questions in one paragraph 4-5 sentences.
· Why did the class collectively sign a blank check? Was this a wise decision; why or why not? we took a decision all the class without hesitation
· What is something that I said individuals should always do; what is it; why wasn't it done this time? Which mitigation strategies were used; what other strategies could have been used/considered? individuals should always participate in one group and take one decision
SAMPLING MEAN:
DEFINITION:
The term sampling mean is a statistical term used to describe the properties of statistical distributions. In statistical terms, the sample meanfrom a group of observations is an estimate of the population mean. Given a sample of size n, consider n independent random variables X1, X2... Xn, each corresponding to one randomly selected observation. Each of these variables has the distribution of the population, with mean and standard deviation. The sample mean is defined to be
WHAT IT IS USED FOR:
It is also used to measure central tendency of the numbers in a database. It can also be said that it is nothing more than a balance point between the number and the low numbers.
HOW TO CALCULATE IT:
To calculate this, just add up all the numbers, then divide by how many numbers there are.
Example: what is the mean of 2, 7, and 9?
Add the numbers: 2 + 7 + 9 = 18
Divide by how many numbers (i.e., we added 3 numbers): 18 ÷ 3 = 6
So the Mean is 6
SAMPLE VARIANCE:
DEFINITION:
The sample variance, s2, is used to calculate how varied a sample is. A sample is a select number of items taken from a population. For example, if you are measuring American people’s weights, it wouldn’t be feasible (from either a time or a monetary standpoint) for you to measure the weights of every person in the population. The solution is to take a sample of the population, say 1000 people, and use that sample size to estimate the actual weights of the whole population.
WHAT IT IS USED FOR:
The sample variance helps you to figure out the spread out in the data you have collected or are going to analyze. In statistical terminology, it can be defined as the average of the squared differences from the mean.
HOW TO CALCULATE IT:
Given below are steps of how a sample variance is calculated:
· Determine the mean
· Then for each number: subtract the Mean and square the result
· Then work out the mean of those squared differences.
To work out the mean, add up all the values then divide by the number of data points.
First add up all the values from the previous step.
But how do we say "add them all up" in mathematics? We use the Roman letter Sigma: Σ
The handy Sigma Notation says to sum up as many terms as we want.
· Next we need to divide by the number of data points, which is simply done by multiplying by "1/N":
Statistically it can be stated by the following:
·
· This value is the variance
EXAMPLE:
Sam has 20 Rose Bushes.
The number of flowers on each bush is

polar pojhjgfnbhggnbh hnhghgnhbhnhbjnhhhhhh

This document discusses various measures of central tendency and variability used in statistics. It describes the three main measures of central tendency as the mode, median, and mean. For measures of variability, it defines concepts like range, variance, and standard deviation. The range is described as the highest score minus the lowest score and provides a simple measure of variation. Variance is defined as the mean of the squared deviations from the mean and standard deviation is the square root of the variance, providing a measure of how data points cluster around the mean. Examples are provided to demonstrate calculating each of these statistical measures.

Measures of Central Tendency

Measures of Central Tendency: Mean, Median and Mode
Reference: Statistics in Psychology and Education/ by Henry E. Garrett

Measures of dispersion

This document discusses measures of dispersion, which indicate how spread out or variable a set of data is. There are three main measures: the range, which is the difference between the highest and lowest values; the semi-interquartile range (SIR), which is the difference between the first and third quartiles divided by two; and variance/standard deviation. Variance is the average of the squared deviations from the mean, while standard deviation is the square root of the variance. These measures provide summaries of how concentrated or dispersed the observed values are from the average or expected value.

Descriptive Statistics.pptx

This document defines and explains various measures of central tendency, dispersion, and distribution used in descriptive statistics. It discusses modes, medians, means, percentiles, quartiles, range, interquartile range, standard deviation, z-scores, and other key statistical concepts. These metrics are used to summarize and describe the central position and variability of data in distributions.

Measures of Dispersion .pptx

This document discusses various measures of dispersion used to describe the spread or variability in a data set. It describes absolute measures of dispersion, such as range and mean deviation, which indicate the amount of variation, and relative measures like the coefficient of variation, which indicate the degree of variation accounting for different scales. Common measures discussed include range, variance, standard deviation, coefficient of variation, skewness and kurtosis. Formulas are provided for calculating many of these dispersion statistics.

Lecture. Introduction to Statistics (Measures of Dispersion).pptx

1) The document discusses various measures of dispersion used to quantify how spread out or varied a set of data values are from the average.
2) There are two types of dispersion - absolute dispersion measures how varied data values are in the original units, while relative dispersion compares variability between datasets with different units.
3) Common measures of absolute dispersion include range, variance, and standard deviation. Range is the difference between highest and lowest values, while variance and standard deviation take into account how far all values are from the mean.

SAMPLING MEANDEFINITIONThe term sampling mean is a stati.docx

The document defines key statistical terms and concepts including:
- Sampling mean is an estimate of the population mean based on a sample. It is calculated by adding all values and dividing by the sample size.
- Sample variance measures the variation or spread of values in a sample. It is calculated by finding the mean of squared differences from the sample mean.
- Standard deviation is the square root of the variance, providing a measure of dispersion from the mean.
- Hypothesis testing uses sample data to determine the validity of claims about a population. The null hypothesis is tested against an alternative using statistical significance.
- Decision trees visually represent decision problems by showing possible choices, outcomes, and probabilities to

SAMPLING MEANDEFINITIONThe term sampling mean is a stati.docx

SAMPLING MEAN:
DEFINITION:
The term sampling mean is a statistical term used to describe the properties of statistical distributions. In statistical terms, the sample meanfrom a group of observations is an estimate of the population mean. Given a sample of size n, consider n independent random variables X1, X2... Xn, each corresponding to one randomly selected observation. Each of these variables has the distribution of the population, with mean and standard deviation. The sample mean is defined to be
WHAT IT IS USED FOR:
It is also used to measure central tendency of the numbers in a database. It can also be said that it is nothing more than a balance point between the number and the low numbers.
HOW TO CALCULATE IT:
To calculate this, just add up all the numbers, then divide by how many numbers there are.
Example: what is the mean of 2, 7, and 9?
Add the numbers: 2 + 7 + 9 = 18
Divide by how many numbers (i.e., we added 3 numbers): 18 ÷ 3 = 6
So the Mean is 6
SAMPLE VARIANCE:
DEFINITION:
The sample variance, s2, is used to calculate how varied a sample is. A sample is a select number of items taken from a population. For example, if you are measuring American people’s weights, it wouldn’t be feasible (from either a time or a monetary standpoint) for you to measure the weights of every person in the population. The solution is to take a sample of the population, say 1000 people, and use that sample size to estimate the actual weights of the whole population.
WHAT IT IS USED FOR:
The sample variance helps you to figure out the spread out in the data you have collected or are going to analyze. In statistical terminology, it can be defined as the average of the squared differences from the mean.
HOW TO CALCULATE IT:
Given below are steps of how a sample variance is calculated:
· Determine the mean
· Then for each number: subtract the Mean and square the result
· Then work out the mean of those squared differences.
To work out the mean, add up all the values then divide by the number of data points.
First add up all the values from the previous step.
But how do we say "add them all up" in mathematics? We use the Roman letter Sigma: Σ
The handy Sigma Notation says to sum up as many terms as we want.
· Next we need to divide by the number of data points, which is simply done by multiplying by "1/N":
Statistically it can be stated by the following:
·
· This value is the variance
EXAMPLE:
Sam has 20 Rose Bushes.
The number of flowers on each bush is
9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
Work out the sample variance
Step 1. Work out the mean
In the formula above, μ (the Greek letter "mu") is the mean of all our values.
For this example, the data points are: 9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
The mean is:
(9+2+5+4+12+7+8+11+9+3+7+4+12+5+4+10+9+6+9+4) / 20 = 140/20 = 7
So:
μ = 7
Step 2. Then for each number: subtract the Mean and square the result
This is t.

- 1. Central Tendency & Dispersion Types of Distributions: Normal, Skewed Central Tendency: Mean, Median, Mode Dispersion: Variance, Standard Deviation
- 2. DESCRIPTIVE STATISTICS are concerned with describing the characteristics of frequency distributions Where is the center? What is the range? What is the shape [of the distribution]?
- 3. Frequency Table Test Scores Observation Frequency (scores) (# occurrences) 65 1 70 2 75 3 80 4 85 3 90 2 95 1 What is the range of test scores? A: 30 (95 minus 65) When calculating mean, one must divide by what number? A: 16 (total # occurrences)
- 4. Summarizing Distributions Two key characteristics of a frequency distribution are especially important when summarizing data or when making a prediction: CENTRAL TENDENCY What is in the “middle”? What is most common? What would we use to predict? DISPERSION How spread out is the distribution? What shape is it?
- 5. 3 measures of central tendency are commonly used in statistical analysis - MEAN, MEDIAN, and MODE. Each measure is designed to represent a “typical” value in the distribution. The choice of which measure to use depends on the shape of the distribution (whether normal or skewed). The MEASURES of Central Tendency
- 6. Mean - Average Most common measure of central tendency. Is sensitive to the influence of a few extreme values (outliers), thus it is not always the most appropriate measure of central tendency. Best used for making predictions when a distribution is more or less normal (or symmetrical). Symbolized as: x for the mean of a sample μ for the mean of a population
- 7. Finding the Mean Formula for Mean: X = (Σ x) N Given the data set: {3, 5, 10, 4, 3} X = (3 + 5 + 10 + 4 + 3) = 25 5 5 X = 5
- 8. Find the Mean Q: 85, 87, 89, 91, 98, 100 A: 91.67 Median: 90 Q: 5, 87, 89, 91, 98, 100 A: 78.3 (Extremely low score lowered the Mean) Median: 90 (The median remained unchanged.)
- 9. Median Used to find middle value (center) of a distribution. Used when one must determine whether the data values fall into either the upper 50% or lower 50% of a distribution. Used when one needs to report the typical value of a data set, ignoring the outliers (few extreme values in a data set). Example: median salary, median home prices in a market Is a better indicator of central tendency than mean when one has a skewed distribution.
- 10. To compute the median first you order the values of X from low to high: 85, 90, 94, 94, 95, 97, 97, 97, 97, 98 then count number of observations = 10. When the number of observations are even, average the two middle numbers to calculate the median. This example, 96 is the median (middle) score.
- 11. Median Find the Median 4 5 6 6 7 8 9 10 12 Find the Median 5 6 6 7 8 9 10 12 Find the Median 5 6 6 7 8 9 10 100,000
- 12. Mode Used when the most typical (common) value is desired. Often used with categorical data. The mode is not always unique. A distribution can have no mode, one mode, or more than one mode. When there are two modes, we say the distribution is bimodal. EXAMPLES: a) {1,0,5,9,12,8} - No mode b) {4,5,5,5,9,20,30} – mode = 5 c) {2,2,5,9,9,15} - bimodal, mode 2 and 9
- 13. Measures of Variability Central Tendency doesn’t tell us everything Dispersion/Deviation/Spread tells us a lot about how the data values are distributed. We are most interested in: Standard Deviation (σ) and Variance (σ2)
- 14. Why can’t the mean tell us everything? Mean describes the average outcome. The question becomes how good a representation of the distribution is the mean? How good is the mean as a description of central tendency -- or how accurate is the mean as a predictor? ANSWER -- it depends on the shape of the distribution. Is the distribution normal or skewed?
- 15. Dispersion Once you determine that the data of interest is normally distributed, ideally by producing a histogram of the values, the next question to ask is: How spread out are the values about the mean? Dispersion is a key concept in statistical thinking. The basic question being asked is how much do the values deviate from the Mean? The more “bunched up” around the mean the better your ability to make accurate predictions.
- 16. Means Consider these means for hours worked day each day: X = {7, 8, 6, 7, 7, 6, 8, 7} X = (7+8+6+7+7+6+8+7)/8 X = 7 Notice that all the data values are bunched near the mean. Thus, 7 would be a pretty good prediction of the average hrs. worked each day. X = {12, 2, 0, 14, 10, 9, 5, 4} X = (12+2+0+14+10+9+5+4)/8 X = 7 The mean is the same for this data set, but the data values are more spread out. So, 7 is not a good prediction of hrs. worked on average each day.
- 17. Data is more spread out, meaning it has greater variability. Below, the data is grouped closer to the center, less spread out, or smaller variability.
- 18. How well does the mean represent the values in a distribution? The logic here is to determine how much spread is in the values. How much do the values "deviate" from the mean? Think of the mean as the true value, or as your best guess. If every X were very close to the Mean, the Mean would be a very good predictor. If the distribution is very sharply peaked then the mean is a good measure of central tendency and if you were to use the Mean to make predictions you would be correct or very close much of the time.
- 19. What if scores are widely distributed? The mean is still your best measure and your best predictor, but your predictive power would be less. How do we describe this? Measures of variability Mean Absolute Deviation (You used in Math1) Variance (We use in Math 2) Standard Deviation (We use in Math 2)
- 20. Mean Absolute Deviation The key concept for describing normal distributions and making predictions from them is called deviation from the mean. We could just calculate the average distance between each observation and the mean. We must take the absolute value of the distance, otherwise they would just cancel out to zero! Formula: | |iX X n
- 21. Mean Absolute Deviation: An Example 1. Compute X (Average) 2. Compute X – X and take the Absolute Value to get Absolute Deviations 3. Sum the Absolute Deviations 4. Divide the sum of the absolute deviations by N X – Xi Abs. Dev. 7 – 6 1 7 – 10 3 7 – 5 2 7 – 4 3 7 – 9 2 7 – 8 1 Data: X = {6, 10, 5, 4, 9, 8} X = 42 / 6 = 7 Total: 12 12 / 6 = 2
- 22. What Does it Mean? On Average, each value is two units away from the mean. Is it Really that Easy? No! Absolute values are difficult to manipulate algebraically Absolute values cause enormous problems for calculus (Discontinuity) We need something else…
- 23. Variance and Standard Deviation Instead of taking the absolute value, we square the deviations from the mean. This yields a positive value. This will result in measures we call the Variance and the Standard Deviation Sample - Population - s Standard Deviation σ Standard Deviation s2 Variance σ2 Variance
- 24. Calculating the Variance and/or Standard Deviation Formulae: Variance: Examples Follow . . . 2 ( )iX X s N 2 2 ( )iX X s N Standard Deviation:
- 25. Example: -1 1 3 9 -2 4 -3 9 2 4 1 1 Data: X = {6, 10, 5, 4, 9, 8}; N = 6 Total: 42 Total: 28 Standard Deviation: 7 6 42 N X X Mean: Variance: 2 2 ( ) 28 4.67 6 X X s N 16.267.42 ss XX 2 )( XX X 6 10 5 4 9 8