This document discusses statistical methods for comparing means, including t-tests and analysis of variance (ANOVA). It explains how t-tests can be used to compare two means or paired samples, and how ANOVA can compare two or more means. Key assumptions and procedures are outlined for one-sample t-tests, paired t-tests, independent t-tests with equal and unequal variances, and one-way between-subjects ANOVAs.
- Sampling distribution describes the distribution of sample statistics like means or proportions drawn from a population. It allows making statistical inferences about the population.
- The central limit theorem states that sampling distributions of sample means will be approximately normally distributed regardless of the population distribution, if the sample size is large.
- Standard error measures the amount of variability in values of a sample statistic across different samples. It is used to construct confidence intervals for population parameters.
This document provides an overview of analysis of variance (ANOVA) techniques. It discusses one-way ANOVA, which evaluates differences between three or more population means. Key aspects covered include partitioning total variation into between- and within-group components, assumptions of normality and equal variances, and using the F-test to test for differences. Randomized block ANOVA and two-factor ANOVA are also introduced as extensions to control for additional variables. Post-hoc tests like Tukey and Fisher's LSD are described for determining specific mean differences.
This document discusses inferential statistics, which uses sample data to make inferences about populations. It explains that inferential statistics is based on probability and aims to determine if observed differences between groups are dependable or due to chance. The key purposes of inferential statistics are estimating population parameters from samples and testing hypotheses. It discusses important concepts like sampling distributions, confidence intervals, null hypotheses, levels of significance, type I and type II errors, and choosing appropriate statistical tests.
The document discusses the normal distribution and related concepts. It describes how normal distributions can vary in their mean and standard deviation. It then discusses key features of normal distributions including that they are symmetric, have equal mean, median and mode, and are denser in the center than the tails. Finally, it discusses related statistical concepts like kurtosis and skewness, describing how kurtosis measures the thickness of a distribution's tails and how skewness measures a distribution's asymmetry.
MEASURES OF CENTRAL TENDENCY AND MEASURES OF DISPERSION Tanya Singla
Central tendency refers to typical or average values in a data set or probability distribution. The three most common measures of central tendency are the mean, median, and mode. The mean is the average calculated by summing all values and dividing by the total number. The median is the middle value when data is arranged in order. The mode is the most frequently occurring value. Other measures discussed include range, which is the difference between highest and lowest values, and quartiles, which divide a data set into four equal parts based on the distribution of values.
The document discusses t-tests, which are used to compare means between groups. It describes the assumptions of t-tests, the different types of t-tests including independent samples t-tests and dependent samples t-tests, and the steps to conduct t-tests by hand and using SPSS. It provides examples of conducting one-sample t-tests, independent samples t-tests, and dependent samples t-tests, including interpreting the results. It also discusses how to increase statistical power by increasing the difference between means, decreasing variance, increasing sample size, and increasing the alpha level.
The chi-square distribution is used to test hypotheses about categorical data. It is defined as the sum of the squares of the differences between observed and expected frequencies divided by the expected frequencies. The chi-square distribution depends on the degrees of freedom, with each number of degrees of freedom having a different distribution. The chi-square test can be used to test goodness of fit, independence, and homogeneity. It requires data to be in the form of frequencies in a contingency table and assumes independence between observations.
- Sampling distribution describes the distribution of sample statistics like means or proportions drawn from a population. It allows making statistical inferences about the population.
- The central limit theorem states that sampling distributions of sample means will be approximately normally distributed regardless of the population distribution, if the sample size is large.
- Standard error measures the amount of variability in values of a sample statistic across different samples. It is used to construct confidence intervals for population parameters.
This document provides an overview of analysis of variance (ANOVA) techniques. It discusses one-way ANOVA, which evaluates differences between three or more population means. Key aspects covered include partitioning total variation into between- and within-group components, assumptions of normality and equal variances, and using the F-test to test for differences. Randomized block ANOVA and two-factor ANOVA are also introduced as extensions to control for additional variables. Post-hoc tests like Tukey and Fisher's LSD are described for determining specific mean differences.
This document discusses inferential statistics, which uses sample data to make inferences about populations. It explains that inferential statistics is based on probability and aims to determine if observed differences between groups are dependable or due to chance. The key purposes of inferential statistics are estimating population parameters from samples and testing hypotheses. It discusses important concepts like sampling distributions, confidence intervals, null hypotheses, levels of significance, type I and type II errors, and choosing appropriate statistical tests.
The document discusses the normal distribution and related concepts. It describes how normal distributions can vary in their mean and standard deviation. It then discusses key features of normal distributions including that they are symmetric, have equal mean, median and mode, and are denser in the center than the tails. Finally, it discusses related statistical concepts like kurtosis and skewness, describing how kurtosis measures the thickness of a distribution's tails and how skewness measures a distribution's asymmetry.
MEASURES OF CENTRAL TENDENCY AND MEASURES OF DISPERSION Tanya Singla
Central tendency refers to typical or average values in a data set or probability distribution. The three most common measures of central tendency are the mean, median, and mode. The mean is the average calculated by summing all values and dividing by the total number. The median is the middle value when data is arranged in order. The mode is the most frequently occurring value. Other measures discussed include range, which is the difference between highest and lowest values, and quartiles, which divide a data set into four equal parts based on the distribution of values.
The document discusses t-tests, which are used to compare means between groups. It describes the assumptions of t-tests, the different types of t-tests including independent samples t-tests and dependent samples t-tests, and the steps to conduct t-tests by hand and using SPSS. It provides examples of conducting one-sample t-tests, independent samples t-tests, and dependent samples t-tests, including interpreting the results. It also discusses how to increase statistical power by increasing the difference between means, decreasing variance, increasing sample size, and increasing the alpha level.
The chi-square distribution is used to test hypotheses about categorical data. It is defined as the sum of the squares of the differences between observed and expected frequencies divided by the expected frequencies. The chi-square distribution depends on the degrees of freedom, with each number of degrees of freedom having a different distribution. The chi-square test can be used to test goodness of fit, independence, and homogeneity. It requires data to be in the form of frequencies in a contingency table and assumes independence between observations.
Standard error is used in the place of deviation. it shows the variations among sample is correlate to sampling error. list of formula used for standard error for different statistics and applications of tests of significance in biological sciences
Measure of dispersion has two types Absolute measure and Graphical measure. There are other different types in there.
In this slide the discussed points are:
1. Dispersion & it's types
2. Definition
3. Use
4. Merits
5. Demerits
6. Formula & math
7. Graph and pictures
8. Real life application.
This document discusses measures of variability, which refer to how spread out a set of data is. Variability is measured using the standard deviation and variance. The standard deviation measures how far data points are from the mean, while the variance is the average of the squared deviations from the mean. To calculate the standard deviation, you take the square root of the variance. This provides a measure of variability that is on the same scale as the original data. The standard deviation and variance are widely used statistical measures for quantifying the spread of a data set.
This document provides an introduction to the statistical concept of kurtosis. It defines kurtosis as a measure of the peakedness of a distribution that indicates how concentrated data is around the mean. There are three main types of kurtosis: leptokurtic distributions have higher peaks; platykurtic have lower peaks; and mesokurtic have normal peaks. Methods for calculating kurtosis include percentile measures and measures based on statistical moments. An example calculation demonstrates a leptokurtic distribution with a kurtosis value greater than 3. SPSS syntax for computing kurtosis from data is also presented.
1. The document discusses hypothesis testing using a one-sample t-test when the population variance is unknown.
2. It provides examples of when to use a z-test or t-test, and walks through the steps of conducting a one-sample t-test including stating hypotheses, determining critical values, computing test statistics, and making conclusions.
3. An example problem demonstrates these steps, testing if a therapy reduces test anxiety below a population mean of 20, finding the sample mean is significantly lower.
The document discusses stratified random sampling, which involves dividing a population into homogeneous subgroups called strata and randomly sampling from each stratum. It describes how to form strata based on common characteristics, how to select items from each stratum such as through systematic sampling, and how to allocate the sample size to each stratum proportionally according to the stratum's size within the overall population. An example is given of allocating a sample of 30 across 3 strata based on their relative population sizes.
The t-test is used to compare the means of two groups and has three main applications:
1) Compare a sample mean to a population mean.
2) Compare the means of two independent samples.
3) Compare the values of one sample at two different time points.
There are two main types: the independent-measures t-test for samples not matched, and the matched-pair t-test for samples in pairs. The t-test assumes normal distributions and equal variances between groups. Examples are provided to demonstrate hypothesis testing for each application.
This document discusses skewness and kurtosis in a financial context. It defines skewness as a measure of asymmetry in a distribution, with positive skewness indicating a long right tail and negative skewness a long left tail. Kurtosis is defined as a measure of the "peakedness" of a probability distribution, with positive excess kurtosis indicating flatness/long fat tails and negative excess kurtosis indicating peakedness. Formulas are provided for calculating skewness and kurtosis from a data set. Examples of positively and negatively skewed distributions are given to illustrate these concepts.
Variable is any characteristic that can take on different values for different individuals. There are two types of variables - quantitative, which can be discrete or continuous, and qualitative, which can be ordinal or categorical. A variable quantifies an element of a population while a qualitative variable describes or categorizes an element.
This document introduces the concept of data classification and levels of measurement in statistics. It explains that data can be either qualitative or quantitative. Qualitative data consists of attributes and labels while quantitative data involves numerical measurements. The document also outlines the four levels of measurement - nominal, ordinal, interval, and ratio - from lowest to highest. Each level allows for different types of statistical calculations, with the ratio level permitting the most complex calculations like ratios of two values.
This document discusses measures of central tendency including the mean, median, and mode. It provides examples and definitions for each measure. The mean is the average and is calculated by summing all values and dividing by the total number. The median is the middle value when values are ranked in order. The mode is the most frequent value. The best measure depends on the scale of measurement and shape of the distribution, such as whether it is symmetrical or skewed.
This document discusses measures of dispersion in statistics. It defines dispersion as the extent of variation in a data set from the average value. There are two main types of dispersion - absolute and relative. Absolute measures express variation in units of the data and include range, variance, standard deviation, and quartile deviation. Relative measures allow comparison between data sets by being unit-free, such as the coefficient of variation. Key absolute measures are then explained in more detail, along with their merits and demerits.
Statistics is the methodology used to interpret and draw conclusions from collected data. It provides methods for designing research studies, summarizing and exploring data, and making predictions about phenomena represented by the data. A population is the set of all individuals of interest, while a sample is a subset of individuals from the population used for measurements. Parameters describe characteristics of the entire population, while statistics describe characteristics of a sample and can be used to infer parameters. Basic descriptive statistics used to summarize samples include the mean, standard deviation, and variance, which measure central tendency, spread, and how far data points are from the mean, respectively. The goal of statistical data analysis is to gain understanding from data through defined steps.
Parametric vs Nonparametric Tests: When to use whichGönenç Dalgıç
There are several statistical tests which can be categorized as parametric and nonparametric. This presentation will help the readers to identify which type of tests can be appropriate regarding particular data features.
Satyaki Aparajit Mishra presented on the topic of standard error and predictability limits. Standard error is used to estimate the standard deviation from a sample. It is calculated by dividing the standard deviation by the square root of the sample size. A larger standard error means the sample mean is less reliable at estimating the population mean. Standard error helps determine how far sample estimates may be from the true population values. Mishra discussed estimating standard error from a single sample and how standard error is used to test hypotheses. He provided an example of testing if a coin flip was unbiased using the standard error of the proportion of heads observed.
Degree of freedom refers to the number of independent pieces of information used to calculate a statistic. In an example where heights of 5 students were measured, taking a single sample of 1 student's height of 8 feet to calculate variance would have 1 degree of freedom. Taking 2 independent samples of heights 8 feet and 5 feet would have 2 degrees of freedom. When estimating the population mean from samples to then calculate variance, the degrees of freedom is the number of samples minus 1, as the values are not fully independent after estimating the mean.
This document discusses confidence intervals for population means and proportions. It explains how to construct confidence intervals using the normal distribution for large sample sizes (n ≥ 30) and the t-distribution for small sample sizes. Formulas are provided for calculating margin of error and determining necessary sample size. Guidelines are given for determining whether to use the normal or t-distribution based on sample size and characteristics. Confidence intervals can be constructed for variance and standard deviation using the chi-square distribution.
The document discusses correlation analysis and different types of correlation. It defines correlation as the linear association between two random variables. There are three main types of correlation:
1) Positive vs negative vs no correlation based on the relationship between two variables as one increases or decreases.
2) Linear vs non-linear correlation based on the shape of the relationship when plotted on a graph.
3) Simple vs multiple vs partial correlation based on the number of variables.
The document also discusses methods for studying correlation including scatter plots, Karl Pearson's coefficient of correlation r, and Spearman's rank correlation coefficient. It provides interpretations of the correlation coefficient r and coefficient of determination r2.
The document provides an overview of inferential statistics. It defines inferential statistics as making generalizations about a larger population based on a sample. Key topics covered include hypothesis testing, types of hypotheses, significance tests, critical values, p-values, confidence intervals, z-tests, t-tests, ANOVA, chi-square tests, correlation, and linear regression. The document aims to explain these statistical concepts and techniques at a high level.
This document provides an overview of analysis of variance (ANOVA) techniques, including one-way and two-way ANOVA. It defines key terms like factors, interactions, F distribution, and multiple comparison tests. For one-way ANOVA, it explains how to test if three or more population means are equal. For two-way ANOVA, it notes you must first test for interactions between two factors before testing their individual effects. The Tukey test is introduced for identifying specifically which group means differ following rejection of a one-way ANOVA null hypothesis.
1) The t-test is a statistical test used to determine if there are any statistically significant differences between the means of two groups, and was developed by William Gosset under the pseudonym "Student".
2) The t-distribution is used for calculating t-tests when sample sizes are small and/or variances are unknown. It has a mean of zero and variance greater than one.
3) Paired t-tests are used to compare the means of two related groups when samples are paired, while unpaired t-tests are used to compare unrelated groups or independent samples.
A brief description of F Test and ANOVA for Msc Life Science students. I have taken the example slides from youtube where an excellent explanation is available.
Here is the link : https://www.youtube.com/watch?v=-yQb_ZJnFXw
Standard error is used in the place of deviation. it shows the variations among sample is correlate to sampling error. list of formula used for standard error for different statistics and applications of tests of significance in biological sciences
Measure of dispersion has two types Absolute measure and Graphical measure. There are other different types in there.
In this slide the discussed points are:
1. Dispersion & it's types
2. Definition
3. Use
4. Merits
5. Demerits
6. Formula & math
7. Graph and pictures
8. Real life application.
This document discusses measures of variability, which refer to how spread out a set of data is. Variability is measured using the standard deviation and variance. The standard deviation measures how far data points are from the mean, while the variance is the average of the squared deviations from the mean. To calculate the standard deviation, you take the square root of the variance. This provides a measure of variability that is on the same scale as the original data. The standard deviation and variance are widely used statistical measures for quantifying the spread of a data set.
This document provides an introduction to the statistical concept of kurtosis. It defines kurtosis as a measure of the peakedness of a distribution that indicates how concentrated data is around the mean. There are three main types of kurtosis: leptokurtic distributions have higher peaks; platykurtic have lower peaks; and mesokurtic have normal peaks. Methods for calculating kurtosis include percentile measures and measures based on statistical moments. An example calculation demonstrates a leptokurtic distribution with a kurtosis value greater than 3. SPSS syntax for computing kurtosis from data is also presented.
1. The document discusses hypothesis testing using a one-sample t-test when the population variance is unknown.
2. It provides examples of when to use a z-test or t-test, and walks through the steps of conducting a one-sample t-test including stating hypotheses, determining critical values, computing test statistics, and making conclusions.
3. An example problem demonstrates these steps, testing if a therapy reduces test anxiety below a population mean of 20, finding the sample mean is significantly lower.
The document discusses stratified random sampling, which involves dividing a population into homogeneous subgroups called strata and randomly sampling from each stratum. It describes how to form strata based on common characteristics, how to select items from each stratum such as through systematic sampling, and how to allocate the sample size to each stratum proportionally according to the stratum's size within the overall population. An example is given of allocating a sample of 30 across 3 strata based on their relative population sizes.
The t-test is used to compare the means of two groups and has three main applications:
1) Compare a sample mean to a population mean.
2) Compare the means of two independent samples.
3) Compare the values of one sample at two different time points.
There are two main types: the independent-measures t-test for samples not matched, and the matched-pair t-test for samples in pairs. The t-test assumes normal distributions and equal variances between groups. Examples are provided to demonstrate hypothesis testing for each application.
This document discusses skewness and kurtosis in a financial context. It defines skewness as a measure of asymmetry in a distribution, with positive skewness indicating a long right tail and negative skewness a long left tail. Kurtosis is defined as a measure of the "peakedness" of a probability distribution, with positive excess kurtosis indicating flatness/long fat tails and negative excess kurtosis indicating peakedness. Formulas are provided for calculating skewness and kurtosis from a data set. Examples of positively and negatively skewed distributions are given to illustrate these concepts.
Variable is any characteristic that can take on different values for different individuals. There are two types of variables - quantitative, which can be discrete or continuous, and qualitative, which can be ordinal or categorical. A variable quantifies an element of a population while a qualitative variable describes or categorizes an element.
This document introduces the concept of data classification and levels of measurement in statistics. It explains that data can be either qualitative or quantitative. Qualitative data consists of attributes and labels while quantitative data involves numerical measurements. The document also outlines the four levels of measurement - nominal, ordinal, interval, and ratio - from lowest to highest. Each level allows for different types of statistical calculations, with the ratio level permitting the most complex calculations like ratios of two values.
This document discusses measures of central tendency including the mean, median, and mode. It provides examples and definitions for each measure. The mean is the average and is calculated by summing all values and dividing by the total number. The median is the middle value when values are ranked in order. The mode is the most frequent value. The best measure depends on the scale of measurement and shape of the distribution, such as whether it is symmetrical or skewed.
This document discusses measures of dispersion in statistics. It defines dispersion as the extent of variation in a data set from the average value. There are two main types of dispersion - absolute and relative. Absolute measures express variation in units of the data and include range, variance, standard deviation, and quartile deviation. Relative measures allow comparison between data sets by being unit-free, such as the coefficient of variation. Key absolute measures are then explained in more detail, along with their merits and demerits.
Statistics is the methodology used to interpret and draw conclusions from collected data. It provides methods for designing research studies, summarizing and exploring data, and making predictions about phenomena represented by the data. A population is the set of all individuals of interest, while a sample is a subset of individuals from the population used for measurements. Parameters describe characteristics of the entire population, while statistics describe characteristics of a sample and can be used to infer parameters. Basic descriptive statistics used to summarize samples include the mean, standard deviation, and variance, which measure central tendency, spread, and how far data points are from the mean, respectively. The goal of statistical data analysis is to gain understanding from data through defined steps.
Parametric vs Nonparametric Tests: When to use whichGönenç Dalgıç
There are several statistical tests which can be categorized as parametric and nonparametric. This presentation will help the readers to identify which type of tests can be appropriate regarding particular data features.
Satyaki Aparajit Mishra presented on the topic of standard error and predictability limits. Standard error is used to estimate the standard deviation from a sample. It is calculated by dividing the standard deviation by the square root of the sample size. A larger standard error means the sample mean is less reliable at estimating the population mean. Standard error helps determine how far sample estimates may be from the true population values. Mishra discussed estimating standard error from a single sample and how standard error is used to test hypotheses. He provided an example of testing if a coin flip was unbiased using the standard error of the proportion of heads observed.
Degree of freedom refers to the number of independent pieces of information used to calculate a statistic. In an example where heights of 5 students were measured, taking a single sample of 1 student's height of 8 feet to calculate variance would have 1 degree of freedom. Taking 2 independent samples of heights 8 feet and 5 feet would have 2 degrees of freedom. When estimating the population mean from samples to then calculate variance, the degrees of freedom is the number of samples minus 1, as the values are not fully independent after estimating the mean.
This document discusses confidence intervals for population means and proportions. It explains how to construct confidence intervals using the normal distribution for large sample sizes (n ≥ 30) and the t-distribution for small sample sizes. Formulas are provided for calculating margin of error and determining necessary sample size. Guidelines are given for determining whether to use the normal or t-distribution based on sample size and characteristics. Confidence intervals can be constructed for variance and standard deviation using the chi-square distribution.
The document discusses correlation analysis and different types of correlation. It defines correlation as the linear association between two random variables. There are three main types of correlation:
1) Positive vs negative vs no correlation based on the relationship between two variables as one increases or decreases.
2) Linear vs non-linear correlation based on the shape of the relationship when plotted on a graph.
3) Simple vs multiple vs partial correlation based on the number of variables.
The document also discusses methods for studying correlation including scatter plots, Karl Pearson's coefficient of correlation r, and Spearman's rank correlation coefficient. It provides interpretations of the correlation coefficient r and coefficient of determination r2.
The document provides an overview of inferential statistics. It defines inferential statistics as making generalizations about a larger population based on a sample. Key topics covered include hypothesis testing, types of hypotheses, significance tests, critical values, p-values, confidence intervals, z-tests, t-tests, ANOVA, chi-square tests, correlation, and linear regression. The document aims to explain these statistical concepts and techniques at a high level.
This document provides an overview of analysis of variance (ANOVA) techniques, including one-way and two-way ANOVA. It defines key terms like factors, interactions, F distribution, and multiple comparison tests. For one-way ANOVA, it explains how to test if three or more population means are equal. For two-way ANOVA, it notes you must first test for interactions between two factors before testing their individual effects. The Tukey test is introduced for identifying specifically which group means differ following rejection of a one-way ANOVA null hypothesis.
1) The t-test is a statistical test used to determine if there are any statistically significant differences between the means of two groups, and was developed by William Gosset under the pseudonym "Student".
2) The t-distribution is used for calculating t-tests when sample sizes are small and/or variances are unknown. It has a mean of zero and variance greater than one.
3) Paired t-tests are used to compare the means of two related groups when samples are paired, while unpaired t-tests are used to compare unrelated groups or independent samples.
A brief description of F Test and ANOVA for Msc Life Science students. I have taken the example slides from youtube where an excellent explanation is available.
Here is the link : https://www.youtube.com/watch?v=-yQb_ZJnFXw
This presentation describes the concept of One Sample t-test, Independent Sample t-test and Paired Sample t-test. This presentation also deals about the procedure to do the t-test through SPSS.
This document discusses parametric statistical tests. It defines parametric tests as those that make assumptions about the population distribution parameters. The key parametric tests covered are: t-tests (paired, unpaired, one sample), ANOVA (one way, two way), Pearson's correlation, and the z-test. Details are provided on the assumptions, calculations, and applications of each test. T-tests are used to compare means, ANOVA compares multiple group means, Pearson's r measures correlation between variables, and the z-test is for large samples when the population standard deviation is known.
This document discusses various statistical tests used to analyze dental research data, including parametric and non-parametric tests. It provides information on tests of significance such as the t-test, Z-test, analysis of variance (ANOVA), and non-parametric equivalents. Key points covered include the differences between parametric and non-parametric tests, assumptions and applications of the t-test, Z-test, ANOVA, and non-parametric alternatives like the Mann-Whitney U test and Kruskal-Wallis test. Examples are provided to illustrate how to perform and interpret common statistical analyses used in dental research.
This document discusses analysis of variance (ANOVA) and chi-square tests. It covers F tests, one-way and two-way ANOVA, assumptions of ANOVA, and how to perform chi-square goodness of fit and independence tests. Examples are provided for variance ratio F tests, one-way ANOVA, two-way ANOVA, and chi-square tests. Limitations of chi-square tests include requiring independent observations and minimum expected frequencies of 5 per cell.
The document discusses different types of t-tests used to compare means:
- One-sample t-test compares a sample mean to a predefined value.
- Paired (dependent) t-test compares means of two conditions with the same participants.
- Independent t-test compares means of two unrelated groups.
It explains how to choose the appropriate t-test based on research design, number of means being compared, and data distribution. Formulas are provided for calculating each t-test statistic. Examples are given to demonstrate applying the one-sample and paired t-tests.
The document defines various statistical measures and types of statistical analysis. It discusses descriptive statistical measures like mean, median, mode, and interquartile range. It also covers inferential statistical tests like the t-test, z-test, ANOVA, chi-square test, Wilcoxon signed rank test, Mann-Whitney U test, and Kruskal-Wallis test. It explains their purposes, assumptions, formulas, and examples of their applications in statistical analysis.
1. Statistical analysis involves collecting, organizing, analyzing data, and drawing inferences about populations based on samples. It includes both descriptive and inferential statistics.
2. The document defines key terms used in statistical analysis like population, sample, statistical analysis, and discusses various statistical measures like mean, median, mode, interquartile range, and standard deviation.
3. The purposes of statistical analysis are outlined as measuring relationships, making predictions, testing hypotheses, and summarizing results. Both parametric and non-parametric statistical analyses are discussed.
The document discusses various parametric statistical tests including t-tests, ANOVA, ANCOVA, and MANOVA. It provides definitions and assumptions for parametric tests and explains how they can be used to analyze quantitative data that follows a normal distribution. Specific parametric tests covered in detail include the independent samples t-test, paired t-test, one-way ANOVA, two-way ANOVA, and ANCOVA. Examples are provided to illustrate how each test is conducted and how results are interpreted.
The document discusses different types of t-tests, including the one sample t-test, independent samples t-test, and paired t-test. It explains the assumptions and equations for each test and provides examples of their applications. The key differences between the t-test and z-test are also outlined. Specifically, t-tests are used for small sample sizes when the population variance is unknown, while z-tests are for large samples when the variance is known.
This lecture covered dependent and independent t-tests for comparing sample means. It discussed when to use a z-test, single sample t-test, dependent t-test, and independent t-test. The dependent t-test is used to compare related samples, such as pre-post scores, while the independent t-test compares unrelated samples. Both tests calculate the t-value, p-value, effect size, and confidence interval. The lecture provided examples comparing wine ratings and working memory training scores to demonstrate applying the t-tests.
This document discusses parametric and nonparametric statistical tests. Parametric tests like the t-test and ANOVA assume a normal distribution of data and compare population means. Nonparametric tests do not assume a normal distribution and can be used when sample sizes are small or distributions are unknown. Specific parametric tests covered include the t-test for comparing two groups, one-way ANOVA for comparing three or more groups on one factor, and two-way ANOVA for examining two factors. Examples of how and when to use these various tests are provided.
Basic statistics concepts include data collection and analysis. Data can be raw or processed, and from various sources like surveys. Probability is the number of favorable outcomes over total outcomes. Data distribution involves tabulation, class intervals, and measures like quartiles and percentiles. Prevalence rates describe disease cases at a point in time, while incidence rates describe new cases over a period of time. Study types include cross-sectional, longitudinal, randomized controlled trials, and case-control studies. Descriptive statistics describe, summarize, and interpret data to derive meaning from numbers. Variables can be qualitative like gender or quantitative like height. Measures of central tendency include the mean, median, and mode. Measures of dispersion include the range and standard deviation.
• Non parametric tests are distribution free methods, which do not rely on assumptions that the data are drawn from a given probability distribution. As such it is the opposite of parametric statistics
• In non- parametric tests we do not assume that a particular distribution is applicable or that a certain value is attached to a parameter of the population.
When to use non parametric test???
1) Sample distribution is unknown.
2) When the population distribution is abnormal
Non-parametric tests focus on order or ranking
1) Data is changed from scores to ranks or signs
2) A parametric test focuses on the mean difference, and equivalent non-parametric test focuses on the difference between medians.
1) Chi – square test
• First formulated by Helmert and then it was developed by Karl Pearson
• It is both parametric and non-parametric test but more of non - parametric test.
• The test involves calculation of a quantity called Chi square.
• Follows specific distribution known as Chi square distribution
• It is used to test the significance of difference between 2 proportions and can be used when there are more than 2 groups to be compared.
Applications
1) Test of proportion
2) Test of association
3) Test of goodness of fit
Criteria for applying Chi- square test
• Groups: More than 2 independent
• Data: Qualitative
• Sample size: Small or Large, random sample
• Distribution: Non-Normal (Distribution free)
• Lowest expected frequency in any cell should be greater than 5
• No group should contain less than 10 items
Example: If there are two groups, one of which has received oral hygiene instructions and the other has not received any instructions and if it is desired to test if the occurrence of new cavities is associated with the instructions.
2) Fischer Exact Test
• Used when one or more of the expected counts in a 2×2 table is small.
• Used to calculate the exact probability of finding the observed numbers by using the fischer exact probability test.
3) Mc Nemar Test
• Used to compare before and after findings in the same individual or to compare findings in a matched analysis (for dichotomous variables).
Example: comparing the attitudes of medical students toward confidence in statistics analysis before and after the intensive statistics course.
4) Sign Test
• Sign test is used to find out the statistical significance of differences in matched pair comparisons.
• Its based on + or – signs of observations in a sample and not on their numerical magnitudes.
• For each subject, subtract the 2nd score from the 1st, and write down the sign of the difference.
It can be used
a. in place of a one-sample t-test
b. in place of a paired t-test or
c. for ordered categorial data where a numerical scale is inappropriate but where it is possible to rank the observations.
5) Wilcoxon signed rank test
• Analogous to paired ‘t’ test
6) Mann Whitney Test
• similar to the student’s t test
7) Spearman’s rank correlation - similar to pearson's correlation.
This document provides information about non-parametric tests. It begins by explaining that non-parametric tests do not assume a specific distribution or make assumptions about the population. It then discusses tests for normality like the Kolmogorov-Smirnov test and Shapiro-Wilk test. Commonly used non-parametric tests like Spearman's rank correlation, Mann-Whitney U test, and Kruskal-Wallis H test are explained. The chi-square test and assumptions are also covered in detail. Advantages of non-parametric tests include fewer assumptions and applicability to small sample sizes. A disadvantage is they are less powerful than parametric tests.
The document discusses the t-test, including:
1. It was introduced in 1908 by William Gosset under the pseudonym "Student" to test hypotheses about population means using small samples with unknown standard deviations.
2. The t-test has assumptions such as normality and equal variances that must be met.
3. There are different types of t-tests for different study designs: single sample t-test, independent samples t-test, and paired t-test.
4. Examples are provided to demonstrate how to calculate and interpret t-tests.
This document provides information about the F test and t test. The F test is used to compare the variances of two data sets and involves calculating the variances of each data set and taking the ratio. The variance of a data set is calculated by taking the sum of the squared differences between each value and the mean, divided by the number of values. The t test is used to compare the means of two data sets and involves calculating the means and standard deviations of each data set and using them to determine a t value for comparison. Formulas for calculating variance, the F test, and the t test are provided.
This document provides an overview of inferential statistics. It defines inferential statistics as using samples to draw conclusions about populations and make predictions. It discusses key concepts like hypothesis testing, null and alternative hypotheses, type I and type II errors, significance levels, power, and effect size. Common inferential tests like t-tests, ANOVA, and meta-analyses are also introduced. The document emphasizes that inferential statistics allow researchers to generalize from samples to populations and test hypotheses about relationships between variables.
This document discusses tests of hypotheses that compare means. It examines statistical tests that are used to determine if two sample means are significantly different from each other or if a sample mean is significantly different from a hypothesized population mean. These tests include the t-test, analysis of variance, and nonparametric tests that can be used as alternatives to parametric tests if their assumptions are violated.
Regression analysis is a statistical technique for predicting a dependent variable based on one or more independent variables. Simple linear regression fits a straight line to the data to predict a continuous dependent variable (y) from a single independent variable (x). The output is an equation of the form y= b0 + b1x + ε, where b0 is the y-intercept, b1 is the slope, and ε is the error. Multiple linear regression extends this to include more than one independent variable. Regression analysis calculates the "best fit" line that minimizes the residuals, or differences between predicted and observed y values.
This document provides an overview of descriptive statistics. It discusses different types of descriptive statistics including measures of central tendency like mean, median and mode, and measures of variability. It also describes various ways of organizing and summarizing data, such as frequency distributions, histograms, stem-and-leaf plots and pie charts. The goal of descriptive statistics is to describe key characteristics of a data set in a simple and easy to understand way.
Correlation analysis is a statistical technique used to determine the degree of relationship between two quantitative variables. Scatterplots are used to graphically depict the relationship and identify if it is positive, negative, or no correlation. The correlation coefficient measures the strength and direction of correlation, ranging from -1 to 1. A significance test determines if a correlation is likely to have occurred by chance or is statistically significant. Different types of correlation include simple, multiple, partial, and autocorrelation.
The document provides guidance on effectively summarizing data through tables and graphs. It outlines key principles such as using the appropriate type of graphic for the data, clearly labeling all components, and indicating the source and sample size. The goal is to present information in a clear, concise and visually compelling manner for the intended audience. Examples are given of different types of tables and graphs and how to properly construct and interpret them.
This document provides an introduction to statistics. It discusses key concepts including the role of statistics in research, the typical research process, variables, scales of measurement, and descriptive and inferential statistics. Specifically, it describes how statistics is used for collecting, analyzing and interpreting data to answer research questions. It also outlines the typical steps in research including developing questions and hypotheses, choosing measures, designing the study, analyzing data, and drawing conclusions.
The document discusses capital structure, which refers to the mix of debt and equity used by a company to finance its long-term operations. It covers definitions of capital structure, forms of capital, the difference between capital structure and financial structure, theories of capital structure including net income, net operating income, and Modigliani-Miller approaches. The document also discusses the concept of optimal capital structure, which maximizes firm value and minimizes the weighted average cost of capital.
This document provides an overview of cost and management accounting. It defines cost accounting as a system for recording costs and producing cost information for products. It also discusses why organizations need costing systems to provide actual unit costs, actual department costs, and forecast costs for planning, decision making, and cost control. The document then covers key terms in cost accounting such as cost, cost units, cost centers, cost objects, and classifications of costs by nature, function, behavior, and changes in activity or volume.
This document summarizes a research paper presented at the International Student Conference in Business at the University of Kelaniya on November 21st, 2014. The paper examines the impact of service portfolio structure on the firm performance of listed hotels in Sri Lanka. It reviews relevant literature, develops research questions and hypotheses, and outlines the research methodology used. The results of the data analysis show a negative correlation between the income generated from different hotel services and measures of firm performance. The conclusion recommends that hotels diversify their service portfolio and properly allocate resources to improve performance.
The document summarizes a research paper presented at an international conference on contemporary management. The paper examines the impact of leverage on shareholders' return of listed manufacturing companies in Sri Lanka. It reviews literature on the relationship between leverage and shareholders' return. The study finds a negative but insignificant relationship between leverage measures like debt-to-equity and debt-to-total assets and shareholders' return measures. It recommends that firms use more equity than debt financing to increase shareholders' returns.
This document provides information on the Central Bank of Sri Lanka (CBSL) and the State Bank of Pakistan (SBP). It outlines their profiles, visions, missions, objectives, organizational structures, functions, departments, and strategic goals. The CBSL aims to maintain economic and price stability and financial system stability in Sri Lanka. The SBP aims to promote monetary and financial stability in Pakistan to achieve sustained economic growth. Both institutions work to regulate banking and the financial system and implement monetary policy in their respective countries.
The textile and apparel industry is the largest export industry and foreign exchange earner in Sri Lanka. It began modestly in the 1950s and grew significantly with economic liberalization policies in the 1970s and partnerships with international brands in the 1980s. The industry produces a wide range of clothing for major international brands. While it faces challenges from competitors and issues retaining its GSP+ status, the government is working to increase the industry's global market share through trade agreements and improving technology, branding, and support for small and medium enterprises.
This document discusses strategic management accounting and reasons for business failure in Jaffna, Sri Lanka. It identifies that small businesses in Jaffna have not remarkably progressed and outlines factors contributing to failure, including increasing costs, lack of experience, poor financial management and monitoring, and fraud. Recommendations to reduce failure risks include accessing alternative funding sources with lower costs, implementing proper monitoring and controls, maintaining good accounting and marketing practices, and building strong customer relations.
Tax administration links legal statutes to the implemented tax system and affects fiscal deficits and tax burdens. Every person chargeable with income tax must furnish a return by November 30th following the assessment year. A deputy commissioner can request information from any person regarding specified matters. An individual must submit an income tax return if they meet four of five criteria related to bills and travel. Failure to submit a return on time can result in penalties up to Rs. 50,000. Assessments can be made up to two years after the due date if taxes were not paid or less was paid than was due.
This document outlines the objective, scope, definitions, and accounting treatments relating to reporting foreign currency transactions and foreign operations according to a particular accounting standard. The objective is to prescribe how to include foreign currency transactions and foreign operations in financial statements. It applies to accounting for transactions in foreign currencies and translating financial statements of foreign operations. Key aspects covered include initial recognition of foreign currency transactions, translation of monetary and non-monetary items at period ends, recognition of exchange differences, and disclosure requirements. An example problem is also provided to demonstrate the application of translation.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"sameer shah
Embark on a captivating financial journey with 'Financial Odyssey,' our hackathon project. Delve deep into the past performance of two companies as we employ an array of financial statement analysis techniques. From ratio analysis to trend analysis, uncover insights crucial for informed decision-making in the dynamic world of finance."
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
Build applications with generative AI on Google CloudMárton Kodok
We will explore Vertex AI - Model Garden powered experiences, we are going to learn more about the integration of these generative AI APIs. We are going to see in action what the Gemini family of generative models are for developers to build and deploy AI-driven applications. Vertex AI includes a suite of foundation models, these are referred to as the PaLM and Gemini family of generative ai models, and they come in different versions. We are going to cover how to use via API to: - execute prompts in text and chat - cover multimodal use cases with image prompts. - finetune and distill to improve knowledge domains - run function calls with foundation models to optimize them for specific tasks. At the end of the session, developers will understand how to innovate with generative AI and develop apps using the generative ai industry trends.
2. Inferential Statistics
• Hypothesis testing
• Drawing conclusions about differences between groups
• Are differences likely due to chance?
• Comparing means
• t-test: 2 means
• Analysis of variance: 2 or more means ~
3. Are there differences?
• One of the fundament questions of survey research is if
there is a difference among respondents
• When seeking to evaluate differences in means, we can
use t-test or ANOVA = analysis of variance
4. 4
Comparison of the z-test and t-test
• Interval or ratio scaled variables
• t-test
• When groups are small
• When population standard deviation is unknown
• z-test
• When groups are large
• σ is known
6. 6
The t Statistic
• The t statistic allows researchers to use sample
data to test hypotheses about an unknown
population mean.
• The particular advantage of the t statistic, is that
the t statistic does not require any knowledge of
the population standard deviation.
• Thus, the t statistic can be used to test
hypotheses about a completely unknown
population; that is, both μ and σ are unknown,
and the only available information about the
population comes from the sample.
7. T-test
• Dependent t-test
– Compares two means based on related data.
– E.g., Data from the same people measured at different times.
– Data from ‘matched’ samples.
• Independent t-test
– Compares two means based on independent data
– E.g., data from different groups of people
• Significance testing
– Testing the significance of Pearson’s correlation coefficient
– Testing the significance of b in regression.
8. (c) 2007 IUPUI SPEA K300 (4392)
Type of the T-test
• One-sample t-test compares one sample
mean with a hypothesized value
• Paired sample t-test (dependent sample)
compares the means of two dependent
variables
• Independent sample t-test compares the
means of two independent variables
• Equal variance
• Unequal variance
9. Assumptions of the t-test
• Both the independent t-test and the dependent t-test are
parametric tests based on the normal distribution.
Therefore, they assume:
– The sampling distribution is normally distributed. In the dependent
t-test this means that the sampling distribution of the differences
between scores should be normal, not the scores themselves.
– Data are measured at least at the interval level.
• The independent t-test, because it is used to test different
groups of people, also assumes:
– Variances in these populations are roughly equal (homogeneity of
variance).
– Scores in different treatment conditions are independent (because
they come from different people).
10. The One-sample t-test
• Evaluating hypothesis about population
• taking a single sample
• Does it likely come from population?
• Test statistics
• z test if s known
• t test if s unknown ~
11. (c) 2007 IUPUI SPEA K300 (4392)
One sample t-test
• Compare a sample mean with a particular
(hypothesized) value
• H0: µ=c, Ha: µ≠c, where c is a particular value
• Degrees of freedom: n-1
)1(~
nt
n
s
cx
tx
n
s
txc
n
s
tx 2/2/
13. t-test
t =
observed difference
between sample means
−
expected difference
between population means
(if null hypothesis is true)
estimate of the standard error of the difference between two
sample means
14. (c) 2007 IUPUI SPEA K300 (4392)
Paired sample t-test 1
• Compare two paired (matched) samples.
• Ex. Compare means of pre- and post-
scores given a treatment. We want to know
the effect of treatment.
• Ex. Compare means of midterm and final
exam of K300.
• Each subject has data points (pre- and
post, or midterm and final)
15. (c) 2007 IUPUI SPEA K300 (4392)
Paired sample t-test 2
• Compute d=x1-x2 (order does not matter)
• H0: µd=c, Ha: µd≠c, where c is a particular value
(often 0)
• Degrees of freedom: n-1
n
d
d
i
1
)( 2
2
n
dd
s i
d
)1(~
nt
n
s
cd
t
d
d
n
s
tdc
n
s
td dd
2/2/
16. (c) 2007 IUPUI SPEA K300 (4392)
Paired sample t-test 3: Example
• Example: Cholesterol levels
• H0: µd=0, Ha: µd≠0
• N=5, dbar=16.7, std err=25.4,
• Test size=.01, df=4, critical value=2.015
• Test statistic is 1.61, which is smaller than CV
• Do not reject the null hypothesis. 1.61 is likely
when the null hypothesis is true.
)16(61.1~
6
4.25
07.160
n
s
d
t
d
d
17.
18. (c) 2007 IUPUI SPEA K300 (4392)
Independent sample t-test
• Compare two independent samples
• Ex. Compare means of personal income between
Colombo and Jaffna
• Ex. Compare means of GPA between UOJ and
SJP
• Each variable include different subjects that are
not related at all
20. (c) 2007 IUPUI SPEA K300 (4392)
How to get standard error?
• If variances of two sample are equal, use the
pooled variance.
• Otherwise, you have to use individual variance to
get the standard error of the mean difference (µ1-
µ2)
• How do we know two variances are equal?
• (Folded form) F test is the answer.
21. (c) 2007 IUPUI SPEA K300 (4392)
F-test for equal variance
• Compute variances of two samples
• Conduct the F-test as follows.
• Larger variance should be the numerator so that F is always
greater than or equal to 1.
• Look up the F distribution table with two degrees of freedom.
• degrees of freedom numerator (dfn) and degrees of
freedom denominator (dfd)
• If H0 of equal variance is not rejected, two samples have the
same variance.
2
2
2
10 : ss H
)1,1(~2
2
SL
S
L
nnF
s
s
22. F-test for equal variance
• Look up the F distribution table with two degrees of
freedom.
• degrees of freedom numerator (dfn) and degrees of
freedom denominator (dfd)
• dfn = a-1
dfd = N-a
• where "a" is the number of groups and "N" is the total
number of subjects
23.
24. (c) 2007 IUPUI SPEA K300 (4392)
Independent sample t-test: Equal variance
• Compare means of two independent samples
that have the same variance
• The null hypothesis is µ1-µ2=c (often 0)
• Degrees of freedom is n1+n2-2
2
)1()1(
2
)()(
21
2
22
2
11
21
2
22
2
112
nn
snsn
nn
yyyy
s
ji
pool
)2(~
11
)(
21
21
21
nnt
nn
s
yy
t
pool
25. (c) 2007 IUPUI SPEA K300 (4392)
Independent sample t-test: Equal variance
• Example
• X1bar=$26,800, s1=$600, n1=10
• X2bar=$25,400, s2=$450, n2=8
• F-test: F 1.78 is smaller than CV 4.82; do not
reject the null hypothesis of equal variance at
the .01 level.
• Therefore, we can use the pooled variance.
)18,110(78.1~
450
600
2
2
2
2
S
L
s
s
26. (c) 2007 IUPUI SPEA K300 (4392)
Independent sample t-test: Equal variance
• X1bar=$26,800, s1=$600, n1=10
• X2bar=$25,400, s2=$450, n2=8
• Since 5.47>2.58 and p-value <.01, reject the H0
at the .01 level.
)2(~
11
)(
21
21
21
nnt
nn
s
xx
t
pool
75.291093
2810
450)18(600)110(
2
)1()1( 22
21
2
22
2
112
nn
snsn
spool
)2810(47.5~
8
1
10
1
5.539
)2540026800(
t
27. (c) 2007 IUPUI SPEA K300 (4392)
Independent sample t-test: Unequal variance
• Compare means of two independent samples
that have different variances (if the null
hypothesis of the F-test is rejected)
• The null hypothesis is µ1-µ2=c (often 0)
• Individual variances need to be used
• Degrees of freedom is approximated; not
necessarily an integer
)(~
2
2
2
1
2
1
21
iteSatterthwadft
n
s
n
s
xx
t
28. (c) 2007 IUPUI SPEA K300 (4392)
Independent sample t-test: Unequal variance
• Approximation of degrees of freedom
• Not necessarily an integer
• Satterthwait’s approximation (common)
• Cochran-Cox’s approximation
• Welch’s approximation
2
2
2
1
21
)1()1)(1(
)1)(1(
cncn
nn
df iteSatterthwa
2
2
21
2
1
1
2
1
nsns
ns
c
29. (c) 2007 IUPUI SPEA K300 (4392)
Independent sample t-test: Unequal variance
• Example
• X1bar=191, s1=38, n1=8
• X2bar=199, s2=12, n2=10
• F-test: F 10.03 (7, 9) is larger than CV 4.20,
indicating unequal variances. Reject H0 of
equal variance at the .05 level.
• Therefore, we have to use individual variances
)110,18(03.10~
12
38
2
2
2
2
S
L
s
s
30. (c) 2007 IUPUI SPEA K300 (4392)
Independent sample t-test: Unequal variance
• Example
• X1bar=191, s1=38, n1=8
• X2bar=199, s2=12, n2=10
• Test statistics |-.57| is small.
• CV 2.365 for 7 (8-1) degrees of freedom and does not
reject the null hypothesis
• However, we need the approximation of degrees of
freedom to get more reliable df.
)(57.~
10
12
8
38
199191
22
2
2
2
1
2
1
21
iteSatterthwadf
n
s
n
s
xx
t
31. (c) 2007 IUPUI SPEA K300 (4392)
Summary of Comparing Means
One sample
T-test
One
sample?
Dependent?
Equal
Variance?
Two
Paired sample
T-test
Independent
sample T-test
(Pooled variance)
Independent sample
T-test
(Approximation of d.f.)
Unequal
Independent
cH :0 0:0 dH 0: 210 H 0: 210 H
1 ndf 1 ndf 221 nndf edapproximatdf
32. ANOVA – Analysis of Variance
• Statistical technique specially designed
to test whether the means of more than 2
quantitative populations are equal.
33. ANOVA – Analysis of Variance
• ANOVA is similar to regression in that it is used to
investigate and model the relationship between a
dependent (response) variable and one or more
independent (explanatory) variables.
• It is different
• the independent variables are qualitative (categorical)
• no assumption is made about the nature of the relationship
• ANOVA really extends the two-sample t-test for testing
the equality of two population means to a more
general null hypothesis of comparing the equality of
more than two means, versus them not all being
equal.
34. ANOVA – Analysis of Variance
• An analysis of variance (=ANOVA) is a statistical
method, to detect if there is a statistical difference
between the means of the populations.
• The null hypothesis in the simple ANOVA test is the
following:
• H0: μ1 = μ2 = … = μk
• Against the alternative
• H1: at least two μ’s differ
• The test statistic for ANOVA is the ANOVA F-statistic.
35. Example ANOVA research question
• Are there differences in the degree of religious
commitment between countries (UK, USA, and Australia)?
• 1-way ANOVA
• 1-way repeated measures ANOVA
• Factorial ANOVA
• Mixed ANOVA
• ANCOVA
36. Example ANOVA research question
• Do university students have different levels of satisfaction
for educational, social, and campus-related domains ?
• 1-way ANOVA
• 1-way repeated measures ANOVA
• Factorial ANOVA
• Mixed ANOVA
• ANCOVA
37. Example ANOVA research questions
• Are there differences in the degree of religious
commitment between countries(UK, USA, and Australia)
and gender (male and female)?
• 1-way ANOVA
• 1-way repeated measures ANOVA
• Factorial ANOVA
• Mixed ANOVA
• ANCOVA
38. Example ANOVA research questions
• Does couples' relationship satisfaction differ between
males and females and before and after having children?
• 1-way ANOVA
• 1-way repeated measures ANOVA
• Factorial ANOVA
• Mixed ANOVA
• ANCOVA
39. Example ANOVA research questions
• Are there differences in university student satisfaction
between males and females (gender) after controlling for
level of academic performance?
• 1-way ANOVA
• 1-way repeated measures ANOVA
• Factorial ANOVA
• Mixed ANOVA
• ANCOVA
40. ANOVA
• Variance can be separated into two major components
• Within groups – variability or differences in particular groups
(individual differences)
• Between groups - differences depending what group one is in or
what treatment is received
41. F test
• ANOVA partitions the sums of squares (variance from the
mean) into:
• Explained variance (between groups)
• Unexplained variance (within groups) – or
• error variance
F = ratio between explained & unexplained variance
p = probability that the observed mean differences between
groups could be attributable to chance
43. One-way ANOVA - Assumptions
• Dependent variable (DV) must be: Interval
or ratio
• Normality: Normally distributed for all
groups
• Variance: Equal variance across for all
groups (homogeneity of variance)
• Independence: Participants' data should be
independent of others' data
44. One-way ANOVA:
• Are there differences in satisfaction
levels between students who get
different grades?
48. ANOVA – Analysis of Variance
• The sample mean and standard deviation is
calculated, by selecting the variable by double
clicking it and going View/ Descriptive
Statistics/Tests/Stats by classification..
50. The ANOVA test in Eviews
• To determine whether to reject the null hypothesis or not
we focus on the highlighted ANOVA F-test output. The
column named Probability contains the p-value of interest.
Since the p-value is below 5% we reject the null
hypothesis and conclude that there is a statistical
significant difference in weight between the groups.
51. Testing assumptions
• A number of assumptions must be met to ensure the
validity of the above analysis of variance.
• The following three assumptions will be checked in this
section
• 1) Homogeneity of variance
• 2) Normally distributed errors
• 3) Independent error terms
52. Homogeneity of variance (1)
•To test for homogeneity of variance
between the different groups in the
analysis, we use Levene’s test for
equality of variance.
53. Homogeneity of variance (1)
• To have EViews run Levene’s test, is somewhat similar to
running the ANOVA test in the first place. Once again you
need to select the variable of interest
• View /Descriptive Statistics /Tests/Equality Tests by
Classification...
54. Homogeneity of variance (1)
• we get a p-value of 0.67, which is way above any
reasonable level of significance. Therefore we cannot
reject the null hypothesis and assumption of homogeneity
of variance is considered satisfied.
56. Normally distributed errors
• Making cross group analysis is done by using Q-Q plots
to determine whether or not the observations follow a
normal distribution when analyzed within their group. To
make this analysis in EViews do the following:
• Select Quick > Graph from the top tool, which should
result in the following windows:
58. Normally distributed errors
• First you need to choose Categorical Graph from the
dropdown menu (1). Then select the specific graph
Quantile – Quantile (2), which is also known as the Q-Q
plot. To make Eviews create a separate graph for each
outcome in the grouping variable, you need to type in the
grouping variable in the Across Graphs window.
60. Independent error terms
• Assumptions concerning independent error terms is
simply done, by making scatter plots of the variable of
interest and the observation numbers. This is done to
ensure that a pattern related to order in which the sample
is collected, doesn’t exist.
• Select Quick/Graph and type in the variable of interest in
the resulting window and click OK.